next up previous
Next: Acknowledgments Up: An Analysis of Compare-by-hash Previous: Existence proof: Rsync vs.


Use of compare-by-hash is justified by mathematical calculations based on assumptions that range from unproven to demonstrably wrong. The short lifetime and fast transition into obsolescence of cryptographic hashes makes them unsuitable for use in long-lived systems. When hash collisions do occur, they cause silent errors and bugs that are difficult to repair. What should worry computer scientists the most about compare-by-hash is that real people are running real workloads that will execute incorrectly on systems using compare-by-hash. Perhaps research would be better directed towards alternatives to or improvements on compare-by-hash that avoid the problems described. At the very least, future research using compare-by-hash should include a more careful analysis of the risk of hash collisions.