next up previous
Next: Comparing probabilities Up: Questions about compare-by-hash Previous: Cryptographic hashes

Silent, deterministic, hard-to-fix errors

Ordinarily, anyone who discovered two inputs that hash to the same SHA-1 output would become world-famous overnight. On a system using compare-by-hash, that person would instead just silently read or overwrite the wrong data (which is more than a bug, it's a security hole). To understand why silent errors are so bad, think about silent disk corruption. Sometimes the corruption goes undetected until long after the last backup with correct data has been destroyed.

In addition, any two inputs that hash to the same value will always be treated incorrectly, whereas most hardware errors are transitory and data-independent. Redundant disks or servers provide no protection against data-dependent, deterministic errors. To avoid this, we could add a random seed every time we compute the hash, but we won't save anything except in the most extreme cases if we have to recompute hashes on every candidate local block every time we compare a block.

Once a hash collision has been found and a demonstrably buggy test program created using the colliding inputs, how will you fix the bug? Usually, the response to a test program that demonstrates a bug in the system is to fix the bug. In this case, the underlying algorithm is the bug.


next up previous
Next: Comparing probabilities Up: Questions about compare-by-hash Previous: Cryptographic hashes
2003-06-16