[tahoe-dev] collision-resistance vs. second-pre-image resistance (was: Tahoe performance)
zooko at zooko.com
Wed Feb 25 16:44:31 UTC 2009
Dear Jack Lloyd:
I enjoyed your example of how collisions in rsync's hashes could be
exploited (quoted below). Devious!
I've been trying to articulate a general principle about this, and
your example has helped. The general principle is that if you have
an immutable mapping from identifier to static object (i.e., to a
specific sequence of bytes), then users will assume that the
important property is that the object you get is always *that one
object*. They will assume this property even if you as the
cryptographer specify the weaker property that the object you get
will always be an object which was chosen by the person who produced
The practical significance of this distinction is that you have to
double the size of the identifier. For secure-hash-based
identifiers, an identifier of size K bits is sufficient to give a K-
bit-security level against second-pre-image attacks (which provides
the 'was intended by the creator of the identifier' property), but an
identifier size of 2K bits is required to give K-bit-security against
collision attacks (which provides the 'that one object' property).
The current state of the art of security engineering often errs by
specifying and enforcing second-pre-image resistance and not thinking
that someone might rely on it for collision-resistance. I think Jack
Lloyd's devious exploit of the rsync hashes is another example of
this pattern. The recent widely publicized Rogue CA which exploits
collisions in MD5 is a beautiful example of this.
Ferguson and Schneier have argued, both in their book "Practical
Cryptography" and in recent SHA-3 discussions, that this pattern is
so common that the wise thing to do is simply to *always* use 2K-bit
identifiers and try to resist collisions -- not just second-pre-images.
However, I am just now reminded of Tyler Close's argument that users
will *not* assume that a mutable file cap -- an identifier of a
mutable file -- will stably point to the same sequence of bytes over
time. So we might be able to get away with having only K-bit
identifiers for those.
It would have to be carefully engineered to make sure that a sneaky
person couldn't screw up the system by generating multiple public
keys whose hashes collided, even though that same person knows the
signing key and has the option of just using it to sign new file
contents, as normal.
Thanks to Tyler Close for arguing that K-bit secure hashes were good
enough for the case that the creator has complete control over the
identified object anyway. (At Zeitgeist Bar a few years ago after a
On Feb 20, 2009, at 19:03 PM, Jack Lloyd wrote:
> On Thu, Feb 19, 2009 at 02:19:42PM -0800, Brian Warner wrote:
>> the same weak checksums), and put A1 on the destination machine
>> and A2
>> on the source, then 'rsync ./A2 remote:A1' would fail to properly
>> replace A1 with A2. But they're both your own files anyways.
> Not necessarily: I can think of many scenarios where it might be
> advantageous to ensure that rsync (or more generally, the system
> backup) would skip backing up a new version of a file. Such situations
> can occur anytime the creator of the file has interests that do not
> align with the person performing the backup operation.
> For instance, say I wish to hide a logic bomb on a corporate server,
> that six months after I left the company would activate and perform
> some unfriendly action. However I don't want to leave evidence behind,
> and the machine in question is backed up nightly to an incorruptible
> offsite system. If I simply uploaded the malicious executable, after
> it triggered one could go back and find said malicious code in the
> backups, providing forensic evidence that could be used against me. So
> instead I could prepare two files, the logic bomb executable plus an
> innocouous file which rsync would consider to be identical. Since most
> binary formats (eg ELF binary files, Office documents) are quite
> flexible, I could perhaps even create my dummy file to be something
> meaningful. Then I place the innocouous version where it will be
> backed up, after which I replace it with the malicious version. Then
> numerous backup operations would occur, and (assuming the malicious
> version deleted or replaced itself after triggering) leave no evidence
> behind, without having to corrupt the backup server or process itself.
More information about the tahoe-dev