[tahoe-dev] a crypto puzzle about digital signatures and future compatibility

Brian Warner warner at lothar.com
Thu Aug 27 01:49:53 UTC 2009

Nathan wrote:
> The scenario of a 1.7 client producing two malicious files is a little
> off, IMO.

At lunch today, Nathan and I discussed this further. There are two
separate sorts of attacks, which (I think) roughly parallel the
difference between first-preimage and second-preimage attacks on hash

Attack A is where Alice uploads a file, derives a filecap, gives the
filecap to Bob, and then Bob downloads the file. Bob desires to see
whatever file Alice wanted him to see, and to not rely upon the servers
or other non-Alice parties to achieve this goal. The attacker (someone
other than Alice) can give Bob any shares they like. The attacker wins
if Bob accepts a file which is different than what Alice wanted him to

Attack B is where Alice uploads a file, Bob gets the filecap and
downloads it, Carol gets the same filecap and downloads it, and Carol
desires to see the same file that Bob saw. (Bob and Carol may be the
same person at different times, or Bob may have signed a contract
referencing the filecap and Carol is the judge who later enforces the
contract). The attackers (who may be Alice and/or other parties) get to
craft the filecap and the shares however they like. The attackers win if
Bob and Carol accept different documents.

In Attack A, Alice is the good-guy, and wants to help Bob win. Any
additional verification data she can put into the filecap will help. If
she only puts a CRC in there, Bob won't have a very good chance against
an active attacker (although he's got a great chance against line
noise). An md5sum is much better, and a sha256sum better still.

In Attack B, Alice is the bad-guy, and any additional verification data
she adds will hurt (hurts her bad-guy chances, helps Bob+Carol's
good-guy chances). So she's motivated to provide as little as possible.
The best way to do this and still *look* like she's playing by the rules
is to claim obsolescence. (or to claim that she's from so far into the
future that even SHA256 is beneath her.. "what? you don't understand
SHA9? tough luck, you should have updated by now, where I come from you
can get SHA256-breaking quantum wristwatches with your breakfast
cereal".. Bob is only helped by verification data that he can actually

I always get confused about the difference between first-preimage and
second-preimage, but I think there's a correspondence here. In Attack A,
the attacker doesn't get to choose the filecap (i.e. the hash of the
message): they've got to create shares to match a specific
pre-determined cap. In Attack B, Alice can craft an arbitrarily complex
message, taking advantage of a known collision or whatever.

Now, of course, the actual point of interest is to define what it means
(in Attack A) for Alice to want to help Bob "win". Certainly Bob loses
if he reads the wrong document. And clearly he wins if he reads the
right document[1]. But it's a judgement/usability call to say if not
reading any document (i.e. detecting the hash failure) is a relative win
or a loss.

If the contents of the message were critical ("cut the red wire for
sure" vs "I don't know which wire to cut so I'll just run away from the
bomb"), then you might prefer unavailability (detect the hash failure
and ignore the message) over inconsistency (slight probability of
believing the wrong mesage, dependent upon external factors like the
advancement of science). But if the message contents are less critical
("pick up some milk on the way home"), the judgment call might run the
other way. You may have other safeguards in place to handle undetected
hash failures (the message says "pick up a 747 on the way home", but I
don't have enough cash for that, and also the grocery store is out of
stock of airplanes this week).

Like the CAP Theorem, you can't get both properties. And like the CAP
Theorem, the discussion frequently gets emotional, because everybody has
some internal notion of what is probable, and what other safeguards
might be in place, etc. Also, you can't generally apply the usual
rational arguments to the probabilities, because they aren't really
quantifiable. Sure, there's a well-established lower bound provided by
the Birthday paradox, but that's effectively zero. The real threat is
some motivated attacker who's smarter than you (and your hash-algorithm
building friends), and there's no way to write down a probability of
that (at least for SHA and all the other functions that don't reduce to
a well-known hard problem).

So we can imagine somebody who's one of those rational economist types,
trying to make a judgement call. They just finished deciding to not buy
a lottery ticket (payoff*probability < price), and deciding to drive to
work (utility-gain > chance-of-accident), and to skip the skydiving trip
(metric-fun < chance-of-death*non-funness-of-death), and now they want
to decide whether they want their document-validating system to
fail-safe or not (i.e. prefer consistency or availability), and they've
got their utility function all set to go. But you give them a
probability-of-failure function that says max(really-tiny-number,
completely-unknowable-number). There's no way to evaluate that. So
rationality goes out the window and people just wing it.

And what's worse, what they hear from the crypto folks sounds like
sky-is-falling talk: "SHA-1 is broken", but then it's decades if ever
before somebody actually uses a collision to steal money from them. They
never hear what they want to hear, which is "use algorithm XYZ and
you'll be safe forever and can stop thinking about this". They're always
being told to do more work. It's hard to evaluate something as a process
when you're asking for a single immediate tool.

Anyways, so, there are arguments in both directions. I personally lean
towards the availability side, in the "make it possible for future
generations to be more secure" style (i.e. ship lots of hashes, the
readers check all that they can).


More information about the tahoe-dev mailing list