[tahoe-dev] [tahoe-lafs] #753: use longer storage index / cap for collision resistance

Fri Jul 10 14:14:43 UTC 2009

#753: use longer storage index / cap for collision resistance
---------------------------+------------------------------------------------
 Reporter:  warner         |           Owner:           
     Type:  defect         |          Status:  new      
 Priority:  major          |       Milestone:  undecided
Component:  code-encoding  |         Version:  1.4.1    
 Keywords:                 |   Launchpad_bug:           
---------------------------+------------------------------------------------

Comment(by zooko):

 Thanks for this analysis.  I like your comments at the end that we want a
 bit of "overkill" in number of files and chance of collision.  People who
 don't want to rely on the collision-resistance of secure hash functions at
 ''all'' are probably not part of our market, but people who are willing to
 do so in principle, but would feel better with a nice fat margin of safety
 are definitely in our market.

 Note that if you generate a new write cap (private key) and then check and
 it turns out that the same write cap has been generated by someone else
 then now you have gained the ability to write to their mutable file or
 directory!  That's why I have been thinking that 96-bits was too few for
 write caps.  Originally I had been thinking something like "It would
 probably not be worthwhile for any attacker to spend 2^96^ computer power
 trying to forge a write cap.".  But this way of thinking discounts two
 important factors: chance of success and number of targets.  If there are
 2^40^ writecaps in use, then if an attacker does a mere 2^36^ work
 (excluding the cost of checking whether each writecap that they generate
 is already out there), then they have a 2^-20^ chance of forging
 ''someone's'' writecap.  (Thanks to Daniel J. Bernstein's papers and
 mailing list postings for helping me understand this important point.
 http://cr.yp.to/snuffle/bruteforce-20050425.pdf )

 However, a storage-index collision doesn't sound nearly as serious to me.
 No integrity or confidentiality is necessarily lost due to storage-index
 collision, right?  Well, it could complicate the question of "which
 servers are handling which shares of this mutable file" -- an issue that
 is already not well managed.

 Anyway, nowadays I think of storage-indexes as a separate layer built on
 top of the crypto layer.  People can define their storage indexes as
 secure hashes of some pieces of the capabilities if they want, or choose
 random storage indices, or hierarchical ones based on the DNS names, or
 just not have storage indices at all and require every downloader to query
 every server.  It shouldn't impact the security of the crypto layer, if
 the crypto layer includes integrity checking using the verifycap itself on
 the storage-server side.

 I think we should write a document called something like "crypto failure
 modes (What could possibly go wrong?)" that explains what the consequences
 are of each different type of failure.  (As requested by Nathan:
 http://allmydata.org/pipermail/tahoe-dev/2009-April/001661.html .)

 The one discussed above is caused by "two people choose the same write cap
 (signing key) seed (possibly due to malice)".  That one leads to an
 integrity failure, where one of the people thinks that they are the only
 one with a write-cap to a specific file or directory, but actually someone
 else also has the same write-cap.

 So I think that is the worst one (because I value integrity highly).
 Another similar integrity failure could come about from a failure of the
 digital signature algorithm -- i.e. if someone were able to forge digital
 signatures even without the writecap.  (Note that a collision in the hash
 function used by the digital signature algorithm could cause this.  People
 who don't want to rely on collision-resistance of secure hash functions at
 ''all'' can't even rely on rsa, dsa, ssh, ssl, or gpg, although I should
 hasten to add that those algorithms typically include some randomness in
 the input to their secure hash function, to make it that much harder for
 attackers to cause collisions.)

 After those integrity failures, there are confidentiality failures.  The
 obvious one is someone being able to crack AES-128-CTR without having the
 readkey.  Another one is if the content-hash-key encryption where to
 generate the same readkey for two different immutable files.  I suppose
 that's another reason why using an {{{added convergence secret}}} is safer
 (although I hasten to add that I see no reason to distrust the collision-
 resistance of SHA-256d-truncated-to-128-bits at the present).

 Note that of course in practice the dangers from bugs and from operator
 error (i.e. misplacing your keys) are a lot greater than these algorithmic
 crypto risks.  So much greater that the former are pretty much guaranteed
 to happen and the latter will probably never.  Nonetheless, I value
 getting the crypto part right so that it is secure and also so that
 everyone who is willing to rely on crypto in principle is willing to rely
 on our crypto, so thanks for you help with this.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/753#comment:1>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid