[tahoe-dev] [tahoe-lafs] #453: safely add plaintext_hash to immutable UEB

tahoe-lafs trac at allmydata.org
Tue Feb 23 09:43:57 UTC 2010

#453: safely add plaintext_hash to immutable UEB
 Reporter:  warner                         |           Owner:           
     Type:  enhancement                    |          Status:  new      
 Priority:  major                          |       Milestone:  undecided
Component:  code-encoding                  |         Version:  1.0.0    
 Keywords:  integrity newcaps performance  |   Launchpad_bug:           

Comment(by warner):

 I like to support parallelism and performance, so if there really is a
 tradeoff between having a flat hash and being able to do high-speed
 super-parallel uploads, I'll prefer the choice that gives us performance.
 This may mean having the flat hash be optional: if the uploader chooses to
 provide it, and if the downloader chooses to download the entire file,
 the downloader will check it. This also gives enough information for a
 simple downloader to validate the whole file (or for a little shell script
 that's comparing hashes of files on disk against data in the UEB).

 So I think there should be four hash-like items in the shares:

  * flat plaintext hash: optional
  * tree plaintext hash: optional
  * flat ciphertext hash: optional
  * tree ciphertext hash: mandatory

 and the flat hashes are only checked by a downloader who is fetching the
 entire file. (note that my new downloader code is very segment-at-a-time
 random-access oriented, so even in the near term the downloader might
 ignoring the flat hashes).

 Then, on the day that we write a super-parallelized upload hasher (one day
 after all tahoe users install an 18 exabyte-per-second DSL line, and two
 after we reduce the protocol to a single roundtrip, otherwise it wouldn't
 make any significant difference), we also add a tahoe.cfg option that
 or disables the generation of the flat hashes. Enabled would result in
 upload hashing (one core would have to linearly see every byte of the
 Disabled would result in faster uploads but would lose those simple flat
 hashes that some downloaders might want to use.

 Oh, and I forgot to mention this in the original description: we can put
 encrypted plaintext merkle tree hashes in the old
 section, which will be ignored by old clients as long as they don't see a
 {{{plaintext_root_hash}}} key in the UEB. This will let us quietly add an
 encrypted plaintext hash tree without impacting compatibility with older

Ticket URL: <http://allmydata.org/trac/tahoe/ticket/453#comment:4>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid

More information about the tahoe-dev mailing list