[tahoe-dev] [tahoe-lafs] #678: converge same file, same K, different M

tahoe-lafs trac at allmydata.org
Tue Aug 25 08:37:44 UTC 2009

#678: converge same file, same K, different M
 Reporter:  zooko          |           Owner:           
     Type:  enhancement    |          Status:  new      
 Priority:  major          |       Milestone:  undecided
Component:  code-encoding  |         Version:  1.3.0    
 Keywords:                 |   Launchpad_bug:           

Comment(by warner):

 One significant barrier to this is the share hash tree. When a
 file is uploaded, we take the hash of all shares (well, the block
 hash tree root for all shares) and put them into a list, then we
 build another merkle tree over that list, and store the root of
 that tree in the UEB (where it gets folded into the
 integrity-checking part of the filecap).

 This list is of length "N", the number of shares that were
 created, padded up to a power-of-two boundary (for the merkle
 tree). This makes it impossible to add more shares to the same
 hash tree.

 So if a file has already been uploaded with 3-of-10, and now you
 decide you'd like to encode it with 3-of-16, then shares 0..9
 will be the same, but the merkle tree that was distributed with
 the first upload won't contain the hashes for shares 10..15, and
 the UEB will be different for the two uploads, so the filecaps
 will be different (to be specific, the readkey [and therefore the
 storage-index] part will be the same, but the UEB hash that
 provides all the integrity information will be different).

 Let's analyze the compatibility grid. One one axis is the person
 doing the download: either Sally holding the "short" 3-of-10
 filecap, or Lucy holding the "long" 3-of-16 filecap. The other is
 the class of share being used:

  * A: the original 10 shares from the 3-of-10 upload
  * B: shares 0..9 from the new 3-of-16 upload
  * C: shares 10..15 from the new 3-of-16 upload

 (in ideal circumstances, the 3-of-16 upload would skip the "B"
 shares, but if that uploader didn't happen to see the "A" shares
 out there already, they might generate "B" shares)

 Sally handles "A" shares just fine, as usual. When she sees "B"
 shares, the current code will reject them because the UEB is
 different than the filecap. The shares don't contain a full share
 hash tree, but rather just the minimal chain necessary to
 validate that one share. If she has enough "A" shares to know the
 expected value of the "B" share in question (i.e. that leaf of
 the short tree), then she can use the "B" share with confidence
 (she'd use the share, but ignore the [long] hash chain that comes
 with it). This basically means she must get that share's sibling:
 if she has A0, she can validate B1; if she has A3, she can
 validate B2, etc.

 Should she see "C" shares (say that 5 times fast!), she's stuck.
 There is no hash chain that will get her from the C share's hash
 to her filecap's UEB hash. She could download the C share anyways
 and throw it into FEC, and then test the resulting ciphertext
 segment against the ciphertext hash tree: if it matches, great,
 the share was good. If it doesn't, then she won't even know which
 share had a problem.

 Now, what about Lucy? When she sees an "A" share, she'll be in
 the same boat as Sally looking at a "B" share. If Lucy manages to
 find a sibling "B" share for each potential "A" share, then she
 can validate the share hash, and use the share. But if she can't
 (such as if there are *no* "B" shares, because the uploader
 didn't create any because it saw the "A" shares out there), then
 she won't be able to validate the "A" shares, and then she'll be
 in the Sally-vs-C-shares boat: try FEC and catch problems with
 the ciphertext hash.

 So, what could be done to improve this?

  * include a full copy of the share hash tree in each share, not
    just the minimal chain: this would let Sally use arbitrary "B"
    shares and Lucy use arbitrary "A" shares, without forcing them
    to locate a sibling share. It still wouldn't let Sally use "C"
    shares with confidence.
  * allow the Downloader to speculate a bit: keep track of which
    shares went into FEC, if the ciphertext hash fails then
    iterate through the various combinations of shares, use some
    sort of constraint-based logic to figure out which
    combinations are still worth trying. The combinatorics will be
    smaller with fewer non-validated shares, so the peer-selection
    code should prioritize validatable shares.
  * if you plan ahead, you can pretend you're encoding to 3-of-16
    or 3-of-32 or whatever, but then only actually upload the
    first 10 shares. You still have to do all the encoding work,
    however, because you have to compute the correct merkle tree
    to put in those 10 shares, but you can save the bandwidth and
    storage space for the shares that you'll create later. It's
    not clear where the break-even point would be.

 I'm not sure what else could be done.

 For mutable files, it's a slightly different story. The encoding
 scheme is basically the same, but if you're willing to modify all
 the existing shares, you can increase the size of the share hash
 tree later on (build a bigger tree, sign the root, then update
 all shares with the new tree). Unfortunately I believe that the
 share format puts the share hash tree before the share data, so
 if you increase its size, you must also move the data (and the
 remote API has no insert() method), so you'll basically have to
 re-upload everything, including the old share data. So it's
 possible, but expensive.

Ticket URL: <http://allmydata.org/trac/tahoe/ticket/678#comment:3>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid

More information about the tahoe-dev mailing list