[tahoe-dev] [tahoe-lafs] #1181: new-downloader requests too much data, builds up

Wed Aug 18 23:52:09 UTC 2010

#1181: new-downloader requests too much data, builds up
--------------------------+-------------------------------------------------
 Reporter:  warner        |           Owner:      
     Type:  defect        |          Status:  new 
 Priority:  major         |       Milestone:  soon
Component:  code-network  |         Version:  1.8β
 Keywords:                |   Launchpad Bug:      
--------------------------+-------------------------------------------------
 In [comment:43:ticket:1170 #1170 comment:43] I describe a problem that
 I noticed in the way the new downloader code handles data for long
 reads (hundreds of segments). It looks like we ask for data which we
 don't end up using. The {{{.received}}} {{{DataSpans}}} structure on
 one of the later-to-arrive shares grew by about 350 ranges (and 35kB)
 over the course of 762 segments. The one increment I looked at was by
 64 bytes.

 I have two guesses, the second more likely than the first:

  * our use of {{{IncompleteHashTree.needed_hashes()}}} with the
    {{{include_leaf=True}}} argument is telling us that we should ask
    for e.g. the hash of block0, so we ask for that part of the hash
    tree. But then we retrieve block0 itself, and can hash it ourselves
    and add it to the hash tree, meaning we never use the H(block0)
    data from {{{self._received}}}. This would result in a 32-byte hash
    value being added (and never removed) from {{{Self._received}}}
    once every other segment.

  * The ciphertext hash tree nodes are requested from all shares,
    because we never know which one will arrive first (and which ones
    won't arrive at all). But we only consume the data for these hash
    nodes out of the first share that provides them: we leave the
    others alone. This will result in a variable amount of data being
    left over in each share: larger amounts when the ciphertext hash
    tree path down to the current segnum leaf flips over higher
    branches (half of the time you'll only get one extra node, a
    quarter of the time you'll get two extra nodes, etc, maximizing on
    the first and numsegs/2 segments in which you have to get a full
    ln2(numsegs) nodes).

 I'm not so sure it's the first, since I ran the unit tests with those
 {{{include_leaf=}}} arguments set to {{{False}}}, and they failed.

 But the ciphertext hash tree feels pretty likely.

 The simple fix is to just discard the whole {{{self._received}}}
 structure after each segment is complete: in the ideal case (correctly
 guessed segsize), it should be empty at that point anyways, and even
 if the segsize guess was wrong, the bandwidth hit for flushing the
 data is probably not more than a single block (and the round-trips hit
 is probably zero). Here's a patch to implement the simple fix:

 {{{
 diff --git a/src/allmydata/immutable/downloader/share.py
 b/src/allmydata/immutable/downloader/share.py
 index f7ed4e8..413f907 100644
 --- a/src/allmydata/immutable/downloader/share.py
 +++ b/src/allmydata/immutable/downloader/share.py
 @@ -531,6 +531,9 @@ class Share:
              for o in observers:
                  # goes to SegmentFetcher._block_request_activity
                  o.notify(state=COMPLETE, block=block)
 +            # now clear our received data, to dodge the #1170 spans.py
 +            # complexity bug
 +            self._received = DataSpans()
          except (BadHashError, NotEnoughHashesError), e:
              # rats, we have a corrupt block. Notify our clients that they
              # need to look elsewhere, and advise the server. Unlike
 }}}

 The complex fix is to consult the ciphertext hash tree when these node
 hashes arrive, and either add them to the hash tree or discard them
 (instead of the current practice of always storing them and then only
 remove them if the hash tree still needs them). It might be easier to
 do this if {{{DataSpans}}} had a feature to label each byte (in some
 efficient way), so you could ask it for the ranges labelled
 "ciphertext hash tree nodes", and discard just those. (this might also
 help with pipelining segments, so you merge requests but still get the
 large blocks of data back in the right order, to minimize your
 buffering needs).

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1181>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage