[tahoe-dev] [tahoe-lafs] #928: start downloading as soon as you know where to get K shares

tahoe-lafs trac at allmydata.org
Wed Jan 27 06:07:03 UTC 2010


#928: start downloading as soon as you know where to get K shares
-----------------------------------------------+----------------------------
 Reporter:  zooko                              |           Owner:  zooko
     Type:  defect                             |          Status:  new  
 Priority:  major                              |       Milestone:  1.6.0
Component:  code-peerselection                 |         Version:  1.5.0
 Keywords:  download availability performance  |   Launchpad_bug:       
-----------------------------------------------+----------------------------
 The current code ([4186]) performs immutable file download in two phases.
 In phase 1 it sends a {{{get_buckets()}}} remote invocation message to
 every storage server that it knows about
 ([source:src/allmydata/immutable/download.py at 4164#L871
 CiphertextDownloader._get_all_shareholders()]).  In phase 2 it requests
 the CEB (a.k.a. UEB) from each server in turn until it gets a valid CEB
 ([source:src/allmydata/immutable/download.py at 4164#L956
 CiphertextDownloader._obtain_uri_extension()]).  In phase 3 it requests
 the crypttext hash tree from each server in turn until it gets a valid
 hash tree ([source:src/allmydata/immutable/download.py at 4164#L1002
 CiphertextDownloader._get_crypttext_hash_tree()]).  In phase 4 is actually
 downloads the blocks of data from servers in parallel.  Out of the shares
 which it learned about in phase 1, which shares does it choose? It chooses
 "primary shares" first (their sharenum is < K) because those can be
 erasure-decoded at no computational cost, then it chooses randomly from
 among secondary shares until it has K.

 Now currently phase 1 does not end, and therefore phase 2 does not begin,
 until all servers have answered the {{{get_buckets()}}}!  To close this
 ticket, make it so phase 2 ends as soon as at least {{{K}}} buckets have
 been found.

 The nice non-invasive way to do this is to replace the DeferredList in
 [source:src/allmydata/immutable/download.py at 4164#L871
 CiphertextDownloader._get_all_shareholders()] with an object that acts
 like a DeferredList but fires once at least {{{K}}} shares (or possibly
 once at least {{{K}}} servers) have been found.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/928>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list