[tahoe-dev] [tahoe-lafs] #1340: consider share-at-a-time uploader

Thu Jan 27 19:09:11 UTC 2011

#1340: consider share-at-a-time uploader
---------------------------+------------------------------------------------
 Reporter:  warner         |           Owner:           
     Type:  enhancement    |          Status:  new      
 Priority:  major          |       Milestone:  undecided
Component:  code-encoding  |         Version:  1.8.1    
 Keywords:                 |   Launchpad Bug:           
---------------------------+------------------------------------------------
 As discussed in this thread:

  http://tahoe-lafs.org/pipermail/tahoe-dev/2011-January/005963.html

 (in response to discussion about UIs for #467 static-server-selection),
 Chris Palmer described Octavia's is-it-uploaded-yet UI. I liked it, and
 I'd like to explore how we might achieve something similar in Tahoe.
 This ticket is to make sure we don't forget the idea.

 The basic goal is to give each upload a sort of "red/yellow/green-light"
 status indicator, showing how healthy/robust the file is. When the
 upload first starts, it stays in the "red" state until there are enough
 shares pushed to recover the file if it were to be deleted from the
 original computer, then transitions into the "yellow" state. In that
 state, more shares are uploaded until we've achieved the desired
 diversity, at which point it goes "green".

 The key differences from our current uploader:

  * we set N higher than usual, maybe 25 or even 50
  * we don't upload all N shares. Instead, we define some lower threshold
    "Nup" (closer to our old N value) which indicates how many shares we
    want to upload. The idea is that we could create more shares if we
    wanted to, without changing the encoding parameters or the filecaps.
  * we store the whole file locally for longer, and do multiple passes
    over it, instead of the almost-streaming approach we have now
   * the first pass generates all shares, builds all the hash trees, but
     only uploads 'k' shares (discarding the rest)
   * subsequent passes re-use the cached hash trees, generate all shares
     again, but only upload some other subset of the shares, maybe 'k' at
     a time, or maybe more.
   * retain the whole file and the cached hash trees until we've uploaded
     Nup shares
  * if the network is partitioned before Nup is achieved, or the client
    is shut down, resume generating and uploading shares when it comes
    back
  * have a realtime status icon which shows the aggregate
    red/yellow/green state of all current uploads. This lets users know
    when it's safe to close their laptops.

 The general notion is to sacrifice some efficiency to reduce the time
 needed to get to a mostly-safe upload. The original downloader generates
 and uploads all shares in parallel, which eliminates wasted CPU cycles
 and almost doesn't need the disk (and #320 plus some new CHK format
 could make it fully streaming). But if the process is interrupted before
 the very last close() is sent to each share, the whole upload fails and
 must be started again from scratch. This new uploader would need more
 local disk storage (to handle multiple passes), and waste some amount of
 CPU (encoding shares that were then discarded, unless they too were
 stored on local disk, and we learned from Tahoe's predecessor that disks
 don't do matrix transpositions well), but would get the file to at least
 a recoverable state in about the same time a normal non-encoded FTP
 upload would have finished, and then gets better after that.

 (caching the hashtrees would save some of the hashing time on the second
 pass, which may or may not be a win, since hashing is generally pretty
 quick too)

 Some assumptions that must be tested before this scheme is at all
 realistic:

  * generating extra shares (and then discarding them without pushing) is
    cheap
   * generating N=25 or N=50 shares (and never uploading them, just
     keeping the encoding space in reserve for the future) is cheap
  * building the hash trees on each pass, or caching them, is cheap
  * we can tolerate the extra disk IO (multiple passes over each file)

 At present, {{{zfec}}} doesn't have an API to create fewer than N
 shares: you have to make all of them and then throw some away. It might
 be possible to enhance {{{zfec}}} to allow this (and of course do it
 faster than create-and-discard: ideally, the CPU time would be
 proportional to the number of shares we retain), but I haven't looked at
 the Reed-Solomon encoding scheme enough to tell. If so, then the second
 and subsequent passes could avoid the encoding waste (we'd still
 generate each share twice: once on the first pass to build the hash
 trees, and a second time on some subsequent pass when we actually push
 that share).

 Octavia uses pure replication (k=1), which removes the question of
 encoding overhead, so it's a bit easier to use this scheme in Octavia
 than in Tahoe.

 The big big win of this approach is the UI. The k-of-N encoding
 parameters don't matter quite so much when you can keep creating more
 shares until you're happy with their distribution, so there might be
 less need for users to choose k/N instead of sticking with the defaults.
 And the "red/yellow/green" status light is a dead-simple UI indicator,
 like the way Dropbox tells you when it's done moving data around.

 The next step is probably to do some zfec performance tests, to find out
 what setting N=25 (and then discarding the extra shares) would do to our
 upload speed.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1340>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage