[tahoe-dev] [tahoe-lafs] #1169: documentation for the new download status page

Thu Aug 12 20:10:51 UTC 2010

#1169: documentation for the new download status page
-------------------------------+--------------------------------------------
     Reporter:  zooko          |       Owner:  somebody              
         Type:  defect         |      Status:  new                   
     Priority:  major          |   Milestone:  soon                  
    Component:  documentation  |     Version:  1.8β                  
   Resolution:                 |    Keywords:  immutable download wui
Launchpad Bug:                 |  
-------------------------------+--------------------------------------------

Comment (by warner):

 I'm slowly (i.e. post-1.8.0, maybe in a 1.8.1) changing the data on this
 page
 and adding visualizations, so I don't want to put too much energy into
 documenting a transient/unstable data structure quite yet. But here's a
 quick
 summary of what's on the new downloader status page.

 First, what's involved in a download?:

  * downloads are triggered by {{{read()}}} calls, each with a starting
 offset
    (defaults to 0) and a length (defaults to the whole file). A regular
    webapi GET request will result in a whole-file {{{read()}}} call
  * each {{{read()}}} call turns into an ordered sequence of
    {{{get_segment()}}} calls. A whole-file read will fetch all segments,
 in
    order, but partial reads or multiple simultaneous reads will result in
    random-access of segments. Segment reads always return ciphertext: the
    layer above that (in {{{read()}}}) is responsible for decryption.
  * before we can satisfy any segment reads, we need to find some shares.
    ("DYHB" is an abbreviation for "Do You Have Block", and is the message
 we
    send to storage servers to ask them if they have any shares for us. The
    name is historical, from Mnet/MV, but nicely distinctive. Tahoe's
 actual
    message name is {{{remote_get_buckets()}}}.). Responses come back
    eventually, or don't.
  * Once we get enough positive DYHB responses, we have enough shares to
 start
    downloading. We send "block requests" for various pieces of the share.
    Responses come back eventually, or don't.
  * When we get enough block-request responses for a given segment, we can
    decode the data and satisfy the segment read.
  * When the segment read completes, some or all of the segment data is
 used
    to satisfy the {{{read()}}} call (if the read call started or ended in
 the
    middle of a segment, we'll only use part of the data, otherwise we'll
 use
    all of it).

 With that background, here is the data currently on the download-status
 page:

 * "DYHB Requests": this shows every Do-You-Have-Block query sent to
 storage
   servers and their results. Each line shows the following:
  * the serverid to which the request was sent
  * the time at which the request was sent. Note that all timestamps are
    relative to the start of the first {{{read()}}} call and indicated with
 a
    "{{{+}}}" sign
  * the time at which the response was received (if ever)
  * the share numbers that the server has, if any
  * the elapsed time taken by the request
  * also, each line is colored according to the serverid. This color is
 also
    used in the "Requests" section below.

 * "Read Events": this shows all the !FileNode {{{read()}}} calls and their
   overall results. Each line shows:
  * the range of the file that was requested (as {{{[OFFSET:+LENGTH]}}}). A
    whole-file GET will start at 0 and read the entire file.
  * the time at which the {{{read()}}} was made
  * the time at which the request finished, either because the last byte of
    data was returned to the {{{read()}}} caller, or because they cancelled
    the read by calling {{{stopProducing}}} (i.e. closing the HTTP
 connection)
  * the number of bytes returned to the caller so far
  * the time spent on the read, so far
  * the total time spent in AES decryption
  * total time spend paused by the client ({{{pauseProducing}}}), generally
    because the HTTP connection filled up, which most streaming media
 players
    will do to limit how much data they have to buffer
  * effective speed of the {{{read()}}}, not including paused time

 * "Segment Events": this shows each {{{get_segment()}}} call and its
   resolution. This table is not well organized, and my post-1.8.0 work
 will
   clean it up a lot. In its present form, it records "request" and
 "delivery"
   events separately, indicated by the "type" column.
  * Each request shows the segment number being requested and the time at
    which the {{{get_segment()}}} call was made
  * Each delivery shows:
   * segment number
   * range of file data (as {{{[OFFSET:+SIZE]}}}) delivered
   * elapsed time spent doing ZFEC decoding
   * overall elapsed time fetching the segment
   * effective speed of the segment fetch

 * "Requests": this shows every block-request sent to the storage servers.
   Each line shows:
  * the server to which the request was sent
  * which share number it is referencing
  * the portion of the share data being requested (as {{{[OFFSET:+SIZE]}}})
  * the time the request was sent
  * the time the response was received (if ever)
  * the amount of data that was received (which might be less than SIZE if
 we
    tried to read off the end of the share)
  * the elapsed time for the request (RTT=Round-Trip-Time)

 Also note that each Request line is colored according to the serverid it
 was
 sent to. And all timestamps are shown relative to the start of the first
 read() call: for example the first DYHB message was sent at
 {{{+0.001393s}}}
 about 1.4 milliseconds after the read() call started everything off.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1169#comment:1>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage