[tahoe-dev] Perf-related architecture question

Wed Jul 21 16:20:43 UTC 2010

Kyle:

On Wed, Jul 21, 2010 at 12:01 AM, Kyle Markley <kyle at arbyte.us> wrote:
> I am running a helper,

Why?

A helper is useful only when the helper has substantially wider pipes
to the storage servers than the storage client has.

(Also the helper is only used for immutable file upload, not for
immutable file download, mutable file upload, or mutable file
download.)

We should probably start calling it something like an "erasure-coding
helper" to emphasize this.

Disclosure: I never liked the erasure-coding helper. I wanted to
improve the existing upload and repair-and-rebalancing instead of
implementing a second kind of upload. I still sort of feel this
way—having two ways to do something instead of one way doubles the
engineering costs of improving the thing. For example, a lot of people
really want different share-placement strategies (see
wiki:ServerSelection [1]). If someone were to implement that (and I
would be very grateful if they did!), they would have to think about
what to do in the case that the user is using an upload helper. I
would probably encourage them to just not support that case and let
the new share placement strategy apply only to storage clients which
are doing the normal upload process.

There, I just wrote a short essay on this subject on ticket #283.

> I'd like to understand what's slowing things down here -- it
> looks like things ought to be able to run about 10x this speed.  Are there
> a lot of serialized network round-trip messages in the upload protocol, or
> something?

There are indeed a few round-trips that could be optimized out. Brian
is working on that with regard to downloader and in fact he should
have some patches ready for review any day now!

However, that probably doesn't account for more than a small fraction
of the delay that you're seeing.

> I'm also curious about how the helper distributes shares to the storage
> nodes.  In my configuration of 4 storage nodes, 3 are wired at 100Mbps and
> 1 is wireless.  It looks like when the helper is distributing shares, this
> happens at roughly the same pace to all nodes, despite some nodes having
> faster connections than others.  I would have expected the wired nodes to
> finish receiving their shares significantly sooner than the wireless node.

Aha! The uploader (whether used by the storage client in your client
or the storage client in the erasure-coding helper) is uploading to
each of the storage servers in parallel. Its rate will therefore be
the rate of the slowest storage servers times K.

To let the fastest storage servers run ahead and finish uploading
would either require storing all of the file data in RAM or revisiting
the file (as stored on disk) more than once.

Note that there is a pipeline which will store up to 50 KB of outgoing
data per storage server in-memory. In practice if you have
max-seg-size=128 KiB and k=3 (defaults) then you pipeline up to two
outgoing blocks per storage server. You could experiment with making
that pipeline deeper and see if that changes the behavior:

http://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/layout.py?rev=4308#L114

Regards,

Zooko