[tahoe-dev] Removing the dependency of immutable read caps on UEB computation

Brian Warner warner at lothar.com
Mon Oct 5 05:06:57 UTC 2009


Shawn Willden wrote:

> Hmm. I didn't realize that segment size was dependent on 'k'. I
> thought segments were fixed at 128 KiB? Or is that buckets? Or blocks?
> I'm still quite hazy on the precise meaning of bucket and block.

128KiB is the *maximum* segment size. The actual size is (I think):

 round_up_to_multiple_of(k, min(filesize, 128KiB))

The deal is that each segment will be effectively split into 'k' pieces
(plus N-k redundant blocks), so the segment needs to be a multiple of
'k' in size. The alacrity is directly related to the segment size, so we
put an upper bound on it. And efficiency goes down as the number of
segments rises, so we use just one segment if we can.

When we talk about segments, we're talking about segments of plaintext
and/or ciphertext. The segment is what goes into FEC. The output of FEC
is called a block, so there's NUMSEGS blocks in each share (and N shares
per file).

Storage servers keep track of buckets, each one labeled with a storage
index. Each bucket has one or more shares (all for the same file). The
client asks a storage server for access to a bucket by naming a
storage-index, and the server responds with a list of all the shareids
that it has for that SI (i.e. all the shares in that bucket).

At least, that's the way we've been using the terms in Tahoe (not always
consistently, I'm afraid).

>> Hm, it sounds like some of the use case might be addressed by making
>> it easier to run additional code in the tahoe node (i.e. a tahoe
>> plugin),

> On the "plugin" point, I'm thinking that I want to implement my backup
> server as a Tahoe plugin. I'm not sure it makes sense to implement it
> as a part of Tahoe, because Tahoe is a more general-purpose system.
> From a practical perspective, though, my backup server is (or will be)
> a Twisted application, it should live right next to a Tahoe node, and
> it should start up whenever the Tahoe node starts and stop whenever
> the Tahoe node stops. Seems like a good case for a plugin.

So, the plugin idea I had was to have tahoe.cfg name the plugins that
you want to load (as well as any plugin-specific configuration to use),
then import the code with one of the various plugin frameworks that
we've got floating around (twisted.python.plugin,
setuptools+entrypoints, whatever mercurial uses, whatever trac uses). I
designed the tahoe node as a hierarchy of twisted.application.service
instances in anticipation of this.. the 'Client' service is the one that
gives you an API to upload/download files. The plugin would then be a
Service instance that gets attached as a service-child of that Client.
This would basically give it start+stop hooks, and then it could e.g.
upload files with:

 d = self.parent.upload(UPLOADABLE)

We'd need to think of some convenient ways to let plugins do more than
that.. probably add some sort of hooks into the webapi dispatcher, so
the plugin could have a status page and some CLI controls.

cheers,
 -Brian



More information about the tahoe-dev mailing list