[tahoe-dev] [Twisted-Python] announcing allmydata.org "Tahoe" v0.9

zooko zooko at zooko.com
Mon Mar 17 03:18:20 UTC 2008

Adding Cc: the tahoe-dev list, which is probably the most logical  
place for follow-ups.

On Mar 16, 2008, at 1:00 PM, John Wells wrote:

> On Thu, Mar 13, 2008 at 3:52 PM, zooko <zooko at zooko.com> wrote:
>> ANNOUNCING: Allmydata.org "Tahoe" version 0.9
>>  We are pleased to announce the release of version 0.9 of  
>> allmydata.org
>>  "Tahoe".
> Very impressive looking project.

Thank you!  We're doing our best to make it possible for other people  
to use allmydata.org "Tahoe", re-use its source code, or at least  
learn from our mistakes.  Please feel free to post to tahoe-dev or  
open a ticket at http://allmydata.org if you try to do one of these  
things and fail.

> I'm curious...it appears that the focus is security and reliability,
> but how well does it perform?

What sort of performance are you interested in?  There are several  
measures of performance (storage efficiency, transfer speed, network  
efficiency, conserving CPU cycles, memory usage, etc.) and many use  
cases.  Tahoe performs very well at a few things and terribly at many  
things.  Below, I'll assume that the kind of performance you were  
interested in is a pleasurable experience downloading movies, since  
that is one of the things that Tahoe is best at.

There are some basic automated performance measurements on The  
Performance Page [1], linked from The Dev Page [2] of the wiki.

Those measurements say that if you are downloading a file over a home  
DSL connection, it might take one quarter of a second to begin a  
download, followed by 500 KB/s sustained transfer speed.

Tahoe performance actually compares favorably with BitTorrent for  
this use.  Our file encoding allows streaming download, so you can  
click to begin downloading a movie, and then you can go ahead and  
start watching the movie while it is still downloading.

Also, Tahoe can transfer data faster than BitTorrent does, because it  
assumes that all clients are deserving of the best possible service  
-- it doesn't use throttling as a way to incentivize cooperation.   
That's good for performance, but by the same token it means you can't  
expect cooperation from arbitrary Tahoe nodes.  If you want this kind  
of service from the storage servers, you have to persuade them to  
serve you, either because you are a friend in their friendnet, or  
because you are a customer.  (In the future other, more general,  
kinds of service relationship will be supported -- we have a detailed  
plan about that which you are welcome to ask about on tahoe-dev.)

As far as I know, Tahoe has not been scaled up to more than a couple  
of dozen storage servers or clients or more than a few hundred GB of  
storage.  This is going to be changing rapidly in the near future as  
allmydata.com is moving our customers' data onto a Tahoe grid.



[1] http://allmydata.org/trac/tahoe/wiki/Performance
[2] http://allmydata.org/trac/tahoe/wiki/Dev

P.S.  In the future, some people might refer to the allmydata.org  
Tahoe secure, decentralized filesystem design as "LAUGFS", which  
stands for "Least AUthority Grid FileSystem".

More information about the tahoe-dev mailing list