[tahoe-dev] Accounting, 2010 edition

Brian Warner warner at lothar.com
Mon Dec 20 22:16:37 UTC 2010


On 12/20/10 9:38 AM, Shawn Willden wrote:
> On Mon, Dec 20, 2010 at 10:35 AM, Ravi Pinjala <ravi at p-static.net
> <mailto:ravi at p-static.net>> wrote:
> 
>     One thing I can't figure out: how do you handle losing your caps
>     for a share? In that case, it'd still count against your storage
>     limits, but you wouldn't be able to delete it.
> 
> 
> If lease expiration is turned on, your shares will go away if you
> aren't renewing the leases regularly. Without the caps, you can't
> renew the leases.

Yeah, the accounting data is tied to the leases: every account which has
a lease for share X is "charged" for that share. When the last lease
from Bob expires, Bob is no longer charged. When the last least from
anyone expires, the share is deleted.

You bring up a good point, though, which we've thought about a little
bit, but haven't answered thoroughly: keeping the client's view in sync
with the server's. There's a multi-way tradeoff between traffic
overhead, storage usage, and data safety. I think limited-time leases
and renewals are the safest approach overall: your data is safe as long
as you renew it frequently enough, the storage-server's space is
eventually freed as long as the expiration time is set low enough, and
the traffic used to do the renewals is low as long as you renew
infrequently enough.

The servers currently provide an instant-expire API, via the
cancel_lease() method, but it's not easy to use in practice (knowing
when you are really done with a share is hard, since it might be
referenced by multiple directories, and distributed reference counting
is doubly hard).

The two additional APIs we've thought about are batch-renew and
batch-cancel-everything-else methods. The batch-renew is what you'd use
after you've done a deep-traversal of your directory tree (i.e. "tahoe
manifest"), and then you (or somebody you've paid to renew your shares
while you're on vacation) renew all of them in a single call. We could
use Bloom Filters here to reduce the amount of data transmitted, since
adding leases to a few extra files won't hurt too much.

The other API would be used in a similar way, right after you build a
manifest, but it would mean "immediately cancel all my leases on shares
that weren't in the manifest". This would even be safe if only you could
lock your directory tree against changes, build a manifest, issue the
batch-cancel, then finally unlock the tree. Otherwise you might be
telling the server to cancel something that you actually care about (but
started caring about too late for it to be included in the manifest).

So yeah, the batch-renew seems useful for a curator or paid-maintainer
service of some sort, but the cancel-everything-else feels fraught with
problems. So I've put off implementing either.

Hopefully timed-lease-expiration and periodic-renewal will satisfy all
three parties (users who care about their data, server-admins who care
about their free disk space, and network-admins who care about the extra
traffic of lease renewals) for a while.

cheers,
 -Brian



More information about the tahoe-dev mailing list