[tahoe-dev] [tahoe-lafs] #633: lease-expiring share crawler

tahoe-lafs trac at allmydata.org
Wed Feb 18 20:19:56 UTC 2009

#633: lease-expiring share crawler
 Reporter:  warner        |           Owner:  warner
     Type:  task          |          Status:  new   
 Priority:  major         |       Milestone:  1.4.0 
Component:  code-storage  |         Version:  1.3.0 
 Keywords:                |   Launchpad_bug:        
Changes (by zooko):

 * cc: tahoe-dev@… (added)


 Hm...  You know, maintaining a (semi-)sorted list can be cheap.  What
 about this, for example:

 We have a separate directory called "expiries".  In it, there is a
 hierarchy of directories, the first layer is a set of directories named by
 the current unix timestamp at a megasecond granularity.  Inside each
 "megasecond" directory there is a set of directories named by the
 kilosecond, and inside each "kilosecond" directory there is a set of
 directories named by the second.  Inside each "second" directory is a set
 of 0-length files whose names are the storage indices of all the shares
 which are due to expire that second.

 Whenever you update the lease on a share, you add that share's SI into the
 new expiry directory, remove its SI from the old expiry directory, and
 update the expiry stamp stored with the share itself.  That's it.  You
 could also remove the SI from the expiry directory whenever you remove a
 lease on a share.

 Now whenever you want to find shares whose leases have expired, you need
 only look at the appropriate megasecond, kilosecond, and second, thus
 saving twelve hours of grovelling through all the shares looking for
 expired leases.

 Note that the failure modes introduced by this scheme are "soft" because
 the expiries directory can be thought of as merely a "cache" of the
 canonical expiry timestamps which are stored with each share.  Corruption
 of the expiries directory never results in premature deletion of a share,
 since you always check the canonical timestamp from the share itself
 before deletion.  Corruption of the expiries directory *can* result in
 failure to delete an expired share, but this is usually less costly than
 the other kind of failure, and if can always be corrected by performing
 one of those 12-hour-grovels to fix or regenerate the expiries directory.

Ticket URL: <http://allmydata.org/trac/tahoe/ticket/633#comment:1>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid

More information about the tahoe-dev mailing list