[tahoe-dev] down with filesystems! up with the web! -- Re: [tahoe-lafs] #776: users are confused by "tahoe rm"

Brian Warner warner at lothar.com
Wed Jan 6 23:07:07 UTC 2010


James A. Donald wrote:

> Suppose objects are in the cloud.  Suppose a there is a storage cost 
> against each object.  Objects can own other objects in which case the 
> storage cost is charged against the owner, and ownership relationships 
> form a tree, the root of the tree being an object that represents the user.
> 
> In that case the delete, copy and create relationships are necessarily 
> treelike, necessarily work the way manipulation commands in Microsoft 
> explore works, even though the browsable links may well form an 
> arbitrary graph, as shortcuts in the Microsoft explorer do.

Incidentally, the approach we've been planning to take with "ownership"
and storage costs is to put a list of leases on each object. Each lease
is held by a specific owner. Each lease simultaneously incurs a storage
cost and causes the object to be retained against garbage-collection. An
arbitrary number of "owners" can hold leases on the same object, and
each of them incurs a cost. The expectation is that each user will
periodically do a deep-traversal of their starting directories and add
or renew a lease on all reachable objects. If they happen to have a
purely tree-shaped graph, and don't share anything with anyone else, all
of "their" files will wind up with a single lease owned by the user.

So if I share a directory with you, and you link it into your "rootcap"
starting directory, then we'll both wind up "owning" it. As David-Sarah
pointed out elsewhere, you could choose between a "weakref" that doesn't
update the leases (you don't get charged for the storage costs, but if I
choose to drop my own reference, those files may get GCed and then you
couldn't read them anymore) or a "strongref" that does (so you do get
charged, but you also have confidence that the files will stick around
regardless of what I do). We considered a scheme in which having N
owners means that each one gets charged 1/Nth the cost, but we decided
that it would be too surprising to have your costs jump up and down
based upon other people's actions, especially if that caused you to lose
data (due to a quota violation).

A scheme in which each object has only one (human) owner would make
sharing harder, in my mind: if I create a file, give it to you, and then
forget all about it, do I still own the file? And if the file gets
marked as owned-by-you when I first share it with you, but then you
forget all about it, do you still own the file? And who gets to change
the ownership field? Can I force a file upon you?

And a scheme in which objects are owned by other objects (rather than
users) would require chasing up-pointers to find out which human
ultimately owned any particular object, which would violate Tahoe's
least-authority goals (to wit: sharing a file should not also share any
containing directories).

These sorts of questions are harder to answer in the distributed Tahoe
world than in, say, the single-administrative-domain unix/windows world,
because files lack "up" pointers to any containing directories, and
because there is no central authority with control over user accounts.
In Unix, chown(8) requires super-user authority, but Tahoe has no
superuser. And enforcing a tree-shaped structure is easier when each
object maintains an accurate reference-count, which usually requires
up-pointers.

cheers,
 -Brian



More information about the tahoe-dev mailing list