[tahoe-dev] Sharing Tahoe file system directories? Loops, lost objects, etc.

Jed Donnelley jed at nersc.gov
Tue Apr 15 21:08:55 UTC 2008


Brain and Tahoe-dev,

I hope you and they don't mind my sharing some higher level messages
regarding the Tahoe file system with my cap-talk colleagues.

On 4/14/2008 11:40 PM, Brian Warner wrote:
> On Sat, 12 Apr 2008 09:35:06 -0700
> Jed Donnelley <capability at webstart.com> wrote:
> 
>> As I mentioned in my first message shared to cap-talk and tahoe-dev,
>> I believe the above semantics are missing the distinction between
>> read-only and "deep read-only" and actually only providing
>> "deep read-only" - namely anything that is fetched through a
>> read-only directory is itself made read-only (file or directory).
>>
>> This "deep read-only" I think is the more important of the two,
>> and perhaps simple read-only isn't worth the complication?  With
>> simple read-only I could share a directory with you and you
>> could fetch anything out of it without diminution (e.g. fetch
>> a read-write capability to a file or a directory), but
>> you could not change the entries in the directory (e.g.
>> delete or add anything).
> 
> Welcome Jed!
> 
> Our "mutable file" objects have both read-only and read+write caps, in which
> the read-only cap is derived from the read+write one. In ocaps terms, the
> read+write cap is a reference to an object which gives you two abilities: the
> ability to change the contents of the file, and the ability to retrieve a
> read-only reference. The read-only reference has only one ability: reading
> the contents of the file.
> 
> (Immutable files, of course, have just the read-only cap).

As it must be I think.

> Then we have "directory node" objects that are based upon mutable files.
> These are really just a way of interpreting the contents of a mutable file.

So far you have exactly the semantics of the LINCS/NLTSS equivalent objects.

> By encrypting the write-caps of the child objects (whether they be files or
> subdirectories) the right way, we achieve transitive readonlyness, aka "deep
> read-only", which (as you've pointed out) is far more useful.

This is different.  I'll guess and you can correct me.  I guess that
you store the "write-caps" with a key that is part of the write capability
to the directory node.  If that's the case then I also guess that you
store two versions of each object that has a writable version in the
directories, the writable version encrypted as above and the read-only
(deep read-only of course for directory nodes) version that is unencrypted.

Is that guess correct about your "trick"?

> So there isn't really a "shallow read-only" cap. If we didn't do the
> encrypted-child-write-caps trick, then the ability to read the underlying
> mutable file would be a shallow-read-only cap, but then we wouldn't have any
> way to provide deep-read-only caps. Choosing where to place and protect the
> child write caps forces us to pick between the two options, and we went with
> deep-read-only.

Hmmm.  I believe I can understand your design choice.  However, it doesn't
seem to me that the choice was either/or.  I accept that deep read-only is
much the most important of the two read-only forms of access.  However, I
believe (perhaps part of my guess above) that if you wished you could add
shallow read-only access for directories.  I guess that you could make
a shallow read-only capability to a directory consist of essentially a
read-only capability to the underlying mutable file together with the
key to decrypt the writable versions of the capabilities in the file.
While I can understand that you may have constraints on the number of
keys that you can include in a capability, you could of course always
use the expedient of storing the second key (in this case the key
used to encrypt the writable capabilities in the directory) encrypted
by the first (in this case the key for the shallow read-only directory
capability) in the directory file.  The above may include too much
guessing about too detailed aspects of the implementation, so I'll pop
up a level.

Why would I want a shallow read-only directory capability?  One example
is to manage a project with other colleagues who I trust with write
access to some of the underlying objects.  I can manage the project by
choosing what to put into the shallow read-only directory (including
whether some of the pieces are writable, shallow read-only, or deep
read-only capabilities to directories) - nobody who I give it to can
modify it - but everybody who I give the shallow read-only capability
to can extract what's in it and write to that which I choose to share
write access.

Of course one can achieve the functional effect of such a shallow read-only
directory by simply sharing a read-only file containing capabilities
to the objects to share with the group.  Unfortunately (I believe)
the interfaces (e.g. those that service the user interfaces) treat
directories differently than they do files.  This means that if I'm
forced to use a read-only file in this way to share capabilities, I
would be unable to use the other directory interfaces to access this
set of capabilities.  I suppose an alternative design philosophy to
achieve this value would be to design all the interfaces so they
can accept a file wherever they accept a directory?  Perhaps you have
some other means to achieve this value?

While I'm on the subject of directory semantics, I may as well exhaust
the access modes for directory services that I've provided in the past.
Perhaps others will wish to share their favorite sort of directory access
just to get them out on the table.

The other sort of access that I've found useful in the past is
"append-only".  This access allows one to insert a named capability
into the directory but not to otherwise modify the contents (e.g. not
to delete anything from the directory - rename, etc.).

In thinking about this a bit while considering the Tahoe file system,
I don't think there is any cryptographic trick that is likely to achieve
"append-only".  I think the main use we had for such directories
was to allow some such "append-only" directories to be widely accessible
so that others could "give" us (e.g. me) object access by inserting
capabilities to them into such an "append-only" directory.  Since Tahoe
file system capabilities are data capabilities that can be sent in
messages, I don't believe such an "append-only" directory access mode
is really needed.  I this day and age widely opening up even "append-only"
access to a directory seems unwise (e.g. denial of service attacks filling
the directory with names, using it for file storage, etc.).

> Incidentally, we're thinking about introducing another level in our next
> design: see http://allmydata.org/trac/tahoe/ticket/308 for details. In short,
> there are actually three levels of access for mutable files: write, read, and
> verify. We're talking about adding a "deep-verify" cap to dirnodes, which
> would allow the holder to verify the integrity of the directory, and all of
> its children, but would not allow them to see the plaintext of anything. This
> "deep-verify" cap would also allow the holder to total up the amount of space
> consumed by the directory and its children, which is obviously of use to a
> company which wants to charge for the storage being provided but which
> doesn't want to be able to see the files being placed there.

Hmmm.  I read the above details.  I have to admit that I don't understand
the requirement for the proposed traverse and verify forms of directory
access, "Writecaps beget readcaps.  Readcaps beget traversecaps. Traversecaps
beget verifycaps."  e.g. where you say, "If we had traversal caps, then
customers could give us the traversal cap instead of their read cap. We
could still see the shape of their filesystem (and probably the length
of their filenames, and the size of their files), but perhaps that would
be little enough exposure that customers would be comfortable with
revealing it."

However, one thing I will mention just in the hope that you are
thinking ahead, imagine a day (soon if I can get there from here)
when some Tahoe directories contain capabilities from different
introducers.  E.g. if I set up my "friendnet" with a separate
introducer (as I believe I'm forced to do) and then store some
capabilities from both the allmydata introducer and my friendnet
introducer into directories introduced by each.  Perhaps you
have such mixed directories now in testing?

How would/do traverse and verify capabilities to directories work
in the above context?  How would you imagine them being used?

You refer to the notion of a "verifier manifest (a set of verifier
caps for all files and directories reachable from some root)".
It seems to me you are touching on what I consider "accounting"
issues.  I believe such accounting issues can and should be kept
as 'back end' mechanisms.  It doesn't seem to me wise to make them
visible through the directory interface (e.g. traverse and
verify directory capabilities).  I believe Allmydata should be
able to find and account for all objects it is servicing for
a given account *without* having to traverse any directory
structure to find them.

A related topic is that of directory loops, lost objects, etc.
Perhaps without bias I can ask you about that topic to see
where your considerations have taken you?  For example, while
"lost" objects can't be found in any directory tree (e.g. no
extant directory structure, perhaps the service for a needed
directory is currently off-line) they are still stored.  How are
they "accounted" for?

I should mention that if you'd prefer that I (we?) butt
out of your design considerations and limit interactions
to testing what you've implemented, I'll be happy to do so
without rancor.  I can well understand the time trade-offs
in considering additional, tangential feature/design input.
However, I'll state the obvious when I note that many of us
on cap-talk have experience designing and implementing systems
with similar enough feature/design sets that you may find
comparisons worthwhile, to a point.

--Jed  http://www.webstart.com/jed/




More information about the tahoe-dev mailing list