[tahoe-dev] mutable file and directory safety in allmydata.org "tahoe" 0.9

zooko zooko at zooko.com
Wed Mar 12 15:48:11 UTC 2008


Folks:

We need to release v0.9 before taking the time to either implement  
Tahoe Lock Files (and Brian doesn't like the idea), or Recovery From  
Colliding Writes (and I don't like the idea of relying on recovery as  
a means of handling multi-writer).

Therefore, what is left to do for v0.9 is:

1.  Fix the ?t=uri and ?t=set_children methods and the DELETE method  
to check the directory's version number, do delta application of the  
requested change, upload the new version of the directory, catch  
UncoordinatedWriteError, re-download, re-apply the requested change,  
upload, etc. until the upload succeeds.  (To understand more about  
this idiom, read on.)

2.  Document the limitations on safe usage of v0.9.


The limitations on safe usage of v0.9 is summarized thusly:

a.  Immutable files are really, really, safe.  They are as safe as we  
could make them.  If at least three out of the ten servers that the  
client used when uploaded the file are still in working order, then  
you can get the file back.

b.  Mutable files and directories are as safe as we could make them  
-- just like immutable files -- except for the following two caveats:

  i. Mutable files do not have the property of "write but don't  
overwrite".  Whenever you write a mutable file, you are overwriting  
whatever is currently in that file, and there is no way for you to  
know what was there.  Therefore, for example, if you download a  
mutable file, append some new data to the end of it, and write your  
new version, there is no way for you to know whether someone else  
uploaded a different change after you downloaded the old version and  
before you uploaded your new version.  If they did, then your upload  
of a new version will silently wipe out their change.

  ii. Mutable files have a ...


Argh.  Folks: I just went to implement "robust application of  
set_children", as per #1 above, and discovered *two* previously  
unknown ways that multiple uncoordinated writes to a directory can  
cause silent data loss.  I see a way to prevent one of these newly  
discovered problems by extending the mutable file write protocol --  
it would probably take a few days because Brian and I have previously  
discussed it and more or less know what to do.  I see no way to  
prevent the other newly discovered problem without a multiple-week  
development effort.


Okay, let's back up a step.  Basically what is going on here is that  
we designed the Tahoe mutable file scheme with the explicit  
requirement that the user never tries to do multiple writes at the  
same time.  Subsequently, allmydata.com started using it with  
multiple writes (there was a Documentation Failure), and in our  
testing nothing bad happened, and so we started saying "Oh, maybe  
writes are sufficiently robust that it is okay to use them like this.".

However, this was a mistake.

We already know of ways that, depending on how many writers and how  
often they write, and depending on when the network connections or  
the clients or the servers crash, that using Tahoe that way can  
silently lose data.

Worse, we don't know if there aren't other, more likely, ways that  
using Tahoe this way can silently lose data.  Nobody has gone back  
over the whole design to understand what the implications of this  
usage are, and when I just tried to implement a simple extension to  
the wapi to allow more convenient way to add children to a directory,  
I found *two* more.

However, there is an easy solution.  One sufficiently easy that we  
can implement it today and release v0.9 on schedule.  That solution  
is to leave the Tahoe API with the same requirement that it has  
always had since mutable files were first designed:

== The Prime Coordination Directive: "Don't Do That" ==

http://allmydata.org/trac/tahoe/browser/docs/mutable.txt?rev=2253#L48


and for users (i.e. allmydata.com) to make sure that they don't so  
that.  I will talk to Mike Booker to be sure, but I'm pretty sure  
that allmydata.com can easily enough avoid uncoordinated writes in  
the Allmydata 3.0 product, simply by having few or no shared- 
writeable directories, or by creating a simple centralized lock  
server when necessary.


Regards,

Zooko

P.S.  Someday someone might search history for instances of the term  
"LAUGFS", which stands for "Least AUthority Grid File System".   
Hello, there, searcher from the future!




More information about the tahoe-dev mailing list