[tahoe-dev] safety and Tahoe Lock Files

zooko zooko at zooko.com
Tue Mar 4 02:48:30 UTC 2008

following-up to my own post

On Mar 3, 2008, at 5:20 PM, zooko wrote:

> When you give someone a write-cap to a mutable file-or-directory, M1,
> which you yourself are also intending to write into in the future,
> you also give them a write-cap to a mutable Tahoe lockfile, L1.
> Thereafter, whenever you want to write to M1, you first read L1 to
> see if it is currently locked.  If L1 is empty (zero length), then M1
> is currently unlocked.
> To lock M1, you pick a random 32-byte string and write that string
> into L1.

A good question to ask about this proposal is: why incur all the  
overhead of using a separate lockfile L1, when you could just,  
instead of attempting to write your random lock string into L1,  
attempt to write your actual data into M1?

The answer is that there is a tiny but unavoidable chance that an  
uncoordinated write to a Tahoe mutable file or directory will destroy  
both the old and new contents of that file or directory, resulting in  
permanent data loss.  This chance is quite remote -- the only way it  
could happen is if there were an unfortunate coincidence of servers  
failing or getting disconnected from the network at the same time as  
the writers failed or got disconnected from the network, and even  
then it would happen only if a couple timing patterns fell out the  
wrong way.  However, the more simultaneous uncoordinated writers  
there are writing to a given mutable file or directory and the more  
frequently they simultaneously write, then the more opportunities  
there are for an unlucky pattern of sudden network outages and  
crashes to cause permanent data loss.

We're strongly averse to even small risk of permanent data loss, and  
we would like to be able to say:

Data Safety Guarantee: No matter what pattern of network outages  
occur, and no matter if your clients crash in the middle of  
performing writes, and no matter if a limited subset of the servers  
crash, have internal errors, or turn out to be subverted by  
criminals, then there is *still* zero possibility of permanent data  
loss, as long as there are at least K well-behaving servers left  
which have shares of one version of your file.

Note: to formulate this safety guarantee precisely, you have to think  
about how certain patterns of network outages and server failures  
could make it be the case that some servers have received the  
previous version of your file while others have received the new  
version.  Analyzing this safety guarantee in terms of which set of  
servers is well-behaved and is available to your writer at what times  
is a difficult but feasible task.  No such guarantee can be offered  
in the presence of an unbounded number of simultaneous uncoordinated  
writes -- when the Tahoe storage servers are under such a load then  
it is always possible to permanently lose data due to an unlucky  
pattern of network disconnections.

Now, if we use the Tahoe Lock Files technique, then the lock file L1  
is exposed to this lack of safety -- an unlucky pattern of failures  
might cause both the current value of L1 and the new value that you  
are attempting to write to be lost.  However, this is no big deal!   
The only thing that is lost is the lock string.  The precious data  
over in M1 remains safe from the (small) danger cause by  
uncoordinated writes.



More information about the tahoe-dev mailing list