[tahoe-dev] mutable file safety: current plan of attack

Brian Warner warner-tahoe at allmydata.com
Wed Mar 12 19:11:41 UTC 2008

We just got off a conference call with Zooko, Brian, Peter, and MikeB. Our
conclusions were:

 1. we're planning to ship tahoe-0.9.0 with one change: the
    MutableFileNode.update fix that we've discussed earlier, for which I
    proposed an API in my earlier message. This will remove the current
    serious directory-contents-loss problem.

 2. the FUSE plugin will do what it can to reduce the rate of directory
    writes (and proportionally reduce the rate of colliding writes, and
    therefore reduce the risk of data loss due to colliding writes). This
    means batching, including a fix to the current "ping-pong" batching
    problem in which copying two directories at the same time will bypass the
    batching algorithm.

 3. next week (after Pycon), we will implement "k+epsilon", which reduces the
    potential for rollback attacks (accidental and otherwise). This will
    enable us to safely switch to k=1 (#332), which removes the problem with
    data loss due to write collisions. We will probably implement #303
    (choose highest-available version) at the same time. This fix will not
    make it into 0.9.0 or the allmydata.com 3.0 final, but will make it into
    the following release (hopefully within a month or so).

      k=1 removes the possibility of data loss due to colliding writes, at
      the expense of a higher expansion factor and more susceptibility to
      rollback, and k+epsilon reduces this susceptibility.

 4. after k+epsilon, we'll implement #272 recovery, to improve the
    consistency of files that have suffered from write collisions.

Other solutions discussed but shelved include:

 * add a locking server, using a foolscap protocol, in which clients obtain a
   limited-time lock and use it for whatever they like. Add a Tahoe webapi to
   obtain and release locks (using the POST /lock API that zooko proposed the
   other day, but passing a FURL in the arguments, and backing it with
   foolscap instead of a tahoe mutable file). Each account would get a lock
   identifier, and the FUSE plugin would be required to hold the lock while
   performing any directory-modification operations. This would be
   implemented in tahoe as a convenience to the windows FUSE plugin which,
   because it is written in C++ instead of python+twisted, cannot use
   foolscap directly.

 * add a lock FURL to the arguments of all webapi directory modification
   calls, with semantics of "obtain this lock before doing any modification".

 * changing the way we present root caps to the FUSE plugin and the web
   frontend to give each system a single writable directory (and read-only
   access to the others). This would remove the possibility of simultaneous
   writes, but would represent a significant feature/usability loss


More information about the tahoe-dev mailing list