[tahoe-dev] BackupDB proposal, and core functionality improvements for v1.4.0

zooko zooko at zooko.com
Wed Sep 24 13:49:07 UTC 2008


Hello Brian.  I just got around to really reading your BackupDB
proposal, since Mike Booker has implemented a similar backupdb for the
allmydata.com Windows client (in C#).

I like your design quite well, and I have a couple of suggestions.
Below I comment only on the parts that I would change -- everything
else in your design I like as is.  :-)

At the end of this message I mention some unrelated "core improvements
in Tahoe" that I hope you will work on in the future.


On May 28, 2008, at 16:42 PM, Brian Warner wrote:

 > The upload process is:
 >
 >  1. record the file's size and modification time
 >  2. upload the file into the grid, obtaining an immutable file
 > read-cap
 >  3. add an entry to the 'caps' table, with the read-cap, and the
 > current time
 >  4. extract the read-key from the read-cap, add an entry to
 > 'keys_to_files'
 >  5. add an entry to 'last_upload'

Hm.  There is a failure which happens rarely but which has ugly
consequences.  (You mentioned this in your design document.)  It could
happen that the file has an mtime of X, and then our backup process
does step 1 here and the part of step 2 that reads the file contents,
and then a new file contents are written to that file fast enough that
its mtime is still X.  (If the new file contents are put into place
with a relink instead of overwriting the file then the part of step 2
that happens could be simply opening the file instead of reading it.)

The ugly consequence is that the new file contents never get backed
up, even if the user (possibly because they suspect that this might
have just happened) specifically tries to force the file to get backed
up.

I guess a way to prevent this is to add this step:

Step 1.5: if the mtime that we just read in step 1 is equal to the
current time, within the granularity of the mtimes in the underlying
system, then do not back it up, but instead wait until the file's
mtime is sufficiently old that we would be able to tell if it changed,
then go back to step 1.

What do you think?  This would impose no cost in the common case (that
the mtime is old), and even if the mtime is current, on most of our
supported platforms the granularity of mtime is sufficiently small
that this would impose only a momentary delay.


 > If the path is present but the mtime differs, the file may have
 > changed. [...]  The client will compute the CHK read-key for the
 > file by hashing its contents [...] then checks the 'keys_to_files'
 > table to see if this file has been uploaded before

Well, it may or may not be a win to hash the current file content in
order to check whether that file content has already been uploaded,
but this is not a good heuristic for when to try that technique.

Consider several policies for "when do we hash the current file
content in order to check whether that file content is already
recorded as uploaded in the db":

1.  Never.  The backupdb relies solely on mtimes and sizes to decide
whether a file has already been backed up.  If someone mv's a file
from a to b, then this will evade the backupdb's notice and the file
will get re-backed-up.  If someone uses touch(1) on a file that is
already backed up, then it will get re-backed-up.  If someone
downloads a file onto their local computer which happens to be the
same contents as a file they previously backed up, then the new file
will get re-backed-up.

2.  Whenever there is a file content already backed up with the same
size and mtime.  When you ask the backupdb if file /a/b/c is already
backed up, then (if the backupdb doesn't already have a note that the
path /a/b/c with the same size and mtime is backed up) it checks for
any files of the same size and mtime in its db.  If there is any file
the same size and mtime in its db, then it hashes /a/b/c in order to
see if the current contents of /a/b/c match the hash of that other
already-backed-up file.  This means that if you backup a file, then mv
it to a new location, and then later back it up again from the new
location, the backupdb will recognize it as previously backed-up.
However, if you back it up, delete it, then later re-download it to a
new location, its mtime will have changed, so it will evade this
heuristic.

3.  Whenever there is a file content already backed up with the same
size.  Like #3, but checking whether files with different mtimes still
have the same content.

4.  (This is what you proposed above, if I understand correctly.)
Whenever it goes to backup a file which was previously backed up but
which has changed since then.  This policy fails to notice if you mv
or download the file to a new path (it doesn't notice that the same
file contents were already backed up), and it often wastes time
hashing if you've changed the contents of a file on disk to a new
version which has never before been backed up.

So I would favor another policy over #4, and in fact for the time
being I would favor policy #1 for simplicity -- the backupdb doesn't
have to know how to hash files at all for the first design.

In the future, we might like to have a more aggressive policy like #2
or #3.  Note that with such a policy we would unnecessarily hash a
file with a given (path, size, mtime) tuple at most once -- after that
first time we would remember its hash in the db.


 > If the read-key *is* found in the 'keys_to_files' table, then the
 > file has been uploaded before, but we should consider performing a
 > file check / verify operation to make sure we can skip a new upload.

Let's separate checking/verification/repair from the backupdb's job.
Assume that there is some service that you can rely on to verify that
a file that you earlier backed up is fully present and fully
redundantly encoded.  Now the backupdb can be a lot dumber.

(Such a service is, of course, what I am working on right now and what
is the blocker for the Tahoe 1.3.0 release.)

Thanks!

Nice design!

Now if only someone would implement it, in a sufficiently portable
language such as C/C++/Python...  :-)


By the way, I really hope that you, Brian, allocate time in the future
for some improvements to core Tahoe functionality which you are
uniquely well prepared to work on.  Backupdb would be a nice
improvement, and you would do an excellent job of designing and
implementing it, but there are a lot of other people who could also
design and implement a backupdb.  There are relatively few other
people who could contribute to these core pieces of Tahoe:

Things that I'm slightly embarassed that Tahoe doesn't already do:

  * #483 repairer service (this one is my job for v1.3.0)
  * #119 lease expiration / deletion / garbage-collection
  * #320 add streaming upload to HTTP interface
  * #346 increase share-size field to 8 bytes, remove 12GiB filesize
     limit

Things that dramatically increase performance (in one case) and make
capabilities actually usable:

  * #217 DSA-based mutable files -- small URLs, fast file creation

Things that open up the way to standardization and re-implementation
of Tahoe and fit it into more deployment scenarios:

  * #510 use plain HTTP for storage server protocol?

Let's make a plan for when to make progress on some of these.  Do you
think we can do one of these in v1.4.0 before the end of the year?


Regards,

Zooko

tickets mentioned in this e-mail:

http://allmydata.org/trac/tahoe/ticket/483 # repairer service

http://allmydata.org/trac/tahoe/ticket/119 # lease expiration /
deletion / garbage-collection

http://allmydata.org/trac/tahoe/ticket/320 # add streaming upload to
HTTP interface

http://allmydata.org/trac/tahoe/ticket/346 # increase share-size field
to 8 bytes, remove 12GiB filesize limit

http://allmydata.org/trac/tahoe/ticket/217 # DSA-based mutable files
-- small URLs, fast file creation

http://allmydata.org/trac/tahoe/ticket/510 # use plain HTTP for
storage server protocol?

---
http://allmydata.org -- Tahoe, the Least-Authority Filesystem
http://allmydata.com -- back up all your files for $5/month




More information about the tahoe-dev mailing list