[tahoe-dev] Backup plans, part 1

Shawn Willden shawn-tahoe at willden.org
Mon Jan 26 16:42:42 UTC 2009


Prototyping uncovered the need for a few changes:

On Thursday 22 January 2009 09:44:52 pm Shawn Willden wrote:
> The same general mechanisms used to handle versioning of individual 
> files will be used to handle backupdb versioning.

Since both the previous and current versions of the backupdb are available 
during backup, I won't use the rdiff delta method.  Also, since I'm 
processing both of them anyway, it's most convenient to use an ad-hoc, sort 
of patch-like delta format that expresses additions, changes and removals of 
whole records.

The change records will naturally end up sorted by pathname, which helps in 
another way -- each delta will be efficiently searchable, even when large.  
This means that rather than downloading a full backupdb revision plus a 
series of deltas, then applying the deltas to update the backupdb to the 
desired version (HEAD, usually), then searching for a file, it will be 
possible to search the each full revision or delta separately in sequence to 
find the latest information about a particular file.  This avoids the need to 
actually apply the changes and works particularly nicely if random access to 
grid-based files is implemented.

> The backupdb will be in JSON format and contain the following for each
> file in the backed-up set:
>
> o  Complete pathname
> o  File hash (base 64, 16 octets)
> o  Platform-dependent metadata, not including filename/path
> o  Write cap of the backup file revlog (see below)
> o  Write cap of the backup file index (see below)
> o  Librsync signature

It turns out that this creates files that are simply too big to work with 
conveniently for backups of large file systems (and I've only got a little 
over 2TB!).  In particular, it's nice to be able to mmap() the backupdb, and 
that's not generally possible with large files -- 32-bit platforms have a 
hard limit at 2 GiB, and few handle mmap'ing of large files well.

To make the backupdb file smaller, I'm going to separate librsync signatures 
into other files.  Without those, backupdb records are < 100 bytes each (on 
Linux, at least -- we'll see what happens when we throw ACLs and resource 
forks into the metadata).

The signatures will be placed in separate files, up to 2**(64*n) of them, 
selected based on the first n characters of the base-64 file hash.  n will 
default to 1, but can be increased if needed (no idea on the heuristics of 
that, but I suspect we won't care for a long time).  The signature files will 
contain:

o  File hash (base 64, 16 octets)
o  Librsync signature (base 64, variable length)

Versioning signature files will be with with record-based deltas.

A related issue that cropped up is that the backup code ends up hashing each 
file twice:  Once with the librsync rolling hash, and then again with SHA-256 
to generate the file hash.  To reduce this duplication of effort, the "file 
hash" will actually be the SHA-256 hash of the librsync signature.

> The most common use of the backupdb is to generate a new backupdb (and
> incidentally back up some files).  To make that efficient, I'll take care
> to ensure that although the entries don't have to be in any particular
> order, all entries with a common prefix (aka common parent directory)
> are contiguous.

I missed something obvious here:  prefix-contiguity may still make finding 
particular entries difficult and makes streaming comparison difficult to 
implement (and potentially requires an arbitrary amount of memory).  Sorting 
the entries provides prefix-contiguity and allows efficient lookups and merge 
comparisons.  Also, it's very cheap to do:  Just do an in-order traversal of 
the filesystem tree.

	Shawn.



More information about the tahoe-dev mailing list