[tahoe-dev] proposal: add padding

Mon Jul 15 23:49:37 UTC 2013

The way this would work in ZFS, say, is that file data (and metadata)
are written in even-sized blocks.  So if you were looking at an
encrypted ZFS dataset you'd see all objects as having sizes that are
multiples of the object's recordsize (block size).  Any object's size
would be stored in the object's metadata, which could also be
encrypted (I don't know if it is in ZFS).

You can't avoid leaking some information about file sizes unless
you're willing to create enormous logical files for actually-small
files.

If you want to be able to verify integrity while lacking keys, or
relocate data, or dedup it while lacking keys, then you need to leak a
fair bit of stuff as you then have to organize the filesystem into a
set of objects, each object having visible block pointers (i.e.,
hashes of blocks) and a size in number of blocks.  Some objects might
be directories, or not.  But access patterns will generally be
indicative.  Alternatively you could encrypt everything, and fill all
empty space with random data, but even then access patterns will tend
to indicate what is likely to be read-only/mostly metadata.

Nico

PS: I know exceedingly little about Tahoe-LAFS's on-disk format.