[tahoe-dev] Question about convergence keys

Brian Warner warner-tahoe at allmydata.com
Wed Aug 13 02:31:52 UTC 2008

> I guess if you want to store a mixture of small really confidential data
> and large semi-confidential/public data, then you'd create two nodes with
> distinct convergence keys. Or is there some more subtle way of achieving
> the same result?

Aye, that's the rub: how do you tell whether a given file is confidential or
not, and if it is guessable or not? You might presume that large files are
not very guessable (and use some sort of heuristic like "use a null
convergence secret for all files larger than 2MB"), but we can think of
several counter-examples that are large, secret, and have low-entropy (i.e.
are guessable). Base something off the filename? But then your security
properties depend upon how you choose to name your files.

The lack of a clear+safe heuristic, coupled with experimental data showing
that convergence did not provide a significant reduction in disk usage, led
us to choose non-convergent uploads (i.e. randomly generated convergence
domains) for the current tahoe release.

> OK, that's what I was hoping. The key isn't exactly the file hash, so
> knowing the bare file hash doesn't let you decrypt it.

Oh, yeah. As Zooko pointed out, Tahoe doesn't use flat hashes anywhere: we
always have a per-purpose tag mixed into the hash. The encryption key is
deterministically derived from the file contents, but it is distinct from a
simple SHA256 hash of the file.


More information about the tahoe-dev mailing list