[tahoe-dev] DSA-based mutable files, key sizes

Brian Warner warner-tahoe at allmydata.com
Fri Mar 7 08:57:44 UTC 2008

Zooko and I were chatting on IRC tonight about our #217 DSA-based
mutable files. There's a diagram attached[1] to that ticket with the
encryption scheme we're considering. Our goals (in order) are:

 * provide good confidentiality and integrity properties
 * have small write- and read- caps, short enough to put on a 72-character
   wide email message (including the "" prefix)
   without wrapping
 * Allow non-readers some number of verification bits that they can use to
   discourage "roadblock" attacks: where somebody who can guess your storage
   index creates their own files (with a different private key) to prevent
   you from uploading your own shares
 * provide the next layer up with three levels of symmetric keys. The
   current RSA-based mutable files offer two levels: the write-key and
   the read-key (which is used to provide transitive readonlyness). Three
   levels would add transitive-verify, which would allow the delegation
   of a single root "deep-verify" cap, that grants the holder the ability
   to traverse the full directory tree, but only get verify caps out of
   each node.

The last property is a wishlist item: it might be cool to have, but we aren't
tied to it. It would simplify the development of file-repair services: you
would only need to hand out a single deep-verify cap instead of creating a
full manifest of shallow-verify caps and giving the whole list to the repair
agent. On the other hand, there are good reasons to have each client produce
a full manifest on a regular basis (like adding and removing leases). So
maybe we won't bother.

The idea behind the third property (verification bits) is to make sure that
some portion of the storage index (64 out of the 128 bits, in our current
plan) is derived solely from the public key being used. Since the whole share
is hashed and signed by this key, anybody can validate every bit of the
share, up to that public key. But how do we know it's the *right* public key?
All caps (write-cap, read-cap, and verify-cap) contain a full 256-bit hash of
the public key, so the holders of those caps can be confident that they know
which public key is in use. But the storage server, who has the share but
none of these caps, cannot benefit from this full-sized hash.

Normally, storage servers are completely unaware of the data being stored
inside these shares. They accept requests to put bytes 1,2,3 inside a slot
with storage index 4, and don't look too closely at the contents. But, there
is a particular kind of attack that we've dubbed the "roadblock" attack, for
which the server might want to examine the data. This attack would be mounted
by a storage server, probably the first storage server that you talk to when
you're creating a brand-new mutable file (say, when you create a new
directory). This storage server has a brief window of time: after you've
asked them to allocate a given storage index, but before you've asked anybody
else to do the same. During this window they can quickly contact all of the
other storage servers that you're likely to contact, and issue their own
slot-allocation request (using your storage index). They fill those slots
with their own data (specifically using a different write-enabler).
Eventually, your client makes contact with the other servers, but by that
point it's too late: those servers have filled the slot, and your
write-enabler won't match, so you can't replace the data with the correct
one. If they do this to enough servers, you'll be unable to place enough
shares, and you'll be prevented from writing to the mutable file that you
just created.

This is a pretty unlikely attack, and as such we don't expect to ever see it
in practice. But, because it seemed like a good idea, we changed the design
of our DSA-based mutable files to include a certain amount of protection
against it. We made part of the storage index be the hash of the public key:
just the public key, not any other piece of information. This way, if we ever
suspect that a roadblock attack is happening, we can have the storage server
(which knows the public key and the share contents but not the verify-cap)
check to see if the public key matches the storage index. If it doesn't, then
we know that we've got one of these bogus shares in place, and the storage
server can simply delete it (allowing the real client to reclaim the slot).

If we see a lot of this problem, we can change the storage server code to
automatically check the share during upload: it can parse the slot contents,
verify the hashes up to the pubkey, then verify that the pubkey's hash
matches the storage index. Any problem is grounds to delete the share. Also,
any client which detects the signs of a roadblock attack (unable to write due
to mismatching write-enabler on a brand new slot) can petition the storage
server to validate the share on-demand, giving the clients a way to fix the

But, there isn't enough room in the read-cap to put a full isolated hash of
the pubkey. If you look at the diagram, you'll notice that the read-cap
contains a left portion and a right portion. The left portion is the hash of
the secret "salt" and the pubkey. The "salt" is available to holders of the
write-cap and the read-cap: if you know the salt and the pubkey, you can
derive the read-cap, and if you know the read-cap, you can fetch the share
(and the pubkey), which gives you enough information to validate the pubkey
with a full 256 bits (192 from the left portion, another 64 from the right).

The right portion is only 64 bits long, but derives *only* from the pubkey
and not from the salt. This is important, because that right portion is used
to derive the right portion of all subordinate caps, all the way down to the
storage index. This means that there are 64 bits of the storage index which
are derived from nothing but the pubkey, and those are the 64 bits that the
storage server can use to validate the pubkey.

Now, 64 bits isn't quite strong enough for high-quality cryptographic
validation purposes. It's still more operations than the age of the universe,
but with the way computers are getting faster, 64 bits doesn't give us a
sufficiently comfortable margin. That's why we say that these bits
"discourage" a roadblock attack rather than claiming that they "prevent" one.
It's an awful lot of discouragement: you'd have to be unlucky enough to hit a
malicious storage server on the first try, then they'd have to be lucky
enough to find a 64-bit collision in SHA-256d (generating a new DSA keypair
for each attempt) in the 10 or 20 milliseconds that take place before you
manage to talk to the next few servers.

Anyways, the actual point of this message is to highlight a problem that we
discovered this evening, while talking about the deep-verify cap. We failed
to realize earlier that we're faced with a frustrating choice for our verify
caps. It appears that we can either:

 * get a full 256 bits of pubkey hash in the verify cap, or
 * be able to derive the verify cap from the read-cap (without having access
   to the full pubkey)

but not both.

This is disappointing. Without a strong hash of the pubkey in the verify cap,
somebody could (with a lot of work) create a share that would make the
verifier think that the data was ok, but which is actually bogus (the
write-cap or read-cap holder would be able to detect the problem, but not the
verify-cap holder). But if we can't derive the verify-cap from the read-cap,
then it will take a lot of network traffic to generate a verify-cap manifest:
for every mutable file in the tree, we have to retrieve a partial share and
extract the pubkey from it (so we can compute the full 256-bit hash and
insert it into the verify-cap).

Now, this might not be that bad. When we walk the tree, we have to visit and
read every dirnode anyways. So if we only use mutable files for dirnodes,
then these pubkey fetches don't add any additional traffic. It would require
some creative juggling of the order in which we construct the manifest (sort
of depth-first: see the child dirnode, fetch its contents, defer traversing
the contents but use the pubkey to create the verify-cap, do the same for all
children, then finally traverse the children), but it's probably tractable.

But in general, we like to have a nice, clean, linear relationship between
our caps. write-cap -> read-cap -> verify-cap -> storage-index. (or in this
case, write-cap -> read-cap -> deep-verify-cap -> verify-cap ->
storage-index). Needing to do network traffic just to get from a superior cap
to a subordinate one is.. annoying.

So, what can we do? In general, every data-bearing cap (write-cap, read-cap,
and in this case deep-verify-cap) needs to provide bits for two distinct
purposes: confidentiality and integrity. In immutable file caps, these are
completely disjoint values: the 128-bit AES encryption key provides
confidentiality, and the 256-bit UEB hash provides integrity. The clever
thing about the #217 DSA approach is that it combines these two purposes into
one string: the left part of the read-cap provides 192 bits of
confidentiality (derived from the 256-bit secret salt), as well as a full 256
bits of integrity (in two pieces, but they total to 256 bits). The bits in
the right part give us integrity (but not confidentiality), while the bits in
the left part gives us *both* confidentiality and integrity. This is really
clever, and I'm very impressed by Zooko's cryptographic design skills.

The verify-cap problem is that every read-cap bit that provides
confidentiality (that 192-bit left part) is only useful for integrity
purposes to people that know the read secret. So if you have the read-cap
(but haven't bothered to fetch the pubkey), and you're creating a verify-cap
to give to people who will never get to see the read-cap, then you only have
64 bits of integrity material to offer them. All of the other bits are
useless: you'd have to share the read-key to give them a chance to validate
them, and that would violate the confidentiality goals.

So, our options are:

 1: make the right part of the read-cap larger. It is 64 bits now, but we'd
    want at least 128 bits to have an appropriate security level for our
    integrity properties. A 72-character cap minus 26-char prefix gives us 46
    characters to work with, and if we use base64 encoding that gives us
    roughly 276 bits. We plan to use about four characters for human-readable
    version/type information (so you can tell whether the cap is read-only or
    read-write, etc), which only leaves us 256 bits to work with. So making
    the right part larger means making the left part smaller. Fortunately the
    integrity properties are expressed in both sides. So, we could shrink the
    right part to 128 bits (which is all we get from the resulting AES
    encryption anyways), and expand the right part to 128. This would give us
    128 bits of confidentiality, and 256 bits of integrity (for readers), and
    128 bits of integrity (for non-readers). We could then make the
    verify-cap derive just from the right part. This would reduce the
    integrity level of the verify-cap from 256 bits (in our current scheme)
    to 128 bits, but it would allow the verify-cap to be generated without
    fetching the pubkey. Is 128 bits enough?

 2: require a pubkey fetch to generate the read-cap

 3: derive the verify-cap just from the read-cap, leaving the left-right
    split as it is. This would reduce the integrity level of the verify-cap
    from 256 bits to 64 bits.

 4: do something even more clever

Zooko is working on adding DSA to pycryptopp (#331), and he's starting with
an elliptic curve implementation (ECDSA instead of the traditional discrete
log-based DSA). This means that both private and public keys will be about
200 bits, and signatures will be about 400 bits. This is slightly smaller
than the 256-bit private key we'd been designing around around. The 200-bit
ECDSA public key is significantly smaller than the 2048-bit DLDSA key we'd
been expecting. This opens up the following possibilities:

 * clearly, 256 bits of confidentiality is overkill, both because AES only
   takes 128 bits, and because the private key is now only 200 bits.
 * the public key is small enough to put inside a cap. Unfortunately, if the
   cap is limited to 256 bits, then it isn't small enough to share the cap
   space with anything else (and the pubkey is public knowledge, so it
   can't provide any integrity). There are 56 bits left over, but, hey, they
   give away DES crackers in cereal boxes these days.

Working from the bottom up, we could make the storage index be the 128-bit
hash of the pubkey. The verify cap could then be just the pubkey, nothing
else. The write-cap should still be just the private key, of course. That
leaves the read-cap in the middle. If we include the whole pubkey, then we
can use it to derive the verify-cap without an extra fetch. But the
left-part/right-part split is still required: every bit that is useful to the
verify-cap must be stolen from the confidentiality level.

Ideally we would make the left-part of the read-cap be a 128-bit hash of the
write-key (giving us 128 bits of confidentiality), and the right-part would
be the 200 bit-long pubkey (giving all caps 200 bits of integrity: the whole
pubkey). If we did this, the verify-cap and storage-index would both be just
the pubkey. If we wanted a four-level symmetric-key scheme, then the
deep-verify cap would just be the storage index plus the 128-bit hash of the
left-part .

Unfortunately that makes for a 328-bit read-cap, which needs 55 characters
minimum, which expands to 85 characters once you add the prefix and header
bytes. Write-caps would be 200 bits, and verify caps would also be 200 bits
(not that we care about short verify caps). What a pity! This would be a
really clean approach, with great security properties. To get it down to 256
bits, we'd have to use 128-bit ECDSA keys, which only gives us a security
factor of 64, which is too small. Maybe if we switched to a unicode-encoded
base65531 scheme we could get around the 72-column limit.[2]

I think the compromise is to have the read-cap be a left-part that's a
128-bit hash of the private key, and a right-part that's a 128-bit hash of
the public key, and have the verify-cap and storage index be copies of the
right-part. This would give writers 200 bits of integrity, everyone else 128
bits of integrity, and 128 bits of confidentiality. In this scheme, the
pubkey is still stored on the server, but verify-caps and the storage index
can be derived from the read-cap without fetching the pubkey. The only users
of the pubkey that's embedded in the share are read-cap holders and a storage
server that wants to validate the share. (in the 328-bit read-cap, only
storage servers would need a copy of the key. If we used a 200-bit storage
index, we wouldn't even need that copy).

Or, we can do option #1 (with Zooko's salt trick), with a 128/128 split,
which gives readers 256 bits of integrity instead of just 128.

Whew! That was fun. Now, what should we do about it?


[1]: http://allmydata.org/trac/tahoe/attachment/ticket/217/mutable-DSA.2.png
[2]: this sentence inserted to see if you were still paying attention. We're
     almost at the end, time to wake up now.

More information about the tahoe-dev mailing list