[tahoe-dev] #466 state-of-the-patch

Tue Feb 15 18:07:29 UTC 2011

> We currently use serverids for three things:
>
>  * A: [share-placement: claim of independent failure modes]
>  * B: [permutation seed]
>  * C: [shared secret seed]
>
> In the future, we would like to also use serverids to:
>
>  * D: [server-selection UI handle]
>  * E: [Accounting handle]

As Zooko mentioned, we had a very productive conversation on IRC. Here's
some of what I took away:

* We can think of those uses for a serverid as "properties of each
  server". Some of them need to be "secure" in some sense, others do
  not. We don't necessarily need to use the same value for each.
* write-enablers are merely a way to express "mutation authority" (the
  right to modify a mutable share), but there could be others

The key insight that I'd been missing was this:

*** <zooko> We'll never have a foolscap-free world with write-enablers.

What he specifically meant was that the non-Foolscap protocols we're
thinking of are unencrypted, which means we can't use shared secrets as
mutation authority: we must use something else.

What really matters is how the server can distinguish between a valid
mutation request (from someone who holds the writecap) and an invalid
request. We want to ensure that the server only accepts shares that were
produced by a writecap-holder (to prevent forgery), and ideally which
have a higher sequence number than those which came before (to prevent
rollback attacks).

In the future-world, we'll be using a new mutable-file format for which
an ECDSA pubkey (or hash thereof) is used as the storage-index, and the
server can verify the whole share down to this pubkey. In that system,
mutation authority is based solely upon being able to submit a valid
share, because the server can tell whether a share is valid or not, and
whether it is being put in the right place. (a nice side-effect is that
clients could safely use a Helper for publishing mutable files).

In the current SDMF/MDMF world, we can accomplish most of that: the
server has enough information to validate the share down to the embedded
RSA public key, but does not know whether that pubkey is the right one
(in SDMF, the storage-index is the hash of the data encryption key, not
the RSA verifying key). It also has no way to verify that the encrypted
private key is correct. For both pubkey and encprivkey, the best it can
do is compare them against a previously assumed-valid share and require
that they match. This leaves open a so-called "roadblock attack", where
the attacker uploads a bad initial share which can never be overwritten,
but this is minor, and we've never lost sleep over it.

Server-side verification of an SDMF share is possible, but somewhat
expensive (I measure 615us for a 2512-byte share on my 2.53GHZ Core2Duo
OS-X laptop, 3.8ms for a 330kB share, and 21ms for a 2MB share, vs
probably nanosecond or two for a simple WE string compare), and gets
more expensive as the complexity of the share format grows (merkle trees
vs flat hash, etc).

Write-enablers were a cheap and easy proxy, with both good and bad
points:

pros:
 + they are trivial for the server to verify, especially important when
   we consider using low-power ARM boxes as storage servers
 + allow servers to remain ignorant of how shares are formatted (because
   they don't need to decide whether a share is valid or not), enabling
   version flexibility: this makes it easier to deploy new encoding
   schemes (MDMF, Elk Point, etc)
cons:
 - require an encrypted channel (tied to some serverid, so clients
   reveal the shared-secret over the *right* encrypted channel)
 - allow clients to create invalid shares (not a security problem since
   clients already have the authority to delete shares, but unfortunate)

So as we figure out forwards-compatibility, what we really care about is
how to retain the ability for clients of all versions to mutate any
share that they can otherwise understand, even as we change the
protocols and channels by which they send those share-mutation messages.

I see four things that affect this:

   1: server version (what formats it knows about and can thus validate)

   2: share version/format (SDMF, MDMF, something new and unrecognized)

   3: the credentials presented: write-enabler, signed mutation request,
      SDMF RSA signature, some new format's ECDSA signature

   4: the channel over which the credentials are sent (foolscap,
      unencrypted HTTP, HTTPS-validated-by-servername+CA,
      HTTPS-validated-by-hash-of-cert/YURL-style)

I propose that we consider four methods to express mutation authority:

Method A: (current) use a TubID-based write-enabler. Fast, gives us
          version flexibility, but can only be used over Foolscap, and
          breaks server-side share migration
Method B: verify RSA signature in SDMF/MDMF shares. Slower, not tied to
          storage-index, needs server version that is equal or newer
          than the client, but can be used over HTTP, or on a migrated
          share.
Method C: verify ECDSA sig in a new mutable-file format. Same as B but
          *is* tied to storage-index so thwarts roadblock attack.
Method D: client provides ECDSA pubkey as write-enabler. share-mutation
          requests that are signed by this pubkey are accepted. Provides
          version flexibility, unencrypted channels, migrated shares,
          but doesn't prevent roadblock attacks.

We're already doing A. To add B, we change the storage server to respond
to an incorrect write-enabler by creating a copy of the share with the
mutations applied, extracting the pubkey from the old share, asserting
that it matches the pubkey in the new share (same for the encprivkey),
then verify the entire share up to that pubkey, and finally assert that
the sequence number is higher than in the old share. That would (for a
significant speed penalty) allow mutable shares to be migrated from one
foolscap-based server to another (once moved, they'd never have the
right write-enabler and would always fall back to the RSA verification).
It would also let a future unencrypted HTTP protocol still express
mutate authority without needing to do any fancy serverid mapping. To
add C, we need a new mutable-file format. To add D, we need some new
transport options, described below.

(it still might be a good idea to build a write-enabler-changing
protocol, to let us avoid paying the migrated-share speed penalty
forever. The trick is to not let one server abuse this protocol to learn
or replace the WE on a different server. See below for one approach)

So the proposal is to continue using the FURL's TubID to seed
write-enabler computation even when the FURL is delivered in a signed
announcement. The seed is a property of the channel used to deliver the
mutate-share message, not a property of the server which announced the
channel.

== Server IDs ==

So, the preceding section suggests that the thing-C [shared secret seed]
use of a "serverid" is really about expressing mutation authority. What
about the other uses?

* thing-A [share-placement]: this is how server A claims that it is
  different from server B, suggesting that clients should send A+B two
  shares instead of just one. Each client will maintain a set of usable
  servers: a new announcement can replace an old one iff the thing-A
  value is the same.

  Two related servers should be able to describe themselves as
  identical, but two unrelated servers should not be able to spoof each
  other. So any pubkey-like identifier will do, it doesn't even need to
  be stable. For this purpose, we could use TubIDs or pubkeys, it
  doesn't matter as long as no server gets advertised using both at the
  same time (which would look like two distinct servers).

* thing-B [permutation-seed]: keeping this stable over the long term
  will improve performance. When this changes, downloads will spend more
  time and bandwidth searching for shares (since they'll effectively be
  looking in random places), re-uploading-an-existing-file will place
  duplicate shares (since they won't look in the right place for
  pre-existing shares), and mutable-file changes will take longer (they
  won't look in the right places). A "file rebalance" operation (not
  implemented yet) would fix all of these problems by moving the shares
  into their new correct places.

  I think we can allow servers to change their permutation-seed (i.e.
  allow them to move to a different pseudorandom place in the permuted
  ring for each file), by asking what new abilities this would grant to
  a server. If they were at the start of the permuted ring (and held a
  share) and move away, it would have the same effect as simply not
  responding to the DYHB request. If they used to be elsewhere in the
  ring and moved to the start, the effect is to make clients perform a
  useless extra DYHB query. Neither is big security loss.

  So permutation-seed could be anything, and doesn't really need to be a
  verified property. Servers can announce their desired permutation-seed
  via the introducer. For stability, existing servers should continue to
  use their TubID for this, but new servers (who have received no shares
  yet) could use their pubkey.

* thing-D [server-selection UI handle]: this needs to be unforgeable: if
  a user decides to trust server X with their data, because X has a good
  reputation or because there's some other relationship in place
  (payment, reciprocal accounting, etc), then we need to prevent server
  Y from taking advantage of that relationship. Any pubkey-like value
  will do, but stable is better. I think we should use the #466
  signed-announcement pubkey for this, rather than the tubid.
  Future-world servers (which don't use foolscap at all) will have a
  pubkey but not a tubid, and then servers which were around before HTTP
  will benefit from whatever reputation they accrued during their
  pure-foolscap days.

  What about unsigned announcements? I see three choices:
    - 1: ID=None if unsigned, =pubkey if signed
    - 2: ID=tubid if unsigned, =pubkey if signed
    - 3: ID=tubid forever for old servers, =pubkey forever for new
         servers (just like $BASEDIR/permutation_seed below), except
         ID=pubkey once foolscap goes away

  We can evaluate each by how they affect the client-server relationship
  across the unsigned->signed->HTTP-only transitions. 3 breaks it across
  the signed->HTTP-only transition. 2 breaks it across the
  unsigned->signed transition. 1 breaks it during the unsigned period
  (you wouldn't be able to express explicit-server-selection unless your
  servers and Introducer are new enough to handle signatures). I think
  that 1 (ID=None or =pubkey) might be easier overall.

* thing-E [Accounting handle]: this needs the same properties as thing-D.
  There might be an argument to make this a different key, but I think
  it's safe to use the announcement-signing pubkey as before.

== Announcement format ==

So that says that announcements should be signed by a pubkey, and the
pubkey will be used as a UI handle for identifying the server when
configuring explicit server selection choices or for Accounting purposes
(both now, and for a future announcement that includes an HTTP URL but
not a FURL). The announcement dictionary will contain a key named
"permutation-seed", to be used by clients when computing the permuted
server list, which will be filled with the TubID for legacy servers, and
with the pubkey for new servers. They will also have a key named "FURL"
which tells clients how to connect to the server being advertised, and
clients will compute write-enablers according to the tubid of that FURL.
A future HTTP-only server will omit the "FURL" key.

=== Server Changes ===

I'm thinking that servers should generate an ECDSA keypair when
$BASEDIR/private/something.privkey and $BASEDIR/something.pubkey don't
already exist, and write them into those files. Announcements will be
signed by something.privkey, and something.pubkey will be displayed on
the WUI for human display and accounting purposes. The tubid will
continue to appear on the WUI, but in a diminished role.

Then, the server should write a $BASEDIR/permutation_seed file at
startup if it doesn't already exist. If they have shares already, they
write their tubid into that file. If they don't, they should write the
pubkey. This contents of this file will be used to populate their
introducer announcement's ["permutation-seed"] field.

=== Client Changes ===

Clients currently keep track of servers through a class in
storage_client.py that implements the "IServerDescriptor" interface.
This was a start towards abstracting out various aspects of a server, to
enable things like S3-based servers to sit next to normal Foolscap-based
ones. I think we should add the following methods to this interface:

 * get_permutation_seed()
 * get_foolscap_secret_seed()
 * get_printable_serverid()

Then the current NativeStorageClientDescriptor implementation can
provide different values according to what the announcement said.
permutation_seed will come from the announcement. foolscap_secret_seed
will come from the FURL. printable_serverid will come from the pubkey
used to sign the announcement, or the tubid if unsigned. I think this
also helps the formatting/versioning questions: announcement signatures
can include a pubkey like "v0-abc123.." even though the resulting
printable serverid is presented as "serverid0-abc123" or whatever.

Later, we'll move server-messaging functionality out of upload.py and
downloader/*.py into the IServerDescriptor object. Ideally it should
have a method like "send_mutation", making the IServerDescriptor
responsible for figuring out what sort of HTTP or Foolscap message is
required, as well as what sort of credentials need to be included. Then
implementing a new protocol (HTTP, S3, etc) would be a matter of
providing a new kind of IServerDescriptor object.

=== Next Changes ===

After that, to get to HTTP for real, we should release a
tahoe-version-X1 with the following work:

 * build the server-side write-enabler-replacement code: a method that
   accepts proof of knowledge of the existing write-enabler and accepts
   a new write-enabler to add to the share. For simple share migration
   (moving from tubid A to tubid B), the client sends B a message with
   (nonce,H(nonce+WE(A)+tubidB),WE(B)), and the server adds WE(B) iff
   the hash matches (this prevents server C from soliciting a
   H(nonce+WE(A)+tubidC) from the client and using it to set a bad WE on
   server B). The nonce prevents the server from learning WE(A) in the
   process (not a zero-knowledge proof but good enough).
 * build client-side WE-replacement code: react to BadWriteEnablerError
   by invoking the previous method
   - at this point, all shares can be safely migrated between servers
 * build server-side replace-WE-with-pubkey-enabler code: given the same
   proof as above, the client provides an ECDSA pubkey, and the server
   writes it down as representing mutate authority. The corresponding
   privkey is derived by hashing the writecap (or just the writekey).
 * build server-side signed-mutation messages: express the test_and_set
   operation as a single binary string, then build a method that takes
   that string plus a signature and a pubkey. Accept the mutation
   request if the signature is good and the pubkey matches the stored
   pubkey-enabler
 * build client-side signed-mutation code: based upon server version
   data, fall back to signed-mutation calls when write-enabler fails
 * build client-side deep-upgrade code: walk through all writecaps, add
   pubkey-enabler to each

Then we release tahoe-version-X2 with:

 * build HTTP transport, with signed-mutation messages (but of course
   not WE-based mutation message)

And finally a tahoe-version-X3 that removes foolscap.

Then to get from foolscap to HTTP, the steps are:

 * tahoe-X0 is current 1.8.2
 * upgrade all storage servers to X1
 * upgrade at least one client per rootcap to X1
 * deep-upgrade all writecaps to add pubkey-enabler
 * - now shares are mutable by X0 *and* are ready for HTTP
 * release tahoe-X2: add HTTP
 * upgrade some servers, some clients
 * new clients use HTTP against new servers, using pubkey-enabler, or
   foolscap against old servers. old clients use foolscap.
 * finish upgrading all clients
 * release tahoe-X3: remove foolscap
 * upgrade some servers
 * X1 clients can no longer access files, but X2 clients can
 * finish upgrading all servers
 * everything is now on HTTP

== Remaining Questions ==

 * what should human-manipulated serverids look like? desiderata:
   cut-and-pasteable, printable, stable, unique, not confused with other
   identifiers like filecaps or privkeys. "serverid0-abc123.."?
 * figure out unicode/ascii/utf-8/json in encoding formats, curse JSON
   for not handling binary, probably use netstrings
 * set thing-D=None for unsigned announcements, and require sigs for
   explicit server-selection?