[tahoe-dev] Idea for grid-based aliases

Mon Aug 24 18:24:56 UTC 2009

There has been some discussion of ways to create aliases for Tahoe URLs, in 
order to allow users to provide shorter, nicer URLs.  I suggested (though I'm 
sure I wasn't the first) the idea of variable-security aliases that would 
allow the user to ask Tahoe to generate aliases of arbitrary length.

At the time of that discussion, I don't think we got into the question of 
where the aliases would be stored, though I was assuming that they'd have to 
be stored in some Tahoe node, and that administrative access to that node 
would be sufficient to reveal them.  Thinking about the issue a little this 
morning, I have an idea about how we could store them in the grid.

If something like this were implemented the user interface would need to 
provide a strong, clear warning to the user that by generating a 
reduced-length alias, he or she is *irrevocably* discarding some degree of 
Tahoe's security guarantees.

However, we don't want to throw away any more security than necessary.  So, 
the security requirements for such variable-length aliases, as I see them, 
are:

1.  The alias must be bound to the full URI.  This allows the user to be 
certain (to the degree possible for the alias length) that an attacker cannot 
insert another file into the grid "underneath" the alias.

2.  It must be possible for the user to easily distinguish the capability 
provided by an alias (write or read), simply by looking at the alias.

The second requirement is arguably more about usability than security.  Some 
other requirements:

3.  The alias mechanism should be flexible enough to accommodate not only all 
of the current capability types, but future types as well.

4.  It should be possible to guarantee that an alias can always be created, 
even in the presence of an observed collision.

5.  The alias mechanism should make use of existing Tahoe infrastructure, 
layering on new capability without requiring deep modifications.

One way to achieve these requirements is by storing the target URI in an 
immutable file (which I'll call a "URI file"), and then aliasing the URI of 
that file.  The reason for aliasing the URI file URI rather than the target 
URI is to accommodate various types of URIs in a flexible and extensible 
manner.

Here's the process I envision:

1.  Hash the target URI and truncate the result to the length requested by the 
user.  Convert it to ASCII using base 32.
2.  Prepend a character indicating capability type to the truncated hash.  The 
result is the alias.
3.  Hash the ASCII alias to produce an encryption key.
4.  Hash the encryption key to produce a storage ID.
5.  Check to see if the storage ID exists in the grid.
5a. If so, retrieve it and see if it's an alias for the same target URI.  If 
it is, we're done.
5b. If not, add one to the alias length requested by the user and start over.
6.  Store an immutable file in the grid, containing the target URI using the 
key from step 3 as the CHK.

Note that the collision detection and handling in step 5 is there only to 
ensure that a unique alias can be generated, not because it prevents any 
attacks.

When a Tahoe node is presented with an alias, it must:

1.  Hash the alias to generate the encryption key.
2.  Hash the encryption key to generate the SID.
3.  Retrieve the URI file (this retrieval must be done without a UEB hash).
4.  Hash the URI file contents and verify that the result matches the alias.
5.  Retrieve and return the resource referenced by the target URI.

The same basic approach can be used to allow user-selected aliases, though 
there would be basically no security against substitution, and alias creation 
could fail.  The approach would be to use the hash of the user-selected alias 
as the encryption key of the URI file.

Comments?

	Shawn.