[tahoe-dev] darcs patch: Allow client-specified keys

Shawn Willden shawn-tahoe at willden.org
Wed Feb 11 16:11:49 UTC 2009

On Monday 09 February 2009 03:53:19 pm Brian Warner wrote:
> Cool! Some thoughts:

Thanks for the feedback -- and for the help on IRC in understanding the less 
transparent aspects of the codebase.  You guys definitely do a great job of 
helping new contributors get up to speed.

I think the attached patch addresses all of your points.  Let me know if I 
missed something, or if you think something else is needed.

BTW:  Here's an odd thing - I tried twice to submit this patch via "darcs 
send".  Both times, darcs reported success, but the message never showed up 
in the mailing list.  But my first post in this thread was made that same 

-------------- next part --------------
Wed Feb 11 09:08:49 MST 2009  Shawn Willden <shawn-tahoe at willden.org>
  * Allow client-specified keys
  Adds support for client-specified keys for immutable files.  Clients
  using either web or internal APIs can specify their own keys.

New patches:

[Allow client-specified keys
Shawn Willden <shawn-tahoe at willden.org>**20090211160849
 Adds support for client-specified keys for immutable files.  Clients
 using either web or internal APIs can specify their own keys.
] {
hunk ./docs/frontends/webapi.txt 330
  mutable file, and return its write-cap in the HTTP respose. The default is
  to create an immutable file, returning the read-cap as a response.
+ To use a randomly-generated key as the encryption key for the file, add
+ the query argument "key=random".
+ To specify a key, add a query argument of the form "key=encoding-value",
+ where 'encoding' is one of 'hex', 'base16', or 'base32' and value is the
+ 128-bit key encoded with the specified encoding.  For example, the following
+ are all equivalent, and permitted:
+ 	  key=hex-B6F39C58C25A501B6FDF4AF94E07BB5D
+ 	  key=base16-B6F39C58C25A501B6FDF4AF94E07BB5D
+	  key=base32-w3zzywgcljibw367jl4u4b53lu
+ Be VERY careful that you know what you're doing if you use this feature.
+ Choosing bad keys could compromise the security of your files.  Also, make
+ sure that every file is encrypted with a unique key because uploading
+ different files with the same key will result in a storage index collision.
+ Even uploading the same file encoded with different FEC parameters will cause
+ a collision if you use the same key.  It's a good idea to hash the FEC
+ parameters (k, N, segsize) into your key to be sure that doesn't happen.
+ Normally, you should omit the 'key' argument and let Tahoe construct a
+ content hash key (CHK) which is secure, unique and will make your uploads
+ idempotent.
 === Creating A New Directory ===
 POST /uri?t=mkdir
hunk ./docs/frontends/webapi.txt 654
  the upload results page. The default is to create an immutable file,
  returning the upload results page as a response.
+ To use a randomly-generated key as the encryption key for the file, add
+ the argument "key=random".
+ To specify a key, add an argument of the form "key=encoding-value", where
+ 'encoding' is one of 'hex', 'base16', or 'base32' and value is the 128-bit
+ key encoded with the specified encoding.  For example, the following are all
+ equivalent, and permitted:
+ 	  key=hex-B6F39C58C25A501B6FDF4AF94E07BB5D
+ 	  key=base16-B6F39C58C25A501B6FDF4AF94E07BB5D
+	  key=base32-w3zzywgcljibw367jl4u4b53lu
+ Be VERY careful that you know what you're doing if you use this feature.
+ Choosing bad keys could compromise the security of your files.  Also, make
+ sure that every file is encrypted with a unique key because uploading
+ different files with the same key will result in a storage index collision.
+ Even uploading the same file encoded with different FEC parameters will cause
+ a collision if you use the same key.  It's a good idea to hash the FEC
+ parameters (k, N, segsize) into your key to be sure that doesn't happen.
+ Normally, you should omit the 'key' argument and let Tahoe construct a
+ content hash key (CHK) which is secure, unique and will make your uploads
+ idempotent.
 POST /uri/$DIRCAP/[SUBDIRS../]?t=upload
hunk ./src/allmydata/immutable/upload.py 1101
 class FileHandle(BaseUploadable):
-    def __init__(self, filehandle, convergence):
+    def __init__(self, filehandle, convergence, key = None):
         Upload the data from the filehandle.  If convergence is None then a
         random encryption key will be used, else the plaintext will be hashed,
hunk ./src/allmydata/immutable/upload.py 1113
         self._key = None
         self.convergence = convergence
         self._size = None
+        self.chosen_key = key
+        if key:
+            assert convergence is None # Can't specify both key and convergence
+            assert isinstance(key, str) and len(key) is 16
     def _get_encryption_key_convergent(self):
         if self._key is not None:
hunk ./src/allmydata/immutable/upload.py 1161
     def get_encryption_key(self):
         if self.convergence is not None:
             return self._get_encryption_key_convergent()
+        elif self.chosen_key is not None:
+            return defer.succeed(self.chosen_key)
             return self._get_encryption_key_random()
hunk ./src/allmydata/immutable/upload.py 1183
 class FileName(FileHandle):
-    def __init__(self, filename, convergence):
+    def __init__(self, filename, convergence, key = None):
         Upload the data from the filename.  If convergence is None then a
         random encryption key will be used, else the plaintext will be hashed,
hunk ./src/allmydata/immutable/upload.py 1191
         "convergence" argument to form the encryption key.
         assert convergence is None or isinstance(convergence, str), (convergence, type(convergence))
-        FileHandle.__init__(self, open(filename, "rb"), convergence=convergence)
+        FileHandle.__init__(self, open(filename, "rb"), convergence=convergence, key=key)
     def close(self):
hunk ./src/allmydata/immutable/upload.py 1197
 class Data(FileHandle):
-    def __init__(self, data, convergence):
+    def __init__(self, data, convergence, key = None):
         Upload the data from the data argument.  If convergence is None then a
         random encryption key will be used, else the plaintext will be hashed,
hunk ./src/allmydata/immutable/upload.py 1205
         "convergence" argument to form the encryption key.
         assert convergence is None or isinstance(convergence, str), (convergence, type(convergence))
-        FileHandle.__init__(self, StringIO(data), convergence=convergence)
+        FileHandle.__init__(self, StringIO(data), convergence=convergence, key=key)
 class Uploader(service.MultiService, log.PrefixingLogMixin):
     """I am a service that allows file uploading. I am a service-child of the
hunk ./src/allmydata/test/test_system.py 1
-from base64 import b32encode
+from base64 import b32encode,b16encode
 import os, sys, time, re, simplejson, urllib
 from cStringIO import StringIO
 from zope.interface import implements
hunk ./src/allmydata/test/test_system.py 1183
         d.addCallback(lambda res: self.GET(public + "/subdir3/new.txt"))
         d.addCallback(self.failUnlessEqual, "NEWER contents")
+        # test unlinked PUT with specified key
+        key = 'd'*16
+        d.addCallback(lambda res: self.PUT("uri?key=hex-" + b16encode(key),
+                                           "data" * 100))
+        def _check_specified_key_uri(res):
+            u = uri.from_string_filenode(res)
+            self.failUnlessEqual(u.key, key)
+            return res
+        d.addCallback(_check_specified_key_uri)
+        # test unlinked PUT with content hash key
+        d.addCallback(lambda res: self.PUT("uri", "data" * 100))
+        def _check_CHK_key_uri(res):
+            u = uri.from_string_filenode(res)
+            self.failIfEqual(u.key, key)
+            return res
+        d.addCallback(_check_CHK_key_uri)
         # test unlinked POST
         d.addCallback(lambda res: self.POST("uri", t="upload",
                                             file=("new.txt", "data" * 10000)))
hunk ./src/allmydata/test/test_system.py 1305
         d.addCallback(lambda res: self.GET("statistics?t=json"))
         def _got_stats_json(res):
             data = simplejson.loads(res)
-            self.failUnlessEqual(data["counters"]["uploader.files_uploaded"], 5)
+            self.failUnlessEqual(data["counters"]["uploader.files_uploaded"], 7)
             self.failUnlessEqual(data["stats"]["chk_upload_helper.upload_need_upload"], 1)
hunk ./src/allmydata/test/test_upload.py 33
         self.failUnlessEqual(s, expected)
     def test_filehandle_random_key(self):
-        return self._test_filehandle(convergence=None)
+        return self._test_filehandle(convergence=None, key=None)
+    def test_filehandle_specified_key(self):
+        return self._test_filehandle(convergence=None, key='a'*16)
     def test_filehandle_convergent_encryption(self):
hunk ./src/allmydata/test/test_upload.py 39
-        return self._test_filehandle(convergence="some convergence string")
+        return self._test_filehandle(convergence="some convergence string", key=None)
hunk ./src/allmydata/test/test_upload.py 41
-    def _test_filehandle(self, convergence):
+    def _test_filehandle(self, convergence, key):
         s = StringIO("a"*41)
hunk ./src/allmydata/test/test_upload.py 43
-        u = upload.FileHandle(s, convergence=convergence)
+        u = upload.FileHandle(s, convergence=convergence, key=key)
         d = u.get_size()
         d.addCallback(self.failUnlessEqual, 41)
         d.addCallback(lambda res: u.read(1))
hunk ./src/allmydata/test/test_upload.py 220
-def upload_data(uploader, data):
-    u = upload.Data(data, convergence=None)
+def upload_data(uploader, data, key=None):
+    u = upload.Data(data, convergence=None, key=key)
     return uploader.upload(u)
 def upload_filename(uploader, filename):
     u = upload.FileName(filename, convergence=None)
hunk ./src/allmydata/test/test_upload.py 259
         self.failUnlessEqual(len(u.key), 16)
         self.failUnlessEqual(u.size, size)
+    def _check_provided_key(self, newuri, size):
+        self._check_large(newuri, size)
+        u = IFileURI(newuri)
+        self.failUnlessEqual(u.key, 'b'*16)
     def get_data(self, size):
         return DATA[:size]
hunk ./src/allmydata/test/test_upload.py 309
         d.addCallback(self._check_large, SIZE_LARGE)
         return d
+    def test_specified_key(self):
+        data = self.get_data(SIZE_LARGE)
+        d = upload_data(self.u, data, 'b'*16)
+        d.addCallback(extract_uri)
+        d.addCallback(self._check_provided_key, SIZE_LARGE)
+        return d
     def test_data_large_odd_segments(self):
         data = self.get_data(SIZE_LARGE)
         segsize = int(SIZE_LARGE / 2.5)
hunk ./src/allmydata/test/test_upload.py 581
         eu = upload.EncryptAnUploadable(u)
         d1salt1a = eu.get_storage_index()
+        # and if we specify a custom encryption key it should be different again
+        key = '\x01' * 16
+        u = upload.Data(DATA, convergence=None, key=key)
+        eu = upload.EncryptAnUploadable(u)
+        k1 = eu.get_storage_index()
         # and if we change the encoding parameters, it should be different (from the same convergence string with different encoding parameters)
         u = upload.Data(DATA, convergence="")
         u.encoding_param_k = u.default_encoding_param_k + 1
hunk ./src/allmydata/test/test_upload.py 602
         eu = upload.EncryptAnUploadable(u)
         d4 = eu.get_storage_index()
-        d = DeferredListShouldSucceed([d1,d1a,d1salt1,d1salt2,d1salt1a,d2,d3,d4])
+        d = DeferredListShouldSucceed([d1,d1a,d1salt1,d1salt2,d1salt1a,k1,d2,d3,d4])
         def _done(res):
hunk ./src/allmydata/test/test_upload.py 604
-            si1, si1a, si1salt1, si1salt2, si1salt1a, si2, si3, si4 = res
+            si1, si1a, si1salt1, si1salt2, si1salt1a, sik1, si2, si3, si4 = res
             self.failUnlessEqual(si1, si1a)
             self.failIfEqual(si1, si2)
             self.failIfEqual(si1, si3)
hunk ./src/allmydata/test/test_upload.py 614
             self.failIfEqual(si1salt1, si1salt2)
             self.failIfEqual(si1salt2, si1)
             self.failUnlessEqual(si1salt1, si1salt1a)
+            self.failIfEqual(sik1, si1)
+            self.failIfEqual(sik1, si1a)
         return d
hunk ./src/allmydata/test/test_web.py 1
-import os.path, re, urllib
+import os.path, re, urllib, base64
 import simplejson
 from twisted.application import service
 from twisted.trial import unittest
hunk ./src/allmydata/test/test_web.py 733
         return d
+    def PUT_URI_specified_key(self, key, encoding, encoder, data):
+        return self.PUT("/uri?key=" + encoding + '-' + encoder(key), data)
+    def test_PUT_URI_random_key(self):
+        d = self.PUT("/uri?key=random", self.NEWFILE_CONTENTS)
+        return d
+    def test_PUT_URI_specified_key_hex(self):
+        return self.PUT_URI_specified_key('0'*16, 'hex', base64.b16encode, 
+                                          self.NEWFILE_CONTENTS)
+    def test_PUT_URI_specified_key_base16(self):
+        return self.PUT_URI_specified_key('1'*16, 'base16', base64.b16encode, 
+                                          self.NEWFILE_CONTENTS)
+    def test_PUT_URI_specified_key_base32(self):
+        return self.PUT_URI_specified_key('2'*16, 'base32', base32.b2a,
+                                          self.NEWFILE_CONTENTS)
+    def test_PUT_URI_specified_key_invalid_format(self):
+        key_str = base32.b2a('3'*16)
+        d = self.PUT("/uri?key=" + key_str, self.NEWFILE_CONTENTS)
+        return self.failUnlessFailure(d, error.Error)
+    def test_PUT_URI_specified_key_incorrect_encoding(self):
+        d = self.PUT_URI_specified_key('4'*16, 'hex', base32.b2a,
+                                       self.NEWFILE_CONTENTS)
+        return self.failUnlessFailure(d, error.Error)
+    def test_PUT_URI_specified_key_incorrect_length(self):
+        d = self.PUT_URI_specified_key('5'*16, 'base32', base64.b16encode,
+                                       self.NEWFILE_CONTENTS)
+        return self.failUnlessFailure(d, error.Error)
+    def test_PUT_NEWFILEURL_specified_key(self):
+        key = '6' * 16
+        key_str = 'base32-'+base32.b2a(key)
+        d = self.PUT(self.public_url + "/foo/new.txt?key=" + key_str,
+                     self.NEWFILE_CONTENTS)
+        # TODO: we lose the response code, so we can't check this
+        #self.failUnlessEqual(responsecode, 201)
+        d.addCallback(self.failUnlessURIMatchesChild, self._foo_node, u"new.txt")
+        d.addCallback(lambda res:
+                      self.failUnlessChildContentsAre(self._foo_node, u"new.txt",
+                                                      self.NEWFILE_CONTENTS))
+        return d
     def test_PUT_NEWFILEURL_range_bad(self):
         headers = {"content-range": "bytes 1-10/%d" % len(self.NEWFILE_CONTENTS)}
         target = self.public_url + "/foo/new.txt"
hunk ./src/allmydata/test/test_web.py 1294
         return d
+    def test_POST_upload_specified_key(self):
+        key = '\x27' * 16
+        key_str = 'base32-' + base32.b2a(key)
+        d = self.POST(self.public_url + "/foo", t="upload",
+                      file=("new.txt", self.NEWFILE_CONTENTS),
+                      key=key_str)
+        fn = self._foo_node
+        d.addCallback(self.failUnlessURIMatchesChild, fn, u"new.txt")
+        d.addCallback(lambda res:
+                      self.failUnlessChildContentsAre(fn, u"new.txt",
+                                                      self.NEWFILE_CONTENTS))
+        return d
     def test_POST_upload_unicode(self):
         filename = u"n\u00e9wer.txt" # n e-acute w e r . t x t
         d = self.POST(self.public_url + "/foo", t="upload",
hunk ./src/allmydata/web/common.py 9
 from nevow.util import resource_filename
 from allmydata.interfaces import ExistingChildError, NoSuchChildError, \
      FileTooLargeError, NotEnoughSharesError
+from allmydata.util import base32
+import base64
 class IClient(Interface):
hunk ./src/allmydata/web/common.py 51
         return results[0]
     return default
+def get_key_arg(ctx_or_req):
+    """
+    Extract the 'key' argument from the query args.  If not found,
+    return None.  If the argument is "random", return "random".
+    Otherwise, the argument should be of the form "encoding-value",
+    where encoding is one of 'hex', 'base16', or 'base32'.  Parse it
+    and return the value as a binary string, which must be 16 bytes in
+    length.
+    """
+    req = IRequest(ctx_or_req)
+    key_str = get_arg(req, "key", "").strip()
+    if key_str == "":
+        return None
+    elif key_str == "random":
+        return key_str
+    try:
+        encoding, value = key_str.split('-', 1)
+        if encoding == 'base32':
+            key = base32.a2b(value)
+        elif encoding == 'hex' or encoding == 'base16':
+            key = base64.b16decode(value)
+        else:
+            raise WebError('Unknown key format ' + encoding)
+    except:
+        raise WebError('Invalid key format')
+    if len(key) != 16:
+        raise WebError("Key must be 16 bytes in length")
+    return key
 def abbreviate_time(data):
     # 1.23s, 790ms, 132us
     if data is None:
hunk ./src/allmydata/web/unlinked.py 8
 from nevow import rend, url, tags as T
 from nevow.inevow import IRequest
 from allmydata.immutable.upload import FileHandle
-from allmydata.web.common import IClient, getxmlfile, get_arg, boolean_of_arg
+from allmydata.web.common import IClient, getxmlfile, get_arg, boolean_of_arg, WebError, get_key_arg
 from allmydata.web import status
 def PUTUnlinkedCHK(ctx):
hunk ./src/allmydata/web/unlinked.py 15
     req = IRequest(ctx)
     # "PUT /uri", to create an unlinked file.
     client = IClient(ctx)
-    uploadable = FileHandle(req.content, client.convergence)
+    key = get_key_arg(req)
+    if key is not None:
+        convergence = None
+        if key == "random":
+            key = None
+    else:
+        convergence = client.convergence
+    uploadable = FileHandle(req.content, convergence=convergence, key=key)
     d = client.upload(uploadable)
     d.addCallback(lambda results: results.uri)
     # that fires with the URI of the new file
hunk ./src/allmydata/web/unlinked.py 51
     req = IRequest(ctx)
     client = IClient(ctx)
     fileobj = req.fields["file"].file
-    uploadable = FileHandle(fileobj, client.convergence)
+    key = get_key_arg(req)
+    if key is not None:
+        convergence = None
+        if key == "random":
+            key = None
+    else:
+        convergence = client.convergence
+    uploadable = FileHandle(fileobj, convergence, key)
     d = client.upload(uploadable)
     when_done = get_arg(req, "when_done", None)
     if when_done:


warner at allmydata.com**20081202014946] 
[#542 'tahoe create-key-generator': fix the .tac file this creates to be compatible with modern code, add a test
warner at allmydata.com**20081201234721] 
