[tahoe-dev] P2P file-sharing protocol ideas

Tue Mar 24 15:19:03 UTC 2009

Following a brief discussion with Zooko in IRC, he suggested I post a
few of my ideas regarding a new, secure, p2p protocol here in order to
see what people think and to see what could be learned from Tahoe as
well as what Tahoe could gain (if anything). It's important to note that
my ideas aren't particularly fleshed out, and many of my solutions have
problems themselves which are in need of further solutions. I also hope
to be able to apply lessons learned through the likes of BitTorrent,
Gnutella/FastTrack, Freenet, and OneSwarm. If you have any comments,
opinions, whatever, please share them.

Why "Yet Another P2P File-Sharing Protocol (TM)"?

Since this is the question that pops into most of our minds when we see
projects or proposals such as this, it only makes sense for me to begin
by *attempting* to provide an answer. Taking a look at most current p2p
protocols, a majority of them weren't designed with security in mind,
exempt a few such as Freenet, Wuala (the apparent commercial clone of
Freenet) GNUnet, and OneSwarm. All of these have caveats which deter
*my* use of them:

Freenet functions in a similar fashion to Tahoe. Nodes store encrypted
portions of files distributed throughout the network. The caveat here is
that I do not know what is stored on my node, and I have absolutely no
control over this content. I haven't been following the project
particularly closely, so I am unaware if Freenet's security has improved
since Dhamija's analysis of it [1].

My only problem with GNUnet would be on account of its lack of usability
(my opinion) and apparent complexity (about 100,000 lines of C, 5 times
more than libtorrent). Provided ACTIVE_MIGRATION is set to YES, it can
behave in a manner similar to Freenet, caching encrypted DBlocks on your
machine to help propagate them through the network.

OneSwarm is a project by the University of Washington to create a
friend-to-friend (F2F) darknet, utilizing existing social networks (such
as GTalk) for bootstrapping. This introduces some new and interesting
concepts, and while backwards compatible with BitTorrent, OneSwarm was
most certainly not designed to replace it.

Overview

Everything I have come up with so far has been based on the operation of
private BitTorrent trackers. That is, there's a web site which acts as a
gateway, whereby users can download metadata files which contain various
information about the files they wish to obtain (hashes of chunks, a
hash of the completed file, directory structure, etc). Peers use a
client to parse these files and download the content from the network.

Security Goals and Methods

    * Confidentiality (encryption)
    * Authentication (PKC)
    * Plausible deniability (malleability)
    * Anonymity (routing?)

Confidentiality is provided in the sense that, if Alice and Bob are
exchanging messages, an eavesdropper, Eve, is incapable of understanding
their conversation. This is the textbook situation for the use of
encryption. Various other techniques may be applied to strengthen the
encryption, such as an AONT being performed on "chunks" (a part of the
message) prior to being transmitted, so that the entire "chunk" is
required before even Bob can discern that part of the message. GNUnet
also uses this term to include techniques which make what kind of
messages are being sent ambiguous, attempting to make all the traffic
look "the same" and generating traffic so it's impossible to tell which
traffic is genuine. The use of TLS/SSL may be worth investigating as a
means of bypassing some traffic shaping. It's unlikely ISPs would
purposefully slow down the information superhighway's shopping malls,
but that still leaves encrypted traffic vulnerable to traffic shaping
performed on high-entropy data.

Authentication would be provided in two senses. In the first sense, I
mean the textbook definition of an authenticated exchange: If Alice
sends a message to Bob, Bob knows with overwhelming probability that the
message, and all subsequent messages did come from Alice. In the second
sense, I refer to a subset of authentication: authorization. Bob also
knows that Alice is authorized to be on and use the network. Both of
these may be achieved through the use of public key cryptography. This
is where I've run into a couple snags.

So far the plan for the network would be to have a couple supernodes
which grant access to the network by signing a user-provided public key
with the network's private key. In the case of a private tracker-style
site, the supernode would be the site itself, or in the case of a F2F
network, the supernodes could be the two or three friends who initially
founded the network. The user's key pair can be generated by the
supernode or provided by the user, it doesn't matter. Every peer on the
network would verify that other peers they attempt to connect to are
authorized to be on said network prior to exchanging information with
them. This authorization is done by mutual verification of the
supernode's digital signature on both the respective peers' public keys
using the network's public key (which is provided to every peer on the
network). The only identifying piece of information for each peer is
their signed public key, which they can change and have the supernode
sign (using the network's private key) at any time, given that the
supernode allows them access.

The security-minded of you probably have already noticed two problems
with this scheme thus far. First, if Alice is initially connecting to
Bob and sends her signed public key, Mallory can intercept it and send
Bob her own signed public key instead. A classic man in the middle
(MITM) attack. This requires Mallory to already be authenticated to use
the network (i.e. Mallory is a traitor within the circle of trust), and
to know either what file Alice is trying to get from Bob, or what files
Bob is sharing, as the hash of the requested file (or perhaps chunk)
Alice/Mallory is trying to obtain from Bob will be used as a shared
secret later. This leads to the second problem: I haven't quite figured
out a way to ban a user from the network immediately should I use a
scheme similar to the aforementioned. The supernode can require the key
pair to expire in the near future (a month or two) when the user
presents the key pair for signing, or generate key pairs which expire in
the near future. Then to remove a user from the network, you simply
remove their capability to have their key signed by the network's
private key (i.e. ban them from the private tracker-style site), but
that would probably take a month or more depending on the network's
configuration. Certificate revocation lists could be used, but on large
networks I could see them growing immensely, being updated sparsely, and
just becoming a pain in the ass.

My hope for plausible deniability would be to utilize it in the same
sense OTR [2] does. That is, we allow for malleability after every
session, such that any portion of an exchange between two peers could be
forged later. This would allow a peer to later deny ever having
exchanged a file (would be more convincing if they changed their key
pair prior to doing so).

I'm still debating different techniques to achieve better anonymity. The
implementation would probably consist of routing messages through a
mesh, utilizing ordinary nodes as relays. Precautions would have to be
taken to prevent the network being abused and utilized to DDoS specific
peers if this is the case.

1. http://people.ischool.berkeley.edu/~rachna/courses/cs261/paper.html
2. http://www.cypherpunks.ca/otr/

Again, feel free to give your opinions, suggestions, ideas, etc.

Thanks for taking your time,
Tom