[tahoe-dev] server selection

Zooko O'Whielacronx zookog at gmail.com
Wed Apr 22 21:12:21 UTC 2009


Yeah I don't know if we're on the same page yet, Troy.

One problem is: how does a client get connections to some servers?
The current behavior in Tahoe v1.4 is: there is a single centralized
"introducer" node which has no authority over anything except the
simple job of giving every client the contact info for every server.
Since it tells all clients about all servers, this means there is no
convenient way to have one client know about a different set of the
servers than another client knows about.  Well, I guess you would run
two introducers for that, and each server would have to run two
instances, one to announce itself to each introducer, or something.
But basically, the only thing that anybody currently does with Tahoe
is to have a full "biclique" [1], where for a given Tahoe grid, every
client that uses that grid has a connection to every server that makes
up that grid.  (Except of course that some clients are always
transiently or even persistently unable to connect to some servers
because of the vagaries of networking, which is another wrinkle that
you have to be aware of.)

But, this is not the problem we are discussing here.  The problem we
are discussing here is: given that a client has connections to a whole
bunch of servers, and now it wants to upload some shares, M different
shares, to servers, which servers should it upload to?  This is the
problem that a lot of people have different answers to, including
"make sure we get enough of the fastest servers", "make sure we get
enough servers hiding in the Tor network", "make sure we evenly
distribute shares around our different clusters/colos/continents",
"make sure the servers with the most available space get the most
shares", etc. etc.

Regards,

Zooko

[1] http://en.wikipedia.org/wiki/Complete_bipartite_graph



More information about the tahoe-dev mailing list