[tahoe-dev] An idea for share placement

Mon Aug 10 22:02:42 UTC 2009

One issues with Tahoe is that the "permuted list" method of selecting servers 
on which to place shares does not scale well.

In a stable grid (such as allmydata.com) it works very well, because the 
servers selected during upload will almost certainly be around and have the 
share later.  But a grid that is growing rapidly creates a problem, in that 
many of the new peers will show up higher in the permuted list than the 
selected peers, so during share retrieval the client may have to query a 
large subset of the larger grid.

The pathological case is that of a large grid and a file with less than K 
shares remaining.  In that case, the client will query every server in the 
grid trying to find its shares.

In small grids, there's no problem, because querying every server isn't a big 
deal.  It's a scalability problem.

One solution that has been suggested is to give each server holding a share a 
list of the other servers holding shares.  That way, if a client can find 
just one share, it will know exactly where to look for the rest.  That's 
good, but it doesn't address the pathological case.  For a large grid, the 
client needs a way to know when to stop looking.

It occurs to me that perhaps we could add a little bit of information to the 
URI which tells the client how far down the list it must go before it should 
give up, and do this in a way that is stable in the presence of new nodes.  
That is, a way that ensures the client looks far enough, regardless of how 
many new nodes are added.

I'm sure there are many ways to accomplish this, but the one that comes to my 
mind is simple hamming distance.  If the peer selection list is sorted by 
hamming distance from the SID, and the hamming distance of the last selected 
peer is added to the URI, then the client node trying to retrieve the file 
will know that there is no point in asking any servers whose HASH(SID+peerid) 
is further from SID than that cutoff.

In a small grid, the hamming distance will typically be pretty large, but 
that's okay because querying every server in a small grid isn't a problem.  
For large grids, however, the hamming distance of the furthest selected peer 
will typically be small, resulting in the client querying only a small 
portion of the servers before determining that the shares are not available.  
For a grid that is small when a file is stored, but grows large when the file 
is retrieved, the client will still have to query most of the grid.

The obvious downside to this sort of search limitation approach is that if the 
shares are moved by a rebalancer, the distance in the URI may be invalidated.  
In the pathological case, perhaps all of the shares will be moved to servers 
that are outside the limit.  One solution would be to place a share location 
list on several servers with low distances.

	Shawn.