[tahoe-dev] [tahoe-lafs] #778: "shares of happiness" is the wrong measure; "servers of happiness" is better

tahoe-lafs trac at allmydata.org
Wed Nov 4 12:08:51 UTC 2009

#778: "shares of happiness" is the wrong measure; "servers of happiness" is
 Reporter:  zooko               |           Owner:  kevan
     Type:  defect              |          Status:  new  
 Priority:  critical            |       Milestone:  1.6.0
Component:  code-peerselection  |         Version:  1.4.1
 Keywords:  reliability         |   Launchpad_bug:       

Comment(by kevan):

 Yes, that's right -- I misread your comment, and the test you suggest does
 indeed fail (I'll update tests.txt with the new test).

 I think that the general solution I propose in comment:55 (but didn't use
 because, given my view of comment:53 at the time, the one I ended up
 implementing seemed easier and just as effective) would still work for
 that issue -- upon seeing that there are no homeless shares, but a too-
 small {{{S}}}, the Tahoe2PeerSelector would check to see if there are more
 uncontacted peers than {{{servers_of_happiness - S}}}, and, if so, stick
 {{{servers_of_happiness - S}}} shares back into homeless shares for the
 algorithm to try to distribute.

 This is (at best) likely to be inefficient, though. If the
 Tahoe2PeerSelector has seen (as an example) {{{s_4}}} with all the shares,
 and there are {{{s_1, s_2, s_3}}} with {{{f_1, f_2, f_3}}} respectively,
 we'd want to take advantage of the existing shares on {{{s_1, s_2, s_3}}}
 and not allocate different shares to them. In this scenario, though,
 Tahoe2PeerSelector doesn't know anything about these servers other than
 the fact that it hasn't contacted them, so it can't do much more than
 guess. If the servers accept the shares, though, the upload will work as
 it should.

 In the worst case -- that of full {{{s_1, s_2, s_3}}} -- this will fail,
 too, because unless we happen to choose the right shares from those
 already allocated to {{{s_4}}} to put back into homeless shares, we'll end
 up with homeless shares. Only I'm forgetting that the peer selection
 process has multiple phases. What I think will happen in that case is that
 when the Tahoe2PeerSelector attempts to store shares on those servers,
 it'll fail, but it will also discover that those servers have {{{f_1, f_2,
 f_3}}}, and record those in its mapping of already allocated shares. It
 will then ask {{{s_4}}} to store the shares that it wasn't able to store
 on {{{s_1, s_2, s_3}}} -- shares which {{{s_4}}} already has. So these
 shares will be removed from the list of homeless shares, the list of
 homeless shares will be empty, and the {{{servers_of_happiness}}} check
 should succeed.

 Does that seem right?

 I think comment:53 generalizes to any instance where the first server (or
 really the first n servers, where n is less than servers_of_happiness)
 happen to store all of the shares associated with a storage index. To that
 end, I'm also adding a test where {{{s_1, s_2, s_3}}} are empty. I'm also
 adding a test for the "worst case" described above.

Ticket URL: <http://allmydata.org/trac/tahoe/ticket/778#comment:71>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid

More information about the tahoe-dev mailing list