[tahoe-dev] Behavior when storage nodes go below shares.happy?

Sat Jul 10 00:58:03 UTC 2010

On Fri, 9 Jul 2010 14:08:19 -0600, "Zooko O'Whielacronx" <zooko at zooko.com>
wrote:
> QUESTION: is the current semantics of servers-of-happiness good for you?
> 
> Brian Warner expressed a lot of reservation about
> servers-of-happiness. I would be interested in all feedback, positive,
> negative, or neutral, about how servers-of-happiness works for people
> who use it.

Positive!  Love it.  With my configuration:
shares.needed = 2
shares.happy = 4
shares.total = 4
... I know that each node gets one share, so I know that I can lose *any*
two nodes and still have allmydata.  I don't have to worry about some
unlucky randomization putting too many shares on the same node, and being
vulnerable to data loss caused by a problem on that specific node.  Yay!

Ignoring the location awareness feature (which I think is important), I
can still imagine some improvement.

I'd like a configuration option to deliberately include or exclude the
local computer (cf. the uploading node, as the computer may have multiple
nodes) when considering happiness.

Beyond that, it seems to me that we have three tiers of "health" on a
grid:

  - shares.needed shares have been placed (anywhere)
    this is the minimum required to be able to retrieve my data

  - shares.happy has been satisfied
    this means I don't need to worry about losing my data

  - shares.total shares have been placed (anywhere)
    this means my data is optimally healthy and quickly accessible

I might want to have the ability to define what level of health I require
in an operation.  Maybe I'd like an upload to succeed at the first tier,
but have a check-with-repair fail unless it can reach the second tier. 
Those seem like sensible things that someone might want.  For example, if I
know one of my nodes is down temporarily, I might like for an upload to
succeed even though it can't reach shares.happy, and I'll leave it up to a
cron job to repair the file up to shares.happy a few days later when the
node is back online.

I'm not sure about the above; it's just something I was thinking about in
connection with that laptop unexpectedly going to sleep.

FYI, I won't have e-mail access over the weekend, so I won't be able to
participate in the weekend discussion about this.

-- 
Kyle Markley