[tahoe-dev] Why no error message when I have only one storage node?

Fri Aug 7 01:29:06 UTC 2009

gc20090728 at metcarob.com wrote:

> Thanks for this. Since it's going to be registered as an issue I
> thought I would provide some reasons for this.
> + I am trying to protect my data from hard disk failure - having 10
>   files all on the same HD doesn't achieve this.
> ...

Thanks for the use-cases! One of the challenges we've had with this
choose-where-your-shares-go feature is how to control it, i.e. what sort
of language you might use in the tahoe.cfg file to explain what exactly
qualifies as "good enough".

The so-called "peer-selection" algorithm runs at the start of upload and
is responsible for deciding upon a home for each share. It walks the
permuted ring of servers, asking each one to accept a single share, and
considers its job done when all N=10 shares have a home. The current
implementation does not attempt to make sure that they all go to
different homes, but given the design, it will only loop around and put
more than one share per server if you don't have at least N servers with
space free.

When we first wrote this part of the code, we were thinking of the
allmydata.com environment, in which there are always more than 10
servers, so this algorithm would always achieve full diversity (one
share per server). Having it work for (num_servers<N) is a fallback
case: uploads work, but diversity suffers. In considering whether
reduced diversity should be treated as an error, we decided to punt.
It's too hard to quantify this "diversity" parameter to everybody's
satisfaction (what if the two servers share a power supply? that's bad,
right?), so we put it off, and made Tahoe let uploads succeed even if
they weren't as reliable as you'd like.

The "shares of happiness" value that was mentioned earlier (in the
ticket, I think) comes into play in the second phase of upload, after
peer-selection, as the shares are created and uploaded. Servers get
bounced on a regular basis (OS upgrades, maintenance, etc), so you might
lose some of the servers during this upload phase, but if you don't lose
too many, that might be ok. If at least "shares of happiness" shares
were uploaded successfully, we declare the upload to have succeeded. But
that's unrelated to server diversity, which is why "shares of happiness"
versus "servers of happiness" is a bit of a confused issue: we could
certainly use a parameter which enforces server diversity, but it needs
to be applied during peer-selection, whereas in the current code
shares-of-happiness is only used later in the upload process.

Another wrinkle is how to tolerate changes in the set of servers that
are available to you. If you only have three servers now, but you think
you might have more in the future, then maybe you should encode your
files into more than three shares (N>3), so that you can take advantage
of better diversity in the future without re-encoding everything.
(re-encoding with different k/N encoding parameters will result in a
different filecap, meaning you'd have to update all your directories
too). A file-repair operation will be able to move
10-shares-on-3-servers to 10-shares-on-10-servers without re-encoding or
needing to see the plaintext. In fact an even smaller tool could simply
move the share files from one server to another and even skip the
erasure-coding work. If Tahoe refused to place multiple shares on a
single server, you'd be forced to do the initial encoding at N=3, which
might be annoying later on.

But the biggest challenge is how to compress this peer-selection
information into the filecap. docs/specifications/outline.txt (section
3) has some details. Basically, you (as the uploader) have somehow told
your Tahoe node where to put those shares, but someone else (as a
downloader) only gets the filecap, and they need to be able to find
(enough of) those same selected servers in an efficient fashion. Servers
may have been added or removed in the process, and the file may have
been repaired, so shares might not be on exactly the same servers that
they started on. The current permuted-serverlist algorithm is carefully
designed to achieve good diversity (if you have enough equally-useful
servers that will accept shares), tolerate repair and server churn, and
compress all the necessary server-selection information into the same
filecap that contains the encryption key and the integrity information.
Other algorithms are possible, but we couldn't come up with any of them
at the time, and this one met the immediate needs.

So, it'd be cool to have some sort of constraint-based "share placement
requirements" language in tahoe.cfg, where you could provide (or
subscribe to) attributes about each server (like which data center
they're in, and what failure modes they have in common), and the
uploading code would be resopnsible for finding a set of servers which
meet the criteria. It'd also be cool to have tahoe.cfg simply list the
servers that will be used, and how many shares are to go to each one
(this is what I'm working on next, for #573, because it meets some of my
own personal needs, and because it will eventually simplify a lot of the
internal unit-testing code).

But we'll need some significant changes on the Download side to permit
that sort of flexibility on the upload side, and this is a big unsolved
problem. The current Downloader code will fall back to asking every
server in the grid, so at the moment *any* upload-time server-selection
algorithm will work. But clearly that's not a good idea for 1000-server
grids, so we need a way to make this work properly in the long run.

cheers,
 -Brian