[tahoe-dev] best practice for wanting to setup multiple tahoe instances on a single node

Wed Jan 11 02:49:04 UTC 2012

On 1/10/12 4:30 PM, Greg Troxel wrote:
> 
> Jimmy Tang <jcftang at gmail.com> writes:
> 
>> I was just wondering if there are best practice recommendations for
>> setting up storage nodes? As far as I understand the recommended way
>> is to setup one instance per node with one big partition on the node.
> 
> I think the key point is about the redundancy that you have vs the
> redundancy that tahoe perceives - it seems dangerous to have 20 nodes
> that appear independent but are all actually on the same box. If they
> are on 20 physical disks that are independent enough that if the box
> fails you can reconstitute all but of the nodes, it might be ok, but
> it seems subject to correlated failures.

Yeah, our recommendation is one storage node per spindle, since it's
individual disks that usually live or die. Computers fail too (which
might take out multiple storage nodes), but you can usually move their
disks to a new computer, so I think of those as transient failures.
Whereas, when a disk fails, it's usually down for good.

If you think your disks are indestructible but your partitions are not,
then maybe multiple storage nodes per disk (one per partition) could
make sense. Seems dubious to me, though.

>> What about setting up multiple instances of tahoe storage nodes per
>> partition on one machine, in an possible scenario where I have 150tb
>> of space on a machine but I can only make a bunch of 16tb partitions.
>> I ask this because we have a few machines in work right now with this
>> kind of setup and I'm kinda pushing for using tahoe-lafs as a
>> possible backend storage system, possibly with irods sitting on top
>> to manage the data (yet to be decided).

If you need to aggregate multiple partitions into one big one, and you
don't have an OS-level way to do it (lvm, etc), then one cheap-and-dirty
approach is to make symlinks from the individual prefix directories.
Tahoe stores shares in:

 $NODEDIR/storage/shares/$PREFIX/$STORAGEINDEX/$SHARENUMBER

where $PREFIX is like "aa" or "7q": there are 1024 of them (first two
characters of the base32-encoded storage-index). Since files get mapped
to storage-index values effectively randomly, if you had two partitions,
you could build 512 symlinks (22 to gz) that point into one of them, and
have the other 512 symlinks (ha to zz) point to the second one. Or
256/256/256/256, etc. Nasty, but it'd work, as long the Law Of Large
Numbers holds up and the partitions fill at about the same rate.

>> As a side question, as we expand the number of nodes, I would
>> probably want to change the k-of-n settings. would the migration
>> method to newer k-of-n parameters be copy and delete within the grid
>> to rebalance data?

Yeah, unfortunately, there's no good way to re-encode a file short of
just downloading it and re-uploading it. The k-of-N settings for new
uploads are controlled by the client node's tahoe.cfg file (see
shares.required and shares.total), but they're embedded in the filecap.
So you could set your tahoe.cfg to the new settings, use 'tahoe cp -r'
to copy a bunch of files out of tahoe into your local directory, then
'tahoe cp -r' again to re-upload them (with the new settings).

BTW, we use "re-encode" to talk about changing a file's encoding
parameters, like 'k' and 'N': that generally means making entirely new
shares. When we say "rebalance", we're talking about moving shares
around without changing them, like when new servers are added, and we
want to move shares around to spread out the load more evenly. We don't
have automatic tools for either yet.

cheers,
 -Brian