[tahoe-dev] Use Tahoe as a real-time distributed file system?

David-Sarah Hopwood david-sarah at jacaranda.org
Wed May 25 00:09:34 UTC 2011


On 24/05/11 16:55, Neil Aggarwal wrote:
> Hello all:
> 
> I have been reading the Tahoe docs and am a bit confused.
> 
> I am looking for a distributed real-time filesystem.
> 
> Does Tahoe allow my to access it just like a regular
> filesystem?  For example, do I cd to a directory and
> list the files?

Tahoe supports several "frontend" protocols for accessing
the filesystem:
 - HTTP(S), extended with some additional operations
 - FTP
 - SFTP (preferred over FTP)

There is a command-line client, the 'tahoe' command, that
connects to the gateway using HTTP(S). You can use a
web browser to access a subset of the HTTP(S) interface,
called the WUI. You can also use any FTP or SFTP client to
connect to those frontends.

It is possible to provide regular filesystem access from
Unix clients by using sshfs to connect to the SFTP frontend.
This should be considered a stable and supported interface
(unlike the experimental FUSE modules mentioned by Shawn
Willden). Note however that Tahoe is *much* slower than any
local filesystem. For other caveats, see
<http://tahoe-lafs.org/trac/tahoe-lafs/wiki/SftpFrontend>.

> My network looks like this:
> 
>   Colo 1			Colo 2
> Server 1			Server 5	
> Server 2			Server 6
> Server 3			Server 7
> Server 4			Server 8
> 
> Colo 1 and 2 are separated by a large distance.
> The servers in each colo are on the same network.
> All servers will be running CentOS.
> 
> Here is what I need:
> 1. Any server may create or modify a file 
> 	and changes should be immediately available
> 	to the others.

This is possible subject to the limitations on
concurrent writes to the same directory or mutable file
that Shawn mentioned.

(You could avoid the concurrent write problem by accessing
files via a single gateway, but that wouldn't meet the
no-single-failure criterion for availability. It would
still meet this criterion for data preservation, and you
can always restart the gateway to restore availability,
but *automatic* fail-over to a second gateway is not
currently supported.)

> 2. I need to have no single points of failure.
> 
> Will Tahoe do this?
> Any suggestions on how to deploy it in this scenario?

The encoding parameters are:
 - the number of shares needed to reconstruct a file
   (K, also called shares.needed),
 - the number of shares that must be stored on distinct
   servers in order for an upload to succeed (H, also
   called shares.happy),
 - the number of shares generated for a file (N, also
   called shares.total).

You have 8 servers, so let's choose N as 8. To avoid
single points of failure, you must assume that the number
of servers that can go down at once is 4 (all servers in a
colo). This requires H - K to be at least 4. One reasonable
choice of parameters is therefore K = 3, H = 7, N = 8.
The expansion factor (amount of encoded ciphertext stored
for each unit of plaintext) will be 8/3.

This avoids single points of failure for data preservation.
You also preserve upload availability if a single server is
unavailable, since it is still possible to store distinct
shares on the remaining 7 servers. (That's why I didn't
suggest K = 4, H = 8, N = 8.) However it isn't possible to
preserve upload availability while an entire colo is unavailable.

(Strictly speaking it should be possible, albeit inefficient
in terms of expansion factor, to do that by using pure
replication, i.e. K = 1 and H = 2 or 3, but there is a bug
that would prevent that from working in this case:
<http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1293>.)

Some future version is likely to support "location awareness"
in order to guarantee that shares are spread roughly evenly
between colos. The choice of H = 7 above effectively achieves
that anyway without location awareness, but at the cost of
upload availability.

For instance, suppose that for your network it were possible
to configure a policy that distinct shares need to be stored
on 3 servers in each colo when both are up, or on 4 servers
(doubling up shares to allow them to be redistributed later)
when only one colo is up. Still using K = 3 and N = 8, that
would preserve the no-single-failure property, but it would
still allow uploads when an entire colo is down.

-- 
David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 294 bytes
Desc: OpenPGP digital signature
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20110525/dc21f9de/attachment.asc>


More information about the tahoe-dev mailing list