[tahoe-dev] tahoe-lafs for archival process ability ?

David-Sarah Hopwood david-sarah at jacaranda.org
Sun Mar 11 03:59:59 UTC 2012

On 10/03/12 08:41, Frédéric Brégier wrote:
> From what I've read, I see there is the possibility to have "immutable"
> files, consistency across several storage nodes, security of access and
> consistency of contents. What I did not find yet (or not sure) are :
> - up to what volume this solution can get (how many files, how many TB
> or even PB)

I think this is only limited by the total amount of space on the storage
servers. Each server's storage directory must be on a single filesystem.
Note that Tahoe becomes less efficient when some servers can no longer
accept shares due to lack of space, so ideally the space used should be
limited to (number of storage servers * space available on smallest server).

In theory there could be hash collisions for a very large number of files,
but it would take an infeasible length of time to store that many files.

> - the ability to have a "multiple physical" locations (for security
> reasons again) as a replication or embedded solution as it seems (I am
> not sure while reading the storage servers need or not to be in the same
> datacenter)

Guaranteeing that shares are distributed across locations or organisational
units, to reduce the risk of correlated failures, is a frequently requested
feature. I hope we'll discuss designs for that at the upcoming Tahoe Hack Fest
on the weekend of March 30.

> - an existing project or temptative to get it used for archiving

I don't know of any specific project that is using it in that way.

> - the interfaces available that could be integrated in "application"
> components (I quickly see Java, (S)FTP, ...). For instance, could it be
> possible to integrate this within a Java Application either to get
> documents but also and maybe more important to store new documents ?

The easiest way to access a Tahoe filesystem from a language other than Python
(or even from Python) is to use the HTTP web-API, which is documented at

> - is there a way to link (except from outside of course) 2 files
> together, in the sense (still thinking about archiving) the ability to
> have 2 files linked, one for the main document and one for the
> associated referential xml file (containing the author, date, some other
> business informations)

Only if you were to define a convention for doing that. User-defined
metadata can be associated with each directory entry, but it wouldn't
be efficient to store a large amount of information there. You could
store a capability to the metadata file there, or adopt some convention
for using related filenames.

David-Sarah Hopwood ⚥

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 554 bytes
Desc: OpenPGP digital signature
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20120311/4ac64555/attachment.asc>

More information about the tahoe-dev mailing list