[tahoe-dev] tahoe-lafs suitable for 500TB ~ 3PB cluster?
dieter at vimeo.com
Tue Apr 16 17:55:53 UTC 2013
I'm looking to build a large highly available storage cluster
i'm not really interested in the specific encryption/security features. So far I've been using openstack swift
for a smaller cluster (70TB usable, 216TB raw) and I'm reasonably satisfied, but it has a lot of overhead in terms of cpu, network and storage, because it uses a simple replication strategy (in our case rep. level 3) and because the design is just simple and inefficient.
I'm looking to store video files (avg about 150MB, max size 5GB).
the size of the cluster will be between 500TB to 3PB (usable) space, it depends on how feasible it would be to implement.
at 500TB i would need 160Mbps output performance, at 1TB about 600Mbps, at 2TB about 1500Mbps and at 3PB about 6Gbps.
output performance scales exponentially wrt clustere size.
ingestion requirements are much less, and can be neglected.
Someone pointed me to tahoe-lafs because apparently it's the only similar (open source) system that uses erasure codes. I'm still reading up on erasure codes, and trying to find out if it's appropriate for my use case, but I have some questions specific to tahoe-lafs:
* are there any known similar large deployments?
* what is the cpu,memory,network overhead of tahoe-lafs? does it have a lot of consistency checking overhead? what does it have to do that's not just (for ingest) splitting incoming files, sending chunks to servers to store them to disk, and on output the reverse? i assume the error codes are not too expensive to compute and to check because the cpu has opcodes for them?
* could i run it on commodity hardware servers? (i.e. say servers with 24x4TB=96TB, a common 8-core cpu and 4-8 GB ram?)
* can you change the parameters on a production cluster? (i.e. extend cluster with new servers (or permanently take clusters out), and increase or decrease how many nodes you can loose)
More information about the tahoe-dev