Erasure Coding

David Vorick david.vorick at gmail.com
Sun Dec 1 05:10:12 UTC 2013


Hi, I've started working on a project that's similar to Tahoe-LAFS, in that
it's a distributed cluster of machines hosting a bunch of files.

Erasure coding is important, but I've been having trouble learning about
the different types of erasure coding to be confident that I'll be able to
pick the best for my needs. I was hoping you could point to some links,
papers, or otherwise help out.

Ideally (in order of importance):

+ machines can participate in many clusters simultaneously (and many
clusters can exist simultaneously)
+ 100s - 1000s of machines per cluster
+ if a machine corrupts or is otherwise lost, it's portion of the file can
be replaced quickly and without using many network resources
+ low redundancy

less important but still important:

+ cluster size can expand or shrink dynamically
+ some machines only need to be online some of the time

I've looked at Reed-Solomon coding, which seems to be useful but not ideal
(too expensive to replace lost nodes)
I've also looked at raptor codes, which seem promising but I don't
understand them, and there seem to be patent issues.

In general, I've been unsuccessful at finding resources to learn about
erasure codes, but persistence has been slowly turning up useful resources.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20131201/02275a55/attachment.html>


More information about the tahoe-dev mailing list