Fwd: Erasure Coding

Natanael natanael.l at gmail.com
Sun Dec 1 19:37:12 UTC 2013


Can't you pretend to run more nodes than you actually are running in order
to "mine" more credits? What could prevent that?

- Sent from my phone
Den 1 dec 2013 17:25 skrev "David Vorick" <david.vorick at gmail.com>:

>
>
> ---------- Forwarded message ----------
> From: David Vorick <david.vorick at gmail.com>
> Date: Sun, Dec 1, 2013 at 11:25 AM
> Subject: Re: Erasure Coding
> To: Alex Elsayed <eternaleye at gmail.com>
>
>
> Alex, thanks for those resources. I will check them out later this week.
>
> I'm trying to create something that will function as a market for cloud
> storage. People can rent out storage on the network for credit (a
> cryptocurrency - not bitcoin but something heavily inspired from bitcoin
> and the other altcoins) and then people who have credit (which can be
> obtained by trading over an exchange, or by renting to the network) can
> rent storage from the network.
>
> So the clusters will be spread out over large distances. With RAID5 and 5
> disks, the network needs to communicate 4 bits to recover each lost bit.
> That's really expensive. The computational cost is not the concern, the
> bandwidth cost is the concern. (though there are computational limits as
> well)
>
> When you buy storage, all of the redundancy and erasure coding happens
> behind the scenes. So a network that needs 3x redundancy will be 3x as
> expensive to rent storage from. To be competitive, this number should be as
> low as possible. If we had Reed-Solomon and infinite bandwidth, I think we
> could safely get the redundancy below 1.2. But with all the other
> requirements, I'm not sure what a reasonable minimum is.
>
> Since many people can be renting many different clusters, each machine on
> the network may (will) be participating in many clusters at once (probably
> in the hundreds to thousands). So the cost of handling a failure should be
> fairly cheap. I don't think this requirement is as extreme as it may sound,
> because if you are participating in 100 clusters each renting an average of
> 50gb of storage, your overall expenses should be similar to participating
> in a few clusters each renting an average of 1tb. The important part is
> that you can keep up with multiple simultaneous network failures, and that
> a single node is never a bottleneck in the repair process.
>
> We need 100s - 1000s of machines in a single cluster for multiple reasons.
> The first is that it makes the cluster roughly as stable as the network as
> a whole. If you have 100 machines randomly selected from the network, and
> on average 1% of the machines on the network fail per day, your cluster
> shouldn't stray too far from 1% failures per day. Even more so if you have
> 300 or 1000 machines. But another reason is that the network is used to
> mine currency based on how much storage you are contributing to the
> network. If there is some way you can trick the network into thinking you
> are storing data when you aren't (or you can somehow lie about the volume),
> then you've broken the network. Having many nodes in every cluster is one
> of the ways cheating is prevented. (there are a few others too, but
> off-topic).
>
> Cluster size should be dynamic (fountain codes?) to support a cluster that
> grows and shrinks in demand. Imagine if some of the files become public
> (for example, youtube starts hosting videos over this network). If one
> video goes viral, the bandwidth demands are going to spike and overwhelm
> the network. But if the network can automatically expand and shrink as
> demand changes, you may be able to solve the 'Reddit hug' problem.
>
> And finally, machines that only need to be on some of the time gives the
> network a tolerance for things like power failures, without needing to
> immediately assume that the lost node is gone for good.
>
>
>
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20131201/a97a7dbe/attachment.html>


More information about the tahoe-dev mailing list