[tahoe-dev] erasure coding makes files more fragile, not less

Eugen Leitl eugen at leitl.org
Sat Mar 31 09:35:29 UTC 2012

On Fri, Mar 30, 2012 at 05:33:58PM -0400, Shawn Willden wrote:

> > Can you please tell more about the VG2 grid? I clean missed it.
> (Sorry I'm slow to respond -- I'm vacationing with my family and often have
> better things to do than read email :-) ).
> Volunteer Grid 2 is a Tahoe grid composed of volunteers all over the world.
> Learning from some problems that the first volunteer grid had, I suggested
> to early members that VG2 establish some clear and somewhat restrictive
> policies in order to ensure that the grid was useful for system backups.
> Two specific backup-driven requirements we had were high
> reliability/availability and relatively high capacity.  To that end, we
> established a 95% nominal up time requirement and a 500 GB minimum node
> capacity requirement.  We also avoid co-located nodes and disallow usage of
> more than min(storage_provided, 1 TB).  The limit on usage is to avoid
> having one user deploy, say, 10 TB and then try to consume that much from
> the grid, swamping the rest of the servers.

I would like to contribute ~TByte on a GBit/s link. I've passed on the
email so there might be other takers as well.

The process is described on http://www.bigpig.org/ ?
> > > servers because the math of erasure coding works against you when the
> > > individual nodes are unreliable, and we ban co-located servers and
> prefer
> > > to minimize the number of servers owned and administered by a single
> person
> > > in order to ensure greater independence.
> > >
> > > How has that worked out?  Well, it's definitely constrained the growth
> rate
> > > of the grid.  We're two years in and still haven't reached 20 nodes.
>  And
> >
> > It doesn't surprise me at all, since I've never heard a single squeak
> > about it in the usual channels. (And I'm moderately well-informed
> > in such matters).
> I'm surprised.   It was definitely announced here when it was created, and
> discussed occasionally since.

If it was just limited to tahoe-dev it definitely did not see wide circulation.
I realize you don't want ephemeral nodes; maybe there should be a probation
period (demonstrated uptime, etc).
> > > although our nodes have relatively high reliability, I'm not sure we've
> > > actually reached the 95% uptime target -- my node, for example, was down
> > > for over a month while I moved, and we recently had a couple of outages
> > > caused by security breaches.
> > >
> > > However, we do now have 15 solid, high-capacity, relatively available
> (90%,
> > > at least) nodes that are widely dispersed geographically (one in Russia,
> > > six in four countries in Europe, seven in six states in the US; not sure
> > > about the other).  So it's pretty good -- though we do need more nodes.
> >
> > How large is the total storage capacity? What about introducer nodes, is
> > there just one?
> Total storage capacity, as reported by the stats gatherer, is around 14 TB.
>  That's disk used (~6 TB) plus disk available (~8 TB).  As near as I can
> tell by eyeballing the graph and summing my estimates is that consumption
> grows by about 40 GB per day.  We have a helper but it's lightly used.
>  Only one introducer.  The node on the slowest network connection has about

How are updates to new versions handled?

> 1 Mbps of bandwidth, two or three nodes are on gigabit links, most are 6-50
> Mbps, IIRC.  Hardware is similarly varied, with the low end being a small
> NAS box, the high end being some fairly powerful servers in data centers,
> and everything in between including some virtual servers.
> Upload performance, as measured from my machine (which has a 50 Mbps up,
> 100 Mbps down connection), averages about about 300 KBps, before erasure
> coding, so with my settings I get around 100 KBps net upload rate.  I
> haven't done any download tests recently, but in the past they've been
> approximately the same as upload speeds, but without the erasure coding
> penalty.
> -- 
> Shawn

> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev

Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE

More information about the tahoe-dev mailing list