[tahoe-dev] Low-effort repairs

Shawn Willden shawn-tahoe at willden.org
Fri Jan 16 06:42:20 UTC 2009


On Thursday 15 January 2009 03:43:41 pm Brian Warner wrote:
> > For example, the probability of losing a file with N=10, k=3, p=.95 and
> > four lost shares which have been replaced by duplicates of still-extant
> > shares is 9.9e-8, as compared to 1.6e-9 for a proper repair job. Not that
> > much worse.
>
> Neat! How did you compute that number?

First, the practice.  Here's how to do it using my statistics module:

f1  = binomial_distribution_pmf(1, .95)
tmp = binomial_distribution_pmf(2, .95)
f2  = [ tmp[0], tmp[1]+tmp[2] ]
f = reduce(convolve, [ f1 * 2 + f2 * 4 ])
return sum(f[0:3])

Oh, while I'm at it, the computation for the replicate-all-six case is:

tmp = binomial_distribution_pmf(2, .95)
f2  = [ tmp[0], tmp[1]+tmp[2] ]
f = reduce(convolve, [ f2 * 6 ])
return sum(f[0:3])

Now for the theory:

The random variable K can also be written as sum(K_j for j in range(1,N+1)).  
Each K_j is a random variable that counts the number of survivors in a single 
share -- either 0 or 1.

If K_1 and K_2 represent the two normal shares, then they each have the simple 
probability mass function which I'll call f1:

	Pr[K_1=i] = Pr[K_2=i] = f1(i) = [ 1-p, p ]

The PMF of the other four shares is more complex.  Each pair is basically a 
microcosm of the normal k-of-N problem, with N=2 and k=1.  Per the binomial 
distribution, the PMF of this case is

	Pr[K_j=i] = tmp(i) = [ (1-p)^2, 2p-2p^2, p^2 ]

(if it's not obvious, Pr[K_j=i] is in the ith position of the 0-based list)

But we don't care whether K_j=1 or K_j=2, just that K_j>0.  So, we sum those 
cases to produce the PMF:

	f2(i) = [ (1-p)^2, 2p-p^2 ]

Now we have PMFs for K_1 through K_6, so we just need to find the PMF of their 
sum, K so we can calculate Pr[K<k].

According to the discrete convolution theorem, given that the K_j are mutually 
independent and K = K_1 + K_2 + ... + K_6, it's the case that

	Pr[K=k] = f(k) = (f1*f1*f2*f2*f2*f2)(k)

where '*' is the convolution operator.  Because convolution is associative, we 
can rewrite that as:

	f(k) = (((((f1*f1)*f2)*f2)*f2)*f2)(k)

With Mathematica it's easy enough to do all of the convolutions symbolically, 
but they get rather long and nasty.  Here's the final f(k):

	k=0	(1-p)^10
	k=1	-2 (1-p)^8 p (3p-5)
	k=2	(1-p)^6 p^2 (15p^2 - 50p + 41)
	k=3	-4 (-1 + p)^4 p^3 (-22 + 41 p - 25 p^2 + 5 p^3)
	k=4	p^4 (2 - 3 p + p^2)^2 (26 - 40 p + 15 p^2)
	k=5	-2 (-2 + p)^3 p^5 (4 - 7 p + 3 p^2)
	k=6	(-2 + p)^4 p^6

Those sum to 1, as a PMF must.  If you add them and do the algebra you can 
confirm it for yourself :)

Finally, to get the odds of losing the file, sum(f(i) for i in range(0,k)).

This approach is really powerful.  It even allows you to handle situations 
like multiple data centers full of heterogeneous servers (distinct p_i), and 
each data center with its own distinct failure probability that impacts all 
servers in the center.  Assuming you can get good estimates for every one of 
those probabilities, you can calculate the PMF of the whole mess.  Toss in a 
bunch of individual servers as well if you like.

Of course, actually obtaining good estimates of all of those failure 
probabilities is probably impossible, but to whatever degree you can get the 
estimates, you can build the composite PMF and find the answers.
	
> > If there's storage to spare, the repairer could even direct all six peers
> > to duplicate their shares, achieving a file loss probability of 5.8e-10,
> > which is *better* than the nominal case, albeit at the expense of
> > consuming 12 shares of distributed storage rather than 10.
>
> Yeah, it's a big multivariable optimization/tradeoff problem: storage space
> consumed, CPU used, bandwidth used (on links of varying capacities, owned
> by different parties), reliability (against multiple sorts of failures).
> Very messy :).

Yep.  I'm simplifying the issue of CPU vs. bandwidth by pushing it off on 
someone else.  I have a notional cost function which computes the putative 
cost of a repair operation, taking into account whatever you want it to take 
into account.  The one I'm playing with ignores CPU and just counts 
bandwidth, weighting upstream bandwidth at 8 times the cost of downstream 
bandwidth since that's up/down ratio on my cable modem.

That at least reduces the problem to a three-way optimization issue -- 
storage, repair cost and reliability.  I'm thinking I should be able to get 
it to a point where you can fix (or bound) two of the three parameters and 
optimize for the other.

The attached patch contains the current state of my code and my paper 
(docs/lossmodel.lyx).

	Shawn.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lossmodel.dpatch.zip
Type: application/x-zip
Size: 68676 bytes
Desc: not available
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20090115/32c4836d/attachment.bin>


More information about the tahoe-dev mailing list