[tahoe-dev] Estimating reliability

Shawn Willden shawn-tahoe at willden.org
Tue Jan 13 17:19:45 UTC 2009


On Tuesday 13 January 2009 12:59:17 am Brian Warner wrote:
> Right. Sometimes the numbers are close to zero, like the probability that
> we'll drop from 10 shares to 2 shares in a month. Sometimes they are close
> to one, like the probability that we'll keep all 10 shares in a month.

That's the point -- don't work with the probability that you'll keep all 10 
shares.

Also, just to pick nits, the probability that you'll keep all 10 shares in a 
month isn't all that close to 1.  Even if your shares have individual 
reliability of .9999, for N=10 the Pr[K=10] = .9990.  You have over 15 digits 
of mantissa to work with, so a value that breaks away from 1 in the fourth 
digit is not very close to 1 -- not unless you're adding it to a number whose 
signficant digits are in the 10e-10 range.

Now, if you look at, say, Pr[K in range(6,11)] (with p=.9999), then you have a 
value that is indistinguishable from 1 (in a float).  However, you can find 
how far it is from 1 by calculating Pr[K in range(0,7)], which is 
2.0998992e-18.

> Much of the time we'll be multiplying these two sorts of numbers together,

Multiplying is almost never a problem.  Where you lose accuracy is when you 
add, so it's in the summations where you need to worry about how large and 
small numbers combine, and the answer there is to start from the small 
values.

I've managed to avoid any situations where I'll lose accuracy without 
difficulty.  Cross checking with Mathematica for infinite precision shows 
that all of my rounding errors are in the 14th and 15th decimal places.

I've attached my current code for you to fiddle with, if you like.  The most 
useful functions are:

pr_file_loss(p_list, k) -- returns the probability on servers with the 
reliabilities in p_list having less than k shares available.

survival_pmf(p_list) -- returns the probability mass function of a set of 
shares on servers with the reliabilities in p_list.

	Shawn.
-------------- next part --------------
Tue Jan 13 10:01:17 MST 2009  Shawn Willden <shawn-tahoe at willden.org>
  * Statistics module
  
  Added a statistics module for calculating various facets of
  share survival statistics.

New patches:

[Statistics module
Shawn Willden <shawn-tahoe at willden.org>**20090113170117
 
 Added a statistics module for calculating various facets of
 share survival statistics.
] {
addfile ./docs/lossmodel.lyx
hunk ./docs/lossmodel.lyx 1
+#LyX 1.6.1 created this file. For more info see http://www.lyx.org/
+\lyxformat 345
+\begin_document
+\begin_header
+\textclass amsart
+\use_default_options true
+\begin_modules
+theorems-ams
+theorems-ams-extended
+\end_modules
+\language english
+\inputencoding auto
+\font_roman default
+\font_sans default
+\font_typewriter default
+\font_default_family default
+\font_sc false
+\font_osf false
+\font_sf_scale 100
+\font_tt_scale 100
+
+\graphics default
+\paperfontsize default
+\spacing single
+\use_hyperref false
+\papersize default
+\use_geometry false
+\use_amsmath 1
+\use_esint 1
+\cite_engine basic
+\use_bibtopic false
+\paperorientation portrait
+\secnumdepth 3
+\tocdepth 3
+\paragraph_separation indent
+\defskip medskip
+\quotes_language english
+\papercolumns 1
+\papersides 1
+\paperpagestyle default
+\tracking_changes false
+\output_changes false
+\author "" 
+\author "" 
+\end_header
+
+\begin_body
+
+\begin_layout Title
+Tahoe Distributed Filesharing System Loss Model
+\end_layout
+
+\begin_layout Author
+Shawn Willden
+\end_layout
+
+\begin_layout Email
+shawn at willden.org
+\end_layout
+
+\begin_layout Abstract
+The abstract goes here
+\end_layout
+
+\begin_layout Section
+Problem Statement
+\end_layout
+
+\begin_layout Standard
+The allmydata Tahoe distributed file system uses Reed-Solomon erasure coding
+ to split files into 
+\begin_inset Formula $N$
+\end_inset
+
+ shares, each of which is then delivered to a randomly-selected peer in
+ a distributed network.
+ The file can later be reassembled from any 
+\begin_inset Formula $k\leq N$
+\end_inset
+
+ of the shares, if they are available.
+\end_layout
+
+\begin_layout Standard
+Over time shares are lost for a variety of reasons.
+ Storage servers may crash, be destroyed or simply be removed from the network.
+ To mitigate such losses, Tahoe network clients employ a repair agent which
+ scans the peers once per time period 
+\begin_inset Formula $A$
+\end_inset
+
+ and determines how many of the shares remain.
+ If less than 
+\begin_inset Formula $R$
+\end_inset
+
+ (
+\begin_inset Formula $k\leq R\leq N$
+\end_inset
+
+) shares remain, then the repairer reconstructs the file shares and redistribute
+s the missing ones, bringing the availability back up to full.
+\end_layout
+
+\begin_layout Standard
+The question we're trying to answer is "What's the probability that we'll
+ be able to reassemble the file at some later time 
+\begin_inset Formula $T$
+\end_inset
+
+?".
+ We'd also like to be able to determine what values we should choose for
+ 
+\begin_inset Formula $k$
+\end_inset
+
+, 
+\begin_inset Formula $N$
+\end_inset
+
+, 
+\begin_inset Formula $A$
+\end_inset
+
+, and 
+\begin_inset Formula $R$
+\end_inset
+
+ in order to ensure 
+\begin_inset Formula $Pr[loss]\leq t$
+\end_inset
+
+ for some threshold probability 
+\begin_inset Formula $t$
+\end_inset
+
+.
+ This is an optimization problem because although we could obtain very low
+ 
+\begin_inset Formula $Pr[loss]$
+\end_inset
+
+ by choosing small 
+\begin_inset Formula $k,$
+\end_inset
+
+ large 
+\begin_inset Formula $N$
+\end_inset
+
+, small 
+\begin_inset Formula $A$
+\end_inset
+
+, and setting 
+\begin_inset Formula $R=N$
+\end_inset
+
+, these choices have costs.
+ The peer storage and bandwidth consumed by the share distribution process
+ are approximately 
+\begin_inset Formula $\nicefrac{N}{k}$
+\end_inset
+
+ times the size of the original file, so we would like to reduce this ratio
+ as far as possible consistent with 
+\begin_inset Formula $Pr[loss]\leq t$
+\end_inset
+
+.
+ Likewise, frequent and aggressive repair process can be used to ensure
+ that the number of shares available at any time is very close to 
+\begin_inset Formula $N,$
+\end_inset
+
+ but at a cost in bandwidth.
+\end_layout
+
+\begin_layout Section
+Reliability
+\end_layout
+
+\begin_layout Standard
+The probability that the file becomes unrecoverable is dependent upon the
+ probability that the peers to whom we send shares are able to return those
+ copies on demand.
+ Shares that are returned in corrupted form can be detected and discarded,
+ so there is no need to distinguish between corruption and loss.
+\end_layout
+
+\begin_layout Standard
+There are a large number of factors that affect share availability.
+ Availability can be temporarily interrupted by peer unavailability, due
+ to network outages, power failures or administrative shutdown, among other
+ reasons.
+ Availability can be permanently lost due to failure or corruption of storage
+ media, catastrophic damage to the peer system, administrative error, withdrawal
+ from the network, malicious corruption, etc.
+\end_layout
+
+\begin_layout Standard
+The existence of intermittent failure modes motivates the introduction of
+ a distinction between 
+\noun on
+availability
+\noun default
+ and 
+\noun on
+reliability
+\noun default
+.
+ Reliability is the probability that a share is retrievable assuming intermitten
+t failures can be waited out, so reliability considers only permanent failures.
+ Availability considers all failures, and is focused on the probability
+ of retrieval within some defined time frame.
+\end_layout
+
+\begin_layout Standard
+Another consideration is that some failures affect multiple shares.
+ If multiple shares of a file are stored on a single hard drive, for example,
+ failure of that drive may lose them all.
+ Catastrophic damage to a data center may destroy all shares on all peers
+ in that data center.
+\end_layout
+
+\begin_layout Standard
+While the types of failures that may occur are pretty consistent across
+ even very different peers, their probabilities differ dramatically.
+ A professionally-administered blade server with redundant storage, power
+ and Internet located in a carefully-monitored data center with automatic
+ fire suppression systems is much less likely to become either temporarily
+ or permanently unavailable than the typical virus and malware-ridden home
+ computer on a single cable modem connection.
+ A variety of situations in between exist as well, such as the case of the
+ author's home file server, which is administered by an IT professional
+ and uses RAID level 6 redundant storage, but runs on old, cobbled-together
+ equipment, and has a consumer-grade Internet connection.
+\end_layout
+
+\begin_layout Standard
+To begin with, let's use a simple definition of reliability:
+\end_layout
+
+\begin_layout Definition
+
+\noun on
+Reliability
+\noun default
+ is the probability 
+\begin_inset Formula $p_{i}$
+\end_inset
+
+ that a share 
+\begin_inset Formula $s_{i}$
+\end_inset
+
+ will surve to (be retrievable at) time 
+\begin_inset Formula $T=A$
+\end_inset
+
+, ignoring intermittent failures.
+ That is, the probability that the share will be retrievable at the end
+ of the current repair cycle, and therefore usable by the repairer to regenerate
+ any lost shares.
+\end_layout
+
+\begin_layout Definition
+Reliability is clearly dependent on 
+\begin_inset Formula $A$
+\end_inset
+
+.
+ Short repair cycles offer less time for shares to 
+\begin_inset Quotes eld
+\end_inset
+
+decay
+\begin_inset Quotes erd
+\end_inset
+
+ into unavailability.
+\end_layout
+
+\begin_layout Subsection
+Fixed Reliability
+\end_layout
+
+\begin_layout Standard
+In the simplest case, the peers holding the file shares all have the same
+ reliability 
+\begin_inset Formula $p$
+\end_inset
+
+, and are all independent from one another.
+ Let 
+\begin_inset Formula $K$
+\end_inset
+
+ be a random variable that represents the number of shares that survive
+ 
+\begin_inset Formula $A$
+\end_inset
+
+.
+ Each share's survival can be viewed as an indepedent Bernoulli trial with
+ a succes probability of 
+\begin_inset Formula $p$
+\end_inset
+
+, which means that 
+\begin_inset Formula $K$
+\end_inset
+
+ follows the binomial distribution with paramaters 
+\begin_inset Formula $N$
+\end_inset
+
+ and 
+\begin_inset Formula $p$
+\end_inset
+
+ (
+\begin_inset Formula $K\sim B(N,p)$
+\end_inset
+
+).
+ The probability mass function (PMF) of the binomial distribution is:
+\begin_inset Formula \begin{equation}
+Pr(K=i)=f(i;N,p)=\binom{n}{i}p^{i}(1-p)^{n-i}\label{eq:binomial-pdf}\end{equation}
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+A file survives if at least 
+\begin_inset Formula $k$
+\end_inset
+
+ of the 
+\begin_inset Formula $N$
+\end_inset
+
+ shares survive.
+ Equation 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:binomial-pdf"
+
+\end_inset
+
+ gives the probability that exactly 
+\begin_inset Formula $i$
+\end_inset
+
+ shares survive, so the probability that fewer than 
+\begin_inset Formula $k$
+\end_inset
+
+ survive is the sum of the probabilities that 
+\begin_inset Formula $0,1,2,\ldots,k-1$
+\end_inset
+
+ shares survive.
+ That is:
+\end_layout
+
+\begin_layout Standard
+\begin_inset Formula \begin{equation}
+Pr[failure]=\sum_{i=0}^{k-1}\binom{n}{i}p^{i}(1-p)^{n-i}\label{eq:simple-failure}\end{equation}
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Subsection
+Independent Reliability
+\end_layout
+
+\begin_layout Standard
+Equation 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:simple-failure"
+
+\end_inset
+
+ assumes that each share has the same probability of survival, but as explained
+ above, this is not typically true.
+ A more accurate model allows each share 
+\begin_inset Formula $s_{i}$
+\end_inset
+
+ an independent probability of survival 
+\begin_inset Formula $p_{i}$
+\end_inset
+
+.
+ Each share's survival can still be treated as an independent Bernoulli
+ trial, but with success probability 
+\begin_inset Formula $p_{i}$
+\end_inset
+
+.
+ Under this assumption, 
+\begin_inset Formula $K$
+\end_inset
+
+ follows a generalized distribution with parameters 
+\begin_inset Formula $N$
+\end_inset
+
+ and 
+\begin_inset Formula $p_{i},1\leq i\leq N$
+\end_inset
+
+.
+\end_layout
+
+\begin_layout Standard
+The PMF for this generalized 
+\begin_inset Formula $K$
+\end_inset
+
+ does not have a simple closed-form representation.
+ However, the PMFs for random variables representing individual share survival
+ do.
+ Let 
+\begin_inset Formula $S_{i}$
+\end_inset
+
+ be a random variable such that:
+\end_layout
+
+\begin_layout Standard
+\begin_inset Formula \[
+S_{i}=\begin{cases}
+1 & \textnormal{if }s_{i}\textnormal{ survives}\\
+0 & \textnormal{if }s_{i}\textnormal{ fails}\end{cases}\]
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+The PMF for 
+\begin_inset Formula $Si$
+\end_inset
+
+ is very simple, 
+\begin_inset Formula $Pr(S_{i}=1)=p_{i}$
+\end_inset
+
+ and 
+\begin_inset Formula $Pr(S_{i}=0)=p_{i}$
+\end_inset
+
+.
+\end_layout
+
+\begin_layout Standard
+Observe that 
+\begin_inset Formula $\sum_{i=1}^{N}S_{i}=K$
+\end_inset
+
+.
+ Effectively, 
+\begin_inset Formula $K$
+\end_inset
+
+ has just been separated into the series of Bernoulli trials that make it
+ up.
+\end_layout
+
+\begin_layout Standard
+The discrete convolution theorem states that given random variables 
+\begin_inset Formula $X$
+\end_inset
+
+ and 
+\begin_inset Formula $Y$
+\end_inset
+
+ and their sum 
+\begin_inset Formula $Z=X+Y$
+\end_inset
+
+, if 
+\begin_inset Formula $Pr[X=x]=f(x)$
+\end_inset
+
+ and 
+\begin_inset Formula $Pr[Y=y]=f(y)$
+\end_inset
+
+ then 
+\begin_inset Formula $Pr[Z=z]=(f\star g)(z)$
+\end_inset
+
+ where 
+\begin_inset Formula $\star$
+\end_inset
+
+ denotes the convolution operation.
+ Stated in English, the probability mass function of the sum of two random
+ variables is the convolution of the probability mass functions of the two
+ random variables.
+\end_layout
+
+\begin_layout Standard
+Discrete convolution is defined as
+\end_layout
+
+\begin_layout Standard
+\begin_inset Formula \[
+(f\star g)(n)=\sum_{m=-\infty}^{\infty}f(m)\cdot g(n-m)\]
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+The infinite summation is no problem because the probability mass functions
+ we need to convolve are zero outside of a small range.
+\end_layout
+
+\begin_layout Standard
+According to the discrete convolution theorem, then, if 
+\begin_inset Formula $Pr[K=i]=f(i)$
+\end_inset
+
+ and 
+\begin_inset Formula $Pr[S_{i}=j]=g_{i}(j)$
+\end_inset
+
+, then 
+\begin_inset Formula $ $
+\end_inset
+
+
+\begin_inset Formula $f=g_{1}\star g_{2}\star g_{3}\star\ldots\star g_{N}$
+\end_inset
+
+.
+ Since convolution is associative, this can also be written as 
+\begin_inset Formula $ $
+\end_inset
+
+
+\begin_inset Formula \begin{equation}
+f=(g_{1}\star g_{2})\star g_{3})\star\ldots)\star g_{N})\label{eq:convolution}\end{equation}
+
+\end_inset
+
+which enables 
+\begin_inset Formula $f$
+\end_inset
+
+ to be implemented as a sequence of convolution operations on the simple
+ PMFs of the random variables 
+\begin_inset Formula $S_{i}$
+\end_inset
+
+.
+ In fact, as values of 
+\begin_inset Formula $N$
+\end_inset
+
+ get large, equation 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:convolution"
+
+\end_inset
+
+ turns out to be a more effective means of computing the PMF of 
+\begin_inset Formula $K$
+\end_inset
+
+ even in the case of the standard bernoulli distribution, primarily because
+ the binomial calculation in equation 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "eq:binomial-pdf"
+
+\end_inset
+
+ produces very large values that overflow unless arbitrary precision numeric
+ representations are used, or unless the binomial calculation is very cleverly
+ interleaved with the powers of 
+\begin_inset Formula $p$
+\end_inset
+
+ and 
+\begin_inset Formula $1-p$
+\end_inset
+
+ to keep the values manageable.
+\end_layout
+
+\begin_layout Standard
+Note also that it is not necessary to have very simple PMFs like those of
+ the 
+\begin_inset Formula $S_{i}$
+\end_inset
+
+.
+ Any share or set of shares that has a known PMF can be combined with any
+ other set with a known PMF by convolution, as long as the two share sets
+ are independent.
+ Since PMFs are easily represented as simple lists of probabilities, where
+ the 
+\begin_inset Formula $i$
+\end_inset
+
+th element in the list corresponds to 
+\begin_inset Formula $Pr[K=i]$
+\end_inset
+
+, these functions are easily managed in software, and computing the convolution
+ is both simple and efficient.
+\end_layout
+
+\begin_layout Subsection
+Multiple Failure Modes
+\end_layout
+
+\begin_layout Standard
+In modeling share survival probabilities, it's useful to be able to analyze
+ separately each of the various failure modes.
+ If reliable statistics for disk failure can be obtained, then a probability
+ mass function for that form of failure can be generated.
+ Similarly 
+\end_layout
+
+\end_body
+\end_document
hunk ./src/allmydata/test/test_util.py 4
-import os, time
+import os, time, random
hunk ./src/allmydata/test/test_util.py 12
+from allmydata.util import statistics
hunk ./src/allmydata/test/test_util.py 167
+    def test_round_sigfigs(self):
+        f = mathutil.round_sigfigs
+        self.failUnlessEqual(f(22.0/3, 4), 7.3330000000000002)
+
+class Statistics(unittest.TestCase):
+    def should_assert(self, msg, func, *args, **kwargs):
+        try:
+            func(*args, **kwargs)
+            self.fail(msg)
+        except AssertionError, e:
+            pass
+
+    def failUnlessListEqual(self, a, b, msg = None):
+        self.failUnlessEqual(len(a), len(b))
+        for i in range(len(a)):
+            self.failUnlessEqual(a[i], b[i], msg)
+
+    def failUnlessListAlmostEqual(self, a, b, places = 7, msg = None):
+        self.failUnlessEqual(len(a), len(b))
+        for i in range(len(a)):
+            self.failUnlessAlmostEqual(a[i], b[i], places, msg)
+
+    def test_binomial_coeff(self):
+        f = statistics.binomial_coeff
+        self.failUnlessEqual(f(20, 0), 1)
+        self.failUnlessEqual(f(20, 1), 20)
+        self.failUnlessEqual(f(20, 2), 190)
+        self.failUnlessEqual(f(20, 8), f(20, 12))
+        self.should_assert("Should assert if n < k", f, 2, 3)
+
+    def test_binomial_distribution_pmf(self):
+        f = statistics.binomial_distribution_pmf
+
+        pmf_comp = f(2, .1)
+        pmf_stat = [0.81, 0.18, 0.01]
+        self.failUnlessListAlmostEqual(pmf_comp, pmf_stat)
+        
+        # Summing across a PMF should give the total probability 1
+        self.failUnlessAlmostEqual(sum(pmf_comp), 1)
+        self.should_assert("Should assert if not 0<=p<=1", f, 1, -1)
+        self.should_assert("Should assert if n < 1", f, 0, .1)
+
+    def test_survival_pmf(self):
+        f = statistics.survival_pmf
+        # Cross-check binomial-distribution method against convolution
+        # method.
+        p_list = [.9999] * 100 + [.99] * 50 + [.8] * 20
+        pmf1 = statistics.survival_pmf_via_conv(p_list)
+        pmf2 = statistics.survival_pmf_via_bd(p_list)
+        self.failUnlessListAlmostEqual(pmf1, pmf2)
+        self.failUnlessTrue(statistics.valid_pmf(pmf1))
+        self.should_assert("Should assert if p_i > 1", f, [1.1]);
+        self.should_assert("Should assert if p_i < 0", f, [-.1]);
+        
+
+    def test_convolve(self):
+        f = statistics.convolve
+        v1 = [ 1, 2, 3 ]
+        v2 = [ 4, 5, 6 ]
+        v3 = [ 7, 8 ]
+        v1v2result = [ 4, 13, 28, 27, 18 ]
+        # Convolution is commutative
+        r1 = f(v1, v2)
+        r2 = f(v2, v1)
+        self.failUnlessListEqual(r1, r2, "Convolution should be commutative")
+        self.failUnlessListEqual(r1, v1v2result, "Didn't match known result")
+        # Convolution is associative
+        r1 = f(f(v1, v2), v3)
+        r2 = f(v1, f(v2, v3))
+        self.failUnlessListEqual(r1, r2, "Convolution should be associative")
+        # Convolution is distributive
+        r1 = f(v3, [ a + b for a, b in zip(v1, v2) ])
+        tmp1 = f(v3, v1)
+        tmp2 = f(v3, v2)
+        r2 = [ a + b for a, b in zip(tmp1, tmp2) ]
+        self.failUnlessListEqual(r1, r2, "Convolution should be distributive")
+        # Convolution is scalar multiplication associative
+        tmp1 = f(v1, v2)
+        r1 = [ a * 4 for a in tmp1 ]
+        tmp2 = [ a * 4 for a in v1 ]
+        r2 = f(tmp2, v2)
+        self.failUnlessListEqual(r1, r2, "Convolution should be scalar multiplication associative")
+
+    def test_find_k(self):
+        f = statistics.find_k
+        g = statistics.pr_file_loss
+        plist = [.9] * 10 + [.8] * 10
+        t = .0001
+        k = f(plist, t)
+        self.failUnlessEqual(k, 10)
+        self.failUnless(g(plist, k) < t)
+
+    def test_pr_file_loss(self):
+        f = statistics.pr_file_loss
+        plist = [.5] * 10
+        self.failUnlessEqual(f(plist, 3), .0546875)
+
+    def test_pr_backup_file_loss(self):
+        f = statistics.pr_backup_file_loss
+        plist = [.5] * 10
+        self.failUnlessEqual(f(plist, .5, 3), .02734375)
+
hunk ./src/allmydata/util/mathutil.py 77
+def round_sigfigs(f, n):
+    fmt = "%." + str(n-1) + "e"
+    return float(fmt % f)
+
addfile ./src/allmydata/util/statistics.py
hunk ./src/allmydata/util/statistics.py 1
+# Copyright (c) 2009 Shawn Willden
+# mailto:shawn at willden.org
+
+from __future__ import division
+from mathutil import round_sigfigs
+import math
+import array
+
+def pr_file_loss(p_list, k):
+    """
+    Probability of single-file loss for shares with reliabilities in
+    p_list.
+
+    Computes the probability that a single file will become
+    unrecoverable, based on the individual share survival
+    probabilities and and k (number of shares needed for recovery).
+
+    Example: pr_file_loss([.9] * 5 + [.99] * 5, 3) returns the
+    probability that a file with k=3, N=10 and stored on five servers
+    with reliability .9 and five servers with reliability .99 is lost.
+
+    See survival_pmf docstring for important statistical assumptions.
+
+    """
+    assert 0 < k <= len(p_list)
+    assert valid_probability_list(p_list)
+
+    # Sum elements 0 through k-1 of the share set PMF to get the
+    # probability that less than k shares survived.
+    return sum(survival_pmf(p_list)[0:k])
+
+def survival_pmf(p_list):
+    """
+    Return the collective PMF of share survival count for a set of
+    shares with the individual survival probabilities in p_list.
+
+    Example: survival_pmf([.99] * 10 + [.8] * 6) returns the
+    probability mass function for the number of shares that will
+    survive from an initial set of 16, 10 with p=0.99 and 6 with
+    p=0.8.  The ith element of the resulting list is the probability
+    that exactly i shares will survive.
+
+    This calculation makes the following assumptions:
+
+    1.  p_list[i] is the probability that any individual share will
+    will survive during the time period in question (whatever that may
+    be).
+
+    2.  The share failures are "independent", in the statistical
+    sense.  Note that if a group of shares are stored on the same
+    machine or even in the same data center, they are NOT independent
+    and this calculation is therefore wrong.
+    """
+    assert valid_probability_list(p_list)
+
+    pmf = survival_pmf_via_conv(p_list)
+
+    assert valid_pmf(pmf)
+    return pmf
+
+def survival_pmf_via_bd(p_list):
+    """
+    Compute share survival PMF using the binomial distribution PMF as
+    much as possible.
+
+    This is more efficient than the convolution method below, but
+    doesn't work for large numbers of shares because the
+    binomial_coeff calculation blows up.  Since the efficiency gains
+    only matter in the case of large numbers of shares, it's pretty
+    much useless except for testing the convolution methond.
+
+    Note that this function does little to no error checking and is
+    intended for internal use and testing only.
+    """
+    pmf_list = [ binomial_distribution_pmf(p_list.count(p), p) 
+                 for p in set(p_list) ]
+    return reduce(convolve, pmf_list)
+
+def survival_pmf_via_conv(p_list):
+    """
+    Compute share survival PMF using iterated convolution of trivial
+    PMFs.
+
+    Note that this function does little to no error checking and is
+    intended for internal use and testing only.
+    """
+    pmf_list = [ [1 - p, p] for p in p_list ];
+    return reduce(convolve, pmf_list)
+
+def print_pmf(pmf, n):
+    """
+    Print a PMF in a readable form, with values rounded to n
+    significant digits. 
+    """
+    for k, p in enumerate(pmf):
+        print "i=" + str(k) + ":", round_sigfigs(p, n)
+
+def pr_backup_file_loss(p_list, backup_p, k):
+    """
+    Probability of single-file loss in a backup context
+
+    Same as pr_file_loss, except it factors in the probability of
+    survival of the original source, specified as backup_p.  Because
+    that's a precondition to caring about the availability of the
+    backup, it's an independent event.
+    """
+    assert valid_probability_list(p_list)
+    assert 0 < backup_p <= 1
+    assert 0 < k <= len(p_list)
+
+    return pr_file_loss(p_list, k) * (1 - backup_p)
+
+
+def find_k(p_list, target_loss_prob):
+    """
+    Find the highest k value that achieves the targeted loss
+    probability, given the share reliabilities given in p_list.
+    """
+    assert valid_probability_list(p_list)
+    assert 0 < target_loss_prob < 1
+
+    pmf = survival_pmf(p_list)
+    return find_k_from_pmf(pmf, target_loss_prob)
+
+def find_k_from_pmf(pmf, target_loss_prob):
+    """
+    Find the highest k value that achieves the targeted loss 
+    probability, given the share survival PMF given in pmf.
+    """
+    assert valid_pmf(pmf)
+    assert 0 < target_loss_prob < 1
+
+    loss_prob = 0.0
+    for k, p_k in enumerate(pmf):
+        loss_prob += p_k
+        if loss_prob > target_loss_prob:
+            return k
+
+    k = len(pmf) - 1
+    return k
+
+def valid_pmf(pmf):
+    """
+    Validate that pmf looks like a proper discrete probability mass
+    function in list form.
+
+    Returns true if the elements of pmf sum to 1.
+    """
+    return round(sum(pmf),5) == 1.0
+
+def valid_probability_list(p_list):
+    """
+    Validate that p_list is a list of probibilities
+    """
+    for p in p_list:
+        if p < 0 or p > 1:
+            return False
+
+    return True
+
+def convolve(list_a, list_b):
+    """
+    Returns the discrete convolution of two lists.
+
+    Given two random variables X and Y, the convolution of their
+    probability mass functions Pr(X) and Pr(Y) is equal to the
+    Pr(X+Y).
+    """
+    n = len(list_a)
+    m = len(list_b)
+    
+    result = []
+    for i in range(n + m - 1):
+        sum = 0.0
+
+        lower = max(0, i - n + 1)
+        upper = min(m - 1, i)
+        
+        for j in range(lower, upper+1):
+            sum += list_a[i-j] * list_b[j]
+
+        result.append(sum)
+
+    return result
+
+def binomial_distribution_pmf(n, p):
+    """
+    Returns Pr(K), where K ~ B(n,p), as a list of values.
+
+    Returns the full probability mass function of a B(n, p) as a list
+    of values, where the kth element is Pr(K=k), or, in the Tahoe
+    context, the probability that exactly k copies of a file share
+    survive, when placed on n independent servers with survival
+    probability p.
+    """
+    assert p >= 0 and p <= 1, 'p=%s must be in the range [0,1]'%p
+    assert n > 0
+
+    result = []
+    for k in range(n+1):
+        result.append(math.pow(p    , k    ) * 
+                      math.pow(1 - p, n - k) * 
+                      binomial_coeff(n, k))
+
+    assert valid_pmf(result)
+    return result;
+
+def binomial_coeff(n, k):
+    """
+    Returns the number of ways that k items can be chosen from a set
+    of n.
+    """
+    assert n >= k
+
+    if k > n:
+        return 0
+
+    if k > n/2:
+        k = n - k
+
+    accum = 1.0
+    for i in range(1, k+1):
+        accum = accum * (n - k + i) // i;
+
+    return int(accum + 0.5)
}

Context:

[upload: use WriteBucketProxy_v2 when uploading a large file (with shares larger than 4GiB). This finally closes #346. I think we can now handle immutable files up to 48EiB.
warner at allmydata.com**20090113021442] 
[deep-check-and-repair: improve results and their HTML representation
warner at allmydata.com**20090113005619] 
[test_repairer.py: hush pyflakes: remove duplicate/shadowed function name, by using the earlier definition (which is identical)
warner at allmydata.com**20090112214509] 
[hush pyflakes by removing unused imports
warner at allmydata.com**20090112214120] 
[immutable repairer
zooko at zooko.com**20090112170022
 Ignore-this: f17cb07b15a554b31fc5203cf4f64d81
 This implements an immutable repairer by marrying a CiphertextDownloader to a CHKUploader.  It extends the IDownloadTarget interface so that the downloader can provide some metadata that the uploader requires.
 The processing is incremental -- it uploads the first segments before it finishes downloading the whole file.  This is necessary so that you can repair large files without running out of RAM or using a temporary file on the repairer.
 It requires only a verifycap, not a readcap.  That is: it doesn't need or use the decryption key, only the integrity check codes.
 There are several tests marked TODO and several instances of XXX in the source code.  I intend to open tickets to document further improvements to functionality and testing, but the current version is probably good enough for Tahoe-1.3.0.
] 
[util: dictutil: add DictOfSets.union(key, values) and DictOfSets.update(otherdictofsets)
zooko at zooko.com**20090112165539
 Ignore-this: 84fb8a2793238b077a7a71aa03ae9d2
] 
[setup: update doc in setup.cfg
zooko at zooko.com**20090111151319
 Ignore-this: 296bfa1b9dbdac876f2d2c8e4e2b1294
] 
[setup: Point setuptools at a directory on the allmydata.org test grid to find dependencies.
zooko at zooko.com**20090111151126
 Ignore-this: f5b6a8f5ce3ba08fea2573b5c582aba8
 Don't include an unrouteable IP address in find_links (fixes #574).
] 
[immutable: separate tests of immutable upload/download from tests of immutable checking/repair
zooko at zooko.com**20090110210739
 Ignore-this: 9e668609d797ec86a618ed52602c111d
] 
[trivial: minor changes to in-line comments -- mark plaintext-hash-tree as obsolete
zooko at zooko.com**20090110205601
 Ignore-this: df286154e1acde469f28e9bd00bb1068
] 
[immutable: make the web display of upload results more human-friendly, like they were before my recent change to the meaning of the "sharemap"
zooko at zooko.com**20090110200209
 Ignore-this: 527d067334f982cb2d3e185f72272f60
] 
[immutable: fix edit-o in interfaces.py documentation introduced in recent patch
zooko at zooko.com**20090110185408
 Ignore-this: f255d09aa96907c402583fc182379391
] 
[immutable: redefine the "sharemap" member of the upload results to be a map from shnum to set of serverids
zooko at zooko.com**20090110174623
 Ignore-this: 10300a2333605bc26c4ee9c7ab7dae10
 It used to be a map from shnum to a string saying "placed this share on XYZ server".  The new definition is more in keeping with the "sharemap" object that results from immutable file checking and repair, and it is more useful to the repairer, which is a consumer of immutable upload results.
] 
[naming: finish renaming "CheckerResults" to "CheckResults"
zooko at zooko.com**20090110000052
 Ignore-this: b01bd1d066d56eff3a6322e0c3a9fbdc
] 
[storage.py : replace 4294967295 with 2**32-1: python does constant folding, I measured this statement as taking 50ns, versus the 400ns for the call to min(), or the 9us required for the 'assert not os.path.exists' syscall
warner at allmydata.com**20090110015222] 
[storage.py: announce a maximum-immutable-share-size based upon a 'df' of the disk. Fixes #569, and this should be the last requirement for #346 (remove 12GiB filesize limit)
warner at allmydata.com**20090110013736] 
[set bin/tahoe executable permissions and leave build_tahoe in sys.argv
cgalvan at mail.utexas.edu**20090109210640] 
[setup: merge relaxation of the version of setuptools that we require at runtime with an indentation change
zooko at zooko.com**20090109190949
 Ignore-this: eb396c71563b9917c8a485efc5bebb36
] 
[setup: remove custom Trial class inside our setup.py and use the setuptools_trial plugin
zooko at zooko.com**20081205232207
 Ignore-this: e0f68169e8ac1b5a54b796e8905c7b80
] 
[setup: we require pywin32 if building on Windows (plus some formatting and comment fixes)
zooko at zooko.com**20081205231911
 Ignore-this: c1d1966cfe458a6380bfd5dce09010ff
] 
[fix bin/tahoe executable for Windows
cgalvan at mail.utexas.edu**20090109184222] 
[use subprocess.call instead of os.execve in bin/tahoe
cgalvan at mail.utexas.edu**20090109180300] 
[setup: attempt to remove the custom setuptools-ish logic in setup.py -- the result works on my Windows box but doesn't yield a working ./bin/tahoe on Windows, and hasn't been tested yet on other platforms
zooko at zooko.com**20081205233054
 Ignore-this: 843e7514870d7a4e708646acaa7c9699
] 
[setup: integrate the bundled setuptools_trial plugin with Chris Galvan's patch to use that plugin
zooko at zooko.com**20081201174804
 Ignore-this: 5d03e936cf45f67a39f993704024788c
] 
[use_setuptools_trial.patch
cgalvan at mail.utexas.edu**20081121205759] 
[setup: bundle setuptools_trial in misc/dependencies/
zooko at zooko.com**20081201174438
 Ignore-this: f13a4a1af648f9ab9b3b3438cf94053f
] 
[test_helper: hush pyflakes by avoiding use of 'uri' as a variable, since it shadows an import of the same name
warner at allmydata.com**20090109025941] 
[immutable/checker: include a summary (with 'Healthy' or 'Not Healthy' and a count of shares) in the checker results
warner at allmydata.com**20090109020145] 
[webapi/deep-manifest t=JSON: don't return the (large) manifest/SI/verifycap lists unless the operation has completed, to avoid the considerable CPU+memory cost of creating the JSON (for 330k dirnodes, it could take two minutes to generate 275MB of JSON). They must be paid eventually, but not on every poll
warner at allmydata.com**20090109015932] 
[dirnode deep-traversal: remove use of Limiter, stick with strict depth-first-traversal, to reduce memory usage during very large (300k+ dirnode) traversals
warner at allmydata.com**20090109014116] 
[immutable: add a monitor API to CiphertextDownloader with which to tell it to stop its work
zooko at zooko.com**20090108204215
 Ignore-this: f96fc150fa68fc2cec46c943171a5d48
] 
[naming: Rename a few things which I touched or changed in the recent patch to download-without-decrypting.
zooko at zooko.com**20090108181307
 Ignore-this: 495ce8d8854c5db5a09b35b856809fba
 Rename "downloadable" to "target".
 Rename "u" to "v" in FileDownloader.__init__().
 Rename "_uri" to "_verifycap" in FileDownloader.
 Rename "_downloadable" to "_target" in FileDownloader.
 Rename "FileDownloader" to "CiphertextDownloader".
] 
[immutable: refactor download to do only download-and-decode, not decryption
zooko at zooko.com**20090108175349
 Ignore-this: 1e4f26f6390a67aa5714650017c4dca1
 FileDownloader takes a verify cap and produces ciphertext, instead of taking a read cap and producing plaintext.
 FileDownloader does all integrity checking including the mandatory ciphertext hash tree and the optional ciphertext flat hash, rather than expecting its target to do some of that checking.
 Rename immutable.download.Output to immutable.download.DecryptingOutput. An instance of DecryptingOutput can be passed to FileDownloader to use as the latter's target.  Text pushed to the DecryptingOutput is decrypted and then pushed to *its* target.
 DecryptingOutput satisfies the IConsumer interface, and if its target also satisfies IConsumer, then it forwards and pause/unpause signals to its producer (which is the FileDownloader).
 This patch also changes some logging code to use the new logging mixin class.
 Check integrity of a segment and decrypt the segment one block-sized buffer at a time instead of copying the buffers together into one segment-sized buffer (reduces peak memory usage, I think, and is probably a tad faster/less CPU, depending on your encoding parameters).
 Refactor FileDownloader so that processing of segments and of tail-segment share as much code is possible.
 FileDownloader and FileNode take caps as instances of URI (Python objects), not as strings.
] 
[trivial: tiny changes to test code
zooko at zooko.com**20090108172048
 Ignore-this: b1a434cd40a87c3d027fef4ce609d25c
] 
[immutable: Make more parts of download use logging mixins and know what their "parent msg id" is.
zooko at zooko.com**20090108172530
 Ignore-this: a4296b5f9b75933d644fd222e1fba079
] 
[trivial: M-x whitespace-cleanup on src/immutable/download.py
zooko at zooko.com**20090108164901
 Ignore-this: bb62daf511e41a69860be657cde8df04
] 
[immutable: ValidatedExtendedURIProxy computes and stores the tail data size as a convenience to its caller.
zooko at zooko.com**20090108164139
 Ignore-this: 75c561d73b17418775faafa60fbbd45b
 The "tail data size" is how many of the bytes of the tail segment are data (as opposed to padding).
] 
[immutable: define a new interface IImmutableFileURI and declare that CHKFileURI and LiteralFileURI provide it
zooko at zooko.com**20090107182451
 Ignore-this: 12c256a0d20655cd73739d45fff0d4d8
] 
[util: log: allow empty msgs (because downloader is using the "format" alternative with no "msg" argument)
zooko at zooko.com**20090107175411
 Ignore-this: 832c333bf027a30a2fcf96e462297ac5
] 
['tahoe cp -r', upon encountering a dangling symlink, would assert out.
Larry Hosken <tahoe at lahosken.san-francisco.ca.us>**20090108055114
 Ignore-this: 46e75845339faa69ffb3addb7ce74f28
 This was somewhat sad; the assertion didn't say what path caused the
 error, what went wrong.  So... silently skip over things that are
 neither dirs nor files.
] 
[immutable: fix error in validation of ciphertext hash tree and add test for that code
zooko at zooko.com**20090108054012
 Ignore-this: 3241ce66373ebc514ae6e6f086f6daa2
 pyflakes pointed out to me that I had committed some code that is untested, since it uses an undefined name.  This patch exercises that code -- the validation of the ciphertext hash tree -- by corrupting some of the share files in a very specific way, and also fixes the bug.
] 
[immutable: do not catch arbitrary exceptions/failures from the attempt to get a crypttext hash tree -- catch only ServerFailure, IntegrityCheckReject, LayoutInvalid, ShareVersionIncompatible, and DeadReferenceError
zooko at zooko.com**20090108042551
 Ignore-this: 35f208af1b9f8603df25ed69047360d1
 Once again I inserted a bug into the code, and once again it was hidden by something catching arbitrary exception/failure and assuming that it means the server failed to provide valid data.
] 
[download: make sure you really get all the crypttext hashes
zooko at zooko.com**20090108022638
 Ignore-this: c1d5ebb048e81f706b9098e26876e040
 We were not making sure that we really got all the crypttext hashes during download.  If a server were to return less than the complete set of crypttext hashes, then our subsequent attempt to verify the correctness of the ciphertext would fail.  (And it wouldn't be obvious without very careful debugging why it had failed.)
 This patch makes it so that you keep trying to get ciphertext hashes until you have a full set or you run out of servers to ask.
] 
[util: deferredutil: undo my recent patch to use our own implementation of gatherResults
zooko at zooko.com**20090107170005
 Ignore-this: c8c5421b47ab5a83c7ced8b08add80e8
 It seems to cause lots of failures on some builders.
] 
[util: deferredutil: implement our own gatherResults instead of using Twisted's
zooko at zooko.com**20090107163207
 Ignore-this: aa676b2b6cfb73bbca15827cb7c0a43e
 Because we want to maintain backwards compatibility to Twisted 2.4.0.
] 
[trivial: M-x whitespace-cleanup
zooko at zooko.com**20090107162528
 Ignore-this: 69bef4518477ca875785f0e0b8ab0000
] 
[util: deferredutil: add basic test for deferredutil.gatherResults
zooko at zooko.com**20090107141342
 Ignore-this: ad457053c8ee3a04921fdcdb639c03d
 Also I checked and Twisted 2.4.0 supports .subFailure and the other parts of the API that we require.
] 
[trivial: fix redefinition of name "log" in imports (pyflakes)
zooko at zooko.com**20090107040829
 Ignore-this: cdcf7ff84082323ebc022b186127e678
] 
[immutable: refactor uploader to do just encoding-and-uploading, not encryption
zooko at zooko.com**20090107034822
 Ignore-this: 681f3ad6827a93f1431d6e3f818840a9
 This makes Uploader take an EncryptedUploadable object instead of an Uploadable object.  I also changed it to return a verify cap instead of a tuple of the bits of data that one finds in a verify cap.
 This will facilitate hooking together an Uploader and a Downloader to make a Repairer.
 Also move offloaded.py into src/allmydata/immutable/.
] 
[trivial: whitespace and docstring tidyups
zooko at zooko.com**20090107034104
 Ignore-this: 34db3eec599efbb2088a87333abfb797
] 
[storage.py: explain what this large and hard-to-recognize 4294967295 number is
warner at allmydata.com**20090106195721] 
[rename "checker results" to "check results", because it is more parallel to "check-and-repair results"
zooko at zooko.com**20090106193703
 Ignore-this: d310e3d7f42a76df68536650c996aa49
] 
[immutable: tests: verifier doesn't always catch corrupted share hashes
zooko at zooko.com**20090106190449
 Ignore-this: a9be83b8e2350ae9af808476015fe0e4
 Maybe it already got one of the corrupted hashes from a different server and it doesn't double-check that the hash from every server is correct.  Or another problem.  But in any case I'm marking this as TODO because an even better (more picky) verifier is less urgent than repairer.
] 
[immutable: fix the writing of share data size into share file in case the share file is used by a < v1.3.0 storage server
zooko at zooko.com**20090106182404
 Ignore-this: 7d6025aba05fe8140bb712e71e89f1ba
 Brian noticed that the constant was wrong, and in fixing that I noticed that we should be saturating instead of modding.
 This code would never matter unless a server downgraded or a share migrated from Tahoe >= v1.3.0 to Tahoe < v1.3.0.  Even in that case, this bug would never matter unless the share size were exactly 4,294,967,296 bytes long.
 Brian, for good reason, wanted this to be spelled "2**32" instead of "4294967296", but I couldn't stand to see a couple of more Python bytecodes interpreted in the middle of a core, frequent operation on the server like immutable share creation.
 
] 
[trivial: whitespace cleanup
zooko at zooko.com**20090106172058
 Ignore-this: 50ee40d42cc8d8f39d2f8ed15f6790d4
] 
[util: base32: require str-not-unicode inputs -- effectively rolls back [3306] and [3307]
zooko at zooko.com**20090106164122
 Ignore-this: 1030c2d37e636d194c99ec99707ae86f
] 
[trivial: fix a bunch of pyflakes complaints
zooko at zooko.com**20090106140054
 Ignore-this: 9a515a237248a148bcf8db68f70566d4
] 
[cli: make startstop_node wait 40 seconds instead of 20 for a process to go away after we signalled it to go away, before emitting a warning
zooko at zooko.com**20090106135106
 Ignore-this: 2da4b794b6a7e2e2ad6904cce61b0f10
 Because the unit tests on the VirtualZooko? buildslave failed when it took 31 seconds for a process to go away.
 Perhaps getting warning message after only 5 seconds instead of 40 seconds is desirable, and we should change the unit tests and set this back to 5, but I don't know exactly how to change the unit tests. Perhaps match this particular warning message about the shutdown taking a while and allow the code under test to pass if the only stderr that it emits is this warning.
] 
[immutable: new checker and verifier
zooko at zooko.com**20090106002818
 Ignore-this: 65441f8fdf0db8bcedeeb3fcbbd07d12
 New checker and verifier use the new download class.  They are robust against various sorts of failures or corruption.  They return detailed results explaining what they learned about your immutable files.  Some grotesque sorts of corruption are not properly handled yet, and those ones are marked as TODO or commented-out in the unit tests.
 There is also a repairer module in this patch with the beginnings of a repairer in it.  That repairer is mostly just the interface to the outside world -- the core operation of actually reconstructing the missing data blocks and uploading them is not in there yet.
 This patch also refactors the unit tests in test_immutable so that the handling of each kind of corruption is reported as passing or failing separately, can be separately TODO'ified, etc.  The unit tests are also improved in various ways to require more of the code under test or to stop requiring unreasonable things of it.  :-)
 
] 
[trivial: fix inline comment in test code
zooko at zooko.com**20090105235342
 Ignore-this: b3d79b9644052e6402b2f7d0125f678a
] 
[immutable: handle another form of share corruption with LayoutInvalid exception instead of AssertionError
zooko at zooko.com**20090105234645
 Ignore-this: fee5f6572efca5435ef54ed32552ca9d
] 
[trivial: remove unused import (pyflakes)
zooko at zooko.com**20090105233120
 Ignore-this: 47b6989ffa5b3a5733e45e8feb507959
] 
[immutable: skip the test of large files, because that is too hard on the host if it doesn't efficiently handle sparse files
zooko at zooko.com**20090105230727
 Ignore-this: 7d35a6cdb1ea6be2adf0e6dacefe01a7
] 
[immutable: raise a LayoutInvalid exception instead of an AssertionError if the share is corrupted so that the sharehashtree is the wrong size
zooko at zooko.com**20090105200114
 Ignore-this: b63d028c44dcd04ef424d6460b46e349
] 
[immutable: stop reading past the end of the sharefile in the process of optimizing download -- Tahoe storage servers < 1.3.0 return an error if you read past the end of the share file
zooko at zooko.com**20090105194057
 Ignore-this: 365e1f199235a55c0354ba6cb2b05a04
] 
[immutable: tidy up the notification of waiters for ReadBucketProxy
zooko at zooko.com**20090105193522
 Ignore-this: 6b93478dae3d627b9d3cbdd254afbe7e
] 
[immutable: refactor downloader to be more reusable for checker/verifier/repairer (and better)
zooko at zooko.com**20090105155145
 Ignore-this: 29a22b1eb4cb530d4b69c12aa0d00740
 
 The code for validating the share hash tree and the block hash tree has been rewritten to make sure it handles all cases, to share metadata about the file (such as the share hash tree, block hash trees, and UEB) among different share downloads, and not to require hashes to be stored on the server unnecessarily, such as the roots of the block hash trees (not needed since they are also the leaves of the share hash tree), and the root of the share hash tree (not needed since it is also included in the UEB).  It also passes the latest tests including handling corrupted shares well.
   
 ValidatedReadBucketProxy takes a share_hash_tree argument to its constructor, which is a reference to a share hash tree shared by all ValidatedReadBucketProxies for that immutable file download.
   
 ValidatedReadBucketProxy requires the block_size and share_size to be provided in its constructor, and it then uses those to compute the offsets and lengths of blocks when it needs them, instead of reading those values out of the share.  The user of ValidatedReadBucketProxy therefore has to have first used a ValidatedExtendedURIProxy to compute those two values from the validated contents of the URI.  This is pleasingly simplifies safety analysis: the client knows which span of bytes corresponds to a given block from the validated URI data, rather than from the unvalidated data stored on the storage server.  It also simplifies unit testing of verifier/repairer, because now it doesn't care about the contents of the "share size" and "block size" fields in the share.  It does not relieve the need for share data v2 layout, because we still need to store and retrieve the offsets of the fields which come after the share data, therefore we still need to use share data v2 with its 8-byte fields if we want to store share data larger than about 2^32.
   
 Specify which subset of the block hashes and share hashes you need while downloading a particular share.  In the future this will hopefully be used to fetch only a subset, for network efficiency, but currently all of them are fetched, regardless of which subset you specify.
   
 ReadBucketProxy hides the question of whether it has "started" or not (sent a request to the server to get metadata) from its user.
 
 Download is optimized to do as few roundtrips and as few requests as possible, hopefully speeding up download a bit.
 
] 
[util: add gatherResults which is a deferred-list-like thing that doesn't wrap failures in a FirstError
zooko at zooko.com**20090104165202
 Ignore-this: a284fb8ab8a00a39416a67dc5d9a451e
] 
[immutable: fix think-o in previous patch which caused all reads to return "", and also optimize by not opening the file when the answer is going to be ""
zooko at zooko.com**20090103200245
 Ignore-this: 8ac4d0b0399cd74e8a424ffbcf3d9eb9
] 
[immutable: when storage server reads from immutable share, don't try to read past the end of the file (Python allocates space according to the amount of data requested, so if there is corruption and that number is huge it will do a huge memory allocation)
zooko at zooko.com**20090103192222
 Ignore-this: e533a65d74437676d5116369fd7c663b
] 
[immutable: mark a failing download test as "todo", because I think it is revealing a limitation of the current downloader's handling of corrupted shares
zooko at zooko.com**20090103190003
 Ignore-this: 1d429912dda92d986e2ee366d73e088c
] 
[docs: update install.html to recommend Python v2 instead of Python v2.5.2
zooko at zooko.com**20090103183100
 Ignore-this: 5dbea379c59e0d9be817cdd9c8393d65
] 
[trivial: remove unused import (pyflakes)
zooko at zooko.com**20090103182215
 Ignore-this: 4a29a14fa4580460a5e61fa0aa88b9b2
] 
[merge_install.patch
cgalvan at mail.utexas.edu**20090102164434
 Ignore-this: aa6d4c05d583a0724eb218fef04c3940
] 
[setup: new install doc -- doesn't require GNU make or a C++ compiler any more!
zooko at zooko.com**20081201180933
 Ignore-this: 753e8d1e6f32e2ddcd7a082050114725
] 
[immutable: fix test for truncated reads of URI extension block size
zooko at zooko.com**20090103174427
 Ignore-this: d9ff9dfff88b4cc7aa6751ce2e9088a6
] 
[immutable: further loosen the performance-regression test to allow up to 45 reads
zooko at zooko.com**20090103174109
 Ignore-this: 614f7dba9c0d310a220e74e45441f07
 This does raise the question of if there is any point to this test, since I apparently don't know what the answer *should* be, and whenever one of the buildbots fails then I redefine success.
 
 But, I'm about to commit a bunch of patches to implement checker, verifier, and repairer as well as to refactor downloader, and I would really like to know if these patches *increase* the number of reads required even higher than it currently is.
 
] 
[trivial: another place where I accidentally committed a note-to-self about the lease fields in the server-side share file
zooko at zooko.com**20090103172941
 Ignore-this: c23c7095ffccdf5aa033ed434b50582b
] 
[immutable: fix detection of truncated shares to take into account the fieldsize -- either 4 or 8
zooko at zooko.com**20090103005745
 Ignore-this: 710184bd90f73dc18f3899d90ec6e972
] 
[immutable: raise LayoutInvalid instead of struct.error when a share is truncated
zooko at zooko.com**20090103004806
 Ignore-this: 346c779045f79725965a0f2d3eea41f9
 To fix this error from the Windows buildslave:
 
 [ERROR]: allmydata.test.test_immutable.Test.test_download_from_only_3_remaining_shares
 
 Traceback (most recent call last):
   File "C:\Documents and Settings\buildslave\windows-native-tahoe\windows\build\src\allmydata\immutable\download.py", line 135, in _bad
     raise NotEnoughSharesError("ran out of peers, last error was %s" % (f,))
 allmydata.interfaces.NotEnoughSharesError: ran out of peers, last error was [Failure instance: Traceback: <class 'struct.error'>: unpack requires a string argument of length 4
 c:\documents and settings\buildslave\windows-native-tahoe\windows\build\support\lib\site-packages\foolscap-0.3.2-py2.5.egg\foolscap\call.py:667:_done
 c:\documents and settings\buildslave\windows-native-tahoe\windows\build\support\lib\site-packages\foolscap-0.3.2-py2.5.egg\foolscap\call.py:53:complete
 c:\Python25\lib\site-packages\twisted\internet\defer.py:239:callback
 c:\Python25\lib\site-packages\twisted\internet\defer.py:304:_startRunCallbacks
 --- <exception caught here> ---
 c:\Python25\lib\site-packages\twisted\internet\defer.py:317:_runCallbacks
 C:\Documents and Settings\buildslave\windows-native-tahoe\windows\build\src\allmydata\immutable\layout.py:374:_got_length
 C:\Python25\lib\struct.py:87:unpack
 ]
 ===============================================================================
 
] 
[immutable: whoops, it actually takes up to 39 reads sometimes to download a corrupted file
zooko at zooko.com**20090102234302
 Ignore-this: ef009d179eb4f84a56559017b633d819
] 
[immutable: add more detailed tests of download, including testing the count of how many reads different sorts of downloads take
zooko at zooko.com**20090102225459
 Ignore-this: d248eb3982fdb05b43329142a723f5a1
] 
[trivial: a few improvements to in-line doc and code, and renaming of test/test_immutable_checker.py to test/test_immutable.py
zooko at zooko.com**20090102224941
 Ignore-this: 27b97a06c3edad1821f43876b4350f3
 That file currently tests checker and verifier and repairer, and will soon also test downloader.
] 
[immutable: fix name change from BadOrMissingShareHash to BadOrMissingHash
zooko at zooko.com**20090102192709
 Ignore-this: 3f22ca1ee045beabb11559512ba130f4
 One of the instances of the name accidentally didn't get changed, and pyflakes noticed.  The new downloader/checker/verifier/repairer unit tests would also have noticed, but those tests haven't been rolled into a patch and applied to this repo yet...
] 
[trivial: remove unused import -- thanks, pyflakes
zooko at zooko.com**20090102192128
 Ignore-this: d99c7349ba6f8db971e31cf8789883d5
] 
[immutable: download.py: Raise the appropriate type of exception to indicate the cause of failure, e.g. BadOrMissingHash, ServerFailure, IntegrityCheckReject (which is a supertype of BadOrMissingHash).  This helps users (such as verifier/repairer) catch certain classes of reasons for "why did this download not work".  The tests of verifier/repairer test this code and rely on this code.
zooko at zooko.com**20090102185858
 Ignore-this: 377bf621bbb6e360a98fd287bb1593f1
] 
[immutable: ReadBucketProxy defines classes of exception: LayoutInvalid and its two subtypes RidiculouslyLargeURIExtensionBlock and ShareVersionIncompatible.  This helps users (such as verifier/repairer) catch certain classes of reasons for "why did this download not work".  This code gets exercised by the verifier/repairer unit tests, which corrupt the shares on disk in order to trigger problems like these.
zooko at zooko.com**20090102181554
 Ignore-this: 2288262a59ee695f524859ed4b0b39d5
] 
[immutable: ValidatedExtendedURIProxy computes and stores block_size and share_size for the convenience of its users
zooko at zooko.com**20090102174317
 Ignore-this: 2bab64048fffc05dc6592d617aeb412f
] 
[remove_sumo_install.patch
cgalvan at mail.utexas.edu**20090102162347
 Ignore-this: f328570b1da1ccfbaebc770d40748046
] 
[doc: remove notes to self that I accidentally included in a recent patch
zooko at zooko.com**20090102041457
 Ignore-this: d0039512dbde09811fdec48a2e00dc4
] 
[docs: remove caveat about Nevow incompatibility with Python 2.6 since the latest version of Nevow has fixed it
zooko at zooko.com**20090102034135
 Ignore-this: 4cb2ceb41f53e07dab0f623e01044edc
] 
[immutable: make the test of large files more likely to work by requesting to allocate space for only one huge share, not three
zooko at zooko.com**20081231215942
 Ignore-this: d7073de4764506550e184f8fdc670962
] 
[trivial: "M-x whitespace-cleanup", and also remove an unused variable
zooko at zooko.com**20081231214233
 Ignore-this: 54c33c205aa88de8655e4232d07f083e
] 
[immutable: storage servers accept any size shares now
zooko at zooko.com**20081231214226
 Ignore-this: 28669d591dddaff69088cba4483da61a
 Nathan Wilcox observed that the storage server can rely on the size of the share file combined with the count of leases to unambiguously identify the location of the leases.  This means that it can hold any size share data, even though the field nominally used to hold the size of the share data is only 32 bits wide.
 
 With this patch, the storage server still writes the "size of the share data" field (just in case the server gets downgraded to an earlier version which requires that field, or the share file gets moved to another server which is of an earlier vintage), but it doesn't use it.  Also, with this patch, the server no longer rejects requests to write shares which are >= 2^32 bytes in size, and it no longer rejects attempts to read such shares.
 
 This fixes http://allmydata.org/trac/tahoe/ticket/346 (increase share-size field to 8 bytes, remove 12GiB filesize limit), although there remains open a question of how clients know that a given server can handle large shares (by using the new versioning scheme, probably).
 
 Note that share size is also limited by another factor -- how big of a file we can store on the local filesystem on the server.  Currently allmydata.com typically uses ext3 and I think we typically have block size = 4 KiB, which means that the largest file is about 2 TiB.  Also, the hard drives themselves are only 1 TB, so the largest share is definitely slightly less than 1 TB, which means (when K == 3), the largest file is less than 3 TB.
 
 This patch also refactors the creation of new sharefiles so that only a single fopen() is used.
 
 This patch also helps with the unit-testing of repairer, since formerly it was unclear what repairer should expect to find if the "share data size" field was corrupted (some corruptions would have no effect, others would cause failure to download).  Now it is clear that repairer is not required to notice if this field is corrupted since it has no effect on download.  :-)
 
] 
[trivial: "M-x whitespace-cleanup" on immutable/layout.py
zooko at zooko.com**20081231210702
 Ignore-this: 8be47d01cf40d1f81aeb0011a0a0caa
] 
[trivial: remove unused import -- thanks, pyflakes
zooko at zooko.com**20081231212556
 Ignore-this: a70cd39a7d633bde2bb5275dfd4d3781
] 
[rrefutil: generically wrap any errback from callRemote() in a ServerFailure instance
zooko at zooko.com**20081231202830
 Ignore-this: c949eaf8589ed4c3c232f17808fdce6a
 This facilitates client code to easily catch ServerFailures without also catching exceptions arising from client-side code.
 See also:
 http://foolscap.lothar.com/trac/ticket/105 # make it easy to distinguish server-side failures/exceptions from client-side
] 
[immutable: more detailed tests for checker/verifier/repairer
zooko at zooko.com**20081231201838
 Ignore-this: dd16beef604b0917f4493bc4ef35ab74
 There are a lot of different ways that a share could be corrupted, or that attempting to download it might fail.  These tests attempt to exercise many of those ways and require the checker/verifier/repairer to handle each kind of failure well.
] 
[docs: add note about non-ascii chars in cli to NEWS
zooko at zooko.com**20081230081728
 Ignore-this: c6f45a1d944af3c77942a4bf740ee24c
] 
[cli: make startstop_node wait 20 seconds instead of 5 for a process to go away after we signalled it to go away
zooko at zooko.com**20081230072022
 Ignore-this: 3b0d47649e32b01ff55a506245c674c6
 Because the unit tests on the VirtualZooko buildslave failed when it took 16 seconds for a process to go away.
 Perhaps getting notification after only 5 seconds instead of 20 seconds is desirable, and we should change the unit tests and set this back to 5, but I don't know exactly how to change the unit tests.  Perhaps match this particular warning message about the shutdown taking a while and allow the code under test to pass if the only stderr that it emits is this warning.
] 
[docs: editing changes and updated news in known_issues.txt
zooko at zooko.com**20081230070116
 Ignore-this: e5dddc4446e3335a6c4eee7472e0670e
] 
[docs: split historical/historical_known_issues.txt out of known_issues.txt
zooko at zooko.com**20081230065226
 Ignore-this: 9b6d0d679294110deeb0ea18b4ad7ac8
 All issues which are relevant to users of v1.1, v1.2, or v1.3 go in known_issues.txt.  All issues which are relevant to users of v1.0 go in historical/historical_known_issues.txt.
] 
[doc: sundry amendments to docs and in-line code comments
zooko at zooko.com**20081228225954
 Ignore-this: a38057b9bf0f00afeea1c468b2237c36
] 
[doc: add mention of "tahoe create-alias" in the security-warning section of CLI.txt
zooko at zooko.com**20081224211646
 Ignore-this: 6bb0ab3af59a79e05ebccb800d9a12b0
] 
[doc: trivial: remove trailing whitespace
zooko at zooko.com**20081224211634
 Ignore-this: 6ff234bc7632c3ae4d4f2be2198bb97d
] 
[doc: warn that unicode might not work, in CLI.txt
zooko at zooko.com**20081224211618
 Ignore-this: 89355b53aab40af1d45a3746bb90ed10
] 
[doc: use the term "filesystem" rather than "virtual drive" in CLI.txt
zooko at zooko.com**20081224211614
 Ignore-this: c9541955201671c1a3a8c6ca7be4e7d
] 
[cli: mark unicode filenames as unsupported -- see #534 for details
zooko at zooko.com**20081224192802
 Ignore-this: b209ccbd838f633ec201e2e97156847c
] 
[cli: undo the effects of [http://allmydata.org/trac/tahoe/changeset/20081222235453-92b7f-f841e18afb94e1fd95e6dafb799a3d876dd85c69]
zooko at zooko.com**20081224155317
 Ignore-this: d34ee20d89221357e32872d721d7685f
 We're just going to mark unicode in the cli as unsupported for tahoe-lafs-1.3.0.  Unicode filenames on the command-line do actually work for some platforms and probably only if the platform encoding is utf-8, but I'm not sure, and in any case for it to be marked as "supported" it would have to work on all platforms, be thoroughly tested, and also we would have to understand why it worked.  :-)
 
] 
[test: extend timeout on the hotline file that prevents the client from stopping itself
zooko at zooko.com**20081222030629
 Ignore-this: 391f48caef9d6ad558e540ded56a8075
 The 20-second timeout was apparently tripped on my Powerbook G4 "draco".
] 
[cli: decode all cli arguments, assuming that they are utf-8 encoded
zooko at zooko.com**20081222235453
 Ignore-this: d92b4d146e1dc9848c6a4b6aaaa3d1e9
 Also encode all args to urllib as utf-8 because urllib doesn't handle unicode objects.
 I'm not sure if it is appropriate to *assume* utf-8 encoding of cli args.  Perhaps the Right thing to do is to detect the platform encoding.  Any ideas?
 This patch is mostly due to François Deppierraz.
 
] 
[util/base32: the identity trans table needn't have any contents -- we are using string.translate solely to delete known chars
zooko at zooko.com**20081222234808
 Ignore-this: 8fe03ec6571726f44425fc5905387b78
] 
[util/base32: allow unicode inputs to a2b() or could_be_base32_encoded(), and encode them with utf-8 before processing them
zooko at zooko.com**20081222234713
 Ignore-this: e1eb4caed2f78b2fef0df4bbf8bb26f7
] 
[util/base32: loosen the precondition forbidding unicode and requiring str -- now it requires either unicode or str
zooko at zooko.com**20081222222237
 Ignore-this: 3481d644bdc5345facbc199d33653f37
 Hopefully this will make it so that tests pass with François Deppierraz's patch to fix the tahoe cli's handling of unicode argument.
] 
[immutable: don't catch all exception when downloading, catch only DeadReferenceError and IntegrityCheckReject
zooko at zooko.com**20081221234135
 Ignore-this: 1abe05c3a5910378abc3920961f19aee
] 
[immutable: invent download.BadOrMissingHashError which is raised if either hashtree.BadHashError, hashtree.NotEnoughHashesError, and which is a subclass of IntegrityCheckReject
zooko at zooko.com**20081221234130
 Ignore-this: 1b04d7e9402ebfb2cd4c7648eb16af84
] 
[dirnode: don't check MAC on entries in dirnodes
zooko at zooko.com**20081221233518
 Ignore-this: efacb56d18259219c910cf5c84b17340
 In an ancient version of directories, we needed a MAC on each entry.  In modern times, the entire dirnode comes with a digital signature, so the MAC on each entry is redundant.
 With this patch, we no longer check those MACs when reading directories, but we still produce them so that older readers will accept directories that we write.
 
] 
[immutable, checker, and tests: improve docstrings, assertions, tests
zooko at zooko.com**20081221210752
 Ignore-this: 403ed5ca120d085d582cd5695d8371f
 No functional changes, but remove unused code, improve or fix docstrings, etc.
] 
[cli: if response code from wapi server is not 200 then stop instead of proceeding
zooko at zooko.com**20081220134918
 Ignore-this: 907481c941fc5696630b9c118137fb52
 Also, include the data that failed to json parse in an exception raised by the json parser.
] 
[immutable: when downloading an immutable file, use primary shares if they are available
zooko at zooko.com**20081220131456
 Ignore-this: f7b8b76fd7df032673ab072384eaa989
 Primary shares require no erasure decoding so the more primary shares you have, the less CPU is used.
] 
[trivial: remove unused import (thanks, pyflakes)
zooko at zooko.com**20081219194629
 Ignore-this: 96e15d6de43dd1204a8933171f194189
] 
[try to tidy up uri-as-string vs. uri-as-object
zooko at zooko.com**20081219143924
 Ignore-this: 4280727007c29f5b3e9273a34519893f
 I get confused about whether a given argument or return value is a uri-as-string or uri-as-object.  This patch adds a lot of assertions that it is one or the other, and also changes CheckerResults to take objects not strings.
 In the future, I hope that we generally use Python objects except when importing into or exporting from the Python interpreter e.g. over the wire, the UI, or a stored file.
] 
[immutable: remove the last bits of code (only test code or unused code) which did something with plaintext hashes or plaintext hash trees
zooko at zooko.com**20081219141807
 Ignore-this: d10d26b279794383f27fa59ec4a50219
] 
[immutable: use new logging mixins to simplify logging
zooko at zooko.com**20081217000450
 Ignore-this: 7d942905d1ea8f34753dbb997e1857f3
] 
[immutable: refactor ReadBucketProxy a little
zooko at zooko.com**20081216235325
 Ignore-this: b3733257769eff3b3e9625bd04643fd6
] 
[debug: pass empty optional arguments to ReadBucketProxy
zooko at zooko.com**20081216235145
 Ignore-this: 7132cdc6a52767fbbcca03b242a16982
 because those arguments are about to become non-optional (for other code than test/debug code)
] 
[uri: generalize regexp that recognizes tahoe URLs to work for any host and port
zooko at zooko.com**20081216234930
 Ignore-this: 4a7716b8034c8e5ed9698a99f1ec5cb4
] 
[util: logging: refactor some common logging behavior into mixins
zooko at zooko.com**20081216233807
 Ignore-this: d91408bc984d1cf1fae30134f6cddb13
] 
[pyutil: assertutil: copy in simplified assertutil from pyutil
zooko at zooko.com**20081216233745
 Ignore-this: cd4a33186c8c134104f07018ab448583
] 
[pyutil: assertutil: simplify handling of exception during formatting of precondition message, and reduce dependency to just the Python Standard Library's logging module
zooko at zooko.com**20081210131057
 Ignore-this: 4a7f1aa5b9f7ac60067347db9cdf5f28
] 
[client: add get_servers()
zooko at zooko.com**20081208230400
 Ignore-this: 1b9b3ff483849563342f467c39fdd15d
] 
[mutable publish: if we are surprised by shares that match what we would have written anyways, don't be surprised. This should fix one of the two #546 problems, in which we re-use a server and forget that we already sent them a share.
warner at allmydata.com**20081210044449] 
[NEWS: updated to most recent user-visible changes, including the 8123-to-3456 change
warner at allmydata.com**20081209231146] 
[immutable: remove unused code to produce plaintext hashes
zooko at zooko.com**20081209224546
 Ignore-this: 1ff9b6fa48e0617fea809998a0e3b6e
] 
[finish renaming 'subshare' to 'block' in immutable/encode.py and in docs/
zooko at zooko.com**20081209223318
 Ignore-this: 3d1b519f740c3d1030cb733f76fdae61
] 
[introducer: fix bug in recent simplification caught by Brian's sharp code-reviewing eye
zooko at zooko.com**20081208231634
 Ignore-this: 29854954577018d658be49142177edf2
] 
[introducer: simplify get_permuted_peers() implementation and add get_peers()
zooko at zooko.com**20081208225725
 Ignore-this: 8299c0dc187521f34187e54c72e57dc9
] 
[webapi.txt: minor edits
warner at allmydata.com**20081208213256] 
[rename "get_verifier()" to "get_verify_cap()"
zooko at zooko.com**20081208184411
 Ignore-this: 3ea4d7a78c802b23f628a37cc643c11a
] 
[setup: try depending on setuptools >= 0.6c6 instead of >= 0.6c7 at run-time, to be able to use the setuptools that came with Ubuntu Gutsy
zooko at zooko.com**20081208174725
 Ignore-this: 1cfefa8891f627c7ed46f1ff127eeee9
] 
[setup: loosen requirement on simplejson to >= 1.4
zooko at zooko.com**20081208143537
 Ignore-this: 2e4bec12f047f3f525caa6f234b58784
 That's the version of simplejson that comes with ubuntu feisty, and the one that we've required for most of our history.  Currently the Ubuntu dapper buildslave fails (see issue #534), and setting the simplejson requirement to be >= 2.0 would fix that failure, but I don't understand why.
] 
[setup: require simplejson >= 1.7.1
zooko at zooko.com**20081208043412
 Ignore-this: ab0e8ba82f0d10bc650bc80732bf3d0e
 That's the version that comes with gutsy, and we don't really understand why increasing the required version number helped with issue #553.
] 
[mutable: merge renaming with test patches
zooko at zooko.com**20081207144519
 Ignore-this: a922a8b231090fb35b9ef84d99e9dba3
] 
[mutable: rename mutable/node.py to mutable/filenode.py and mutable/repair.py to mutable/repairer.py
zooko at zooko.com**20081207142008
 Ignore-this: ecee635b01a21e6f866a11bb349712a3
 To be more consistent with the immutable layout that I am working on.
] 
[web/directory.py: really really fix #553. Unfortunately it's tricky to simulate the behavior of a brower's relative-url handling in a unit test.
warner at allmydata.com**20081206051412] 
[filenode.py: Fix partial HTTP Range header handling according to RFC2616
francois at ctrlaltdel.ch**20081118134135
 
 Tahoe webapi was failing on HTTP request containing a partial Range header.
 This change allows movies players like mplayer to seek in movie files stored in
 tahoe.
 
 Associated tests for GET and HEAD methods are also included
] 
[mutable.modify(): after UCWE, publish even if the second invocation of the modifier didn't modify anything. For #551.
warner at allmydata.com**20081206044923] 
[dirnode.py: dirnode.delete which hits UCWE should not fail with NoSuchChildError. Fixes #550.
warner at allmydata.com**20081206040837] 
[MutableFileNode.modify: pass first_time= and servermap= to the modifier callback
warner at allmydata.com**20081206040710] 
[misc/cpu-watcher.tac: tolerate disk-full errors when writing the pickle, and pickle corruption from earlier disk-full errors
warner at allmydata.com**20081205215412] 
[web: fix more info links again
zooko at zooko.com**20081205213939
 Ignore-this: d51cf2c6393b5799dc615952680cd079
 Really, *really* closes #553.
] 
[web: fix moreinfo link
zooko at zooko.com**20081205212939
 Ignore-this: 89913601a159437a2c151dd3652e6a94
] 
[web: "More Info" link describes the same file that the "file" link points to, rather than to the file under the same name in this directory
zooko at zooko.com**20081205210502
 Ignore-this: 5017754e11749b376c7fa66d1acb2a58
 It's a subtle but real difference.
 Fixes #553 -- "More Info" link should point to a file/dir, not a dir+childname .
] 
[minor: fix unused imports -- thanks, pyflakes
zooko at zooko.com**20081205190723
 Ignore-this: 799f6a16360ac1aee8f6e0eb35a28a88
] 
[download: refactor handling of URI Extension Block and crypttext hash tree, simplify things
zooko at zooko.com**20081205141754
 Ignore-this: 51b9952ea2406b0eea60e8d72654fd99
 
 Refactor into a class the logic of asking each server in turn until one of them gives an answer 
 that validates.  It is called ValidatedThingObtainer.
 
 Refactor the downloading and verification of the URI Extension Block into a class named 
 ValidatedExtendedURIProxy.
 
 The new logic of validating UEBs is minimalist: it doesn't require the UEB to contain any 
 unncessary information, but of course it still accepts such information for backwards 
 compatibility (so that this new download code is able to download files uploaded with old, and 
 for that matter with current, upload code).
 
 The new logic of validating UEBs follows the practice of doing all validation up front.  This 
 practice advises one to isolate the validation of incoming data into one place, so that all of 
 the rest of the code can assume only valid data.
 
 If any redundant information is present in the UEB+URI, the new code cross-checks and asserts 
 that it is all fully consistent.  This closes some issues where the uploader could have 
 uploaded inconsistent redundant data, which would probably have caused the old downloader to 
 simply reject that download after getting a Python exception, but perhaps could have caused 
 greater harm to the old downloader.
 
 I removed the notion of selecting an erasure codec from codec.py based on the string that was 
 passed in the UEB.  Currently "crs" is the only such string that works, so 
 "_assert(codec_name == 'crs')" is simpler and more explicit.  This is also in keeping with the 
 "validate up front" strategy -- now if someone sets a different string than "crs" in their UEB, 
 the downloader will reject the download in the "validate this UEB" function instead of in a 
 separate "select the codec instance" function.
 
 I removed the code to check plaintext hashes and plaintext Merkle Trees.  Uploaders do not 
 produce this information any more (since it potentially exposes confidential information about 
 the file), and the unit tests for it were disabled.  The downloader before this patch would 
 check that plaintext hash or plaintext merkle tree if they were present, but not complain if 
 they were absent.  The new downloader in this patch complains if they are present and doesn't 
 check them.  (We might in the future re-introduce such hashes over the plaintext, but encrypt 
 the hashes which are stored in the UEB to preserve confidentiality.  This would be a double-
 check on the correctness of our own source code -- the current Merkle Tree over the ciphertext 
 is already sufficient to guarantee the integrity of the download unless there is a bug in our 
 Merkle Tree or AES implementation.) 
 
 This patch increases the lines-of-code count by 8 (from 17,770 to 17,778), and reduces the 
 uncovered-by-tests lines-of-code count by 24 (from 1408 to 1384).  Those numbers would be more 
 meaningful if we omitted src/allmydata/util/ from the test-coverage statistics.
 
] 
[test_web: add get_permuted_peers, to unbreak recent checker_results change
warner at allmydata.com**20081205081210] 
[web checker_results: include a table of servers in permuted order, so you can see the places where new servers have been inserted
warner at allmydata.com**20081205080309] 
[test_system.py: assert less about the stats we get, since shares (and thus allocate() calls) are distributed randomly
warner at allmydata.com**20081204232704] 
[stats: don't return booleans: it violates the schema. Add a test.
warner at lothar.com**20081204210124] 
[test_system.py: don't ask the stats-gatherer to poll: it tolerates failures, so it isn't really giving us enough test coverage. Removing the call will make it more clear that we need to improve the tests later
warner at lothar.com**20081204210053] 
[confwiz.py - removing hardcoded version number
secorp at allmydata.com**20081203023831] 
[CLI: check for pre-existing aliases in 'tahoe create-alias' and 'tahoe add-alias'
warner at lothar.com**20081203022022] 
[test_cli: pass rc out of do_cli() too
warner at lothar.com**20081203020828] 
[setup: one more address to send release announcements to
zooko at zooko.com**20081203015040
 Ignore-this: 87cb7a9c3a1810ff0c87908548027ac5
] 
[setup: another note about the process of making a tahoe release: mail to duplicity-talk at nongnu.org
zooko at zooko.com**20081203014414
 Ignore-this: 77ffd6f7412cdc3283c1450cfde9fdf1
] 
[test_storage.py: more windows-vs-readonly-storage fixes
warner at lothar.com**20081203014102] 
[docs/webapi.txt: update helper section to discuss tahoe.cfg
warner at lothar.com**20081203010726] 
[docs/webapi.txt: update to discuss tahoe.cfg, not BASEDIR/webport
warner at lothar.com**20081203010612] 
[storage.py: oops, fix windows again, readonly_storage wasn't getting picked up properly
warner at lothar.com**20081203010317] 
[test_download.py: remove extra base32 import
warner at lothar.com**20081203003126] 
[test_download: test both mutable and immutable pre-generated shares
warner at lothar.com**20081203003007] 
[test_download.py: added 'known-answer-tests', to make sure current code can download a file that was created by earlier code
warner at lothar.com**20081203002208] 
[docs/configuration.txt: fix minor typo
warner at lothar.com**20081202215101] 
[storage.py: unbreak readonly_storage=True on windows
warner at allmydata.com**20081202014946] 
[#542 'tahoe create-key-generator': fix the .tac file this creates to be compatible with modern code, add a test
warner at allmydata.com**20081201234721] 
[storage.py: fix minor typo in comment
warner at lothar.com**20081201232540] 
[storage: replace sizelimit with reserved_space, make the stats 'disk_avail' number incorporate this reservation
warner at lothar.com**20081201232421] 
[util/abbreviate: add abbreviated-size parser
warner at lothar.com**20081201232412] 
[wui/wapi: change the default port number from 8123 to 3456 to avoid conflict with TorButton
zooko at zooko.com**20081125235737
 Ignore-this: 47ea30bafd5917a7e1dbc88aa0190f8e
 See ticket #536 for details.
] 
[setup: move the requirement on simplejson from setup.py to _auto_deps.py, and loosen it from >= 2.0.5 to > 1.8.1
zooko at zooko.com**20081125203751
 Ignore-this: 4403781ef878547ee09e7e010eb5b49a
 We'll see if this fixes the tests on all of our current buildslaves, and if it does then I'll be happy to leave it at "> 1.8.1" for now, even though I don't know exactly what versions of simplejson changed exactly what behavior that interacts with exactly what environment.  See http://allmydata.org/trac/tahoe/ticket/534 for uncertainties.
 
] 
[setup.py: Require simplejson version >= 2.0.5
francois at ctrlaltdel.ch**20081125171727] 
[mutable publish: reinstate the foolscap-reference-token-bug workaround, both for the original reasons and because of an apparent new foolscap bug that's triggered by reference tokens. See #541 for details.
warner at allmydata.com**20081125202735] 
[setup: fix missing import -- thanks, pyflakes
zooko at zooko.com**20081125155528
 Ignore-this: 1fc042da2882b7b2f71cde93eb234a47
] 
[setup: correctly detect Arch Linux in platform description
zooko at zooko.com**20081125155118
 Ignore-this: 37a7648f190679d3e973270a73133189
] 
[dirnode manifest: add verifycaps, both to internal API and to webapi. This will give the manual-GC tools more to work with, so they can estimate how much space will be freed.
warner at allmydata.com**20081124204046] 
[control.py: use get_buckets() instead of get_version() to measure ping time, because the latter changed recently
warner at lothar.com**20081123051323] 
[upload: when using a Helper, insist that it provide protocols/helper/v1 . Related to #538.
warner at allmydata.com**20081122022932] 
[upload: don't use servers which can't support the share size we need. This ought to avoid #439 problems. Some day we'll have a storage server which advertises support for a larger share size. No tests yet.
warner at allmydata.com**20081122022812] 
[#538: fetch version and attach to the rref. Make IntroducerClient demand v1 support.
warner at allmydata.com**20081122020727] 
[#538: add remote_get_version() to four main Referenceable objects: Introducer Service, Storage Server, Helper, CHK Upload Helper. Remove unused storage-server get_versions().
warner at allmydata.com**20081121234352] 
[setup: turn off --multi-version until I can figure out why it breaks test_runner
zooko at zooko.com**20081121043645
 Ignore-this: 36bf5db4122e6bc4e12588d9717a1e32
] 
[setup: require setuptools >= 0.6c7 to run
zooko at zooko.com**20081121043611
 Ignore-this: e92e07c7e8edbaadcd44db7e8f4a028
] 
[setup: use "setup.py develop --multi-version" so that if there is a too-old version of a dependency installed this doesn't prevent Tahoe's "develop" and run-in-place from working
zooko at zooko.com**20081120201545
 Ignore-this: 898f21fc1b16ae39c292fdd1ef42c446
] 
[setup: we require setuptools > 0.6a9 in order to parse requirements that have a dot in them such as "zope.interface"
zooko at zooko.com**20081120151503
 Ignore-this: a6304de8f1f44defc50438d72a13e58f
 In the near future we might start actually relying on setuptools's pkg_resources's "require()" function to make modules importable, so we can't just skip zope.interface.
] 
[test_dirnode: add an explainError call
warner at allmydata.com**20081119220212] 
[manifest: add storage-index strings to the json results
warner at allmydata.com**20081119220027] 
[manifest: include stats in results. webapi is unchanged.
warner at allmydata.com**20081119210347] 
[misc/spacetime/diskwatcher.tac: remove dead code
warner at allmydata.com**20081119200552] 
[mutable: respect the new tahoe.cfg 'shares.needed' and 'shares.total' settings
warner at allmydata.com**20081119200501] 
[oops, update tests to match 'tahoe stats' change
warner at allmydata.com**20081119023259] 
[cli: tahoe stats: abbreviate total sizes too
warner at allmydata.com**20081119022816] 
[cli: 'tahoe stats': add abbreviated size to the histogram. Not sure this actually improves things.
warner at allmydata.com**20081119021736] 
[util/abbreviate: little utility to abbreviate seconds and bytes
warner at allmydata.com**20081119021142] 
[cli: add 'tahoe check' and 'tahoe deep-check' commands, with primitive reporting code
warner at allmydata.com**20081119011210] 
[cli: factor out slow-http-operation to a separate module
warner at allmydata.com**20081119011113] 
[cli: tahoe stats/manifest: change --verbose to --raw, since I want -v for --verify for check/deep-check/repair
warner at allmydata.com**20081119003608] 
[test_system: make 'where' strings more helpful, to track down test failures better
warner at allmydata.com**20081119002950] 
[webapi: add 'summary' string to checker results JSON
warner at allmydata.com**20081119002826] 
[munin/tahoe_disktotal: add a 'disk used' line, since it will always be less than disktotal
warner at allmydata.com**20081118214431] 
[munin/tahoe_introstats: add line for distinct-storage-hosts (which counts machines instead of nodes)
warner at allmydata.com**20081118213238] 
[webapi: introducer stats: add 'announcement_distinct_hosts' to the t=json form, to show how many distinct hosts are providing e.g. storage services
warner at allmydata.com**20081118213015] 
['tahoe create-key-generator': fix help text
warner at allmydata.com**20081118074758] 
[#330: convert stats-gatherer into a .tac file service, add 'tahoe create-stats-gatherer'
warner at allmydata.com**20081118074620] 
[munin/tahoe_diskused: new plugin to show total disk space used across the grid
warner at allmydata.com**20081118072525] 
[munin/tahoe_disktotal: new plugin to show total disk space (used and unused) in the grid
warner at allmydata.com**20081118065101] 
[tahoe.cfg: add controls for k and N (and shares-of-happiness)
warner at allmydata.com**20081118062944] 
[cli: add tests for 'tahoe stats --verbose'
warner at allmydata.com**20081118041114] 
[cli: add --verbose to 'tahoe manifest', to show the raw JSON data
warner at allmydata.com**20081118040219] 
[diskwatcher: record total-space (the size of the disk as reported by df) in the db, report it to HTTP clients. This will involve a 50-item-per-second upgrade process when it is first used on old data
warner at allmydata.com**20081118034516] 
[dirnode manifest/stats: process more than one LIT file per tree; we were accidentally ignoring all but the first
warner at allmydata.com**20081115045049] 
[limiter.py: fix stack blowout by inserting an eventual-send between _done and maybe_start_task. This was causing failures during a 'tahoe manifest' of a large set of directories
warner at allmydata.com**20081115031144] 
[New credit file entry
francois at ctrlaltdel.ch**20081114140548] 
[test_cli.py: Ensure that we can read our uploaded files back
francois at ctrlaltdel.ch**20081114134458] 
[test_cli.py: use str objects instead of unicode ones
francois at ctrlaltdel.ch**20081114134137
 
 This will hopefully fix failing tests with LC_ALL=C
] 
[CLI: add 'tahoe stats', to run start-deep-stats and print the results
warner at allmydata.com**20081114014350] 
[test_system.py: fix new 'tahoe manifest' tests to not break on windows, by providing --node-directory instead of --node-url
warner at allmydata.com**20081113212748] 
[test for bug #534, unicode filenames
francois at ctrlaltdel.ch**20081113111951
 
 This test assure that uploading a file whose name contains unicode character
 doesn't prevent further uploads in the same directory.
] 
[Fix an filename encoding issue with "tahoe cp"
francois at ctrlaltdel.ch**20081111200803] 
[web/info.py: use 128-bit ophandles instead of 64-bit
warner at allmydata.com**20081113021842] 
[CLI: add 'tahoe manifest', which takes a directory and returns a list of things you can reach from it
warner at allmydata.com**20081113021725] 
[create_node.py: also remove now-unused import of pkg_resources
warner at allmydata.com**20081113004716] 
[tahoe.cfg: add tub.location, to override the location hints we include in our FURL. This replaces advertised_ip_addresses, which doesn't remain useful enough to retain it. Helps with #517 (Tor).
warner at allmydata.com**20081113004458] 
[setup: remove pkg_resources.require() from create_node.py and add it to runner.py
zooko at zooko.com**20081112212503
 Ignore-this: 763324202456a59b833b14eb4027171
 Brian correctly points out that the latter is an entry point.
] 
[docs: fix cutnpasto in source:docs/logging.txt
zooko at zooko.com**19700105140422
 Ignore-this: de0f9ceb8e0ca4c158492ad2f9a6ba6f
] 
[tests: fix comment
zooko at zooko.com**19700105101055
 Ignore-this: fabedea917895568b1fca75a480111b9
] 
[tests: add tahoe_cp to the list of scripts that we don't actually have tests for yet
zooko at zooko.com**19700105100058
 Ignore-this: ac89583992fb1b48d9a4680344569d91
] 
[setup: the .tac files created by create_node.py call pkg_resources.require() so that they can load tahoe and twisted packages which were installed with setuptools multi-version mode
zooko at zooko.com**19700101235005
 Ignore-this: e1db03f86e0407a91087d8ada6b477fd
 Also the create_node.py script itself uses pkg_resources.require() for the same reason.
] 
[web/info: don't let an unrecoverable file break the page (show ? instead of a size)
warner at allmydata.com**20081107045117] 
[checker: add is_recoverable() to checker results, make our stub immutable-verifier not throw an exception on unrecoverable files, add tests
warner at allmydata.com**20081107043547] 
[monitor: update interface definition: get_status() can return a Failure
warner at allmydata.com**20081107035452] 
[web/operations.py: if the operation failed, render the Failure
warner at allmydata.com**20081107035309] 
[undoing test change for native_client.php
secorp at allmydata.com**20081106220310] 
[NEWS: more minor edits
warner at allmydata.com**20081106223517] 
[NEWS: minor edits
warner at allmydata.com**20081106223356] 
[NEWS: mention SFTP server
warner at allmydata.com**20081106014153] 
[client.py: oops, update FTP/SFTP config names to match current docs
warner at allmydata.com**20081106013442] 
[remove duplicate+old docs/NEWS. The top-level NEWS file is the canonical one.
warner at allmydata.com**20081106013224] 
[SFTP/FTP: merge user/account code, merge docs
warner at allmydata.com**20081106012558] 
[docs: move webapi/ftp/sftp into a new frontends/ directory
warner at allmydata.com**20081105233050] 
[ftp/sftp: move to a new frontends/ directory in preparation for factoring out password-auth component
warner at allmydata.com**20081105200733] 
[sftpd: minor debug-logging tweak
warner at allmydata.com**20081105194511] 
[confwiz.py - trying out a new configuration site
secorp at allmydata.com**20081105011830] 
[ftpd: include an (unused) avatar logout callback
warner at allmydata.com**20081105000104] 
[#531: implement an SFTP frontend. Mostly works, still lots of debug messages. Still needs tests and auth-by-pubkey in accounts.file
warner at allmydata.com**20081105000022] 
[docs/ftp.txt: correct Twisted dependency: we don't need VFS, we can use a release, as long as you apply the patch
warner at allmydata.com**20081104235840] 
[shebang: replace "/usr/bin/python" with "/usr/bin/env python"
zooko at zooko.com**20081105000306
 Ignore-this: 8ae33a8a7828fa7423422e252f2cfd74
] 
[misc/fixshebangs.py
zooko at zooko.com**20081105000130
 Ignore-this: 13b03ea2d2ed8982f8346a827b46bd2e
] 
[util: copy in pyutil.fileutil.ReopenableNamedTemporaryFile
zooko at zooko.com**20081104234715
 Ignore-this: f1131e9b8f249b5f10be4cba2aeb6118
] 
[immutable: tolerate filenode.read() with a size= that's too big, rather than hanging
warner at allmydata.com**20081104212919] 
[util: copy in nummedobj from pyutil
zooko at zooko.com**20081104195550] 
[util: copy in dictutil from pyutil
zooko at zooko.com**20081104195327] 
[rollback change... move allmydatacontextmenu registration to installer.tmpl in tahoe-w32-client\installer
booker at allmydata.com**20081103213647] 
[register the AllmydataContextMenu.dll for the context menu handler file sharing shell extension
booker at allmydata.com**20081103200027] 
[debug catalog-shares: tolerate even more errors on bad files/directories
warner at allmydata.com**20081030215447] 
[NEWS: update with all user-visible changes since the last update
warner at allmydata.com**20081030213604] 
[#527: expire the cached files that are used to support Range: headers, every hour, when the file is unused and older than an hour
warner at allmydata.com**20081030203909] 
[util/cachedir.py: add a cache-directory manager class, which expires+deletes unused files after a while
warner at allmydata.com**20081030200120] 
[test_cli: try to fix windows again
warner at allmydata.com**20081030193204] 
[debug/test_cli: fix error handling for catalog-shares, to make the test stop failing on windows
warner at allmydata.com**20081030190651] 
[web: add 'Repair' button to checker results when they indicate unhealthyness. Also add the object's uri to the CheckerResults instance.
warner at allmydata.com**20081030010917] 
[create_node.py: add 'web.static = public_html' to the initial tahoe.cfg
warner at allmydata.com**20081030001336] 
[webapi: serve the /static URL tree from /public_html (configurable)
warner at allmydata.com**20081029223431] 
[catalog-shares command: tolerate errors, log them to stderr, handle v2-immutable shares
warner at allmydata.com**20081029221010] 
[test_web.py: one more line of test coverage
warner at allmydata.com**20081029050015] 
[test_web: improve test coverage of PUT DIRURL t=uri replace=false
warner at allmydata.com**20081029045744] 
[web: test (and fix) PUT DIRURL t=uri, which replaces a directory in-place with some other cap
warner at allmydata.com**20081029045446] 
[web/directory.py: slight shuffle to improve test coverage
warner at allmydata.com**20081029045406] 
[test_client.py: improve test coverage a bit
warner at allmydata.com**20081029044335] 
[node.py: remove unused old_log() function
warner at allmydata.com**20081029043558] 
[node.py: remove support for the old BASEDIR/authorized_keys.PORT file
warner at allmydata.com**20081029043420] 
[move testutil into test/common_util.py, since it doesn't count as 'code under test' for our pyflakes numbers
warner at allmydata.com**20081029042831] 
[util: move PollMixin to a separate file (pollmixin.py), so testutil can be moved into test/
warner at allmydata.com**20081029041548] 
[control.py: removed unused testutil.PollMixin
warner at allmydata.com**20081029040359] 
[web/filenode: oops, fix test failures, not everything has a storage index
warner at allmydata.com**20081029011720] 
[web/filenode: add Accept-Ranges and ETag (for immutable files) headers to GET responses
warner at allmydata.com**20081029010103] 
[#527: respond to GETs with early ranges quickly, without waiting for the whole file to download. Fixes the alacrity problems with the earlier code. Still needs cache expiration.
warner at allmydata.com**20081029005618] 
[#527: support HTTP 'Range:' requests, using a cachefile. Adds filenode.read(consumer, offset, size) method. Still needs: cache expiration, reduced alacrity.
warner at lothar.com**20081028204104] 
[iputil.py: avoid a DNS lookup at startup (which may timeout tests when run on a partially-offline host) by using 198.41.0.4 instead of A.ROOT-SERVERS.NET
warner at lothar.com**20081028203646] 
[interfaces.py: promote immutable.encode.NotEnoughSharesError.. it isn't just for immutable files any more
warner at lothar.com**20081027203449] 
[interfaces.IMutableFileNode.download_best_version(): fix return value
warner at lothar.com**20081027202046] 
[dirnode lookup: use distinct NoSuchChildError instead of the generic KeyError when a child can't be found
warner at lothar.com**20081027201525] 
[storage: don't use colons in the corruption-advisory filename, since windows can't tolerate them
warner at lothar.com**20081026024633] 
[mutable: call remove_advise_corrupt_share when we see share corruption in mapupdate/download/check, tolerate servers that do not implement it
warner at lothar.com**20081024202128] 
[storage: add remote_advise_corrupt_share, for clients to tell storage servers about share corruption that they've discovered. The server logs the report.
warner at lothar.com**20081024185248] 
[mutable/servermap.py: fix needs_merge(), it was incorrectly claiming that mixed shares with distinct seqnums needed a merge, causing repair(force=False) to fail
warner at lothar.com**20081024040024] 
[test_web.test_POST_DIRURL_deepcheck: confirm that /operations/HANDLE/ works with or without the slash
warner at lothar.com**20081024021759] 
[web/checker_results.py: remove dead code
warner at lothar.com**20081024001717] 
[test_web: more test coverage
warner at lothar.com**20081024001118] 
[webapi: fix t=rename from==to, it used to delete the file
warner at lothar.com**20081023233236] 
[test_system: update test to match web checker results
warner at lothar.com**20081023233202] 
[webapi deep-check: show the root as <root>, rather than an empty path string
warner at lothar.com**20081023230359] 
[mutable/checker: announce the mapupdate op on the 'recent uploads+downloads' page
warner at lothar.com**20081023230319] 
[scripts/create_node.py: remove empty-string defaults for --introducer= and --nickname=
warner at lothar.com**20081023230235] 
[deep-check: add webapi links to detailed per-file/dir results
warner at lothar.com**20081023230031] 
[interface.py: fix typo
warner at lothar.com**20081023225936] 
[webapi: make the /operations/ 't=status' qualifier optional, remove it from examples
warner at lothar.com**20081023225658] 
[setup: require the latest version of the setuptools bootstrap egg
zooko at zooko.com**20081025152858
 Ignore-this: c0c9923ba3008f410d5cc56f2236edb9
] 
[setup: include _pkgutil.py in setuptools bootstrap egg so that it will work on Python 2.4
zooko at zooko.com**20081025152839
 Ignore-this: 38d81a037c1a3413d69d580ccb13fd67
] 
[setup: pretend the tahoe requires twisted to set up, so that twisted will be there for nevow
zooko at zooko.com**20081025135042
 Ignore-this: 4e6c7e580f7e30df571e2e63be663734
] 
[setup: require the SVN snapshot of setuptools to build
zooko at zooko.com**20081025134959
 Ignore-this: f68077dd10d85a71a1e06678365e6753
] 
[setup: remove old bundled setuptools-0.6c9
zooko at zooko.com**20081025134947
 Ignore-this: 3a95dd72346a60b39ffd6ddfadd1b3a8
] 
[setup: bundle an SVN snapshot of setuptools instead of the most recent stable release of setuptools
zooko at zooko.com**20081025134837
 Ignore-this: 9a0c9a34b186b972650cf9455edb0d28
 This SVN snapshot fixes a problem that prevents the setting up of nevow:
 http://bugs.python.org/setuptools/issue20
] 
[setup: reorder dependencies to be sort of increasing order of how much they depend on other stuff
zooko at zooko.com**20081025134739
 Ignore-this: 6d636aaf5deb37cbf18172824b0bbf87
 Not that the order makes any different to how it gets installed, as far as I can tell.
] 
[docs: add a note that when you make a new tahoe release, you should send the announcement to fuse-devel at lists.sourceforge.net
zooko at zooko.com**20081023213658] 
[web/info.py: fix 'Check This Object' link, for files it was checking the parent directory by mistake
warner at lothar.com**20081022171056] 
[#514: add meta-refresh=60 tag to t=status page for incomplete operations
warner at lothar.com**20081022164842] 
[test_dirnode.py: oops, missed a Monitor(), unbreak tests
warner at lothar.com**20081022085054] 
[immutable/filenode.py: add TODO note about the #514 monitor to check(), rather than going through the checker/verifier code and adding it, since Zooko is currently working on that code
warner at lothar.com**20081022084237] 
[more #514: pass a Monitor to all checker operations, make mutable-checker honor the cancel flag
warner at lothar.com**20081022083818] 
[dirnode.py: check for cancel during deep-traverse operations, and don't initiate any new ones if we've been cancelled. Gets us closer to #514.
warner at lothar.com**20081022075552] 
[more #514 log-webop status/cancel: add handle-expiration, test coverage
warner at lothar.com**20081022051354] 
[webapi.txt: improve t=deep-size output docs
warner at lothar.com**20081022005331] 
[#514: improve test coverage
warner at lothar.com**20081022005256] 
[Change deep-size/stats/check/manifest to a start+poll model instead of a single long-running synchronous operation. No cancel or handle-expiration yet. #514.
warner at lothar.com**20081022000307] 
[setup: change ez_setup.py to install setuptools-0.6c9
zooko at zooko.com**20080930200502] 
[setup: bundle setuptools-0.6c9
zooko at zooko.com**20080930200448] 
[setup: remove bundled setuptools-0.6c8
zooko at zooko.com**20080930200336] 
[setup: remove the developer note about doing without GNU make (the GNU make requirement is about to hurt Peter if he tries to follow this doc, by the way)
zooko at zooko.com**20081021163200
 add classifiers showing with which versions of Python it is known to work.
] 
[* fuse/runtests: added --catch-up-pause option
robk-tahoe at allmydata.com**20081021002902
 
 On linux, write tests are failing because data written to fuse isn't showing
 up in tahoe by the time it's checked.  it's not clear where this is originating,
 since the fuse implementation [should be] waiting for completion of tahoe 
 operations before returning from its calls.  This adds an option to control the
 duration of a pause between the fuse write and the check of tahoe, which is by
 default set to 2s on linux, which - somewhat inexplicably - seems to 'fix' the 
 problem, in as far as it allows tests to complete.
 
] 
[fuse/runtests: include length in drepr() output
robk-tahoe at allmydata.com**20081021000159] 
[fuse/runtests: make exceptions in 'read_in_random_order' into TestFailures
robk-tahoe at allmydata.com**20081020235235] 
[fuse/blackmatch: added asynchronous (background) file download
robk-tahoe at allmydata.com**20081020233333
 
 previously, upon opening a file for reading, the open() call would block
 while the entire file was retrieved from tahoe into the cache directory.
 This change adds a DownloaderWithReadQueue class, and associated plumbing,
 such that an open() will return promptly with the download initiated 'in
 the background'.  Subsequent read() operations will block until enough
 data has been downloaded to satisfy that request.  This provides a behaviour
 similar to streaming, i.e. the client application will be able to read
 data from the fuse interface while the remainder of the file is still being
 downloaded.
 
] 
[fuse/runtests: added 'read_in_random_order' test
robk-tahoe at allmydata.com**20081020232427
 
 this test uploads a test file to tahoe, and then reads the file from fuse,
 but reads the blocks of the file in a random order; this is designed to
 exercise the asynchronous download feature of blackmatch - where the file
 is downloaded from tahoe asynchronously, and rather than blocking open()
 for the entirety of the download, instead individual read() calls are
 blocked until enough of the file has been downloaded to satisfy them
] 
[fuse/runtests: added a --no-cleanup option
robk-tahoe at allmydata.com**20081020155120
 
 the code had a 'fullcleanup' flag internally which controlled whether
 working directories were cleaned up.  this promotes that to a command
 line option (negated) '--no-cleanup' defaulting to False, i.e. do cleanup
] 
[fuse/runtests: truncate expected file contents in reported error message
robk-tahoe at allmydata.com**20081020144523
 
 this avoids dumping the repr of 1Mb of random data to stdout in the event
 of a test failure, but rather just dumps the start/end of the errant strings
 if the amount of data is > 200 chars repr'd
] 
[fuse/blackmatch: fix platform specific problems in repr_flags
robk-tahoe at allmydata.com**20081020143052
 
 the repr_flags debug/logging function had a list of fields from the os 
 module that might be passed into an open() call, but it included at 
 least one which was available on the mac but not on linux. symmetrically
 linux has numerous flags which are not present on the mac. the repr_flags
 function is now tolerant of flags not being present, and has an expanded
 list of flags
] 
[makefile: added 'fuse-test' target to makefile, to run 'runtests'
robk-tahoe at allmydata.com**20081019132518] 
[fuse/runtests: added a 'todo' flag, surpressing failure for implementations not expected to pass
robk-tahoe at allmydata.com**20081019131600
 
 since the current tests assume that the implementation responds to changes made
 to tahoe after mount, and impl_b prefetches and cached directory data, impl_b
 fails the current 'read' test suite.
 
 rather than reflect that problem in the overall failure of the runtests exit
 code, this adds a 'todo' flag to the implementations table, and sets the todo
 flag for impl_b.  Thus errors will therein be reported in output, but not cause
 a failing exit code.
] 
[fuse/runtests: made runtests exit code depend on success
robk-tahoe at allmydata.com**20081017180058
 
 return an exit code of 0 only if no tests failed, and 1 in the case of
 linkage error, test setup failure, or individual test case failure
 
] 
[storage.py: assert that immutable share size will fit in the 4-byte v1 container (see #346). The struct module in py2.4 raises an error on overflow, but py2.5 merely emits a warning
warner at lothar.com**20081020172208] 
[NEWS: update to summarize all changes since the last update
warner at lothar.com**20081020164047] 
[fuse/runtest: make removal of webport file soft
robk-tahoe at allmydata.com**20081017030154
 
 previously the runtests suite removed the webport file created by
 tahoe create-client in all but the first node.  now that the node config
 is in tahoe.cfg by default this file might not exist.
] 
[fuse/blackmatch: update json handling to support simplejson v2
robk-tahoe at allmydata.com**20081017025931
 
 simplejson v2 returns strings as either unicode or str, depending upon its
 mood.  thus the interpretation of the node's json repr of a directory, and
 the serialisation of strings in the json based rpc both exploded when built
 against simplejson v2.  this makes both of these places liberal in their
 acceptance of either str or unicode.
] 
[fuse/blackmatch: log exception in server startup
robk-tahoe at allmydata.com**20081017014650
 
 humphf.  my build runs the fuse stuff fine, but the build from the buildslave
 doesn't seem to start up properly.  hopefully this will elicit some useful info
] 
[fuse/blackmatch: add readability to some logging, fix a permissions problem
robk-tahoe at allmydata.com**20081017004421
 
 adds a couple of functions to unpack 'mode' and 'flags' for open() calls, to
 facilitate debugging.
 
 adds a fix to ensure that all tmp files created for writing are opened with
 permissions 0600 - one problem I had with testing with the Finder was that
 files were being opened write only (0200) and were then failing to upload
 to tahoe due to internal permission denied errors.
 
 there remain a variety of problems with finder access which I'm unable to
 comprehend at this time.  sometimes copies to tahoe will work fine, sometimes
 they yield "the finder cannot complete the operation because some data ...
 could not be read or written. (Error code -36)" sometimes "You may need to
 enter the name and password for an administrator on this computer to change
 the item" sometimes "The operation cannot be completed because an item with
 the name ... already exists." and sometimes "The operation cannot be completed
 because the item ... is locked."  What seems to be absent is rhyme or reason.
 
 unix operations (cp, mv) work fine, rsync works fine. 
 
] 
[fuse/blackmatch: fix linkage problems with daemonize
robk-tahoe at allmydata.com**20081016163637
 
 the daemonize() function imported from twisted was causing problems when 
 run from a frozen (py2app) build.  I simply copied the daemonize function
 into this file, and that fixes the problem.
 
 also removed a couple of lines of debugging spam that slipped through.
 
] 
[gui/macapp: minor bugfixes
robk-tahoe at allmydata.com**20081016163052
 
 though it seemed to work before the 'fstype' passed to fuse of 'allmydata' was
 today throwing errors that len(fstype) must be at most 7.
 
 fixed a typo in changes to 'mount_filesystem()' args
 
 bumped the delay between mounting a filesystem and 'open'ing it in Finder to
 4s, as it seems to take a little longer to mount now the client and server
 fuse processes need to coordinate.
] 
[fuse/blackmatch: split into client/server (twisted server)
robk-tahoe at allmydata.com**20081016150846
 
 This implements a client/server split for blackmatch, where the client 
 implements the fuse_main bindings and a simple blocking rpc client mechanism.
 The server implements the other half of that rpc mechanism, and contains all
 the actual logic for interpreting fuse requests in the context of the on disk
 cache and requests to the tahoe node.  The server is based on a twisted reactor.
 
 The rpc mechanism implements a simple method dispatch including marshalling,
 using json, of basic inert data types, in a flat namespace (no objects).
 The client side is written in a blocking idiom, to interface with the threading
 model used by the fuse_main bindings, whereas the server side is written for a
 twisted reactor-based environment, intended to facilitate implementing more 
 sophisticated logic in that paradigm.  The two communicate over a unix domain
 socket, allocated within the nodedir.
 
 Command line usage is unchanged; the server is launched automatically by the
 client. The server daemonizes itself, to avoid preventing the original parent
 process (e.g. 'runtests') from waiting upon the server exiting.
 
 The client keeps open a 'keepalive' connection to the server; upon loss thereof
 the server will exit. This addresses the fact that the python-fuse bindings 
 provide no notification of exit of the client process upon unmount.
 
 The client thus provides a relatively thin 'shim' proxying requests from the
 fuse_main bindings across the rpc to the server process, which handles the 
 logic behind each request.  
 
 For the time being, a '--no-split' option is provided to surpress the splitting
 into client/server, yielding the prior behaviour.  Once the server logic gets
 more complex and more entrenched in a twisted idiom, this might be removed.
 The 'runtests' test harness currently tests both modes, as 'impl_c' and 
 'impl_c_no_split'
 
] 
[fuse/blackmatch: 'flatten' the fuse api implementation
robk-tahoe at allmydata.com**20081016143547
 
 the previous revision of blackmatch used a file_class to delegate all fuse
 api operations on files to a specific per-file class, which is an option
 given by the python-fuse bindings.
 
 this is a pre-cursor to the 'split' client/server version, which uses a
 simple, moreover flat, rpc mechanism to broker access to methods.
] 
[fuse/runtests: disable impl_a/impl_b on mac, as they don't actually work.
robk-tahoe at allmydata.com**20081016143232] 
[fuse/runtests: added write_partial_overwrite test
robk-tahoe at allmydata.com**20081016142926
 
 this tests opening a file for update, overwriting a small part of it, and
 ensuring that the end result constitutes an overwrite of the original file.
 This tests, e.g. the implementation doesn' open a 'fresh' file but does in
 fact initialise the file to be uploaded with the contents of any extant
 file before applying updates
 
] 
[fuse/runtests: added --tests, renamed --suites
robk-tahoe at allmydata.com**20081016142836
 
 changed the --tests option to be --suites, as it takes a prefix, e.g. 'read'
 'write' (or 'all', the default) and runs those suites which are applicable to
 each implementation being tested.
 
 added a --tests option, which takes a list of tests, e.g. 'read_file_contents'
 'write_overlapping_large_writes' and runs all tests specified without regard
 to whether the implementation(s) under test are declared to support them.
 
 this is basically to allow a specific test or two to be run, saving time 
 during development and debugging by not running the entire suite
] 
[fuse/runtests: added 'random scatter' write test
robk-tahoe at allmydata.com**20081003233436
 
 this writes the test file in a randomised order, with randomly sized writes.
 also for each 'slice' of the file written, a randomly chosen overlapping
 write is also made to the file.  this ensures that the file will be written
 in its entirety in a thoroughly random order, with many overlapping writes.
] 
[fuse/runtests: add overlapping write tests
robk-tahoe at allmydata.com**20081003224833
 
 using both small and large blocksizes for writes, write a 1Mb file to fuse
 where every write overlaps another. 
 
 This serves a useful purpose - in manual testing of blackmatch some time ago
 most operations e.g. bulk copies, worked fine, but using rsync caused data 
 corruption on most files.  it turned out to be that rsync writes in 64K blocks,
 but rather than making the last block short, the last block instead overlaps
 the preceding (already written) block.  This revealed a problem where cache
 files were being opened 'append' rather than 'write' and hence the overlapping
 write to the fuse layer caused the overlapping portion of the file to be 
 duplicated in cache, leading to oversized and corrupt files being uploaded.
] 
[fuse/runtests: remove write small file test, as it's subsumed by the tiny_file test
robk-tahoe at allmydata.com**20081003223944] 
[fuse/runtests: added linear write tests for various block sizes
robk-tahoe at allmydata.com**20081003223550
 
 unit tests to test writing contiguous blocks linearly through the file,
 for a variety of block sizes;  'tiny_file' is an entire file fitting within
 a single io block / write operation.  'linear_{small,large}_writes' test
 a 1Mb file written with each write operation containing significantly less
 or more, respecitvely, data than fuse will pass into the implementation as
 a single operation (which on the mac at least is 64Kib)
] 
[fuse/runtests: add a very simple 'write' test
robk-tahoe at allmydata.com**20081003172044
 
 this performs a very simple write through the fuse layer and confirms that
 the file is stored correctly into the tahoe mesh.  ('simple' in the sense
 that the entire file body fits trivially in a single write() operation, 
 disk block etc)
] 
[fuse/runtests: added a --web-open option
robk-tahoe at allmydata.com**20081003172026
 
 similar to the --debug-wait option which causes the test harness to
 pause at various stages of the process to facilitate debugging, this
 option simplifies that debugging by automatically opening a web browser
 to the root dir of that implementation's tests when tests are commenced.
 
 in addition, if --web-open is specfied but --debug-wait is not, the
 harness will still pause after running tests but before tearing down
 the tahoe grid - this allows all tests to run to completion, but
 provide a debugging hook to investigate the end state of the grid's
 contents thereafter.
] 
[fuse/impl_a: fix a suspected bug in caching
robk-tahoe at allmydata.com**20081003171309
 
 from my examination of the tahoe_fuse ('impl_a') code, it looks like 
 the intention is to cache the file contents in memory while it's open,
 since it does in fact do that.  however it looks like it also ignored
 that cache entirely, and made an individual tahoe webapi GET request
 for each and every read() operation regardless of the relative size of
 the read block and the file in question.
 
 this changes that to make read() use the data in memory rather than
 fetch the data over again.   if there's something more subtle going
 on, please let me know.
] 
[gui/macapp: slew of code cleanup; unmount filesystems on quit
robk-tahoe at allmydata.com**20080925233235
 
 a handful of code cleanup, renaming and refactoring.  basically consolidating
 'application logic' (mount/unmount fs) into the 'MacGuiApp' class (the wx.App)
 and cleaning up various scoping things around that.  renamed all references to
 'app' to refer more clearly to the 'AppContainer' or to the guiapp.
 
 globally renamed basedir -> nodedir
 
 also made the guiapp keep a note of each filesystem it mounts, and unmount
 them upon 'quit' so as to cleanup the user's environment before the tahoe node
 vanishes from out underneath the orphaned tahoe fuse processes
 
] 
[gui/macapp: make submenu of aliases for 'webopen'
robk-tahoe at allmydata.com**20080925163919
 
 this changes the 'open webroot' menu item to be a submenu listing all aliases
 defined in ~/.tahoe.  Note that the dock menu does not support submenus, so it
 only offers a single 'open webroot' option for the default tahoe: alias.
 
 I had trouble with this at first and concluded that the submenus didn't work,
 and made it a distinct 'WebUI' menu in it's own right.  on further inspection,
 there are still problems but they seem to be something like once the dock menu
 has been used, sometimes the app's main menubar menus will cease to function,
 and this happens regardless of whether submenus or plain simple menus are used.
 I have no idea what the peoblem is, but it's not submenu specific.
] 
[repairer: fix flaw in testutil.flip_one_bit() that Brian pointed out
zooko at zooko.com**20081016194848] 
[misc/incident-gatherer: add classify_tahoe.py: a foolscap incident-gatherer classification plugin
warner at allmydata.com**20081015220940] 
[repairer: test all different kinds of corruption that can happen to share files on disk
zooko at zooko.com**20081014230920] 
[util/time_format.py: accept space separator, add unit tests
warner at allmydata.com**20081013225258] 
[test_storage: use different filenames, poor stupid windows
warner at allmydata.com**20081010021139] 
[scripts/debug.py: emit the immutable-share version number, tolerate v2
warner at allmydata.com**20081010013422] 
[storage.py: improve some precondition() error messages
warner at allmydata.com**20081010011425] 
[storage: introduce v2 immutable shares, with 8-byte offsets fields, to remove two of the three size limitations in #346. This code handles v2 shares but does not generate them. We'll make a release with this v2-tolerance, wait a while, then make a second release that actually generates v2 shares, to avoid compatibility problems.
warner at allmydata.com**20081010011327] 
[debug.py: oops, add missing import for ReadBucketProxy
warner at allmydata.com**20081010002922] 
[storage: split WriteBucketProxy and ReadBucketProxy out into immutable/layout.py . No behavioral changes.
warner at allmydata.com**20081010000800] 
[interfaces: loosen a few max-size constraints which would limit us to a mere 1.09 TB maximum file size
zooko at zooko.com**20081009191357
 
 These constraints were originally intended to protect against attacks on the
 storage server protocol layer which exhaust memory in the peer.  However,
 defending against that sort of DoS is hard -- probably it isn't completely
 achieved -- and it costs development time to think about it, and it sometimes
 imposes limits on legitimate users which we don't necessarily want to impose.
 So, for now we forget about limiting the amount of RAM that a foolscap peer can
 cause you to start using.
 
] 
[util/limiter: add a repr
warner at allmydata.com**20081007201945] 
[dirnode.build_manifest: include node.list in the limiter, that's the most important thing to slow down
warner at allmydata.com**20081007201929] 
[web/directory: t=manifest output=html: make the caps into clickable hrefs
warner at allmydata.com**20081007201845] 
[web/directory: factor out the get_root function
warner at allmydata.com**20081007201742] 
[web/directory.py: remove unused imports
warner at allmydata.com**20081007194820] 
[test_web: deep-size is more variable than I thought, so assert less
warner at allmydata.com**20081007051147] 
[web: change t=manifest to return a list of (path,read/writecap) tuples, instead of a list of verifycaps. Add output=html,text,json.
warner at allmydata.com**20081007043618] 
[web: rewrite t=deep-size in terms of deep-stats, update test to match inclusion of directory sizes
warner at allmydata.com**20081007043539] 
[ftpd: hush pyflakes
warner at allmydata.com**20081007014513] 
[ftpd: make sure we're using a patched/fixed Twisted, to avoid confusion later
warner at allmydata.com**20081007011411] 
[ftp: change the twisted hack necessary for async-write-close, to one more agreeable to the twisted-dev folks, add a copy of the necessary patch to docs/ftp.txt
warner at allmydata.com**20081007010605] 
[ftpd: remove debug messages
warner at allmydata.com**20081006231620] 
[ftpd: add native_client.php -based HTTP authentication scheme
warner at allmydata.com**20081006231511] 
[ftpd: add ftp.accounts checker, remove InMemoryPasswordChecker
warner at allmydata.com**20081006225124] 
[test_system: add test coverage for immutable download.ConsumerAdapter, remove debug messages
warner at allmydata.com**20081006225037] 
[ftp server: initial implementation. Still needs unit tests, custom Twisted patches. For #512
warner at allmydata.com**20081006195236] 
[test_cli.py: remove unused imports
warner at allmydata.com**20081007004204] 
[CLI: remove 'tahoe admin generate-keypair', since the pycryptopp ecdsa API is about to change incompatibly. We'll undo this once pycryptopp is updated
warner at allmydata.com**20081007002320] 
[docs: update architecture.txt 's section on the vdrive a.k.a. filesystem layer
zooko at zooko.com**20081006210500
 Remove some obsolete parts (correct at the time, now incorrect), change terminology to reflect my preference: s/vdrive/filesystem/ and s/dirnode/directory/, and make a few other small changes.
] 
[dirnode: fix my remarkably-consistent 'metdadata' typo
warner at allmydata.com**20081003010845] 
[interfaces: fix minor typo
warner at allmydata.com**20081003005249] 
[dirnode: add get_child_and_metadata_at_path
warner at allmydata.com**20081003005203] 
[stop using 'as' as an identifier: as with 'with', 'as' has become a reserved word in python 2.6
warner at allmydata.com**20081003002749] 
[scripts/admin: split up generate_keypair code so that unit tests can use it more easily
warner at allmydata.com**20081001235238] 
[docs: add some notes about things to do for a Tahoe release on pypi, freshmeat, and launchpad
zooko at zooko.com**20081001210703] 
[misc/cpu-watcher.tac: use writeaside-and-rename for the history.pickle file
warner at allmydata.com**20081001003053] 
[misc/spacetime: use async polling so we can add a 60-second timeout, add an index to the 'url' Axiom column for 2x speedup
warner at allmydata.com**20080930233448] 
[#518: replace various BASEDIR/* config files with a single BASEDIR/tahoe.cfg, with backwards-compatibility of course
warner at allmydata.com**20080930232149] 
[tolerate simplejson-2.0.0 and newer, which frequently return bytestrings instead of unicode objects. Closes #523
warner at allmydata.com**20080930222106] 
[munin/tahoe_doomsday: oops, tolerate 'null' in the timeleft results, to unbreak the 2wk/4wk graphs
warner at allmydata.com**20080930202051] 
[test_node: improve coverage of advertised_ip_addresses a bit
warner at allmydata.com**20080930060816] 
[testutil.PollMixin: set default timeout (to 100s), emit a more helpful error when the timeout is hit
warner at allmydata.com**20080930052309] 
[repair: fix test to map from storage index to directory structure properly (thanks, cygwin buildbot, for being so kloodgey that you won't accept random binary filenames and thus making me notice this bug)
zooko at zooko.com**20080926224913] 
[repairer: assert that the test code isn't accidentally allowing the repairer code which is being tested to do impossible things
zooko at zooko.com**20080926222353] 
[repairer: enhance the repairer tests
zooko at zooko.com**20080926174719
 Make sure the file can actually be downloaded afterward, that it used one of the 
 deleted and then repaired shares to do so, and that it repairs from multiple 
 deletions at once (without using more than a reasonable amount of calls to 
 storage server allocate).
] 
[netstring: add required_trailer= argument
warner at allmydata.com**20080926165754] 
[test_netstring.py: move netstring tests to a separate file
warner at allmydata.com**20080926165526] 
[move netstring() and split_netstring() into a separate util.netstring module
warner at allmydata.com**20080926043824] 
[repairer: remove a test that doesn't apply to the repair-from-corruption case
zooko at zooko.com**20080925220954] 
[repairer: add a test that repairer fixes corrupted shares (in addition to the test that it fixes deleted shares)
zooko at zooko.com**20080925220712] 
[docs: proposed mutable file crypto design with ECDSA, 96-bit private keys, and semi-private keys (from http://allmydata.org/~zooko/lafs.pdf )
zooko at zooko.com**20080925213457] 
[docs: mutable file crypto design (from http://allmydata.org/~zooko/lafs.pdf )
zooko at zooko.com**20080925213433] 
[repairer: fix swapped docstrings; thanks Brian
zooko at zooko.com**20080925182436] 
[trivial: remove unused imports; thanks, pyflakes
zooko at zooko.com**20080925180422] 
[trivial: remove unused imports -- thanks, pyflakes
zooko at zooko.com**20080925173453] 
[repairer: add basic test of repairer, move tests of immutable checker/repairer from test_system to test_immutable_checker, remove obsolete test helper code from test_filenode
zooko at zooko.com**20080925171653
 Hm...  "Checker" ought to be renamed to "CheckerRepairer" or "Repairer" at some point...
] 
[setup: remove a few minimal unit tests from test_filenode which have been obviated by much better tests in test_mutable and test_system
zooko at zooko.com**20080925161544] 
[gui/macapp: rough cut of ui tweaks; configurability, auto-mount
robk-tahoe at allmydata.com**20080925141224
 
 chatting with peter, two things the mac gui needed were the ability to mount
 the 'allmydata drive' automatically upon launching the app, and open the
 Finder to reveal it.  (also a request to hide the debug 'open webroot' stuff)
 
 this (somewhat rough) patch implements all the above as default behaviour
 
 it also contains a quick configuration mechanism for the gui - rather than a 
 preferences gui, running with a more 'tahoe' styled mechanism, the contents
 of a few optional files can modify the default behaviour, specifically file
 in ~/.tahoe/gui.conf control behaviour as follows:
 
 auto-mount (bool): if set (the default) then the mac app will, upon launch
 automatically mount the 'tahoe:' alias with the display name 'Allmydata'
 using a mountpoint of ~/.tahoe/mnt/__auto__
 
 auto-open (bool): if set (the default) then upon mounting a file system
 (including the auto-mount if set) finder will be opened to the mountpoint
 of the filesystem, which essentially reveals the newly mounted drive in a
 Finder window
 
 show-webopen (bool): if set (false by default) then the 'open webroot'
 action will be made available in both the dock and file menus of the app 
 
 daemon-timout (int): sets the daemon-timeout option passed into tahoe fuse
 when a filesystem is mounted. this defaults to 5 min
 
 files of type (int) much, naturally contain a parsable int representation.
 files of type (bool) are considered true if their (case-insensitive) contents
 are any of ['y', 'yes', 'true', 'on', '1'] and considered false otherwise.
 
] 
[gui/macapp: improve 'about' box
robk-tahoe at allmydata.com**20080925135415
 
 adds exactly 1 metric dollop of professionalism to the previously
 rather amateurish looking about box.
] 
[fuse/impl_c: UNDO --auto-fsid option
robk-tahoe at allmydata.com**20080925134730
 
 rolling back:
 
 Thu Sep 25 14:42:23 BST 2008  robk-tahoe at allmydata.com
   * fuse/impl_c: add --auto-fsid option
   
   this was inspired by reading the fuse docs and discovering the 'fsid' option
   to fuse_main, and was _intended_ to support a sort of 'stability' to the 
   filesystem (specifically derived from the root-uri mounted, whether directly
   or via an alias) to support mac aliases across unmount/remount etc.
   
   some experimentation shows that that doesn't actually work, and that, at
   least for mac aliases in my testing, they're tied to path-to-mountpoint and
   not to the fsid - which seems to have no bearing.  perhaps the 'local' flag
   is causing weirdness therein.
   
   at any rate, I'm recording it simply for posterity, in case it turns out to
   be useful after all somewhere down the road.
   
 
     M ./contrib/fuse/impl_c/blackmatch.py +13
] 
[fuse/impl_c: add --auto-fsid option
robk-tahoe at allmydata.com**20080925134223
 
 this was inspired by reading the fuse docs and discovering the 'fsid' option
 to fuse_main, and was _intended_ to support a sort of 'stability' to the 
 filesystem (specifically derived from the root-uri mounted, whether directly
 or via an alias) to support mac aliases across unmount/remount etc.
 
 some experimentation shows that that doesn't actually work, and that, at
 least for mac aliases in my testing, they're tied to path-to-mountpoint and
 not to the fsid - which seems to have no bearing.  perhaps the 'local' flag
 is causing weirdness therein.
 
 at any rate, I'm recording it simply for posterity, in case it turns out to
 be useful after all somewhere down the road.
 
] 
[manhole: be more tolerant of authorized_keys. files in .tahoe
robk-tahoe at allmydata.com**20080925031149
 
 both peter and I independently tried to do the same thing to eliminate the
 authorized_keys file which was causing problems with the broken mac build
 (c.f. #522) namely mv authorized_keys.8223{,.bak}  but the node is, ahem,
 let's say 'intolerant' of the trailing .bak - rather than disable the
 manhole as one might expect, it instead causes the node to explode on
 startup.  this patch makes it skip over anything that doesn't pass the
 'parse this trailing stuff as an int' test.
] 
[fuse/impl_c: move mac tahoefuse impl out into contrib/fuse
robk-tahoe at allmydata.com**20080925014214
 
 For a variety of reasons, high amongst them the fact that many people 
 interested in fuse support for tahoe seem to have missed its existence,
 the existing fuse implementation for tahoe, previously 'mac/tahoefuse.py'
 has been renamed and moved.
 
 It was suggested that, even though the mac build depends upon it, that
 the mac/tahoefuse implementation be moved into contrib/fuse along with
 the other fuse implementations.  The fact that it's not as extensively
 covered by unit tests as mainline tahoe was given as corroboration.
 
 In a bid to try and stem the confusion inherent in having tahoe_fuse,
 tfuse and tahoefuse jumbled together (not necessarily helped by 
 referring to them as impl_a, b and c respectively) I'm hereby renaming
 tahoefuse as 'blackmatch'  (black match is, per wikipedia "a type of 
 crude fuse" hey, I'm a punny guy)  Maybe one day it'll be promoted to
 be 'quickmatch' instead...
 
 Anyway, this patch moves mac/tahoefuse.py out to contrib/fuse/impl_c/
 as blackmatch.py, and makes appropriate changes to the mac build process
 to transclude blackmatch therein.  this leaves the extant fuse.py and
 fuseparts business in mac/ as-is and doesn't attempt to address such
 issues in contrib/fuse/impl_c.
 
 it is left as an exercise to the reader (or the reader of a message
 to follow) as to how to deal with the 'fuse' python module on the mac.
 
 as of this time, blackmatch should work on both mac and linux, and
 passes the four extant tests in runtests.  (fwiw neither impl_a nor
 impl_b have I managed to get working on the mac yet)
 
 since blackmatch supports a read-write and caching fuse interface to
 tahoe, some write tests obviously need to be added to runtests.
 
] 
[macapp: changes to support aliases, updated tahoefuse command line options
robk-tahoe at allmydata.com**20080925010128
 
 the tahoefuse command line options changed to support the runtests harness,
 and as part of that gained support for named aliases via --alias
 
 this changes the mac app's invocation of tahoefuse to match that, and also
 changes the gui to present the list of defined aliases as valid mounts
 
 this replaces the previous logic which examined the ~/.tahoe/private directory
 looking for files ending in '.cap' - an ad-hoc alias mechanism.
 
 if a file is found matching ~/.tahoe/private/ALIASNAME.icns then that will still
 be passed to tahoefuse as the icon to display for that filesystem. if no such
 file is found, the allmydata icon will be used by default.
 
 the '-olocal' option is passed to tahoefuse.  this is potentially contentious.
 specifically this is telling the OS that this is a 'local' filesystem, which is
 intended to be used to locally attached devices.  however leopard (OSX 10.5)
 will only display non-local filesystems in the Finder's side bar if they are of
 fs types specifically known by Finder to be network file systems (nfs, cifs, 
 webdav, afp)  hence the -olocal flag is the only way on leopard to cause finder
 to display the mounted filesystem in the sidebar, but it displays as a 'device'.
 there is a potential (i.e. the fuse docs carry warnings) that this may cause
 vague and unspecified undesirable behaviour.
 (c.f. http://code.google.com/p/macfuse/wiki/FAQ specifically Q4.3 and Q4.1)
 
 
] 
[fuse/impl_c: reworking of mac/tahoefuse, command line options, test integration
robk-tahoe at allmydata.com**20080925001535
 
 a handful of changes to the tahoefuse implementation used by the mac build, to 
 make command line option parsing more flexible and robust, and moreover to 
 facilitate integration of this implementation with the 'runtests' test harness
 used to test the other two implementations.
 
 this patch includes;
 - improvements to command line option parsing [ see below ]
 - support for 'aliases' akin to other tahoe tools
 - tweaks to support linux (ubuntu hardy)
 
 the linux support tweaks are, or at least seem to be, a result of the fact that
 hardy ships with fuse 0.2pre3, as opposed to the fuse0.2 that macfuse is based
 upon.  at least the versions I was working with have discrepencies in their
 interfaces, but on reflection this is probably a 'python-fuse' version issue
 rather than fuse per se.  At any rate, the fixes to handling the Stat objects
 should be safe against either version, it's just that the bindings on hardy
 lacked code that was in the 'fuse' python module on the mac...
 
 command line options:
 
 the need for more flexible invocation in support of the runtests harness led
 me to rework the argument parsing from some simple positional hacks with a
 pass-through of the remainder to the fuse binding's 'fuse_main' to a system
 using twisted.usage to parse arguments, and having just one option '-o' being
 explicitly a pass-through for -o options to fuse_main. the options are now:
 
 --node-directory NODEDIR : this is used to look up the node-url to connect
 to if that's not specified concretely on the command line, and also used to
 determine the location of the cache directory used by the implementation,
 specifically '_cache' within the nodedir.  default value: ~/.tahoe
 
 --node-url NODEURL : specify a node-url taking precendence over that found
 in the node.url file within the nodedir
 
 --alias ALIAS : specifies the named alias should be mounted. a lookup is
 performed in the alias table within 'nodedir' to find the root dir cap
 the named alias must exist in the alias table of the specified nodedir
 
 --root-uri ROOTURI : specifies that the given directory uri should be mounted
 
 at least one of --alias and --root-uri must be given (which directory to mount
 must be specified somehow)  if both are given --alias takes precedence.
 
 --cache-timeout TIMEOUTSECS : specifies the number of seconds that cached
 directory data should be considered valid for.  this tahoefuse implementation
 implements directory caching for a limited time; largely because the mac (i.e.
 the Finder in particular) tends to make a large number of requests in quick 
 successsion when browsing the filesystem.  on the flip side, the 'runtests'
 unit tests fail in the face of such caching because the changes made to the
 underlying tahoe directories are not reflected in the fuse presentation.  by 
 specifying a cache-timeout of 0 seconds, runtests can force the fuse layer
 into refetching directory data upon each request.
 
 any number of -oname=value options may be specified on the command line,
 and they will all be passed into the underlying fuse_main call.
 
 a single non-optional argument, the mountpoint, must also be given.
 
 
 
] 
[fuse/tests: slew of changes to fuse 'runtests'
robk-tahoe at allmydata.com**20080924183601
 
 This patch makes a significant number of changes to the fuse 'runtests' script
 which stem from my efforts to integrate the third fuse implementation into this
 framework.  Perhaps not all were necessary to that end, and I beg nejucomo's
 forebearance if I got too carried away.
 
 - cleaned up the blank lines; imho blank lines should be empty
 
 - made the unmount command switch based on platform, since macfuse just uses
 'umount' not the 'fusermount' command (which doesn't exist)
 
 - made the expected working dir for runtests the contrib/fuse dir, not the 
 top-level tahoe source tree - see also discussion of --path-to-tahoe below
 
 - significantly reworked the ImplProcManager class.  rather than subclassing
 for each fuse implementation to be tested, the new version is based on 
 instantiating objects and providing relevant config info to the constructor.
 this was motivated by a desire to eliminate the duplication of similar but
 subtly different code between instances, framed by consideration of increasing
 the number of platforms and implementations involved. each implementation to
 test is thus reduced to the pertinent import and an entry in the 
 'implementations' table defining how to handle that implementation. this also
 provides a way to specify which sets of tests to run for each implementation,
 more on that below.
 
 
 - significantly reworked the command line options parsing, using twisted.usage;
 
 what used to be a single optional argument is now represented by the 
 --test-type option which allows one to choose between running unittests, the
 system tests, or both.
 
 the --implementations option allows for a specific (comma-separated) list of
 implemenations to be tested, or the default 'all'
 
 the --tests option allows for a specific (comma-separated) list of tests sets
 to be run, or the default 'all'.  note that only the intersection of tests
 requested on the command line and tests relevant to each implementation will
 be run. see below for more on tests sets.
 
 the --path-to-tahoe open allows for the path to the 'tahoe' executable to be
 specified. it defaults to '../../bin/tahoe' which is the location of the tahoe
 script in the source tree relative to the contrib/fuse dir by default.
 
 the --tmp-dir option controls where temporary directories (and hence 
 mountpoints) are created during the test.  this defaults to /tmp - a change
 from the previous behaviour of using the system default dir for calls to 
 tempfile.mkdtemp(), a behaviour which can be obtained by providing an empty
 value, e.g. "--tmp-dir=" 
 
 the --debug-wait flag causes the test runner to pause waiting upon user
 input at various stages through the testing, which facilitates debugging e.g.
 by allowing the user to open a browser and explore or modify the contents of
 the ephemeral grid after it has been instantiated but before tests are run,
 or make environmental adjustments before actually triggering fuse mounts etc.
 note that the webapi url for the first client node is printed out upon its
 startup to facilitate this sort of debugging also.
 
 
 - the default tmp dir was changed, and made configurable. previously the 
 default behaviour of tempfile.mkdtemp() was used.  it turns out that, at least
 on the mac, that led to temporary directories to be created in a location
 which ultimately led to mountpoint paths longer than could be handled by 
 macfuse - specifically mounted filesystems could not be unmounted and would
 'leak'. by changing the default location to be rooted at /tmp this leads to
 mountpoint paths short enough to be supported without problems.
 
 - tests are now grouped into 'sets' by method name prefix.  all the existing
 tests have been moved into the 'read' set, i.e. with method names starting
 'test_read_'. this is intended to facilitate the fact that some implementations
 are read-only, and some support write, so the applicability of tests will vary
 by implementation. the 'implementations' table, which governs the configuration
 of the ImplProcManager responsible for a given implementation, provides a list
 of 'test' (i.e test set names) which are applicable to that implementation.
 note no 'write' tests yet exist, this is merely laying the groundwork.
 
 - the 'expected output' of the tahoe command, which is checked for 'surprising'
 output by regex match, can be confused by spurious output from libraries.
 specfically, testing on the mac produced a warning message about zope interface
 resolution various multiple eggs.  the 'check_tahoe_output()' function now has
 a list of 'ignorable_lines' (each a regex) which will be discarded before the
 remainder of the output of the tahoe script is matched against expectation.
 
 - cleaned up a typo, and a few spurious imports caught by pyflakes
 
] 
[fuse/impl_{a,b}: improve node-url handling
robk-tahoe at allmydata.com**20080924182854
   
 specifically change the expectation of the code to be such that the node-url
 (self.url) always includes the trailing slash to be a correctly formed url
 
 moreover read the node-url from the 'node.url' file found in the node 'basedir'
 and only if that doesn't exist, then fall back to reading the 'webport' file
 from therein and assuming localhost.  This then supports the general tahoe 
 pattern that tools needing only a webapi server can be pointed at a directory
 containing the node.url file, which can optionally point to another server,
 rather than requiring a complete node dir and locally running node instance.
 
] 
[fuse/impl_b: tweaks from testing on hardy
robk-tahoe at allmydata.com**20080924180738
 
 from testing on linux (specifically ubuntu hardy) the libfuse dll has a
 different name, specifically libfuse.so.2. this patch tries libfuse.so
 and then falls back to trying .2 if the former fails.
 
 it also changes the unmount behaviour, to simply return from the handler's
 loop_forever() loop upon being unmounted, rather than raising an EOFError,
 since none of the client code I looked at actually handled that exception,
 but did seem to expect to fall off of main() when loop_forever() returned.
 Additionally, from my testing unmount typically led to an OSError from the
 fuse fd read, rather than an empty read, as the code seemed to expect.
 
 also removed a spurious import pyflakes quibbled about.
] 
[setup: fix site-dirs to find system installed twisted on mac.
robk-tahoe at allmydata.com**20080924174255
 
 zooko helped me unravel a build weirdness today.  somehow the system installed
 twisted (/System/Library) was pulling in parts of the other twisted (/Library)
 which had been installed by easy_install, and exploding. 
 
 getting rid of the latter helped, but it took this change to get the tahoe
 build to stop trying to rebuild twisted and instead use the one that was 
 already installed. c.f. tkt #229
] 
[CLI: rework webopen, and moreover its tests w.r.t. path handling
robk-tahoe at allmydata.com**20080924164523
 
 in the recent reconciliation of webopen patches, I wound up adjusting
 webopen to 'pass through' the state of the trailing slash on the given
 argument to the resultant url passed to the browser.  this change 
 removes the requirement that arguments must be directories, and allows
 webopen to be used with files.  it also broke the tests that assumed
 that webopen would always normalise the url to have a trailing slash.
 
 in fixing the tests, I realised that, IMHO, there's something deeply
 awry with the way tahoe handles paths; specifically in the combination
 of '/' being the name of the root path within an alias, but a leading
 slash on paths, e.g. 'alias:/path', is catagorically incorrect. i.e.
  'tahoe:' == 'tahoe:/' == '/' 
 but 'tahoe:/foo' is an invalid path, and must be 'tahoe:foo'
 
 I wound up making the internals of webopen simply spot a 'path' of
 '/' and smash it to '', which 'fixes' webopen to match the behaviour
 of tahoe's path handling elsewhere, but that special case sort of
 points to the weirdness.
 
 (fwiw, I personally found the fact that the leading / in a path was
 disallowed to be weird - I'm just used to seeing paths qualified by
 the leading / I guess - so in a debate about normalising path handling
 I'd vote to include the /)
 
] 
[CLI: reconcile webopen changes
robk-tahoe at allmydata.com**20080924152002
 
 I think this is largely attributable to a cleanup patch I'd made
 which never got committed upstream somehow, but at any rate various
 conflicting changes to webopen had been made. This cleans up the
 conflicts therein, and hopefully brings 'tahoe webopen' in line with
 other cli commands.
] 
[cli: cleanup webopen command
robk-tahoe at allmydata.com**20080618201940
 
 moved the body of webopen out of cli.py into tahoe_webopen.py
 
 made its invocation consistent with the other cli commands, most
 notably replacing its 'vdrive path' with the same alias parsing,
 allowing usage such as 'tahoe webopen private:Pictures/xti'
] 
[macapp: changed to remove 'Tahoe' from .app name
robk-tahoe at allmydata.com**20080611003145
 
 Change the build product from 'Allmydata Tahoe' to 'Allmydata'
 more inkeeping with the branding of the Allmydata product
] 
[add --syslog argument to 'tahoe start' and 'tahoe restart', used to pass --syslog to twistd for non-Tahoe nodes (like cpu-watcher)
warner at allmydata.com**20080925010302] 
[misc/make-canary-files.py: tool to create 'canary files', explained in the docstring
warner at allmydata.com**20080925004716] 
[webapi: survive slashes in filenames better: make t=info and t=delete to work, and let t=rename fix the problem
warner at allmydata.com**20080924203505] 
[setup: when detecting platform, ask the Python Standard Library's platform.dist() before executing lsb_release, and cache the result in global (module) variables
zooko at zooko.com**20080924180922
 This should make it sufficiently fast, while still giving a better answer on Ubuntu than platform.dist() currently does, and also falling back to lsb_release if platform.dist() says that it doesn't know.
] 
[node.py: add BASEDIR/keepalive_timeout and BASEDIR/disconnect_timeout, to set/enable the foolscap timers, for #521
warner at allmydata.com**20080924175112] 
[setup: stop catching EnvironmentError when attempting to copy ./_auto_deps.py to ./src/allmydata/_auto_deps.py
zooko at zooko.com**20080924000402
 It is no longer the case that we can run okay without _auto_deps.py being in place in ./src/allmydata, so if that cp fails then the build should fail.
] 
[immutable: remove unused imports (thanks, pyflakes)
zooko at zooko.com**20080923192610] 
[immutable: refactor immutable filenodes and comparison thereof
zooko at zooko.com**20080923185249
 * the two kinds of immutable filenode now have a common base class
 * they store only an instance of their URI, not both an instance and a string
 * they delegate comparison to that instance
] 
[setup: try parsing /etc/lsb-release first, then invoking lsb_release, because the latter takes half-a-second on my workstation, which is too long
zooko at zooko.com**20080923171431
 Also because in some cases the former will work and the latter won't.
 This patch also tightens the regexes so it won't match random junk.
] 
[setup: fix a cut-and-paste error in the fallback to parsing /etc/lsb-release
zooko at zooko.com**20080923165551] 
[setup: if executing lsb_release doesn't work, fall back to parsing /etc/lsb-release before falling back to platform.dist()
zooko at zooko.com**20080923162858
 An explanatio of why we do it this way is in the docstring.
] 
[setup: if invoking lsb_release doesn't work (which it doesn't on our etch buildslave), then fall back to the Python Standard Library's platform.dist() function
zooko at zooko.com**20080923154820] 
[setup: fix bug in recent patch to use allmydata.get_package_versions() to tell the foolscap app-version-tracking what's what
zooko at zooko.com**20080923001347] 
[setup: when using the foolscap "what versions are here?" feature, use allmydata.get_package_versions() instead of specifically importing allmydata, pycryptopp, and zfec
zooko at zooko.com**20080923000351] 
[setup: simplify the implementation of allmydata.get_package_versions() and add "platform" which is a human-oriented summary of the underlying operating system and machine
zooko at zooko.com**20080922235354] 
[misc/make_umid: change docs, make elisp code easier to grab
warner at lothar.com**20080920183933] 
[use foolscap's new app_versions API, require foolscap-0.3.1
warner at lothar.com**20080920183853] 
[BASEDIR/nickname is now UTF-8 encoded
warner at lothar.com**20080920183713] 
[various: use util.log.err instead of twisted.log.err, so we get both Incidents and trial-test-flunking
warner at lothar.com**20080920173545] 
[logging.txt: explain how to put log.err at the end of Deferred chains, explain FLOGTOTWISTED=1
warner at lothar.com**20080920173500] 
[util.log: send log.err to Twisted too, so that Trial tests are flunked
warner at lothar.com**20080920173427] 
[setup.py trial: improve --verbose suggestion a bit
warner at lothar.com**20080919193922] 
[test_cli: disable generate-keypair test on OS-X, pycryptopp still has a bug
warner at lothar.com**20080919193855] 
[NEWS: finish editing for the upcoming 1.3.0 release
warner at lothar.com**20080919193053] 
[NEWS: more edits, almost done
warner at lothar.com**20080919010036] 
[NEWS: describe all changes since the last release. Still needs editing.
warner at lothar.com**20080919002755] 
[CLI: add 'tahoe admin generate-keypair' command
warner at lothar.com**20080919001133] 
[web: add 'more info' pages for files and directories, move URI/checker-buttons/deep-size/etc off to them
warner at lothar.com**20080918050041] 
[setup.py: remove unused 'Extension' import
warner at lothar.com**20080917230829] 
[setup.py,Makefile: move the 'chmod +x bin/tahoe' into setup.py
warner at lothar.com**20080917230756] 
[docs/install.html: reference InstallDetails instead of debian-specific stuff
warner at lothar.com**20080917225742] 
[Makefile,docs: tahoe-deps.tar.gz now lives in separate source/deps/ directory on http://allmydata.org
warner at lothar.com**20080917204452] 
[docs: mention -SUMO tarballs, point users at release tarballs instead of development ones
warner at lothar.com**20080917203631] 
[setup.py,Makefile: teat sdist --sumo about tahoe-deps/, use -SUMO suffix on tarballs, add sumo to 'make tarballs' target
warner at lothar.com**20080917200119] 
[.darcs-boringfile ignore tahoe-deps and tahoe-deps.tar.gz
warner at lothar.com**20080917195938] 
[docs: add a note about the process of making a new Tahoe release
zooko at zooko.com**20080917170839] 
[Makefile: pyutil from a dependent lib causes a #455-ish problem, the workaround is to run build-once *three* times
warner at lothar.com**20080917053643] 
[Makefile: desert-island: don't re-fetch tahoe-deps.tar.gz if it's already there, remove the tahoe-deps/ before untarring directory to avoid unpacking weirdness
warner at lothar.com**20080917052204] 
[misc/check-build.py: ignore the 'Downloading file:..' line that occurs for the setup_requires= -triggered handling of the setuptools egg
warner at lothar.com**20080917051725] 
[#249: add 'test-desert-island', to assert that a tahoe-deps.tar.gz -enabled build does not download anything
warner at lothar.com**20080917013702] 
[#249: get dependent libs from tahoe-deps and ../tahoe-deps
warner at lothar.com**20080917013627] 
[#249: move dependent libs out of misc/dependencies/, get them from tahoe-deps.tar.gz instead
warner at allmydata.com**20080917012545] 
[conf_wiz.py - updating version numbers in file, should really get these from a TAG or conf file
secorp at allmydata.com**20080917004547] 
[webish: add an extra newline to JSON output
warner at lothar.com**20080915204314] 
[windows/Makefile: fix dependencies: windows-installer must cause windows-exe to run
warner at allmydata.com**20080912052151] 
[Makefile: fix windows issues
warner at allmydata.com**20080912050919] 
[Makefile: use run_with_pythonpath, move windows targets into a separate Makefile
warner at allmydata.com**20080912044508] 
[setup.py: add 'setup.py run_with_pythonpath', to run other commands with PYTHONPATH set usefully
warner at allmydata.com**20080912044418] 
[Makefile: convert check-auto-deps target into 'setup.py check_auto_deps'
warner at allmydata.com**20080912035904] 
[startstop_node.py: find twistd in our supportlib if we had to build Twisted as a setuptools dependency. This is a form of cgalvan's #505 patch, simplified because now 'setup.py trial' takes care of sys.path and PYTHONPATH
warner at allmydata.com**20080912025138] 
[rewrite parts of the Makefile in setup.py. Add 'build_tahoe' and 'trial' subcommands.
warner at allmydata.com**20080912010321
 
 The 'make build' target now runs 'setup.py build_tahoe', which figures out
 where the target 'supportlib' directory should go, and invokes 'setup.py
 develop' with the appropriate arguments.
 
 The 'make test' target now runs 'setup.py trial', which manages sys.path and
 runs trial as a subroutine instead of spawning an external process. This
 simplifies the case where Twisted was built as a dependent library (and thus
 the 'trial' executable is not on PATH).
 
 setup.py now manages sys.path and PYTHONPATH for its internal subcommands, so
 the $(PP) prefix was removed from all Makefile targets that invoke setup.py .
 For the remaining ones, the 'setup.py -q show_pythonpath' subcommand was
 added to compute this prefix with python rather than with fragile
 shell/Makefile syntax.
 
 
] 
[bin/tahoe: reflow error messages
warner at allmydata.com**20080912010225] 
[mac/Makefile: remove the verbose hdiutil diagnostics now that we resolved the problem
warner at allmydata.com**20080912004622] 
[Makefile: give setup.py develop a '--site-dirs' arg to work around the #249 setuptools bug which causes us to unnecessarily rebuild pyopenssl and other support libs installed via debian's python-support. Should be harmless on other platforms.
warner at allmydata.com**20080910233432] 
[web: fix output=JSON, add buttons for repair/json to the 'run deep-check' form
warner at allmydata.com**20080910211137] 
[disallow deep-check on non-directories, simplifies the code a bit
warner at allmydata.com**20080910204458] 
[dirnode: refactor recursive-traversal methods, add stats to deep_check() method results and t=deep-check webapi
warner at lothar.com**20080910084504] 
[dirnode: cleanup, make get_verifier() always return a URI instance, not a string
warner at lothar.com**20080910083755] 
[test_system: check t=deep-stats too
warner at lothar.com**20080910065457] 
[test_system: add deep-check-JSON tests, fix a bug
warner at lothar.com**20080910061416] 
[test_system: oops, re-enable some tests that got bypassed
warner at lothar.com**20080910060245] 
[test_system: add deep-stats test
warner at lothar.com**20080910055634] 
[hush pyflakes
warner at allmydata.com**20080910025017] 
[checker results: add output=JSON to webapi, add tests, clean up APIs
warner at allmydata.com**20080910024517
 to make the internal ones use binary strings (nodeid, storage index) and
 the web/JSON ones use base32-encoded strings. The immutable verifier is
 still incomplete (it returns imaginary healty results).
] 
[immutable verifier: provide some dummy results so deep-check works, make the tests ignore these results until we finish it off
warner at allmydata.com**20080910010827] 
[mutable checker: even more tests. Everything in ICheckerResults should be covered now, except for immutable-verify which is incomplete
warner at allmydata.com**20080910005706] 
[checker results: more tests, update interface docs
warner at allmydata.com**20080910003010] 
[mutable checker: oops, fix redefinition of 'healthy' (numshares < N, not numshares < k, which is 'recoverable' not 'healthy')
warner at allmydata.com**20080910002853] 
[checker results: more tests, more results. immutable verifier tests are disabled until they emit more complete results
warner at allmydata.com**20080910001546] 
[checker: add tests, add stub for immutable check_and_repair
warner at allmydata.com**20080909233449] 
[interfaces.py: minor improvement to IDirectoryNode.set_node
warner at allmydata.com**20080909233416] 
[mac/Makefile: upload the .dmg file with foolscap xfer-client.py instead of scp
warner at allmydata.com**20080908231943] 
[misc/xfer-client.py: small foolscap utility to transfer a file to a waiting server
warner at allmydata.com**20080908231903] 
[setup: add excited DEVELOPER NOTE to install.html
zooko at zooko.com**20080908215603
 It should be removed before 1.3.0 release, of course...
] 
[setup: edit the text of install.html
zooko at zooko.com**20080908215549] 
[setup: add link to the DownloadDebianPackages page
zooko at zooko.com**20080908215451
 Because I want that link off of the front page of the wiki...
] 
[setup: change URL from which to get source tarballs
zooko at zooko.com**20080908215409
 So that when you look at that directory you won't see distracting other things such as darcs repositories.
] 
[test_system: make log() tolerate the format= form
warner at lothar.com**20080908030336] 
[immutable/checker: make log() tolerate the format= form
warner at lothar.com**20080908030308] 
[checker: overhaul checker results, split check/check_and_repair into separate methods, improve web displays
warner at allmydata.com**20080907194456] 
[webapi.txt: explain that t=manifest gives verifycaps
warner at allmydata.com**20080907192950] 
[introducer: add get_nickname_for_peerid
warner at allmydata.com**20080906050700] 
[docs/logging.txt: explain tahoe/foolscap logging. Addresses #239.
warner at allmydata.com**20080904002531] 
[setup: don't assert that trial is present when the Makefile is evaluated
zooko at zooko.com**20080903171837
 This should fix #506, but it means that if (for some weird reason) Twisted can't be auto-installed and the find_trial.py script doesn't work, the user will get a weird failure message instead of a clean failure message explaining that trial couldn't be found.  Oh well.
 
 Chris Galvan is working on a much nicer fix to all these issues -- see #505.
 
] 
[testutil.PollMixin: use a custom exception (and convert it) to avoid the ugly 'stash' cycle
warner at allmydata.com**20080903033251] 
[mac/Makefile: more attempts to debug the buildslave failure
warner at allmydata.com**20080829220614] 
[mac: add -verbose to the hdiutil call, to figure out why it's failing on the buildslave
warner at allmydata.com**20080829205243] 
[setup: simplify parsing of python version number
zooko at zooko.com**20080829000045] 
[setup: emit the version of python in the list of versions
zooko at zooko.com**20080828220454] 
[munin: add tahoe_diskleft plugin, update spacetime/diskwatcher.tac to support it
warner at allmydata.com**20080828203236] 
[docs: how_to_make_a_tahoe_release.txt
zooko at zooko.com**20080828202109
 Just some cryptic notes to self, but if I get hit by a truck then someone else might be able to decode them.
] 
[debian: include misc/cpu-watcher.tac in the debian package
warner at allmydata.com**20080827223026] 
[munin/tahoe_doomsday: change the graph title, 'time predictor' is more accurate than 'space predictor'
warner at allmydata.com**20080827213013] 
[munin/tahoe_diskusage: clip the graph at zero, to prevent transient negative excursions (such as when a lot of old logfiles are deleted from a storage server's disk) from scaling the graph into unusability
warner at allmydata.com**20080827193543] 
[CREDITS: thanks to Chris Galvan
zooko at zooko.com**20080827183950] 
[setup: patch from Chris Galvan to build sdists with no deps in them normally, but include deps if --sumo
zooko at zooko.com**20080827182644] 
[servermap: don't log late arrivals, and don't log DeadReferenceError at log.WEIRD
warner at allmydata.com**20080827003729] 
[mutable: make mutable-repair work for non-verifier runs, add tests
warner at allmydata.com**20080826233454] 
[mutable: remove work-around for a flaw in an older version of foolscap
zooko at zooko.com**20080826155055
 We now require "foolscap[secure_connections] >= 0.3.0", per [source:_auto_deps.py].
] 
[docs: edit install.html a tad
zooko at zooko.com**20080826154929] 
[misc/make_umid: little script and elisp fragment to insert umid= arguments
warner at allmydata.com**20080826015918] 
[logging: add 'unique-message-ids' (or 'umids') to each WEIRD-or-higher log.msg call, to make it easier to correlate log message with source code
warner at allmydata.com**20080826015759] 
[logging cleanups: lower DeadReferenceError from WEIRD (which provokes Incidents) to merely UNUSUAL, don't pre-format Failures in others
warner at allmydata.com**20080826005155] 
[checker: make the log() function of SimpleCHKFileVerifier compatible with the log() function of its superclasses and subclasses
zooko at zooko.com**20080825214407] 
[docs: warn that the "garbage-collection and accounting" section of architecture.txt is out of date, and clarify that "deleted" therein means ciphertext getting garbage-collected
zooko at zooko.com**20080822154605] 
[docs/filesystem-notes.txt: add notes about enabling the 'directory index' feature on ext3 filesystems for storage server lookup speed
warner at allmydata.com**20080821205901] 
[setup: doc string describing what the require_auto_deps() function is for
zooko at zooko.com**20080815172234] 
[mutable/checker: log a WEIRD-level event when we see a hash failure, to trigger an Incident
warner at allmydata.com**20080813035020] 
[immutable checker: add a status_report field
warner at allmydata.com**20080813033530] 
[mutable/servermap: lower the priority of many log messages
warner at allmydata.com**20080813033506] 
[web/deep-check: show the webapi runtime at the bottom of the page
warner at allmydata.com**20080813033426] 
[CLI: tolerate blank lines in the aliases file
warner at allmydata.com**20080813025050] 
[test_web: workaround broken HEAD behavior in twisted-2.5.0 and earlier
warner at allmydata.com**20080813024520] 
[test_web: oops, actually use HEAD (instead of GET) in the HEAD test
warner at allmydata.com**20080813020451] 
[web: use get_size_of_best_version for HEAD requests, provide correct content-type
warner at allmydata.com**20080813020410] 
[mutable: add get_size_of_best_version to the interface, to simplify the web HEAD code, and tests
warner at allmydata.com**20080813020252] 
[CLI: add 'tahoe debug corrupt-share', and use it for deep-verify tests, and fix non-deep web checker API to pass verify=true into node
warner at allmydata.com**20080813000501] 
[IFilesystemNode: add get_storage_index(), it makes tests easier
warner at allmydata.com**20080812231407] 
[test_system: rename Checker to ImmutableChecker, to make room for a mutable one
warner at allmydata.com**20080812225932] 
['tahoe debug dump-share': add --offsets, to show section offsets
warner at allmydata.com**20080812214656] 
[test_cli: oops, fix tests after recent stdout/stderr cleanup
warner at allmydata.com**20080812214634] 
[scripts/debug: split out dump_immutable_share
warner at allmydata.com**20080812205517] 
[scripts/debug: clean up use of stdout/stderr
warner at allmydata.com**20080812205242] 
[CLI: move the 'repl' command to 'tahoe debug repl'
warner at allmydata.com**20080812204017] 
[CLI: move all debug commands (dump-share, dump-cap, find-shares, catalog-shares) into a 'debug' subcommand, and improve --help output
warner at allmydata.com**20080812203732] 
[hush a pyflakes warning
warner at allmydata.com**20080812042423] 
[web/directory: enable verify=true in t=deep-check
warner at allmydata.com**20080812042409] 
[dirnode: add some deep-check logging
warner at allmydata.com**20080812042338] 
[checker_results.problems: don't str the whole Failure, just extract the reason string
warner at allmydata.com**20080812042306] 
[checker: add information to results, add some deep-check tests, fix a bug in which unhealthy files were not counted
warner at allmydata.com**20080812040326] 
[mutable/checker: rearrange a bit, change checker-results to have a status_report string
warner at allmydata.com**20080812032033] 
[mutable/servermap: add summarize_version
warner at allmydata.com**20080812031930] 
[CLI: make 'tahoe webopen' command accept aliases like 'tahoe ls'
warner at allmydata.com**20080812012023] 
[munin diskusage/doomsday: oops, fix labels, everything was reported in the 1hr column
warner at allmydata.com**20080811203431] 
[munin/tahoe_overhead: don't emit non-sensicial numbers
warner at lothar.com**20080807214008] 
[munin: add tahoe_overhead plugin, to measure effectiveness of GC and deleting data from inactive accounts
warner at lothar.com**20080807203925] 
[diskwatcher.tac: include total-bytes-used
warner at lothar.com**20080807201214] 
[setup: remove accidentally duplicated lines from Makefile
zooko at zooko.com**20080807193029] 
[misc/dependencies: remove the no-longer-useful foolscap-0.2.5 tarball
warner at lothar.com**20080807184546] 
[Makefile: avoid bare quotes, since the emacs syntax-highlighter gets confused by them
warner at lothar.com**20080807183001] 
[diskwatcher.tac: don't report negative timeleft
warner at lothar.com**20080807173433] 
[diskwatcher.tac: reduce the polling rate to once per hour
warner at lothar.com**20080807062021] 
[misc/spacetime: add munin plugins, add everything to .deb
warner at lothar.com**20080807060003] 
[diskwatcher.tac: hush pyflakes
warner at lothar.com**20080807050427] 
[diskwatcher.tac: add async-GET code, but leave it commented out: urlopen() seems to work better for now
warner at lothar.com**20080807050327] 
[cpu-watcher.tac: improve error message
warner at lothar.com**20080807043801] 
[disk-watcher: first draft of a daemon to use the HTTP stats interface and its new storage_server.disk_avail feature, to track changes in disk space over time
warner at lothar.com**20080807042222] 
[misc/cpu-watcher.tac: tolerate missing pidfiles, just skip over that sample
warner at lothar.com**20080807041705] 
[setup: don't attempt to escape quote marks, just delete them.  Ugly, but it works okay.
zooko at zooko.com**20080806232742] 
[setup: escape any double-quote chars in the PATH before using the PATH to find and invoke trial
zooko at zooko.com**20080806231143] 
[storage: include disk-free information in the stats-gatherer output
warner at lothar.com**20080806210602] 
[mutable: more repair tests, one with force=True to check out merging
warner at lothar.com**20080806190607] 
[test/common: add ShouldFailMixin
warner at lothar.com**20080806190552] 
[test_mutable: add comment about minimal-bandwidth repairer, comma lack of
warner at lothar.com**20080806173850] 
[test_mutable: factor out common setup code
warner at lothar.com**20080806173804] 
[mutable: start adding Repair tests, fix a simple bug
warner at lothar.com**20080806061239] 
[mutable.txt: add warning about out-of-date section
warner at lothar.com**20080806061219] 
[test_system: factor out find_shares/replace_shares to a common class, so they can be used by other tests
warner at lothar.com**20080806014958] 
[debian/control: update dependencies to match _auto_deps: foolscap-0.3.0, pycryptopp-0.5
warner at lothar.com**20080806013222] 
[bump foolscap dependency to 0.3.0, for the new incident-gathering interfaces
warner at lothar.com**20080805235828] 
[web: add 'report incident' button at the bottom of the welcome page
warner at lothar.com**20080805190921] 
[test_cli: more coverage for 'tahoe put' modifying a mutable file in-place, by filename, closes #441
warner at lothar.com**20080804202643] 
[check_grid.py: update to match new CLI: 'put - TARGET' instead of 'put TARGET'
warner at lothar.com**20080802024856] 
[test_cli: remove windows-worrying newlines from test data
warner at lothar.com**20080802024734] 
[test_cli.py: factor out CLITestMixin
warner at lothar.com**20080802022938] 
[CLI: change one-arg forms of 'tahoe put' to make an unlinked file, fix replace-mutable #441
warner at lothar.com**20080802022729] 
[CLI: add create-alias command, to merge mkdir and add-alias into a single (secure-from-argv-snooping) step
warner at lothar.com**20080802021041] 
[test_cli: add system-based tests for PUT, including a mutable put that fails/todo (#441)
warner at lothar.com**20080801221009] 
[tests: simplify CLI tests that use stdin, now that runner supports it
warner at lothar.com**20080801220514] 
[CLI: simplify argument-passing, use options= for everthing, including stdout
warner at lothar.com**20080801184624] 
[tests: add test that verifier notices any (randomly chosen) bit flipped in the verifiable part of any (randomly chosen) share
zooko at zooko.com**20080731002015
 The currently verifier doesn't (usually) pass this randomized test, hence the TODO.
] 
[tests: test that checker doesn't cause reads on the storage servers
zooko at zooko.com**20080730235420
 It would still pass the test if it noticed a corrupted share.  (It won't
 notice, of course.)  But it is required to do its work without causing storage
 servers to read blocks from the filesystem.
 
] 
[storage: make storage servers declare oldest supported version == 1.0, and storage clients declare oldest supported version == 1.0
zooko at zooko.com**20080730225107
 See comments in patch for intended semantics.
] 
[tests: use the handy dandy TestCase.mktemp() function from trial to give unique and nicely named directories for each testcase
zooko at zooko.com**20080730224920] 
[tests: don't use SignalMixin
zooko at zooko.com**20080730223536
 It seems like we no longer need it, and it screws up something internal in
 trial which causes trial's TestCase.mktemp() method to exhibit wrong behavior
 (always using a certain test method name instead of using the current test
 method name), and I wish to use TestCase.mktemp().
 
 Of course, it is possible that the buildbot is about to tell me that we do
 still require SignalMixin on some of our platforms...
 
] 
[setup: if the user passes a TRIALOPT env var then pass that on to trial
zooko at zooko.com**20080730205806
 This is useful for --reporter=bwverbose, for example.
] 
[setup: turn back on reactor=poll for cygwin trial (else it runs out of fds)
zooko at zooko.com**20080730181217] 
[setup: fix bug in Makefile -- ifeq, not ifneq -- so that now it sets poll reactor only if the user hasn't specified a REACTOR variable, instead of setting poll reactor only if the user has specified a REACTOR variable
zooko at zooko.com**20080730160429] 
[setup: whoops, really remove the default reactor=poll this time
zooko at zooko.com**20080730032358] 
[setup: instead of setting --reactor=poll for trial in all cases (which fails on platforms that don't have poll reactor, such as Windows and some Mac OS X), just set --reactor=poll for linux2.
zooko at zooko.com**20080730031656
 
] 
[setup: pass --reactor=poll to trial unless REACTOR variable is set, in which case pass --reactor=$(REACTOR)
zooko at zooko.com**20080730023906
 This hopefully works around the problem that Twisted v8.1.0 has a bug when used
 with pyOpenSSL v0.7 which bug causes some unit tests to spuriously fail -- see
 known_issues.txt r2788:
 
 http://allmydata.org/trac/tahoe/browser/docs/known_issues.txt?rev=2788#L122
 
 Also it matches with the fact that --reactor=poll is required on cygwin.
 
] 
[setup: require secure_connections from foolscap
zooko at zooko.com**20080730021041
 This causes a problem on debian sid, since the pyOpenSSL v0.6 .deb doesn't come
 with .egg-info, so setuptools will not know that it is already installed and
 will try to install pyOpenSSL, and if it installs pyOpenSSL v0.7, then this
 will trigger the bug in Twisted v8.1.0 when used with pyOpenSSL v0.7.
 
 http://twistedmatrix.com/trac/ticket/3218
 
 Now the comments in twisted #3218 suggest that it happens only with the select
 reactor, so maybe using --reactor=poll will avoid it.
 
] 
[tests: add test_system.Checker which tests basic checking (without verification) functionality
zooko at zooko.com**20080728234317] 
[test: add testutil.flip_one_bit which flips a randomly chosen bit of the input string
zooko at zooko.com**20080728234217] 
[tests: make it so that you can use common.py's SystemTestMixin.set_up_nodes() more than once with the same introducer
zooko at zooko.com**20080728234029] 
[download.py: set up self._paused before registering the producer, since they might call pauseProducing right away
warner at lothar.com**20080728215731] 
[test/common.py: use pre-computed Tub certificates for the system-test mixin, to speed such tests up by maybe 15%. The goal is to encourage more full-grid tests.
warner at allmydata.com**20080728194421] 
[munin/tahoe_spacetime: show 2wk data even if 4wk data is unavailable
warner at allmydata.com**20080728194233] 
[web: add /status/?t=json, with active upload/download ops. Addresses #493.
warner at allmydata.com**20080726004110] 
[web: make t=json stats pages use text/plain, instead of leaving it at text/html
warner at allmydata.com**20080726002427] 
[test_system.py: factor SystemTestMixin out of SystemTest
warner at allmydata.com**20080725223349] 
[test_system.py: modify system-test setup code in preparation for merge with common.SystemTestMixin
warner at allmydata.com**20080725222931] 
[test_system.py: move SystemTestMixin out into common.py, where further improvements will occur
warner at allmydata.com**20080725221758] 
[test_system.py: create SystemTestMixin, with less cruft, for faster system-like tests
warner at allmydata.com**20080725221300] 
[TAG allmydata-tahoe-1.2.0
zooko at zooko.com**20080722014608] 
Patch bundle hash:
96526fa51e1f484e4aa3f25ee31fbc9ea7e829af


More information about the tahoe-dev mailing list