[tahoe-dev] Recombinant Hashes
cmsdew at googlemail.com
Thu Feb 11 20:28:31 UTC 2010
I'm a bit confused by your reply. I hope you could expand on it.
There are 12 hashes created (numbered from 0 to confuse people). Any
number of hashes can be created, I just chose 12 as it is a nice
number. There could have been 1000 hashes. I don't understand the
term 'expansion'. Is unlimited expansion what defines a fountain code
- or is the definition restricted to variants of the 'random linear
The efficiency of my algorithm increases as you increase the number of
hashes. I restricted the demo to a short block of data and a limited
number of hashes. In fact only 4 hashes an one bit from a fifth hash
is required to get back the data. That was too complicated for a
A more general implementation (which I'll port to C, clean up and
document) takes blocks 119 uint16's long and needs 120 uint16 hashes
to get back the original data. (120 was chosen as it's highly
divisible). That's an efficiency of over 99%.
Theoretically there's no limit to the efficiency, but the processing
overhead grows at a greater than linear rate when larger block sizes
are chosen. When I was using block sizes of 4KiB, and 'drops' of 256
bytes, the processor became a bottleneck (compared with broadband
On 11 February 2010 19:21, David-Sarah Hopwood
<david-sarah at jacaranda.org> wrote:
> Chris Dew wrote:
>> Hi, I just read about this project on Reddit, and wondered whether you
>> use an algorithm similar to that which I've developed.
>> It lets you make an unlimited amount of hashes from some data. So
>> long as the total length of the hashes you possess are greater than
>> the length of the original data, you can recreate the original data -
>> regardless of which combination of hashes you possess.
> Yes, that's a form of Forward Error Correction code (a block code,
> more specifically).
>> My use case for this algorithm is very similar to your own project, so
>> I thought I'd get in touch.
> In the demo, 5 of 11 hashes are needed to recover the data. But the
> maximum input size was 16 bytes, and the total size of the hashes was
> 11*4 = 44 bytes. That's an expansion of 44/16 = 2.75, when an expansion
> of 11/5 = 2.2 should be possible.
> The Reed-Solomon FEC code that Tahoe uses also has the property that
> the first few shares are just slices of the data, so less computation
> is needed to either generate or recombine them.
> David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com
> tahoe-dev mailing list
> tahoe-dev at allmydata.org
More information about the tahoe-dev