[tahoe-dev] measure your convergence

Aleksandr Milewski zandr at allmydata.com
Fri Mar 21 05:39:18 UTC 2008

On Mar 20, 2008, at 1:38 PM, zooko wrote:
> Folks:
> Ever wondered how much storage space you would save if you and your
> friends coalesced all of your identical files?

In thinking about the implications of the loss of universal  
convergence, a few things come to mind. I was chatting with Zooko over  
IM about these, and he suggested I share them more broadly.

Though I haven't run dupfilefind on my own machines yet, intuition  
tells me that I don't really have a lot of duplication on a single  
system. However, convergence is still important, because while I'm not  
likely to have lots of local duplication, it *is* likely that  
something will have happened to cause me to forget that I've already  
uploaded something. Convergence accelerates these otherwise wasted  

Further, broadening the scope of the chk_secret to all of a user's  
machines will indeed offer some savings. .Mac is actively ensuring  
that there are a few GB of files that are duplicated between my  
notebook and desktop machines, and as they're both Macs, there are  
likely very large overlaps between the two boxes.

Also, having a single chk_secret per account paves over a number of  
opportunities for errors in restore-then-backup cases, where a machine  
recovered from the mesh would generate a new chk_secret and then re- 
upload its entire contents.

