[tahoe-dev] BackupDB proposal

Ben Laurie ben at links.org
Thu May 29 18:46:53 UTC 2008


Aleksandr Milewski wrote:
> On May 29, 2008, at 01:45 , Ben Laurie wrote:
>> Rather than messing around with a database, I would store hashes
>> alongside each file and check whether the hash has changed. Obviously
>> you incur the cost of rehashing the local file each time, but, well,  
>> who
>> cares?
> 
> Users care.
> 
> Rehashing the entire filesystem every time you're trying to run an  
> incremental backup is obnoxious. It will take a large amount of disk  
> IO and CPU time to do even a small backup.
> 
> Putting the hashes alongside each file is also obnoxious for a couple  
> of reasons, but most importantly because a backup is nominally a read  
> operation. Scribbling all over the filesystem you're supposed to be  
> backing up is a bad idea.

I didn't mean on the backed up system, I meant on the backup.

> So, you could create a parallel tree with the file hashes, but if  
> you're going to do that, then a database is faster and easier. And,  
> FWIW, common practice in backup tools.

Yeah, including my favourite, bacula.

In fact, a Tahoe storage device for bacula would be cool.

-- 
http://www.apache-ssl.org/ben.html           http://www.links.org/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff



More information about the tahoe-dev mailing list