[tahoe-dev] [tahoe-lafs] #200: writing of shares is fragile and "tahoe stop" is unnecessarily harsh

tahoe-lafs trac at allmydata.org
Mon Nov 2 16:08:09 UTC 2009

#200: writing of shares is fragile and "tahoe stop" is unnecessarily harsh
 Reporter:  zooko         |           Owner:  warner    
     Type:  enhancement   |          Status:  new       
 Priority:  major         |       Milestone:  eventually
Component:  code-storage  |         Version:  0.6.1     
 Keywords:  reliability   |   Launchpad_bug:            

Comment(by zooko):

 I'm sorry if this topic makes you feel unhappy.  For what it is worth, I
 am satisfied with the current behavior: dumb writes, stupid shutdown,
 simple startup.  :-)  This scores highest on simplicity, highest on
 performance, and not so great on preserving mutable shares.

 This seems okay to me, because I consider shares to be expendable -- files
 are what we care about, and those are preserved by verification and repair
 at the Tahoe-LAFS layer rather than by having high quality storage at the
 storage layer.  allmydata.com uses the cheap commodity PC kit, such as 2
 TB hard drive for a mere $200.  Enterprise storage people consider it to
 be completely irresponsible and wrong to use such kit for "enterprise"
 purposes.  They buy "enterprise" SCSI drives from their big equipment
 provider (Sun, HP, IBM) with something like 300 GB capacity for something
 like $500.  Then they add RAID-5 or RAID-6 or RAID-Z, redundant power
 supplies, yadda yadda yadda.

 So anyway, allmydata.com buys these commodity PCs -- basically the same
 hardware you can buy retail at Fry's or Newegg -- which are quite
 inexpensive and suffer a correspondingly higher failure rate.  In one
 memorable incidence, one of these 1U servers from ~SuperMicro failed in
 such a way that all four of the commodity 1 TB hard drives in it were
 destroyed.  This means lots of mutable shares -- maybe something on the
 order of 10,000 mutable shares -- were destroyed in an instant!  But none
 of the allmydata.com customer files were harmed.

 The hard shutdown behavior that is currently in Tahoe-LAFS would have to
 be exercised quite a lot while under high load before it would come close
 to destroying that many mutable shares.  :-)

 I would accept changing it to do robust writes such as the simple "write-
 new-then-relink-into-place".  (My guess is that this will not cause a
 noticeable performance degradation.)

 I would accept changing it to do traditional unixy two-phase graceful
 shutdown as you describe, with misgivings, as I think I've already made
 clear to you in personal conversation and in comment:ticket:181:8.

 To sum my misgivings: 1. our handling of hard shutdown (e.g. power off,
 out of disk space, kernel crash) is not thereby improved, and 2. if we
 come to rely on "graceful shutdown" then our "robust startup" muscles

 Consider this: we currently have no automated tests of what happens when
 servers get shut down in the middle of their work.  So we should worry
 that as the code evolves, someone could commit a patch which causes bad
 behavior in that case and we wouldn't notice.

 However, we do know that everytime anyone runs {{{tahoe stop}}} or
 {{{tahoe restart}}} that it exercises the hard shutdown case.  The fact
 that allmydata.com has hundreds of servers with this behavior and has had
 for years gives me increased confidence the current code doesn't do
 anything catastrophically wrong in this case.

 If we improved {{{tahoe stop}}} to be a graceful shutdown instead of a
 hard shutdown, then of course the current version of Tahoe-LAFS would
 still be just as good as ever, but as time went on and the code evolved I
 would start worrying more and more about how tahoe servers handle the hard
 shutdown case.  Maybe this means we need automated tests of that case.

Ticket URL: <http://allmydata.org/trac/tahoe/ticket/200#comment:6>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid

More information about the tahoe-dev mailing list