[tahoe-dev] [tahoe-lafs] #755: if there is an unrecoverable subdirectory, the deep-check report (both WUI and CLI) loses other information

tahoe-lafs trac at tahoe-lafs.org
Fri Jan 7 05:31:20 UTC 2011

#755: if there is an unrecoverable subdirectory, the deep-check report (both WUI
and CLI) loses other information
     Reporter:  zooko          |       Owner:  francois                                     
         Type:  defect         |      Status:  new                                          
     Priority:  critical       |   Milestone:  1.8.2                                        
    Component:  code-dirnodes  |     Version:  1.4.1                                        
   Resolution:                 |    Keywords:  usability error tahoe-check wui verify repair
Launchpad Bug:                 |  
Changes (by warner):

  * keywords:  usability error tahoe-check wui verify repair review-needed
               => usability error tahoe-check wui verify
  * owner:  davidsarah => francois
  * status:  assigned => new


 Good patch! I like the approach of making filenode.check_and_repair()
 signal inability to repair by returning
 {{{CheckAndRepairResults.repair_successful}}}=False instead of by
 throwing an exception. A few things I'd like to see changed:

 * we usually repair files that are unhealthy but recoverable. If repair
   fails, the file should still be recoverable. The post-repair-results
   are pessimistically being set to healthy=False recoverable=False
   needs_rebalancing=False, when it's probably (and sometimes certainly)
   more accurate to copy these values from the pre-repair-results. In
   particular, we shouldn't scare users into thinking that repair
   failures of "scratched" files (unhealthy but recoverable) indicate
   unrecoverable files: this makes benign things like
   {{{UnhappinessError}}} look like data loss. This should be fixed in
   both mutable and immutable files.

 * the newly-enabled test in {{{test_repairer.Repairer.test_harness}}}
   (which previously got a {{{self.shouldFail()}}}) should be slightly
   enhanced to check the return value of {{{check_and_repair()}}}. We
   should verify that it has {{{crr.repair_attempted=True}}},
   {{{crr.repair_successful=False}}}, and

 * we should add a similar test for mutable files that have had 8 shares
   deleted. There's something awfully close in
   {{{test_mutable.Repair.test_unrepairable_1share}}} .. it should be
   changed to use {{{self._fn.check_and_repair()}}} instead of
   {{{self._fn.repair()}}} . To be honest, I'm not sure why that test was
   passing before, because from what I can tell it should have been
   behaving the same way as immutable repair on an unrecoverable file.
  * it's probably worth checking the code coverage when we exercise
    {{{test_mutable}}} and make sure the new code is getting run

 * do we have any tests that confirm deep-repair on a tree with an
   unrecoverable file (or directory) makes it through to the end without
   an errback? We probably do but I'd like to be sure.. probably
   something in {{{test_deepcheck}}} exercises this.

 * I see {{{test_deepcheck.py:DeepCheckWebBad.do_deepcheck_broken()}}}
   asserts that an unrecoverable dirnode causes the traversal to halt. Is
   this what we want? Is this ticket about making sure an unrecoverable
   *file* doesn't halt a deep-repair, or about an unrecoverable
   *dirnode*? (broken dirnodes are more significant than files, because
   it means you've probably lost access to even more data). We certainly
   want the deep-traversal to keep going and repair more things, but we
   also need to make sure the user learns about the dead dirnode.

 Otherwise, looks great! With those few changes we can land this one for

Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/755#comment:22>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage

More information about the tahoe-dev mailing list