[tahoe-dev] state of testgrid

Greg Troxel gdt at ir.bbn.com
Fri Jul 30 00:18:10 UTC 2010

The testgrid seems not really ok.  I realize it's volunteer/test, so I
don't mean to complain, but it's an interesting set of error
cases/marginal cases.   Things that would be nice:

  some way to test if servers actually will take and return shares.
  some status of how this has gone (perhaps just store errors) on the
  welcome web page

  some clarity on the nat situation.  I'm not sure what to ask for, but
  I notice that a client behind a nat finds 8 servers while a server
  with a global address finds 12 servers.  So it seems there are 4
  testgrid servers behind nats and not configured for nat passthrough.
  Maybe this is just how it is, and it's all fine.

  My real issue may be that of the 12 servers maybe only 6 will take
  shares.  tahoe check --repair seems to keep incrementing the version
  number instead of trying to replace shares of the highest version
  (which is reconstructable).  It takes 8-12 minutes to run, so it seems
  there is some attempt to contact servers that should have been
  declared non-responsive.

This is making me want to stop playing with the testgrid and just find
resources to run a private grid that will be more reliable.  But I think
it would be good for the project if participating in and using the
testgrid were a better experience, so I thought I'd explain my trouble.

More information about the tahoe-dev mailing list