[tahoe-dev] Dev Chat summary

Brian Warner warner at lothar.com
Wed Aug 7 16:22:47 UTC 2013


Tahoe-LAFS Weekly Dev Chat, 07-Aug-2013

In attendance: Brian, Daira, Zooko, Mark, Nathan, Oleksandr(?)

We started by investigating the severe (2x) slowdown of the unit test
suite on the 1819-cloud-merge branch (ticket #1870). We're pretty sure
it involves the new leasedb operations. Brian specifically suspects the
DB writes, since he's heard that SQLite tries to be honest about
durability and does an fsync() after every db.commit().

Daira and Zooko have done some analysis to count DB ops and tempfile
generation. Brian noticed that the specific test in question
(test_cli.Cp.test_copy_using_filecap) is mistakenly copying the entire
base directory, *including the servers and their shares*, into the
virtual filesystem, creating 174 files instead of four (and making it
remarkable that it ever terminates, since it's copying shares that are
created by the process of copying the other shares). Fixing that reduces
the non-leasedb runtime from 24s to 1s. The buggy test snuck past code
review last november.

The larger performance problem still remains. Daira will add some
instrumentation to measure time spent in db.commit() specifically, to
see if the slowdown can be attributed to it. If so, we need to look at
the DB writes and see if there's any clean way to consolidate them
(reducing the number of writes to one-per-new-share). If that's
insufficient, we may need to tell SQLite to stop fsyncing (e.g. reduce
durability) during unit tests. Daira is hopeful we won't have to do
that.

We also talked about moving the Buildbot config into a git repo where it
would be more accessible (Zooko has started on this), adding a buildbot
tool to warn us when a single test's runtime suddenly increases (to spot
things like the buggy test above), and reviewing other patches.

Please join us in the Dev Chat next week!

cheers,
  -Brian



More information about the tahoe-dev mailing list