[tahoe-dev] [tahoe-lafs] #327: performance measurement of directories

tahoe-lafs trac at allmydata.org
Sun Jul 5 22:39:40 UTC 2009


#327: performance measurement of directories
------------------------------------------+---------------------------------
 Reporter:  zooko                         |           Owner:  zooko     
     Type:  enhancement                   |          Status:  assigned  
 Priority:  major                         |       Milestone:  eventually
Component:  code-dirnodes                 |         Version:  0.8.0     
 Keywords:  test performance scalability  |   Launchpad_bug:            
------------------------------------------+---------------------------------

Comment(by kevan):

 (this is in response to the first iteration of the benchmarking script;
 the issue is addressed in r3965)

 Cool!

 I've run it on my machine, and noticed that it actually shows slower
 results for the optimized code. I think that's a matter of methodology,
 though.

 From what I understand (having never used benchutil), to test
 pack_contents you're building a list of (name, child) tuples, feeding that
 into a dict, and then feeding the dict to pack_contents, and you're
 testing how long that takes for increasing numbers of tuples. To test
 unpack_contents, you're doing that, but saving the result of pack_contents
 in a global variable, then feeding that to unpack_contents to see how long
 it takes.

 If I'm right, we aren't seeing any speed improvements because the
 benchmark isn't actually testing the optimizations. In order to do that,
 we need to feed pack_contents a dictionary that was actually output from
 unpack_contents (or else built with set_both_items instead of __setitem__)
 -- that way, the set_both_items method of the dict wrapper will have been
 used to set the serialized value, and pack_contents will find and use that
 value, thus (ideally) making it faster.

 One way to do this might be to stick the logic for child creation into a
 more general setup method -- something which, when called, would generate
 + pack the list of children, and return the results of pack_contents on
 that list, for example. Then the init method for the unpack test could
 store that value in the global packstr variable, and work as it does now,
 while the init method for the pack test could unpack that, then feed the
 resulting dictionary into pack_contents again, where the optimizations
 would start working.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/327#comment:10>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list