[tahoe-dev] [tahoe-lafs] #694: remove hard limit on mutable file size (was: tahoe put reeturns "error, got 413 request entity too large")

tahoe-lafs trac at allmydata.org
Mon May 4 17:15:05 UTC 2009


#694: remove hard limit on mutable file size
------------------------+---------------------------------------------------
 Reporter:  sigmonsays  |           Owner:  nobody   
     Type:  defect      |          Status:  new      
 Priority:  major       |       Milestone:  undecided
Component:  unknown     |         Version:  1.4.1    
 Keywords:              |   Launchpad_bug:           
------------------------+---------------------------------------------------

Comment(by zooko):

 Thank you for the bug report, sigmonsays.  There is a hardcoded limit on
 the maximum size of mutable files.  Directories are stored inside mutable
 files.  The directory into which you are linking your new file would grow
 beyond the limit by the addition of this link.

 I believe the first thing to do is to remove the hardcoded limit, and I'm
 accordingly changing the title of this ticket to "remove hard limit on
 mutable file size".  The line of code in question is
 [source:src/allmydata/mutable/publish.py at 20090222233056-4233b-
 171d02bfd1df45fff4af7a4f64863755379e855a#L145 publish.py line 145].
 Someone go fix it!  Just remove the {{{MAX_SEGMENT_SIZE}}} hardcoded
 parameter and all two places that it is used.

 There is already a unit test in
 [source:src/allmydata/test/test_mutable.py at 20090218222301-4233b-
 49132283585996c7cee159d1ce2a9133bdd00aa7#L359 test_mutable.py line 359]
 that makes sure that Tahoe raises a failure when you try to create a
 mutable file that is bigger than 3,500,000 bytes.  Change that test to
 make sure that Tahoe ''doesn't'' raise a failure and instead that the file
 is created.

 After that, however, you might start to learn why we put that limit in --
 it is because modifying a mutable file requires downloading and re-
 uploading the entirety of that mutable file, and storing the entirety of
 it in RAM while changing it.  So the more links you keep in that directory
 of yours, the slower it is going to be to read the directory or to change
 it, and the more RAM will be used.

 Ultimately we need to implement efficient modification of mutable files
 without downloading and re-uploading the whole file -- that is the subject
 of #393 (mutable: implement MDMF).

 In the mean-time, there are also some tickets about optimizing the CPU
 usage when processing large directories.  Fixing these would not fix the
 problem that the entire directory has to be downloaded and re-uploaded,
 but these tickets might also be important: #327 (performance measurement
 of directories), #329
 (dirnodes could cache encrypted/serialized entries for speed), #383 (large
 directories take a long time to modify), #414 (profiling on directory
 unpacking).

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/694#comment:2>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list