[tahoe-dev] Recommendations for minimal RAM usage ?

Mon Mar 5 19:24:08 UTC 2012

On Mon, Mar 5, 2012 at 11:44 AM, Brian Warner <warner at lothar.com> wrote:
>
> The storage-server side of Tahoe doesn't really do that much, so I'd be
> pleased as punch if it could fit into 10 or 20MB. To do that, I think
> we'd need to avoid loading all of the code when we're not running a
> client (I'm thinking zfec, pycryptopp, twisted.web, conch, ftp/sftp).
> Switching away from foolscap (to HTTP) might help too (or might hurt, if
> the crypto needed to replace it is actually bigger).

Wouldn't the code (.py's, .pyc's, or .so's) tend to be swapped out if
it unused? Or maybe unused code gets interspersed with code that is
needed, so it ends up bloating the memory usage?

Sounds like the sort of thing that needs to be measured. There was a
very interesting tool presented at Pycon last year by Dave Malcolm:

http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-dude-where-s-my-ram-a-deep-dive-into-how-python-uses-memory-4896725

http://dmalcolm.livejournal.com/5782.html

The thing that was interesting about this tool is that it was
"bottom-up" -- it starts by observing, from the operating system's
perspective, what memory your process is using and then starts
classifying it and breaking it down to help you understand why the
process is using that memory. This is in contrast to tools like Meliae
(https://launchpad.net/meliae) which are "top-down" -- they start by
observing, from the Python interpreter's perspective, what objects
your Python code is using and then start estimating the memory usage
of objects and aggregating objects into groups to help you understand
why the Python code is using that much memory.

A bottom-up tool would probably be better to answer the question of
how much of the memory usage is due to code. I would probably want to
start with a bottom-up tool, because I've used those top-down tools
(oriented around Python objects) before, and the results are usually
just a vast forest of Python objects that gives me no clear idea of
where the problem or problems lie.

I wouldn't be surprised if you needed to use both styles of tool
together in order to really get a clear picture.

I also wouldn't be surprised if you had to port each of these tools to
ARM before you could use them. On the other hand, if someone used them
on a (32-bit) x86 server, the results and the resulting optimizations
might apply to the ARM device.

Oh, I guess it would be somewhat straightforward to test Brian's
hypothesis about unused code by hacking a server to not import any
modules it didn't need and then running it and measure the resulting
(resident) memory usage.

Here's another "bottom-up" tool that tries to figure out how much of
your processes resident memory usage is due to sharing libraries with
other processes:

http://www.selenic.com/smem/

Just to be explicit about this, I'm not going to start doing these
experiments myself anytime soon. I would love for Tahoe-LAFS to get
slimmer on memory usage, and to continue to fit into more and more
cute little ARM devices. (I just ordered my Raspberry Pi!) But, I have
higher priorities in Tahoe-LAFS development right now. I'll offer
encouragement, advice, feedback, and patch review.

Regards,

Zooko