[tahoe-dev] packaging for Fedora and Debian blocked on C++ issues (was: plans for tahoe-1.3.0 release)

Mon Feb 9 19:46:47 UTC 2009

Hi Andrej Falout:

On Feb 7, 2009, at 18:28 PM, Andrej Falout wrote:

>    * pycryptopp improvements — link against system libcryptopp.so  
> for Debian and Fedora packagers, add new improved ECDSA, build out  
> more buildbots, etc.
>
> I'm assuming "Fedora packages" refers to .rpm? Where can they be  
> found?

They don't yet exist -- Ruben Kerkof of the Fedora project kindly  
offered to create them for us (and Micah Anderson of the Debian  
project kindly offered to build .deb's), but there is a problem  
linking the standard python interpreter to the standard libcryptopp.so.

Here are details:

http://allmydata.org/trac/pycryptopp/ticket/9

I can think of a few ways to fix this:

1.  Modify the python interpreter to pass the RTLD_GLOBAL flag to  
dlopen().

This probably won't happen, because the upstream Python maintainers  
have already rejected this, and although I don't understand their  
reasons, I assume that they are good reasons, and that therefore the  
Fedora Python maintainers will not patch the Fedora version of Python  
to do this.

2.  Use the gcc exported-symbol map file to specify that the relevant  
symbols are exported *only* from libcryptopp.so, not also from the  
pycryptopp code that uses those symbols.

This is the cleanest solution, if it works.  I've never used the gcc  
exported-symbols map file myself.

3.  Remove the features of pycryptopp that rely on comparing symbols  
that were defined in Crypto++ and used in pycryptopp.

The features in question are:

feature 1: defining hexdigest() by attaching an instance of  
HexEncoder to an instance of ArraySink:

http://allmydata.org/trac/pycryptopp/browser/pycryptopp/hash/ 
sha256module.cpp?rev=576#L69

This uses some runtime-typing features of Crypto++ to decide exaclty  
how these two things ought to be attached, and that runtime-typing  
depends on comparing symbols compiled in libcryptopp.so with symbols  
compiled in pycryptopp.so.

Note, you can currently avoid this runtime failure by not  
invoking .hexdigest().  For example, you can invoke .digest() instead  
and then hex-encode the result yourself.

So we could either remove .hexdigest() entirely or rewrite it to not  
depend on the automatic detection of how to attach a HexEncoder to an  
ArraySink.  I'm a little vague on exactly how that latter feature  
works and why it compares symbols, but it shouldn't be too hard to  
just "hard-wire" it so that it doesn't need to compare symbols.

feature 2: raising exceptions from Crypto++ and catching them in  
pycryptopp.

This is a more convenient and safer way to handle problems such as  
invalid inputs, e.g. keys can't be this size for this algorithm, than  
the alternative approach of having pycryptopp pre-validate the inputs  
so that it is sure Crypto++ won't raise an exception on those  
inputs.  However, it requires comparing symbols.  So one way to stop  
relying on this is just to add pre-checking to pycryptopp to validate  
those inputs.

Hm...  On the other hand, currently pycryptopp has extensive unit  
tests which assert that arbitrary corruption of ECDSA keys will be  
handled gracefully at runtime (if you accidentally pass a sample of / 
dev/urandom in place of an ECDSA key, I mean).  I would have to delve  
into some elliptic curve math to replicate all the Crypto++ validity  
checks in pycryptopp.  This strategy seems like a bad idea.

Okay, so if anyone out there wants to contribute a fix to this, that  
would be great!  To replicate the problem is very simple: "python ./ 
setup.py build --disable-embedded-cryptopp && python ./setup.py test".

Regards,

Zooko