[tahoe-dev] #534: "tahoe cp" command encoding issue

Shawn Willden shawn-tahoe at willden.org
Sun Mar 1 05:08:31 UTC 2009


On Friday 27 February 2009 10:45:24 am Brian Warner wrote:
> One limitation to keep in mind is that JSON cannot represent arbitrary
> binary data without application-visible encoding, and that both the
> webapi GET $dircap?t=json and the dirnode-format metadata dict use
> JSON. So any "store the original bytes and let the reader sort it out"
> approach must e.g. base32-encode those bytes on the way in and base32-
> decode them on the way out, in the CLI tool on the user side of the
> HTTP connection.

You don't need to use base 32.  simplejson can output arbitrary Unicode 
strings, it just spits out ASCII-unrepresentable characters in \uXXX format.  
This is more convienient and often more compact than base 32.

In order for that to work, though, you first have to get it into Unicode.  The 
latin1 codec can decode any byte string to Unicode.

	Shawn.



More information about the tahoe-dev mailing list