[tahoe-dev] Unicode issues review

Shawn Willden shawn-tahoe at willden.org
Tue Feb 17 21:18:32 UTC 2009

On Tuesday 17 February 2009 11:52:10 am Shawn Willden wrote:
> See the attached screenshot.  This is from my machine.  The name would be
> meaningless to me even if I knew what the encoding was, because it's
> Korean. 

For fun I tried all of the Python-provided codecs listed for Korean on that 
file name, and two of the five could decode it.  I tried a bunch of other 
codecs more or less at random, and several of those also appeared to be able 
to decode it, but produced wrong values (string that when converted to UTF-8 
and then rendered contained accented latin characters, math symbols, etc.)

I found it very cool that after I converted the original filename to UTF-8, my 
terminal window shows the Korean characters when I run 'ls'.  I still have no 
idea what it says, of course, but it's pleasant that it works.  I can even 
paste a character or two into a command and then use tab completion.


