Thursday, 15 August 2013

Changing unicode encodings in Python 2.7

Changing unicode encodings in Python 2.7

I am writing a simple module that needs to handle German letters eg ASCII
132. I have read most of the advice given on this site and others on how
to handle unicode and encodings in Python2.x However, things do not work
out for me. Example
>>> import sys,unicodedata
>>> x='a'
>>> u=unicode(x,'utf-8')
>>> unicodedata.category(u)
'Ll'
>>> y=u.encode('latin-1') #to turn string into bytes
>>> y=y.decode('utf-8') # to turn bytes back to strin but encoded utf-8
>>> unicodedata.category(y)
'Ll'
What am I doing wrong? Why can't I change the encoding to utf8?
By the way I had copied a file called sitecustomize.py which was supposed
to make the default encoding to utf-8. Sys.getdefaultcode() in fact shows
utf-8 as the default encoding, belive it or not.

No comments:

Post a Comment