Sunday, December 7, 2008

See how python encodes a string internally

Fun fact: you can see how python encodes a string internally by asking it to encode a unicode object to the 'unicode_internal' codec.

Example:


>>> u'\N{SNOWMAN}'.encode('utf16')
'\xfe\xff&\x03'
>>> u'\N{SNOWMAN}'.encode('unicode_internal')
'&\x03'
As you can see, it's UTF16 without a BOM.

No comments: