python - How to interpret the result of format(n, 'c') beyond ASCII? -
consider following example:
format(97, 'c') format(6211, 'c')
the first outputs 'a'
correct; however, second outputs 'c'
don't understand why.
the string format specification states that:
'c': character. converts integer corresponding unicode character before printing.
so shouldn't 6211
mapped unicode character 我
in chinese?
related sysinfo: cpython 2.7.10, on fedora 22.
you seeing issue 7267 - format method: c presentation type broken in 2.7 .
the issue format(int, 'c')
internally calls int.__format__('c')
, , returns str value (bytes in python 2.x) , hence in range (0, 256) . hence value 256, goes round 0
. example -
>>> format(256,'c') '\x00'
according issue, fix use python 3 , strings unicode , , issue not there in python 3.x .
the workaround can think of use unichr()
instead -
>>> unichr(0x6211) u'\u6211' >>> print(unichr(0x6211)) 我
though please note, 6211
integer , not unicode character looking for, maps 0x1843
. looking 0x6211
, hexadecimal value, maps 我
, i.e format(0x6211,'c')
in python 3.x .
Comments
Post a Comment