linux - Python 2.7 unicode confusion again -
i've read this:
setting correct encoding when piping stdout in python
and i'm trying stick rule of thumb: "always use unicode internally. decode receive, , encode send."
so here's main file:
# coding: utf-8 import os import sys myplugin import myplugin if __name__ == '__main__': c = myplugin() = unicode(open('myfile.txt').read().decode('utf8')) print(c.generate(a).encode('utf8'))
what getting on nerves that:
- i read in utf8 file decode it.
- then force convert unicode gives
unicode(open('myfile.txt').read().decode('utf8'))
- then try output terminal
- on linux shell need re-encode utf8, , i guess normal because i'm working time on unicode string, output it, have re-encode in utf8 (correct me if i'm wrong here)
- when run pycharm under windows, it's twice utf8 encoded, gives me things
agréable, déjÃ
. if removeencode('utf8')
(which changes last lineprint(c.generate(a))
works pycharm, doesn't work anymore linux, get:'ascii' codec can't encode character u'\xe9' in position
blabla know problem.
if try in command line:
- linux/shell ssh:
import sys sys.stdout.encoding
'utf-8'
- linux/shell in code:
import sys sys.stdout.encoding
none
wtf?? - windows/pycharm:
import sys sys.stdout.encoding
'windows-1252'
what best way code works on both environments?
you're philosophy correct you're on complicating things , making code brittle.
open files in text mode automatically convert unicode you. print without encoding - print supposed work out correct encoding.
if linux environment isn't set correctly, set pythonioencoding=utf-8
in linux environment vars (export pythonioencoding=utf-8
) fix issues during print. should consider setting locale utf-8 variation such en_gb.utf-8
avoid having define pythonioencoding
.
pycharm should work without modification.
your code should like:
import os import sys import io myplugin import myplugin if __name__ == '__main__': c = myplugin() # t default io.open('myfile.txt', 'rt', encoding='utf-8') myfile: # unicode string = myfile.read() result = c.generate(a) print result
if you're using python 3.x, drop import io
, io.
io.open()
.
Comments
Post a Comment