
On Thu, Aug 29, 2019 at 11:41 AM Yedidyah Bar David <didi@redhat.com> wrote:
Hi all,
This is in a sense a continuation of the thread "Why filetransaction needs to encode the content to utf-8?", but I decided that a new thread is better.
I started to systematically convert the code to use a unicode sandwich. I admit it was harder than I expected, and made me think somewhat differently about the move to python3, and about how reasonable (or not) it is to develop in the common subset of python2 and python3 vs ditching python2 and moving fully to python3. It seems like at least parts of our (integration team) code will still have to run in python2 also in oVirt 4.4, so I guess we'll not have much choice :-)
Current patches are only for otopi and engine-setup, and are by no means thorough - I didn't check each and every open() call and similar ones. But it's enough for getting engine-setup finish successfully on both python2 and python3 (EL7 and Fedora 29), with some utf-8 inserted in relevant places of the input (for the plugins already handled).
I didn't bother trying non-utf-8 encodings. Perhaps I should, but it's not completely clear to me what's the best approach [2].
A universal solution when dealing with sys.argv which could contain file paths/names in various languages, would be selecting sys.getfilesystemencoding() for the encoding scheme instead of a hard coded 'utf-8' [3]. We've done something similar in sanlock python-c API for converting file-system paths into bytes, although it's in C, the principle of using the file-system default encoding applies there as well [4]. [3] https://stackoverflow.com/a/5113874 [4] https://pagure.io/sanlock/blob/master/f/python/sanlock.c#_76
Currently, you must have both otopi and engine updated to get things working. If there is demand, I might spend some time splitting/rebasing/etc to make it possible to update just one of them and only later the other, but not sure it's worth it.
I don't mind splitting/squashing if it makes reviews simpler, but I think the patches are ok as-is. These are the bottom patches of each stack:
otopi: https://gerrit.ovirt.org/102085
engine-setup: https://gerrit.ovirt.org/102934
[1] http://python-future.org/unicode_literals.html
[2] https://stackoverflow.com/questions/4012571/python-which-encoding-is-used-fo...
Thanks and best regards, -- Didi