
On Wed, Sep 4, 2019 at 5:21 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Wed, Sep 4, 2019 at 11:50 AM Amit Bawer <abawer@redhat.com> wrote:
On Wed, Sep 4, 2019 at 10:25 AM Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Aug 29, 2019 at 10:00 PM Amit Bawer <abawer@redhat.com> wrote:
On Thu, Aug 29, 2019 at 11:41 AM Yedidyah Bar David <didi@redhat.com> wrote:
Hi all,
This is in a sense a continuation of the thread "Why filetransaction needs to encode the content to utf-8?", but I decided that a new thread is better.
I started to systematically convert the code to use a unicode sandwich. I admit it was harder than I expected, and made me think somewhat differently about the move to python3, and about how reasonable (or not) it is to develop in the common subset of python2 and python3 vs ditching python2 and moving fully to python3. It seems like at least parts of our (integration team) code will still have to run in python2 also in oVirt 4.4, so I guess we'll not have much choice :-)
Current patches are only for otopi and engine-setup, and are by no means thorough - I didn't check each and every open() call and similar ones. But it's enough for getting engine-setup finish successfully on both python2 and python3 (EL7 and Fedora 29), with some utf-8 inserted in relevant places of the input (for the plugins already handled).
I didn't bother trying non-utf-8 encodings. Perhaps I should, but it's not completely clear to me what's the best approach [2].
A universal solution when dealing with sys.argv which could contain file paths/names in various languages, would be selecting sys.getfilesystemencoding() for the encoding scheme instead of a hard coded 'utf-8' [3]. We've done something similar in sanlock python-c API for converting file-system paths into bytes, although it's in C, the principle of using the file-system default encoding applies there as well [4].
Thanks for the hint. Looked at this and thought a bit, and I tend to ignore/postpone until a need arises. We already have "utf-8" hard-coded in otopi 27 times, not sure it makes sense now to go after each and every one of them and analyze the more-general function (or expression, or even more complex) to replace it with. I guess this is only relevant for Windows, and I do not think anyone is going to try to port otopi to Windows soon.
Searching for relevant keywords in google finds mostly results from around 2009-2012, which I guess was the time around which most systems converted their non-utf-8 file collections to utf-8. A somewhat newer example (2016):
http://beets.io/blog/paths.html
So I am going to ignore this. If you think that's a bad choice, please open a bug, and I'll handle it later. Thanks!
Default Linux locale encoding is UTF-8, so I don't think its a bad choice.
Thanks, Amit.
Now rebased and triggered an OST run:
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests...
If it passes, I'll merge both stacks soon. I'll then send a different email, to notify developers to update otopi on their machines.
Was a bit hesitant to merge without verifying on fedora, but fc30 is still not fully building/built, so gave up and verified only on fc30 in dev env. Rebased patches, going to merge them soon if they pass CI.
Best regards,
For now, my top priority is to get otopi+engine-setup+host-deploy work well enough for:
1. Developers that use fedora for everything, or mix fedora and RHEL7/8 (e.g. engine on one, host on another).
2. RHV 4.4, with hosts being RHEL8.
Best regards,
[3] https://stackoverflow.com/a/5113874 [4] https://pagure.io/sanlock/blob/master/f/python/sanlock.c#_76
Currently, you must have both otopi and engine updated to get things working. If there is demand, I might spend some time splitting/rebasing/etc to make it possible to update just one of them and only later the other, but not sure it's worth it.
I don't mind splitting/squashing if it makes reviews simpler, but I think the patches are ok as-is. These are the bottom patches of each stack:
otopi: https://gerrit.ovirt.org/102085
engine-setup: https://gerrit.ovirt.org/102934
[1] http://python-future.org/unicode_literals.html
[2] https://stackoverflow.com/questions/4012571/python-which-encoding-is-used-fo...
Thanks and best regards, -- Didi
-- Didi
-- Didi
-- Didi