
On Sun, Sep 1, 2019 at 2:34 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Sep 1, 2019 at 1:20 PM Amit Bawer <abawer@redhat.com> wrote:
On Sun, Sep 1, 2019 at 10:28 AM Yedidyah Bar David <didi@redhat.com>
wrote:
Hi all,
That's a "sub-thread" of "unicode sandwich in otopi/engine-setup".
I was recommended to use 'six.text_type() over "u''". I did read [1], and eventually decided that my own preference is to just add "u" prefix. Reasoning is inside [1].
Do people have different preferences/reasoning they want to share?
Do people think we should have project-wide policy re this?
Since our code is currently transitioning from py2 to py2/py3, and not from py3 to py3/py2, it would be fair to assume that most already existing string literals in it contain ascii symbols, unless explicitly stated otherwise; so IMO it would only make sense to enforce 'u' over newly added literals which involve non-ascii symbols as long as py2 is still alive.
Not exactly.
Suppose (mostly correctly) that the code didn't employ the "unicode sandwich" technique so far. Meaning, much was handled as python2 str objects containing utf-8-encoded strings, and converted to unicode objects mainly as needed/noted/considered. Suppose that x is a variable that used to contain such an str, usually ascii-only, but sometimes perhaps utf-8. Now, this:
'x: {}'.format(x)
would work, and replace {} with the contents of x, and return a python2 str, utf-8-encoded if x is utf-8. But if now x contains a unicode object (because we decided to follow the sandwich approach, and encode all utf-8 during input), it would fail, if x is not ascii-only. Adding u to 'x: {}' solves this.
utf-8 is an ascii extension, meaning that first 128 ordinals agree for both encodings, so unicode sandwich has no negative effect on your example. It would be only a problem only if input for x originally had a non-ascii character in it, but that should have been an issue for py2 in the first place, regardless to py3 sandwiches.
So I have to handle also all existing such literals, at least those that would now require handling unicode vars.
Personally, I do not see the big advantage of adding "six.text_type()" (15 chars) instead of a single "u". I do see where it can be useful, but not as a very long replacement, IMO, for "u", or for unicode_literals.
Once py2 will be officially terminated, probably neither option
however IMO for literals it seems that an explicit 'u' is a more native approach, and provides clarity about the intentions of the programmer compared to a global switch button in the form of import unicode_literals. Using six.text_type() is probably a good solution nowadays for variables and not
mentioned above would be meaningful as unicode is py3's default string encoding; literals,
and would probably have to die off some day after py2 does the same.
Thanks and best regards,
[1] http://python-future.org/unicode_literals.html -- Didi _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/SW3P4VOGBP43N5...
-- Didi