On Sun, Sep 1, 2019 at 2:34 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Sep 1, 2019 at 1:20 PM Amit Bawer <abawer@redhat.com> wrote:
>
>
>
> On Sun, Sep 1, 2019 at 10:28 AM Yedidyah Bar David <didi@redhat.com> wrote:
>>
>> Hi all,
>>
>> That's a "sub-thread" of "unicode sandwich in otopi/engine-setup".
>>
>> I was recommended to use 'six.text_type() over "u''". I did read [1],
>> and eventually decided that my own preference is to just add "u"
>> prefix. Reasoning is inside [1].
>>
>> Do people have different preferences/reasoning they want to share?
>>
>> Do people think we should have project-wide policy re this?
>
>
> Since our code is currently transitioning from py2 to py2/py3, and not from py3 to py3/py2, it would be fair to assume that most
> already existing string literals in it contain ascii symbols, unless explicitly stated otherwise;
> so IMO it would only make sense to enforce 'u' over newly added literals which involve non-ascii symbols as long as py2 is still alive.

Not exactly.

Suppose (mostly correctly) that the code didn't employ the "unicode
sandwich" technique so far. Meaning, much was handled as python2 str
objects containing utf-8-encoded strings, and converted to unicode
objects mainly as needed/noted/considered. Suppose that x is a
variable that used to contain such an str, usually ascii-only, but
sometimes perhaps utf-8. Now, this:

'x: {}'.format(x)

would work, and replace {} with the contents of x, and return a
python2 str, utf-8-encoded if x is utf-8. But if now x contains a
unicode object (because we decided to follow the sandwich approach,
and encode all utf-8 during input), it would fail, if x is not
ascii-only. Adding u to 'x: {}' solves this.

utf-8 is an ascii extension, meaning that first 128 ordinals agree for both encodings, so unicode sandwich has no negative effect on your example. 
It would be only a problem only if input for x originally had a non-ascii character in it, but that should have been an issue for py2 in the first place, regardless to py3 sandwiches.


So I have to handle also all existing such literals, at least those
that would now require handling unicode vars.

>
>>
>>
>> Personally, I do not see the big advantage of adding "six.text_type()"
>> (15 chars) instead of a single "u". I do see where it can be useful,
>> but not as a very long replacement, IMO, for "u", or for
>> unicode_literals.
>
>
> Once py2 will be officially terminated, probably neither option mentioned above would be meaningful as unicode is py3's default string encoding;
> however IMO for literals it seems that an explicit 'u' is a more native approach, and provides clarity about the intentions of the programmer compared
> to a global switch button in the form of import unicode_literals. Using six.text_type() is probably a good solution nowadays for variables and not literals,
> and would probably have to die off some day after py2 does the same.
>
>>
>>
>> Thanks and best regards,
>>
>> [1] http://python-future.org/unicode_literals.html
>> --
>> Didi
>> _______________________________________________
>> Devel mailing list -- devel@ovirt.org
>> To unsubscribe send an email to devel-leave@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
>> List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/SW3P4VOGBP43N54CQEH3YURN6X5ZMWIX/



--
Didi