On Tue, Jul 23, 2019 at 1:11 PM Yedidyah Bar David <didi@redhat.com> wrote:
Hi Nir and all,

In [1] you added line 151, to encode the contents to utf-8. Do you
remember why you needed that? What happens if I remove this line?

I added it because without it installation on Fedora 28 was broken. I don't remember
what was the error but probably the content was unicode instead of bytes.

I am working on [2]. It fails on that line, because the current
content, if organization name is unicode, has a UTF-8 encoded string
already, but is a python str (not unicode).

If the content is not unicode, why it goes to the unicode branch?

        if binary:
            self._content = content
        else:
            if isinstance(content, list) or isinstance(content, tuple):
                self._content = '\n'.join([common.toStr(i) for i in content])
                if content:
                    self._content += '\n'
            else:
                self._content = common.toStr(content)
                if not self._content.endswith('\n'):
                    self._content += '\n'
            self._content = self._content.encode("utf-8")

It looks like the binary flag value is wrong, or common.toStr() is encoding
the values to bytes.

The common error in python 2 is:

>>> text = u"\u05d0"
>>> encoded = text.encode("utf-8")
>>> encoded.encode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal not in range(128)

But:

>>> text = u"ascii"
>>> encoded = text.encode("utf-8")
>>> encoded.encode("utf-8")
'ascii'

Which seems to be the case in [2].

So if common.toStr() is encoding the values, there is no need to encode the values
again in line 151.

Tried patching otopi [3],
did a few attempts (some of them also pushed there, check the
different patchsets), but none worked. So I am going to patch
postinstall file generation instead [4], but I don't like this.

Any hints are welcome. Thanks and best regards,

[1] https://gerrit.ovirt.org/#/c/92435/1/src/otopi/filetransaction.py

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1729511

[3] https://gerrit.ovirt.org/102085

I don't understand what are you trying to do there, however codecs.open() is not needed
to read utf-8 data.

    with open(path, "rb") as f:
        text = f.read().decode("utf-8")


[4] https://gerrit.ovirt.org/102089

I agree that this is not the way.

Nir