
On Tue, Jul 23, 2019 at 1:11 PM Yedidyah Bar David <didi@redhat.com> wrote:
Hi Nir and all,
In [1] you added line 151, to encode the contents to utf-8. Do you remember why you needed that? What happens if I remove this line?
I added it because without it installation on Fedora 28 was broken. I don't remember what was the error but probably the content was unicode instead of bytes. I am working on [2]. It fails on that line, because the current
content, if organization name is unicode, has a UTF-8 encoded string already, but is a python str (not unicode).
If the content is not unicode, why it goes to the unicode branch? if binary: self._content = content else: if isinstance(content, list) or isinstance(content, tuple): self._content = '\n'.join([common.toStr(i) for i in content]) if content: self._content += '\n' else: self._content = common.toStr(content) if not self._content.endswith('\n'): self._content += '\n' self._content = self._content.encode("utf-8") It looks like the binary flag value is wrong, or common.toStr() is encoding the values to bytes. The common error in python 2 is:
text = u"\u05d0" encoded = text.encode("utf-8") encoded.encode("utf-8") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal not in range(128)
But:
text = u"ascii" encoded = text.encode("utf-8") encoded.encode("utf-8") 'ascii'
Which seems to be the case in [2]. So if common.toStr() is encoding the values, there is no need to encode the values again in line 151. Tried patching otopi [3],
did a few attempts (some of them also pushed there, check the different patchsets), but none worked. So I am going to patch postinstall file generation instead [4], but I don't like this.
Any hints are welcome. Thanks and best regards,
[1] https://gerrit.ovirt.org/#/c/92435/1/src/otopi/filetransaction.py
I don't understand what are you trying to do there, however codecs.open() is not needed to read utf-8 data. with open(path, "rb") as f: text = f.read().decode("utf-8")
I agree that this is not the way. Nir