On Wed, Sep 30, 2020 at 7:08 AM Ritesh Chikatwar <rchikatw(a)redhat.com> wrote:
On Wed, Sep 30, 2020 at 1:54 AM Nir Soffer <nsoffer(a)redhat.com> wrote:
>
> On Tue, Sep 29, 2020 at 11:32 AM Michal Skrivanek
> <michal.skrivanek(a)redhat.com> wrote:
> >
> > [adding devel list]
> >
> > On 29 Sep 2020, at 09:51, Ritesh Chikatwar <rchikatw(a)redhat.com> wrote:
> >
> > Hello,
> >
> > i am new to VDSM codebase.
> >
> > There is one minor bug in gluster and they don't have any near plan to fix
this. So i need to fix this in vdsm.
> >
> > The bug is when i run command
> > [root@dhcp35-237 ~]# gluster v geo-replication status
> > No active geo-replication sessions
> > geo-replication command failed
> > [root@dhcp35-237 ~]#
> >
> > So this engine is piling up with error. i need handle this in vdsm the code at
this place
> >
> >
https://github.com/oVirt/vdsm/blob/master/lib/vdsm/gluster/cli.py#L1231
> >
> > If i run the above command with xml then i will get
> >
> > [root@dhcp35-237 ~]# gluster v geo-replication status --xml
> > geo-replication command failed
> > [root@dhcp35-237 ~]#
> >
> > so i have to run one more command before executing this and check in exception
that it contains No active geo-replication sessions this string. for that i did like this
> >
> >
> > maybe it goes to stderr?
> >
> > other than that, no idea, sorry
> >
> > Thanks,
> > michal
> >
> >
> > try:
> > xmltree = _execGluster(command)
> > except ge.GlusterCmdFailedException as e:
> > if str(e).find("No active geo-replication sessions") != -1:
> > return []
>
> The issue is that _execGluster is not implemented correctly, dropping
> the stdout, stderror, and return the format string:
>
> "Command {self.cmd} failed with rc={self.rc} out={self.out!r}
> err={self.err!r}"
>
> As the error, inside a list.
>
> 109 def _execGluster(cmd):
> 110 try:
> 111 return commands.run(cmd)
> 112 except cmdutils.Error as e:
> 113 raise ge.GlusterCmdFailedException(rc=e.rc, err=[e.msg])
>
> After fixing this issue you will be able to search in the error message
> of the original command, like this:
>
> if "needle" in e.err:
>
> But searching for text in an error message is fragile. It will break
> if vdsm is running
> with a different locale and the error message is localized, or if the
> error message
> is modified in a future gluster version.
Yes true. Because of this engine messages(1858236) are piling up so thought of doing in
this way if there is another of handling this in vdsm let me know.
>
>
> The right thing is to fix the gluster tool so it it succeeds and
> passes the error
> code/message in a way that can be consumed by another program.
>
> Looking at _getTree(), this is how it already behaves:
>
> 116 def _getTree(out):
> 117 try:
> 118 tree = etree.fromstring(out)
> 119 rv = int(tree.find('opRet').text)
> 120 msg = tree.find('opErrstr').text
> 121 errNo = int(tree.find('opErrno').text)
> 122 except _etreeExceptions: # pylint: disable=catching-non-exception
> 123 raise ge.GlusterXmlErrorException(err=out)
> 124 if rv == 0:
> 125 return tree
> 126 else:
> 127 if errNo != 0:
> 128 rv = errNo
> 129 raise ge.GlusterCmdFailedException(rc=rv, err=[msg])
>
> Did you file a gluster bug for this?
There is a bug already created for this #1367
I see that there is a patch for fixing the root cause in gluster.
So I would not add a hack to detect these errors in vdsm.
The engine bug (
) should move to vdsm
or to gluster, since vdsm is reporting the errors from gluster as it should.
Regardless, we should fix vdsm error handling so it exposes the real error
instead of the error format string.