Hi,
Not sure if this is the same issue or not, but I opened a thread about
guestfs_launch failing not too long ago.
https://www.redhat.com/archives/libguestfs/2020-August/msg00352.html
Since we are using our own tool, we ended up retrying guestfs_launch
(and also guestfs_open, since the documentation suggests not reusing
handles) - Not sure if this solved the issue since we introduced this
retrial not too long ago.
Sam
On Mon, Nov 9, 2020 at 9:57 AM Yedidyah Bar David <didi(a)redhat.com> wrote:
>
> On Wed, Oct 14, 2020 at 8:54 AM Yedidyah Bar David <didi(a)redhat.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 8:40 PM Richard W.M. Jones <rjones(a)redhat.com>
wrote:
> > >
> > > On Tue, Oct 13, 2020 at 07:56:29PM +0300, Nir Soffer wrote:
> > > > On Tue, Oct 13, 2020 at 7:15 PM Richard W.M. Jones
<rjones(a)redhat.com> wrote:
> > > > >
> > > > > On Tue, Oct 13, 2020 at 06:45:42PM +0300, Nir Soffer wrote:
> > > > > > I think this is the right solution - when virt-something
tool fails,
> > > > > > it should log the reason for the failure - the error that
caused the
> > > > > > tool to fail. I'm not sure this is easy to do as the
failing code
> > > > > > run inside a special VM. Maybe the code running in the VM
should log
> > > > > > the output in a machine readable way, so once an error is
detected
> > > > > > virt-something can report the error as the reason, without
running
> > > > > > in debug mode.
> > > > >
> > > > > All the virt-* tools that I've written have a non-zero exit
code and
> > > > > print an error message on stderr when they fail. Errors from
inside
> > > > > the appliance are propagated to the library and thence to the
tool
> > > > > correctly.
> > > > >
> > > > > I think the best thing to do is:
> > > > >
> > > > > - spool up stdout + stderr from the tool
> > > > >
> > > > > - if the exit code != 0, save the spooled output for analysis
> > > > >
> > > > > - if the exit code == 0, discard it (or keep it if you like)
> > > >
> > > > This is what we already do, and the result is not helpful. If you
look
> > > > at the log message in the previous message, basically the only
> > > > info about the error is:
> > > >
> > > > libguestfs error: guestfs_launch failed
> > > >
> > > > I don't see what we can do with this error message.
> > >
> > > Right, so in this particular instance the error message would tell us
> > > that you should run libguestfs-test-tool because your qemu/kernel/etc
> > > is broken in some way :-/
> > >
> > > There's not a particularly good answer here if you don't want to
ever
> > > use LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE, but perhaps you could run
> > > libguestfs-test-tool if you see any error which matches the substring
> > > /guestfs_launch/ ?
> >
> > Another (orthogonal?) option:
> >
> > Make LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE log elsewhere, not to stdout/err
> > (e.g. some other file descriptor, or to a file passed via env or whatever).
> > This way, it might make sense for vdsm to always pass these vars, continue
> > logging all stdout/err, and log/keep debug/trace logs only on errors.
>
> This now happened again:
>
>
https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night...
>
>
https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night...
>
> 2020-11-09 01:05:42,031-0500 INFO (jsonrpc/4) [api.host] FINISH
> getAllVmIoTunePolicies return={'status': {'code': 0,
'message':
> 'Done'}, 'io_tune_policies_dict':
> {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [],
> 'current_values': [{'name': 'vda', 'path':
>
'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80',
> 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0,
> 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec':
0,
> 'read_iops_sec': 0}}]}}} from=::1,34002 (api:54)
> 2020-11-09 01:05:42,038-0500 DEBUG (jsonrpc/4) [jsonrpc.JsonRpcServer]
> Return 'Host.getAllVmIoTunePolicies' in bridge with
> {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [],
> 'current_values': [{'name': 'vda', 'path':
>
'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80',
> 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0,
> 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec':
0,
> 'read_iops_sec': 0}}]}} (__init__:360)
> 2020-11-09 01:05:42,435-0500 DEBUG (tasks/3) [common.commands] FAILED:
> <err> = b"virt-sparsify: error: libguestfs error: guestfs_launch
> failed.\nThis usually means the libguestfs appliance failed to start
> or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand
> run the command again. For further information, read:\n
>
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can
> also run 'libguestfs-test-tool' and post the *complete* output\ninto a
> bug report or message to the libguestfs mailing list.\n\nIf reporting
> bugs, run virt-sparsify with debugging enabled and include the
> \ncomplete output:\n\n virt-sparsify -v -x [...]\n"; <rc> = 1
> (commands:98)
>
> I suggest that if we have come to a dead-end and no-one has any clue, then
> we either patch something (vdsm?) to allow getting more information if this
> happens again, or open a bug for further discussion/prioritization.
>
> Best regards,
> --
> Didi
>
> _______________________________________________
> Libguestfs mailing list
> Libguestfs(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/libguestfs
>