[ovirt-devel] [ ovirt-devel ] [OST Failure Report ] [ oVirt Master ] [ 002_bootstrap ] [ 20/08/17 ]

Nir Soffer nsoffer at redhat.com
Wed Aug 23 07:21:43 UTC 2017


On Tue, Aug 22, 2017 at 1:48 PM Dan Kenigsberg <danken at redhat.com> wrote:

> This seems to be my fault, https://gerrit.ovirt.org/80908 should fix it.
>

This fix the actual error, but we still have bad logging.

Piotr, can you fix error handling so we get something like:

    Error configuring "foobar": actual error...


>
> On Tue, Aug 22, 2017 at 1:14 PM, Nir Soffer <nsoffer at redhat.com> wrote:
> >
> >
> > On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <didi at redhat.com>
> wrote:
> >>
> >> On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <amarchuk at redhat.com>
> >> wrote:
> >> > Hello All.
> >> >
> >> > Any news on this?  I see the latest failures for vdsm is the same [1]
> >> > and
> >> > the job is still not working for it.
> >> >
> >> > [1]
> >> >
> >> >
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/1901/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20170822035135-lago-basic-suite-master-host0-1f46d892.log
> >>
> >> This log has:
> >>
> >> 2017-08-22 03:51:28,272-0400 DEBUG otopi.context
> >> context._executeMethod:128 Stage closeup METHOD
> >> otopi.plugins.ovirt_host_deploy.vdsm.packages.Plugin._reconfigure
> >> 2017-08-22 03:51:28,272-0400 DEBUG
> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:813
> >> execute: ('/bin/vdsm-tool', 'configure', '--force'),
> >> executable='None', cwd='None', env=None
> >> 2017-08-22 03:51:30,687-0400 DEBUG
> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:863
> >> execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
> >> 2017-08-22 03:51:30,688-0400 DEBUG
> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921
> >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:
> >>
> >> Checking configuration status...
> >>
> >> abrt is not configured for vdsm
> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
> >> on vdsm configuration
> >> lvm requires configuration
> >> libvirt is not configured for vdsm yet
> >> FAILED: conflicting vdsm and libvirt-qemu tls configuration.
> >> vdsm.conf with ssl=True requires the following changes:
> >> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
> >> qemu.conf: spice_tls=1.
> >> multipath requires configuration
> >>
> >> Running configure...
> >> Reconfiguration of abrt is done.
> >> Reconfiguration of passwd is done.
> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
> >> on vdsm configuration
> >> Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.201708220351
> >> Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf
> >> Units need configuration: {'lvm2-lvmetad.service': {'LoadState':
> >> 'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket':
> >> {'LoadState': 'loaded', 'ActiveState': 'active'}}
> >> Reconfiguration of lvm is done.
> >> Reconfiguration of sebool is done.
> >>
> >> 2017-08-22 03:51:30,688-0400 DEBUG
> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:926
> >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
> >> Error:  ServiceNotExistError: Tried all alternatives but failed:
> >> ServiceNotExistError: dev-hugepages1G.mount is not native systemctl
> >> service
> >> ServiceNotExistError: dev-hugepages1G.mount is not a SysV service
> >>
> >>
> >> 2017-08-22 03:51:30,689-0400 WARNING
> >> otopi.plugins.ovirt_host_deploy.vdsm.packages
> >> packages._reconfigure:155 Cannot configure vdsm
> >>
> >> Nir, any idea?
> >
> >
> > Looks like some configurator has failed after sebool, but we don't have
> > proper error message with the name of the configurator.
> >
> > Piotr, can you take a look?
> >
> >
> >>
> >> >
> >> >
> >> >
> >> > On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <nsoffer at redhat.com>
> wrote:
> >> >>
> >> >> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <danken at redhat.com>
> >> >> wrote:
> >> >>>
> >> >>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <ykaul at redhat.com>
> wrote:
> >> >>> >
> >> >>> >
> >> >>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky
> >> >>> > <dbelenky at redhat.com>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Failed test: basic_suite_master/002_bootstrap
> >> >>> >> Version: oVirt Master
> >> >>> >> Link to failed job: ovirt-master_change-queue-tester/1860/
> >> >>> >> Link to logs (Jenkins): test logs
> >> >>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3
> >> >>> >>
> >> >>> >> From what I was able to find, It seems that for some reason VDSM
> >> >>> >> failed to
> >> >>> >> start on host 1. The VDSM log is empty, and the only error I
> could
> >> >>> >> find in
> >> >>> >> supervdsm.log is that start of LLDP failed (Not sure if it's
> >> >>> >> related)
> >> >>> >
> >> >>> >
> >> >>> > Can you check the networking on the hosts? Something's very
> strange
> >> >>> > there.
> >> >>> > For example:
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]:
> >> >>> > <info>
> >> >>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device
> >> >>> > (/org/freedesktop/NetworkManager/Devices/17)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting xmit hash policy to layer2+3 (2)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting xmit hash policy to encap2+3 (3)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting xmit hash policy to encap3+4 (4)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > option xmit_hash_policy: invalid value (5)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting primary_reselect to always (0)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting primary_reselect to better (1)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting primary_reselect to failure (2)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > option primary_reselect: invalid value (3)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting arp_all_targets to any (0)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > Setting arp_all_targets to all (1)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
> >> >>> > e7NZWeNDXwIjQia:
> >> >>> > option arp_all_targets: invalid value (2)
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:
> >> >>> > e7NZWeNDXwIjQia is being deleted...
> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad:
> recvfrom(Event
> >> >>> > interface): No buffer space available
> >> >>> >
> >> >>> > Y.
> >> >>>
> >> >>>
> >> >>>
> >> >>> The post-boot noise with funny-looking bonds is due to our calling
> of
> >> >>> `vdsm-tool dump-bonding-options` every boot, in order to find the
> >> >>> bonding defaults for the current kernel.
> >> >>>
> >> >>> >
> >> >>> >>
> >> >>> >> From host-deploy log:
> >> >>> >>
> >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG
> >> >>> >> otopi.plugins.otopi.services.systemd
> >> >>> >> systemd.state:130 starting service vdsmd
> >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG
> >> >>> >> otopi.plugins.otopi.services.systemd
> >> >>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
> >> >>> >> 'vdsmd.service'),
> >> >>> >> executable='None', cwd='None', env=None
> >> >>> >> 2017-08-19 16:38:44,628-0400 DEBUG
> >> >>> >> otopi.plugins.otopi.services.systemd
> >> >>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',
> >> >>> >> 'vdsmd.service'), rc=1
> >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG
> >> >>> >> otopi.plugins.otopi.services.systemd
> >> >>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
> >> >>> >> 'vdsmd.service') stdout:
> >> >>> >>
> >> >>> >>
> >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG
> >> >>> >> otopi.plugins.otopi.services.systemd
> >> >>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
> >> >>> >> 'vdsmd.service') stderr:
> >> >>> >> Job for vdsmd.service failed because the control process exited
> >> >>> >> with
> >> >>> >> error
> >> >>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe"
> for
> >> >>> >> details.
> >> >>> >>
> >> >>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context
> >> >>> >> context._executeMethod:142 method exception
> >> >>> >> Traceback (most recent call last):
> >> >>> >>   File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line
> >> >>> >> 132,
> >> >>> >> in
> >> >>> >> _executeMethod
> >> >>> >>     method['method']()
> >> >>> >>   File
> >> >>> >>
> >> >>> >>
> >> >>> >>
> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
> >> >>> >> line 224, in _start
> >> >>> >>     self.services.state('vdsmd', True)
> >> >>> >>   File
> >> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/systemd.py",
> >> >>> >> line 141, in state
> >> >>> >>     service=name,
> >> >>> >> RuntimeError: Failed to start service 'vdsmd'
> >> >>> >>
> >> >>> >>
> >> >>> >> From /var/log/messages:
> >> >>> >>
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> Error:
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> One of
> >> >>> >> the modules is not configured to work with VDSM.
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> To
> >> >>> >> configure the module use the following:
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> 'vdsm-tool configure [--module module-name]'.
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> If
> >> >>> >> all
> >> >>> >> modules are not configured try to use:
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> 'vdsm-tool configure --force'
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> (The
> >> >>> >> force flag will stop the module's service and start it
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> afterwards automatically to load the new configuration.)
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> abrt
> >> >>> >> is already configured for vdsm
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> lvm is
> >> >>> >> configured for vdsm
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> libvirt is already configured for vdsm
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> multipath requires configuration
> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
> vdsmd_init_common.sh:
> >> >>> >> Modules sanlock, multipath are not configured
> >> >>
> >> >>
> >> >> This means the host was not deployed correctly. When deploying vdsm
> >> >> host deploy must run "vdsm-tool configure --force", which configures
> >> >> multipath and sanlock.
> >> >>
> >> >> We did not change anything in multipath and sanlock configurators
> >> >> lately.
> >> >>
> >> >> Didi, can you check this?
> >> >>
> >> >> _______________________________________________
> >> >> Devel mailing list
> >> >> Devel at ovirt.org
> >> >> http://lists.ovirt.org/mailman/listinfo/devel
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Anton Marchukov
> >> > Team Lead - Release Management - RHV DevOps - Red Hat
> >> >
> >>
> >>
> >>
> >> --
> >> Didi
> >
> >
> > _______________________________________________
> > Devel mailing list
> > Devel at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170823/a99c57fb/attachment-0001.html>


More information about the Devel mailing list