[ovirt-devel] [ ovirt-devel ] [OST Failure Report ] [ oVirt Master ] [ 002_bootstrap ] [ 20/08/17 ]
Nir Soffer
nsoffer at redhat.com
Sun Aug 20 10:39:24 UTC 2017
On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <danken at redhat.com> wrote:
> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
> >
> >
> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky <dbelenky at redhat.com>
> wrote:
> >>
> >> Failed test: basic_suite_master/002_bootstrap
> >> Version: oVirt Master
> >> Link to failed job: ovirt-master_change-queue-tester/1860/
> >> Link to logs (Jenkins): test logs
> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3
> >>
> >> From what I was able to find, It seems that for some reason VDSM failed
> to
> >> start on host 1. The VDSM log is empty, and the only error I could find
> in
> >> supervdsm.log is that start of LLDP failed (Not sure if it's related)
> >
> >
> > Can you check the networking on the hosts? Something's very strange
> there.
> > For example:
> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]: <info>
> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device
> > (/org/freedesktop/NetworkManager/Devices/17)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting xmit hash policy to layer2+3 (2)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting xmit hash policy to encap2+3 (3)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting xmit hash policy to encap3+4 (4)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > option xmit_hash_policy: invalid value (5)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting primary_reselect to always (0)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting primary_reselect to better (1)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting primary_reselect to failure (2)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > option primary_reselect: invalid value (3)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting arp_all_targets to any (0)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > Setting arp_all_targets to all (1)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
> > option arp_all_targets: invalid value (2)
> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:
> > e7NZWeNDXwIjQia is being deleted...
> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: recvfrom(Event
> > interface): No buffer space available
> >
> > Y.
>
>
>
> The post-boot noise with funny-looking bonds is due to our calling of
> `vdsm-tool dump-bonding-options` every boot, in order to find the
> bonding defaults for the current kernel.
>
> >
> >>
> >> From host-deploy log:
> >>
> >> 2017-08-19 16:38:41,476-0400 DEBUG otopi.plugins.otopi.services.systemd
> >> systemd.state:130 starting service vdsmd
> >> 2017-08-19 16:38:41,476-0400 DEBUG otopi.plugins.otopi.services.systemd
> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
> 'vdsmd.service'),
> >> executable='None', cwd='None', env=None
> >> 2017-08-19 16:38:44,628-0400 DEBUG otopi.plugins.otopi.services.systemd
> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',
> >> 'vdsmd.service'), rc=1
> >> 2017-08-19 16:38:44,630-0400 DEBUG otopi.plugins.otopi.services.systemd
> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
> >> 'vdsmd.service') stdout:
> >>
> >>
> >> 2017-08-19 16:38:44,630-0400 DEBUG otopi.plugins.otopi.services.systemd
> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
> >> 'vdsmd.service') stderr:
> >> Job for vdsmd.service failed because the control process exited with
> error
> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" for
> details.
> >>
> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context
> >> context._executeMethod:142 method exception
> >> Traceback (most recent call last):
> >> File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line 132, in
> >> _executeMethod
> >> method['method']()
> >> File
> >>
> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
> >> line 224, in _start
> >> self.services.state('vdsmd', True)
> >> File "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/systemd.py",
> >> line 141, in state
> >> service=name,
> >> RuntimeError: Failed to start service 'vdsmd'
> >>
> >>
> >> From /var/log/messages:
> >>
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> Error:
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: One
> of
> >> the modules is not configured to work with VDSM.
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: To
> >> configure the module use the following:
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> >> 'vdsm-tool configure [--module module-name]'.
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: If
> all
> >> modules are not configured try to use:
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> >> 'vdsm-tool configure --force'
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: (The
> >> force flag will stop the module's service and start it
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> >> afterwards automatically to load the new configuration.)
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: abrt
> >> is already configured for vdsm
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: lvm
> is
> >> configured for vdsm
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> >> libvirt is already configured for vdsm
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> >> multipath requires configuration
> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
> >> Modules sanlock, multipath are not configured
>
This means the host was not deployed correctly. When deploying vdsm
host deploy must run "vdsm-tool configure --force", which configures
multipath and sanlock.
We did not change anything in multipath and sanlock configurators lately.
Didi, can you check this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170820/c06ab44c/attachment.html>
More information about the Devel
mailing list