[ovirt-devel] [ ovirt-devel ] [OST Failure Report ] [ oVirt Master ] [ 002_bootstrap ] [ 20/08/17 ]

Anton Marchukov amarchuk at redhat.com
Tue Aug 22 09:52:53 UTC 2017


Hello All.

Any news on this?  I see the latest failures for vdsm is the same [1] and
the job is still not working for it.

[1]
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/1901/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20170822035135-lago-basic-suite-master-host0-1f46d892.log



On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <nsoffer at redhat.com> wrote:

> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <danken at redhat.com> wrote:
>
>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
>> >
>> >
>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky <dbelenky at redhat.com>
>> wrote:
>> >>
>> >> Failed test: basic_suite_master/002_bootstrap
>> >> Version: oVirt Master
>> >> Link to failed job: ovirt-master_change-queue-tester/1860/
>> >> Link to logs (Jenkins): test logs
>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3
>> >>
>> >> From what I was able to find, It seems that for some reason VDSM
>> failed to
>> >> start on host 1. The VDSM log is empty, and the only error I could
>> find in
>> >> supervdsm.log is that start of LLDP failed (Not sure if it's related)
>> >
>> >
>> > Can you check the networking on the hosts? Something's very strange
>> there.
>> > For example:
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]:
>> <info>
>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device
>> > (/org/freedesktop/NetworkManager/Devices/17)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting xmit hash policy to layer2+3 (2)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting xmit hash policy to encap2+3 (3)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting xmit hash policy to encap3+4 (4)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > option xmit_hash_policy: invalid value (5)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting primary_reselect to always (0)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting primary_reselect to better (1)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting primary_reselect to failure (2)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > option primary_reselect: invalid value (3)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting arp_all_targets to any (0)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > Setting arp_all_targets to all (1)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>> > option arp_all_targets: invalid value (2)
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:
>> > e7NZWeNDXwIjQia is being deleted...
>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: recvfrom(Event
>> > interface): No buffer space available
>> >
>> > Y.
>>
>>
>>
>> The post-boot noise with funny-looking bonds is due to our calling of
>> `vdsm-tool dump-bonding-options` every boot, in order to find the
>> bonding defaults for the current kernel.
>>
>> >
>> >>
>> >> From host-deploy log:
>> >>
>> >> 2017-08-19 16:38:41,476-0400 DEBUG otopi.plugins.otopi.services.
>> systemd
>> >> systemd.state:130 starting service vdsmd
>> >> 2017-08-19 16:38:41,476-0400 DEBUG otopi.plugins.otopi.services.
>> systemd
>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
>> 'vdsmd.service'),
>> >> executable='None', cwd='None', env=None
>> >> 2017-08-19 16:38:44,628-0400 DEBUG otopi.plugins.otopi.services.
>> systemd
>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',
>> >> 'vdsmd.service'), rc=1
>> >> 2017-08-19 16:38:44,630-0400 DEBUG otopi.plugins.otopi.services.
>> systemd
>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
>> >> 'vdsmd.service') stdout:
>> >>
>> >>
>> >> 2017-08-19 16:38:44,630-0400 DEBUG otopi.plugins.otopi.services.
>> systemd
>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
>> >> 'vdsmd.service') stderr:
>> >> Job for vdsmd.service failed because the control process exited with
>> error
>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" for
>> details.
>> >>
>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context
>> >> context._executeMethod:142 method exception
>> >> Traceback (most recent call last):
>> >>   File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line 132,
>> in
>> >> _executeMethod
>> >>     method['method']()
>> >>   File
>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/
>> vdsm/packages.py",
>> >> line 224, in _start
>> >>     self.services.state('vdsmd', True)
>> >>   File "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/
>> systemd.py",
>> >> line 141, in state
>> >>     service=name,
>> >> RuntimeError: Failed to start service 'vdsmd'
>> >>
>> >>
>> >> From /var/log/messages:
>> >>
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> Error:
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> One of
>> >> the modules is not configured to work with VDSM.
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: To
>> >> configure the module use the following:
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> >> 'vdsm-tool configure [--module module-name]'.
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: If
>> all
>> >> modules are not configured try to use:
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> >> 'vdsm-tool configure --force'
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> (The
>> >> force flag will stop the module's service and start it
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> >> afterwards automatically to load the new configuration.)
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> abrt
>> >> is already configured for vdsm
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> lvm is
>> >> configured for vdsm
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> >> libvirt is already configured for vdsm
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> >> multipath requires configuration
>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>> >> Modules sanlock, multipath are not configured
>>
>
> This means the host was not deployed correctly. When deploying vdsm
> host deploy must run "vdsm-tool configure --force", which configures
> multipath and sanlock.
>
> We did not change anything in multipath and sanlock configurators lately.
>
> Didi, can you check this?
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>



-- 
Anton Marchukov
Team Lead - Release Management - RHV DevOps - Red Hat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170822/08eba9ff/attachment-0001.html>


More information about the Devel mailing list