[ovirt-devel] [ ovirt-devel ] [OST Failure Report ] [ oVirt Master ] [ 002_bootstrap ] [ 20/08/17 ]

Dan Kenigsberg danken at redhat.com
Wed Aug 23 07:26:22 UTC 2017


Oh, and I'd like to thank OST & team for catching my blunder.

On Tue, Aug 22, 2017 at 1:48 PM, Dan Kenigsberg <danken at redhat.com> wrote:
> This seems to be my fault, https://gerrit.ovirt.org/80908 should fix it.
>
> On Tue, Aug 22, 2017 at 1:14 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>>
>>
>> On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <didi at redhat.com> wrote:
>>>
>>> On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <amarchuk at redhat.com>
>>> wrote:
>>> > Hello All.
>>> >
>>> > Any news on this?  I see the latest failures for vdsm is the same [1]
>>> > and
>>> > the job is still not working for it.
>>> >
>>> > [1]
>>> >
>>> > http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/1901/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20170822035135-lago-basic-suite-master-host0-1f46d892.log
>>>
>>> This log has:
>>>
>>> 2017-08-22 03:51:28,272-0400 DEBUG otopi.context
>>> context._executeMethod:128 Stage closeup METHOD
>>> otopi.plugins.ovirt_host_deploy.vdsm.packages.Plugin._reconfigure
>>> 2017-08-22 03:51:28,272-0400 DEBUG
>>> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:813
>>> execute: ('/bin/vdsm-tool', 'configure', '--force'),
>>> executable='None', cwd='None', env=None
>>> 2017-08-22 03:51:30,687-0400 DEBUG
>>> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:863
>>> execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
>>> 2017-08-22 03:51:30,688-0400 DEBUG
>>> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921
>>> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:
>>>
>>> Checking configuration status...
>>>
>>> abrt is not configured for vdsm
>>> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
>>> on vdsm configuration
>>> lvm requires configuration
>>> libvirt is not configured for vdsm yet
>>> FAILED: conflicting vdsm and libvirt-qemu tls configuration.
>>> vdsm.conf with ssl=True requires the following changes:
>>> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
>>> qemu.conf: spice_tls=1.
>>> multipath requires configuration
>>>
>>> Running configure...
>>> Reconfiguration of abrt is done.
>>> Reconfiguration of passwd is done.
>>> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
>>> on vdsm configuration
>>> Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.201708220351
>>> Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf
>>> Units need configuration: {'lvm2-lvmetad.service': {'LoadState':
>>> 'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket':
>>> {'LoadState': 'loaded', 'ActiveState': 'active'}}
>>> Reconfiguration of lvm is done.
>>> Reconfiguration of sebool is done.
>>>
>>> 2017-08-22 03:51:30,688-0400 DEBUG
>>> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:926
>>> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
>>> Error:  ServiceNotExistError: Tried all alternatives but failed:
>>> ServiceNotExistError: dev-hugepages1G.mount is not native systemctl
>>> service
>>> ServiceNotExistError: dev-hugepages1G.mount is not a SysV service
>>>
>>>
>>> 2017-08-22 03:51:30,689-0400 WARNING
>>> otopi.plugins.ovirt_host_deploy.vdsm.packages
>>> packages._reconfigure:155 Cannot configure vdsm
>>>
>>> Nir, any idea?
>>
>>
>> Looks like some configurator has failed after sebool, but we don't have
>> proper error message with the name of the configurator.
>>
>> Piotr, can you take a look?
>>
>>
>>>
>>> >
>>> >
>>> >
>>> > On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>>> >>
>>> >> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <danken at redhat.com>
>>> >> wrote:
>>> >>>
>>> >>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>> >>> >
>>> >>> >
>>> >>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky
>>> >>> > <dbelenky at redhat.com>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> Failed test: basic_suite_master/002_bootstrap
>>> >>> >> Version: oVirt Master
>>> >>> >> Link to failed job: ovirt-master_change-queue-tester/1860/
>>> >>> >> Link to logs (Jenkins): test logs
>>> >>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3
>>> >>> >>
>>> >>> >> From what I was able to find, It seems that for some reason VDSM
>>> >>> >> failed to
>>> >>> >> start on host 1. The VDSM log is empty, and the only error I could
>>> >>> >> find in
>>> >>> >> supervdsm.log is that start of LLDP failed (Not sure if it's
>>> >>> >> related)
>>> >>> >
>>> >>> >
>>> >>> > Can you check the networking on the hosts? Something's very strange
>>> >>> > there.
>>> >>> > For example:
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]:
>>> >>> > <info>
>>> >>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device
>>> >>> > (/org/freedesktop/NetworkManager/Devices/17)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting xmit hash policy to layer2+3 (2)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting xmit hash policy to encap2+3 (3)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting xmit hash policy to encap3+4 (4)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > option xmit_hash_policy: invalid value (5)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting primary_reselect to always (0)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting primary_reselect to better (1)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting primary_reselect to failure (2)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > option primary_reselect: invalid value (3)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting arp_all_targets to any (0)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > Setting arp_all_targets to all (1)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>>> >>> > e7NZWeNDXwIjQia:
>>> >>> > option arp_all_targets: invalid value (2)
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:
>>> >>> > e7NZWeNDXwIjQia is being deleted...
>>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: recvfrom(Event
>>> >>> > interface): No buffer space available
>>> >>> >
>>> >>> > Y.
>>> >>>
>>> >>>
>>> >>>
>>> >>> The post-boot noise with funny-looking bonds is due to our calling of
>>> >>> `vdsm-tool dump-bonding-options` every boot, in order to find the
>>> >>> bonding defaults for the current kernel.
>>> >>>
>>> >>> >
>>> >>> >>
>>> >>> >> From host-deploy log:
>>> >>> >>
>>> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG
>>> >>> >> otopi.plugins.otopi.services.systemd
>>> >>> >> systemd.state:130 starting service vdsmd
>>> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG
>>> >>> >> otopi.plugins.otopi.services.systemd
>>> >>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
>>> >>> >> 'vdsmd.service'),
>>> >>> >> executable='None', cwd='None', env=None
>>> >>> >> 2017-08-19 16:38:44,628-0400 DEBUG
>>> >>> >> otopi.plugins.otopi.services.systemd
>>> >>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',
>>> >>> >> 'vdsmd.service'), rc=1
>>> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG
>>> >>> >> otopi.plugins.otopi.services.systemd
>>> >>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
>>> >>> >> 'vdsmd.service') stdout:
>>> >>> >>
>>> >>> >>
>>> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG
>>> >>> >> otopi.plugins.otopi.services.systemd
>>> >>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
>>> >>> >> 'vdsmd.service') stderr:
>>> >>> >> Job for vdsmd.service failed because the control process exited
>>> >>> >> with
>>> >>> >> error
>>> >>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" for
>>> >>> >> details.
>>> >>> >>
>>> >>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context
>>> >>> >> context._executeMethod:142 method exception
>>> >>> >> Traceback (most recent call last):
>>> >>> >>   File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line
>>> >>> >> 132,
>>> >>> >> in
>>> >>> >> _executeMethod
>>> >>> >>     method['method']()
>>> >>> >>   File
>>> >>> >>
>>> >>> >>
>>> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
>>> >>> >> line 224, in _start
>>> >>> >>     self.services.state('vdsmd', True)
>>> >>> >>   File
>>> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/systemd.py",
>>> >>> >> line 141, in state
>>> >>> >>     service=name,
>>> >>> >> RuntimeError: Failed to start service 'vdsmd'
>>> >>> >>
>>> >>> >>
>>> >>> >> From /var/log/messages:
>>> >>> >>
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> Error:
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> One of
>>> >>> >> the modules is not configured to work with VDSM.
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> To
>>> >>> >> configure the module use the following:
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> 'vdsm-tool configure [--module module-name]'.
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> If
>>> >>> >> all
>>> >>> >> modules are not configured try to use:
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> 'vdsm-tool configure --force'
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> (The
>>> >>> >> force flag will stop the module's service and start it
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> afterwards automatically to load the new configuration.)
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> abrt
>>> >>> >> is already configured for vdsm
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> lvm is
>>> >>> >> configured for vdsm
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> libvirt is already configured for vdsm
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> multipath requires configuration
>>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >>> >> Modules sanlock, multipath are not configured
>>> >>
>>> >>
>>> >> This means the host was not deployed correctly. When deploying vdsm
>>> >> host deploy must run "vdsm-tool configure --force", which configures
>>> >> multipath and sanlock.
>>> >>
>>> >> We did not change anything in multipath and sanlock configurators
>>> >> lately.
>>> >>
>>> >> Didi, can you check this?
>>> >>
>>> >> _______________________________________________
>>> >> Devel mailing list
>>> >> Devel at ovirt.org
>>> >> http://lists.ovirt.org/mailman/listinfo/devel
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Anton Marchukov
>>> > Team Lead - Release Management - RHV DevOps - Red Hat
>>> >
>>>
>>>
>>>
>>> --
>>> Didi
>>
>>
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel


More information about the Devel mailing list