[ovirt-devel] [ ovirt-devel ] [OST Failure Report ] [ oVirt Master ] [ 002_bootstrap ] [ 20/08/17 ]

Piotr Kliczewski pkliczew at redhat.com
Wed Aug 23 07:30:18 UTC 2017


On Wed, Aug 23, 2017 at 9:21 AM, Nir Soffer <nsoffer at redhat.com> wrote:

> On Tue, Aug 22, 2017 at 1:48 PM Dan Kenigsberg <danken at redhat.com> wrote:
>
>> This seems to be my fault, https://gerrit.ovirt.org/80908 should fix it.
>>
>
> This fix the actual error, but we still have bad logging.
>
> Piotr, can you fix error handling so we get something like:
>
>     Error configuring "foobar": actual error...
>

Thank you for your suggestion

Yes, we will improve the logging.


>
>
>>
>> On Tue, Aug 22, 2017 at 1:14 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>> >
>> >
>> > On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <didi at redhat.com>
>> wrote:
>> >>
>> >> On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <amarchuk at redhat.com
>> >
>> >> wrote:
>> >> > Hello All.
>> >> >
>> >> > Any news on this?  I see the latest failures for vdsm is the same [1]
>> >> > and
>> >> > the job is still not working for it.
>> >> >
>> >> > [1]
>> >> >
>> >> > http://jenkins.ovirt.org/job/ovirt-master_change-queue-
>> tester/1901/artifact/exported-artifacts/basic-suit-master-
>> el7/test_logs/basic-suite-master/post-002_bootstrap.py/
>> lago-basic-suite-master-engine/_var_log/ovirt-engine/
>> host-deploy/ovirt-host-deploy-20170822035135-lago-basic-
>> suite-master-host0-1f46d892.log
>> >>
>> >> This log has:
>> >>
>> >> 2017-08-22 03:51:28,272-0400 DEBUG otopi.context
>> >> context._executeMethod:128 Stage closeup METHOD
>> >> otopi.plugins.ovirt_host_deploy.vdsm.packages.Plugin._reconfigure
>> >> 2017-08-22 03:51:28,272-0400 DEBUG
>> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:813
>> >> execute: ('/bin/vdsm-tool', 'configure', '--force'),
>> >> executable='None', cwd='None', env=None
>> >> 2017-08-22 03:51:30,687-0400 DEBUG
>> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:863
>> >> execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
>> >> 2017-08-22 03:51:30,688-0400 DEBUG
>> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921
>> >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:
>> >>
>> >> Checking configuration status...
>> >>
>> >> abrt is not configured for vdsm
>> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
>> >> on vdsm configuration
>> >> lvm requires configuration
>> >> libvirt is not configured for vdsm yet
>> >> FAILED: conflicting vdsm and libvirt-qemu tls configuration.
>> >> vdsm.conf with ssl=True requires the following changes:
>> >> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
>> >> qemu.conf: spice_tls=1.
>> >> multipath requires configuration
>> >>
>> >> Running configure...
>> >> Reconfiguration of abrt is done.
>> >> Reconfiguration of passwd is done.
>> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
>> >> on vdsm configuration
>> >> Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.
>> 201708220351
>> >> Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf
>> >> Units need configuration: {'lvm2-lvmetad.service': {'LoadState':
>> >> 'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket':
>> >> {'LoadState': 'loaded', 'ActiveState': 'active'}}
>> >> Reconfiguration of lvm is done.
>> >> Reconfiguration of sebool is done.
>> >>
>> >> 2017-08-22 03:51:30,688-0400 DEBUG
>> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:926
>> >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
>> >> Error:  ServiceNotExistError: Tried all alternatives but failed:
>> >> ServiceNotExistError: dev-hugepages1G.mount is not native systemctl
>> >> service
>> >> ServiceNotExistError: dev-hugepages1G.mount is not a SysV service
>> >>
>> >>
>> >> 2017-08-22 03:51:30,689-0400 WARNING
>> >> otopi.plugins.ovirt_host_deploy.vdsm.packages
>> >> packages._reconfigure:155 Cannot configure vdsm
>> >>
>> >> Nir, any idea?
>> >
>> >
>> > Looks like some configurator has failed after sebool, but we don't have
>> > proper error message with the name of the configurator.
>> >
>> > Piotr, can you take a look?
>> >
>> >
>> >>
>> >> >
>> >> >
>> >> >
>> >> > On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <nsoffer at redhat.com>
>> wrote:
>> >> >>
>> >> >> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <danken at redhat.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <ykaul at redhat.com>
>> wrote:
>> >> >>> >
>> >> >>> >
>> >> >>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky
>> >> >>> > <dbelenky at redhat.com>
>> >> >>> > wrote:
>> >> >>> >>
>> >> >>> >> Failed test: basic_suite_master/002_bootstrap
>> >> >>> >> Version: oVirt Master
>> >> >>> >> Link to failed job: ovirt-master_change-queue-tester/1860/
>> >> >>> >> Link to logs (Jenkins): test logs
>> >> >>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3
>> >> >>> >>
>> >> >>> >> From what I was able to find, It seems that for some reason VDSM
>> >> >>> >> failed to
>> >> >>> >> start on host 1. The VDSM log is empty, and the only error I
>> could
>> >> >>> >> find in
>> >> >>> >> supervdsm.log is that start of LLDP failed (Not sure if it's
>> >> >>> >> related)
>> >> >>> >
>> >> >>> >
>> >> >>> > Can you check the networking on the hosts? Something's very
>> strange
>> >> >>> > there.
>> >> >>> > For example:
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0
>> NetworkManager[685]:
>> >> >>> > <info>
>> >> >>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device
>> >> >>> > (/org/freedesktop/NetworkManager/Devices/17)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting xmit hash policy to layer2+3 (2)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting xmit hash policy to encap2+3 (3)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting xmit hash policy to encap3+4 (4)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > option xmit_hash_policy: invalid value (5)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting primary_reselect to always (0)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting primary_reselect to better (1)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting primary_reselect to failure (2)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > option primary_reselect: invalid value (3)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting arp_all_targets to any (0)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > Setting arp_all_targets to all (1)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:
>> >> >>> > e7NZWeNDXwIjQia:
>> >> >>> > option arp_all_targets: invalid value (2)
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:
>> >> >>> > e7NZWeNDXwIjQia is being deleted...
>> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad:
>> recvfrom(Event
>> >> >>> > interface): No buffer space available
>> >> >>> >
>> >> >>> > Y.
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> The post-boot noise with funny-looking bonds is due to our calling
>> of
>> >> >>> `vdsm-tool dump-bonding-options` every boot, in order to find the
>> >> >>> bonding defaults for the current kernel.
>> >> >>>
>> >> >>> >
>> >> >>> >>
>> >> >>> >> From host-deploy log:
>> >> >>> >>
>> >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG
>> >> >>> >> otopi.plugins.otopi.services.systemd
>> >> >>> >> systemd.state:130 starting service vdsmd
>> >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG
>> >> >>> >> otopi.plugins.otopi.services.systemd
>> >> >>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
>> >> >>> >> 'vdsmd.service'),
>> >> >>> >> executable='None', cwd='None', env=None
>> >> >>> >> 2017-08-19 16:38:44,628-0400 DEBUG
>> >> >>> >> otopi.plugins.otopi.services.systemd
>> >> >>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl',
>> 'start',
>> >> >>> >> 'vdsmd.service'), rc=1
>> >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG
>> >> >>> >> otopi.plugins.otopi.services.systemd
>> >> >>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
>> >> >>> >> 'vdsmd.service') stdout:
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG
>> >> >>> >> otopi.plugins.otopi.services.systemd
>> >> >>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
>> >> >>> >> 'vdsmd.service') stderr:
>> >> >>> >> Job for vdsmd.service failed because the control process exited
>> >> >>> >> with
>> >> >>> >> error
>> >> >>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe"
>> for
>> >> >>> >> details.
>> >> >>> >>
>> >> >>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context
>> >> >>> >> context._executeMethod:142 method exception
>> >> >>> >> Traceback (most recent call last):
>> >> >>> >>   File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line
>> >> >>> >> 132,
>> >> >>> >> in
>> >> >>> >> _executeMethod
>> >> >>> >>     method['method']()
>> >> >>> >>   File
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/
>> vdsm/packages.py",
>> >> >>> >> line 224, in _start
>> >> >>> >>     self.services.state('vdsmd', True)
>> >> >>> >>   File
>> >> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/
>> systemd.py",
>> >> >>> >> line 141, in state
>> >> >>> >>     service=name,
>> >> >>> >> RuntimeError: Failed to start service 'vdsmd'
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> From /var/log/messages:
>> >> >>> >>
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> Error:
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> One of
>> >> >>> >> the modules is not configured to work with VDSM.
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> To
>> >> >>> >> configure the module use the following:
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> 'vdsm-tool configure [--module module-name]'.
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> If
>> >> >>> >> all
>> >> >>> >> modules are not configured try to use:
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> 'vdsm-tool configure --force'
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> (The
>> >> >>> >> force flag will stop the module's service and start it
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> afterwards automatically to load the new configuration.)
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> abrt
>> >> >>> >> is already configured for vdsm
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> lvm is
>> >> >>> >> configured for vdsm
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> libvirt is already configured for vdsm
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> multipath requires configuration
>> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0
>> vdsmd_init_common.sh:
>> >> >>> >> Modules sanlock, multipath are not configured
>> >> >>
>> >> >>
>> >> >> This means the host was not deployed correctly. When deploying vdsm
>> >> >> host deploy must run "vdsm-tool configure --force", which configures
>> >> >> multipath and sanlock.
>> >> >>
>> >> >> We did not change anything in multipath and sanlock configurators
>> >> >> lately.
>> >> >>
>> >> >> Didi, can you check this?
>> >> >>
>> >> >> _______________________________________________
>> >> >> Devel mailing list
>> >> >> Devel at ovirt.org
>> >> >> http://lists.ovirt.org/mailman/listinfo/devel
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Anton Marchukov
>> >> > Team Lead - Release Management - RHV DevOps - Red Hat
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Didi
>> >
>> >
>> > _______________________________________________
>> > Devel mailing list
>> > Devel at ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/devel
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170823/0a3664f4/attachment-0001.html>


More information about the Devel mailing list