On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <didi@redhat.com> wrote:On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <amarchuk@redhat.com> wrote:
> Hello All.
>
> Any news on this? I see the latest failures for vdsm is the same [1] and
> the job is still not working for it.
>
> [1]
> http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/1901/artifact/exported- artifacts/basic-suit-master- el7/test_logs/basic-suite- master/post-002_bootstrap.py/ lago-basic-suite-master- engine/_var_log/ovirt-engine/ host-deploy/ovirt-host-deploy- 20170822035135-lago-basic- suite-master-host0-1f46d892. log
This log has:
2017-08-22 03:51:28,272-0400 DEBUG otopi.context
context._executeMethod:128 Stage closeup METHOD
otopi.plugins.ovirt_host_deploy.vdsm.packages.Plugin._ reconfigure
2017-08-22 03:51:28,272-0400 DEBUG
otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:813
execute: ('/bin/vdsm-tool', 'configure', '--force'),
executable='None', cwd='None', env=None
2017-08-22 03:51:30,687-0400 DEBUG
otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:863
execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
2017-08-22 03:51:30,688-0400 DEBUG
otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921
execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm
WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
on vdsm configuration
lvm requires configuration
libvirt is not configured for vdsm yet
FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=True requires the following changes:
libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
qemu.conf: spice_tls=1.
multipath requires configuration
Running configure...
Reconfiguration of abrt is done.
Reconfiguration of passwd is done.
WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
on vdsm configuration
Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.201708220351
Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf
Units need configuration: {'lvm2-lvmetad.service': {'LoadState':
'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket':
{'LoadState': 'loaded', 'ActiveState': 'active'}}
Reconfiguration of lvm is done.
Reconfiguration of sebool is done.
2017-08-22 03:51:30,688-0400 DEBUG
otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:926
execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
Error: ServiceNotExistError: Tried all alternatives but failed:
ServiceNotExistError: dev-hugepages1G.mount is not native systemctl service
ServiceNotExistError: dev-hugepages1G.mount is not a SysV service
2017-08-22 03:51:30,689-0400 WARNING
otopi.plugins.ovirt_host_deploy.vdsm.packages
packages._reconfigure:155 Cannot configure vdsm
Nir, any idea?Looks like some configurator has failed after sebool, but we don't have proper error message with the name of the configurator.Piotr, can you take a look?
>
>
>
> On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <nsoffer@redhat.com> wrote:
>>
>> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <danken@redhat.com> wrote:
>>>
>>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
>>> >
>>> >
>>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky <dbelenky@redhat.com>
>>> > wrote:
>>> >>
>>> >> Failed test: basic_suite_master/002_bootstrap
>>> >> Version: oVirt Master
>>> >> Link to failed job: ovirt-master_change-queue-tester/1860/
>>> >> Link to logs (Jenkins): test logs
>>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3
>>> >>
>>> >> From what I was able to find, It seems that for some reason VDSM
>>> >> failed to
>>> >> start on host 1. The VDSM log is empty, and the only error I could
>>> >> find in
>>> >> supervdsm.log is that start of LLDP failed (Not sure if it's related)
>>> >
>>> >
>>> > Can you check the networking on the hosts? Something's very strange
>>> > there.
>>> > For example:
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]:
>>> > <info>
>>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device
>>> > (/org/freedesktop/NetworkManager/Devices/17)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting xmit hash policy to layer2+3 (2)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting xmit hash policy to encap2+3 (3)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting xmit hash policy to encap3+4 (4)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > option xmit_hash_policy: invalid value (5)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting primary_reselect to always (0)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting primary_reselect to better (1)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting primary_reselect to failure (2)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > option primary_reselect: invalid value (3)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting arp_all_targets to any (0)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > Setting arp_all_targets to all (1)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: e7NZWeNDXwIjQia:
>>> > option arp_all_targets: invalid value (2)
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:
>>> > e7NZWeNDXwIjQia is being deleted...
>>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: recvfrom(Event
>>> > interface): No buffer space available
>>> >
>>> > Y.
>>>
>>>
>>>
>>> The post-boot noise with funny-looking bonds is due to our calling of
>>> `vdsm-tool dump-bonding-options` every boot, in order to find the
>>> bonding defaults for the current kernel.
>>>
>>> >
>>> >>
>>> >> From host-deploy log:
>>> >>
>>> >> 2017-08-19 16:38:41,476-0400 DEBUG
>>> >> otopi.plugins.otopi.services.systemd
>>> >> systemd.state:130 starting service vdsmd
>>> >> 2017-08-19 16:38:41,476-0400 DEBUG
>>> >> otopi.plugins.otopi.services.systemd
>>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
>>> >> 'vdsmd.service'),
>>> >> executable='None', cwd='None', env=None
>>> >> 2017-08-19 16:38:44,628-0400 DEBUG
>>> >> otopi.plugins.otopi.services.systemd
>>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',
>>> >> 'vdsmd.service'), rc=1
>>> >> 2017-08-19 16:38:44,630-0400 DEBUG
>>> >> otopi.plugins.otopi.services.systemd
>>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
>>> >> 'vdsmd.service') stdout:
>>> >>
>>> >>
>>> >> 2017-08-19 16:38:44,630-0400 DEBUG
>>> >> otopi.plugins.otopi.services.systemd
>>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
>>> >> 'vdsmd.service') stderr:
>>> >> Job for vdsmd.service failed because the control process exited with
>>> >> error
>>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" for
>>> >> details.
>>> >>
>>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context
>>> >> context._executeMethod:142 method exception
>>> >> Traceback (most recent call last):
>>> >> File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line 132,
>>> >> in
>>> >> _executeMethod
>>> >> method['method']()
>>> >> File
>>> >>
>>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/ vdsm/packages.py",
>>> >> line 224, in _start
>>> >> self.services.state('vdsmd', True)
>>> >> File
>>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/ systemd.py",
>>> >> line 141, in state
>>> >> service=name,
>>> >> RuntimeError: Failed to start service 'vdsmd'
>>> >>
>>> >>
>>> >> From /var/log/messages:
>>> >>
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> Error:
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> One of
>>> >> the modules is not configured to work with VDSM.
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: To
>>> >> configure the module use the following:
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> 'vdsm-tool configure [--module module-name]'.
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh: If
>>> >> all
>>> >> modules are not configured try to use:
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> 'vdsm-tool configure --force'
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> (The
>>> >> force flag will stop the module's service and start it
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> afterwards automatically to load the new configuration.)
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> abrt
>>> >> is already configured for vdsm
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> lvm is
>>> >> configured for vdsm
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> libvirt is already configured for vdsm
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> multipath requires configuration
>>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:
>>> >> Modules sanlock, multipath are not configured
>>
>>
>> This means the host was not deployed correctly. When deploying vdsm
>> host deploy must run "vdsm-tool configure --force", which configures
>> multipath and sanlock.
>>
>> We did not change anything in multipath and sanlock configurators lately.
>>
>> Didi, can you check this?
>>
>> _______________________________________________
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>
>
>
>
> --
> Anton Marchukov
> Team Lead - Release Management - RHV DevOps - Red Hat
>
--
Didi