<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 23, 2017 at 9:21 AM, Nir Soffer <span dir="ltr"><<a href="mailto:nsoffer@redhat.com" target="_blank">nsoffer@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span class="gmail-"><div dir="ltr">On Tue, Aug 22, 2017 at 1:48 PM Dan Kenigsberg <<a href="mailto:danken@redhat.com" target="_blank">danken@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This seems to be my fault, <a href="https://gerrit.ovirt.org/80908" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/80908</a> should fix it.<br></blockquote><div><br></div></span><div>This fix the actual error, but we still have bad logging.</div><div><br></div><div>Piotr, can you fix error handling so we get something like:</div><div><br></div><div> Error configuring "foobar": actual error...</div></div></div></blockquote><div><br></div><div>Thank you for your suggestion</div><div><br></div><div>Yes, we will improve the logging. <br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div><div class="gmail-h5"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
On Tue, Aug 22, 2017 at 1:14 PM, Nir Soffer <<a href="mailto:nsoffer@redhat.com" target="_blank">nsoffer@redhat.com</a>> wrote:<br>
><br>
><br>
> On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <<a href="mailto:didi@redhat.com" target="_blank">didi@redhat.com</a>> wrote:<br>
>><br>
>> On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <<a href="mailto:amarchuk@redhat.com" target="_blank">amarchuk@redhat.com</a>><br>
>> wrote:<br>
>> > Hello All.<br>
>> ><br>
>> > Any news on this? I see the latest failures for vdsm is the same [1]<br>
>> > and<br>
>> > the job is still not working for it.<br>
>> ><br>
>> > [1]<br>
>> ><br>
>> > <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/1901/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20170822035135-lago-basic-suite-master-host0-1f46d892.log" rel="noreferrer" target="_blank">http://jenkins.ovirt.org/job/<wbr>ovirt-master_change-queue-<wbr>tester/1901/artifact/exported-<wbr>artifacts/basic-suit-master-<wbr>el7/test_logs/basic-suite-<wbr>master/post-002_bootstrap.py/<wbr>lago-basic-suite-master-<wbr>engine/_var_log/ovirt-engine/<wbr>host-deploy/ovirt-host-deploy-<wbr>20170822035135-lago-basic-<wbr>suite-master-host0-1f46d892.<wbr>log</a><br>
>><br>
>> This log has:<br>
>><br>
>> 2017-08-22 03:51:28,272-0400 DEBUG otopi.context<br>
>> context._executeMethod:128 Stage closeup METHOD<br>
>> otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages.Plugin._<wbr>reconfigure<br>
>> 2017-08-22 03:51:28,272-0400 DEBUG<br>
>> otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages plugin.executeRaw:813<br>
>> execute: ('/bin/vdsm-tool', 'configure', '--force'),<br>
>> executable='None', cwd='None', env=None<br>
>> 2017-08-22 03:51:30,687-0400 DEBUG<br>
>> otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages plugin.executeRaw:863<br>
>> execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1<br>
>> 2017-08-22 03:51:30,688-0400 DEBUG<br>
>> otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages plugin.execute:921<br>
>> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:<br>
>><br>
>> Checking configuration status...<br>
>><br>
>> abrt is not configured for vdsm<br>
>> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based<br>
>> on vdsm configuration<br>
>> lvm requires configuration<br>
>> libvirt is not configured for vdsm yet<br>
>> FAILED: conflicting vdsm and libvirt-qemu tls configuration.<br>
>> vdsm.conf with ssl=True requires the following changes:<br>
>> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1<br>
>> qemu.conf: spice_tls=1.<br>
>> multipath requires configuration<br>
>><br>
>> Running configure...<br>
>> Reconfiguration of abrt is done.<br>
>> Reconfiguration of passwd is done.<br>
>> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based<br>
>> on vdsm configuration<br>
>> Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.<wbr>201708220351<br>
>> Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf<br>
>> Units need configuration: {'lvm2-lvmetad.service': {'LoadState':<br>
>> 'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket':<br>
>> {'LoadState': 'loaded', 'ActiveState': 'active'}}<br>
>> Reconfiguration of lvm is done.<br>
>> Reconfiguration of sebool is done.<br>
>><br>
>> 2017-08-22 03:51:30,688-0400 DEBUG<br>
>> otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages plugin.execute:926<br>
>> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:<br>
>> Error: ServiceNotExistError: Tried all alternatives but failed:<br>
>> ServiceNotExistError: dev-hugepages1G.mount is not native systemctl<br>
>> service<br>
>> ServiceNotExistError: dev-hugepages1G.mount is not a SysV service<br>
>><br>
>><br>
>> 2017-08-22 03:51:30,689-0400 WARNING<br>
>> otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages<br>
>> packages._reconfigure:155 Cannot configure vdsm<br>
>><br>
>> Nir, any idea?<br>
><br>
><br>
> Looks like some configurator has failed after sebool, but we don't have<br>
> proper error message with the name of the configurator.<br>
><br>
> Piotr, can you take a look?<br>
><br>
><br>
>><br>
>> ><br>
>> ><br>
>> ><br>
>> > On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <<a href="mailto:nsoffer@redhat.com" target="_blank">nsoffer@redhat.com</a>> wrote:<br>
>> >><br>
>> >> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <<a href="mailto:danken@redhat.com" target="_blank">danken@redhat.com</a>><br>
>> >> wrote:<br>
>> >>><br>
>> >>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <<a href="mailto:ykaul@redhat.com" target="_blank">ykaul@redhat.com</a>> wrote:<br>
>> >>> ><br>
>> >>> ><br>
>> >>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky<br>
>> >>> > <<a href="mailto:dbelenky@redhat.com" target="_blank">dbelenky@redhat.com</a>><br>
>> >>> > wrote:<br>
>> >>> >><br>
>> >>> >> Failed test: basic_suite_master/002_<wbr>bootstrap<br>
>> >>> >> Version: oVirt Master<br>
>> >>> >> Link to failed job: ovirt-master_change-queue-<wbr>tester/1860/<br>
>> >>> >> Link to logs (Jenkins): test logs<br>
>> >>> >> Suspected patch: <a href="https://gerrit.ovirt.org/#/c/80749/3" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/#/c/<wbr>80749/3</a><br>
>> >>> >><br>
>> >>> >> From what I was able to find, It seems that for some reason VDSM<br>
>> >>> >> failed to<br>
>> >>> >> start on host 1. The VDSM log is empty, and the only error I could<br>
>> >>> >> find in<br>
>> >>> >> supervdsm.log is that start of LLDP failed (Not sure if it's<br>
>> >>> >> related)<br>
>> >>> ><br>
>> >>> ><br>
>> >>> > Can you check the networking on the hosts? Something's very strange<br>
>> >>> > there.<br>
>> >>> > For example:<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]:<br>
>> >>> > <info><br>
>> >>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device<br>
>> >>> > (/org/freedesktop/<wbr>NetworkManager/Devices/17)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting xmit hash policy to layer2+3 (2)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting xmit hash policy to encap2+3 (3)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting xmit hash policy to encap3+4 (4)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > option xmit_hash_policy: invalid value (5)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting primary_reselect to always (0)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting primary_reselect to better (1)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting primary_reselect to failure (2)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > option primary_reselect: invalid value (3)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting arp_all_targets to any (0)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > Setting arp_all_targets to all (1)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel:<br>
>> >>> > e7NZWeNDXwIjQia:<br>
>> >>> > option arp_all_targets: invalid value (2)<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding:<br>
>> >>> > e7NZWeNDXwIjQia is being deleted...<br>
>> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: recvfrom(Event<br>
>> >>> > interface): No buffer space available<br>
>> >>> ><br>
>> >>> > Y.<br>
>> >>><br>
>> >>><br>
>> >>><br>
>> >>> The post-boot noise with funny-looking bonds is due to our calling of<br>
>> >>> `vdsm-tool dump-bonding-options` every boot, in order to find the<br>
>> >>> bonding defaults for the current kernel.<br>
>> >>><br>
>> >>> ><br>
>> >>> >><br>
>> >>> >> From host-deploy log:<br>
>> >>> >><br>
>> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG<br>
>> >>> >> otopi.plugins.otopi.services.<wbr>systemd<br>
>> >>> >> systemd.state:130 starting service vdsmd<br>
>> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG<br>
>> >>> >> otopi.plugins.otopi.services.<wbr>systemd<br>
>> >>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',<br>
>> >>> >> 'vdsmd.service'),<br>
>> >>> >> executable='None', cwd='None', env=None<br>
>> >>> >> 2017-08-19 16:38:44,628-0400 DEBUG<br>
>> >>> >> otopi.plugins.otopi.services.<wbr>systemd<br>
>> >>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',<br>
>> >>> >> 'vdsmd.service'), rc=1<br>
>> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG<br>
>> >>> >> otopi.plugins.otopi.services.<wbr>systemd<br>
>> >>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',<br>
>> >>> >> 'vdsmd.service') stdout:<br>
>> >>> >><br>
>> >>> >><br>
>> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG<br>
>> >>> >> otopi.plugins.otopi.services.<wbr>systemd<br>
>> >>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',<br>
>> >>> >> 'vdsmd.service') stderr:<br>
>> >>> >> Job for vdsmd.service failed because the control process exited<br>
>> >>> >> with<br>
>> >>> >> error<br>
>> >>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" for<br>
>> >>> >> details.<br>
>> >>> >><br>
>> >>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context<br>
>> >>> >> context._executeMethod:142 method exception<br>
>> >>> >> Traceback (most recent call last):<br>
>> >>> >> File "/tmp/ovirt-dunwHj8Njn/<wbr>pythonlib/otopi/context.py", line<br>
>> >>> >> 132,<br>
>> >>> >> in<br>
>> >>> >> _executeMethod<br>
>> >>> >> method['method']()<br>
>> >>> >> File<br>
>> >>> >><br>
>> >>> >><br>
>> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-<wbr>plugins/ovirt-host-deploy/<wbr>vdsm/packages.py",<br>
>> >>> >> line 224, in _start<br>
>> >>> >> self.services.state('vdsmd', True)<br>
>> >>> >> File<br>
>> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-<wbr>plugins/otopi/services/<wbr>systemd.py",<br>
>> >>> >> line 141, in state<br>
>> >>> >> service=name,<br>
>> >>> >> RuntimeError: Failed to start service 'vdsmd'<br>
>> >>> >><br>
>> >>> >><br>
>> >>> >> From /var/log/messages:<br>
>> >>> >><br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> Error:<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> One of<br>
>> >>> >> the modules is not configured to work with VDSM.<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> To<br>
>> >>> >> configure the module use the following:<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> 'vdsm-tool configure [--module module-name]'.<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> If<br>
>> >>> >> all<br>
>> >>> >> modules are not configured try to use:<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> 'vdsm-tool configure --force'<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> (The<br>
>> >>> >> force flag will stop the module's service and start it<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> afterwards automatically to load the new configuration.)<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> abrt<br>
>> >>> >> is already configured for vdsm<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> lvm is<br>
>> >>> >> configured for vdsm<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> libvirt is already configured for vdsm<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> multipath requires configuration<br>
>> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 vdsmd_init_common.sh:<br>
>> >>> >> Modules sanlock, multipath are not configured<br>
>> >><br>
>> >><br>
>> >> This means the host was not deployed correctly. When deploying vdsm<br>
>> >> host deploy must run "vdsm-tool configure --force", which configures<br>
>> >> multipath and sanlock.<br>
>> >><br>
>> >> We did not change anything in multipath and sanlock configurators<br>
>> >> lately.<br>
>> >><br>
>> >> Didi, can you check this?<br>
>> >><br>
>> >> ______________________________<wbr>_________________<br>
>> >> Devel mailing list<br>
>> >> <a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
>> >> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/devel</a><br>
>> ><br>
>> ><br>
>> ><br>
>> ><br>
>> > --<br>
>> > Anton Marchukov<br>
>> > Team Lead - Release Management - RHV DevOps - Red Hat<br>
>> ><br>
>><br>
>><br>
>><br>
>> --<br>
>> Didi<br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> Devel mailing list<br>
> <a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/devel</a><br>
</blockquote></div></div></div></div>
</blockquote></div><br></div></div>