On Tue, Jun 26, 2018 at 11:39 AM fsoyer <fsoyer(a)systea.fr> wrote:
Well,
unfortunatly, it was a "false-positive". This morning I tried again, with
the idea that at one moment the deploy will ask for the final destination
for the engine, I will restart bond0+gluster+volume engine at thos moment.
Re-launching the deploy on the second "fresh" host (the first one with all
errors yesterday let it in a doutful state) with em2 and gluster+bond0 off :
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether e0:db:55:15:f0:f0 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.227/8 brd 10.255.255.255 scope global em1
valid_lft forever preferred_lft forever
inet6 fe80::e2db:55ff:fe15:f0f0/64 scope link
valid_lft forever preferred_lft forever
3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
qlen 1000
link/ether e0:db:55:15:f0:f1 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
qlen 1000
link/ether e0:db:55:15:f0:f2 brd ff:ff:ff:ff:ff:ff
5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
qlen 1000
link/ether e0:db:55:15:f0:f3 brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER> mtu 9000 qdisc noqueue state DOWN
group default qlen 1000
link/ether 3a:ab:a2:f2:38:5c brd ff:ff:ff:ff:ff:ff
# ip r
default via 10.0.1.254 dev em1
10.0.0.0/8 dev em1 proto kernel scope link src 10.0.0.227
169.254.0.0/16 dev em1 scope link metric 1002
... does NOT work this morning
[ INFO ] TASK [Get local VM IP]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50,
"changed": true,
"cmd": "virsh -r net-dhcp-leases default | grep -i 00:16:3e:01:c6:32 |
awk
'{ print $5 }' | cut -f1 -d'/'", "delta":
"0:00:00.083587", "end":
"2018-06-26 11:26:07.581706", "rc": 0, "start":
"2018-06-26
11:26:07.498119", "stderr": "", "stderr_lines": [],
"stdout": "",
"stdout_lines": []}
I'm sure that the network was the same yesterday when my attempt finally
pass the "get local vm ip". Why not today ?
After the error, the network was :
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether e0:db:55:15:f0:f0 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.227/8 brd 10.255.255.255 scope global em1
valid_lft forever preferred_lft forever
inet6 fe80::e2db:55ff:fe15:f0f0/64 scope link
valid_lft forever preferred_lft forever
3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
qlen 1000
link/ether e0:db:55:15:f0:f1 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
qlen 1000
link/ether e0:db:55:15:f0:f2 brd ff:ff:ff:ff:ff:ff
5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
qlen 1000
link/ether e0:db:55:15:f0:f3 brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER> mtu 9000 qdisc noqueue state DOWN
group default qlen 1000
link/ether 3a:ab:a2:f2:38:5c brd ff:ff:ff:ff:ff:ff
7: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP group default qlen 1000
link/ether 52:54:00:ae:8d:93 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master
virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:ae:8d:93 brd ff:ff:ff:ff:ff:ff
9: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master virbr0 state UNKNOWN group default qlen 1000
link/ether fe:16:3e:01:c6:32 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc16:3eff:fe01:c632/64 scope link
valid_lft forever preferred_lft forever
# ip r
default via 10.0.1.254 dev em1
10.0.0.0/8 dev em1 proto kernel scope link src 10.0.0.227
169.254.0.0/16 dev em1 scope link metric 1002
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1
So, finally, I have no idea why this appends :(((
Can you please attach /var/log/messages and /var/log/libvirt/qemu/* ?
Le Mardi, Juin 26, 2018 09:21 CEST, Simone Tiraboschi <stirabos(a)redhat.com>
a écrit:
On Mon, Jun 25, 2018 at 6:32 PM fsoyer <fsoyer(a)systea.fr> wrote:
> Well, answering to myself for more informations.
> Thinking that the network was part of the problem, I tried to stop
> gluster volumes, stop gluster on host, and stop bond0.
> So, the host now had just em1 with one IP.
> And... The winner is... Yes : the install passed the "[Get local VM IP]"
> and continued !!
>
> I hit ctrl-c, restart the bond0, restart deploy : it crashed. So it seems
> that more than one network is the problem. But ! How do I install engine on
> gluster on a separate - bonding - jumbo network in this case ???
>
> Can you reproduce this on your side ?
Can you please attach the output of 'ip a' in both the case?
>
> Frank
>
>
>
>
> Le Lundi, Juin 25, 2018 16:50 CEST, "fsoyer" <fsoyer(a)systea.fr> a
écrit:
>
>
>
>
> Hi staff,
> Installing a fresh ovirt - CentOS 7.5.1804 up to date, ovirt version :
> # rpm -qa | grep ovirt
> ovirt-hosted-engine-ha-2.2.11-1.el7.centos.noarch
> ovirt-imageio-common-1.3.1.2-0.el7.centos.noarch
> ovirt-host-dependencies-4.2.2-2.el7.centos.x86_64
> ovirt-vmconsole-1.0.5-4.el7.centos.noarch
> ovirt-provider-ovn-driver-1.2.10-1.el7.centos.noarch
> ovirt-hosted-engine-setup-2.2.20-1.el7.centos.noarch
> ovirt-engine-appliance-4.2-20180504.1.el7.centos.noarch
> python-ovirt-engine-sdk4-4.2.6-2.el7.centos.x86_64
> ovirt-host-deploy-1.7.3-1.el7.centos.noarch
> ovirt-release42-4.2.3.1-1.el7.noarch
> ovirt-vmconsole-host-1.0.5-4.el7.centos.noarch
> cockpit-ovirt-dashboard-0.11.24-1.el7.centos.noarch
> ovirt-setup-lib-1.1.4-1.el7.centos.noarch
> ovirt-imageio-daemon-1.3.1.2-0.el7.centos.noarch
> ovirt-host-4.2.2-2.el7.centos.x86_64
> ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch
>
> ON PHYSICAL SERVERS (not on VMware, why should I be ?? ;) I got exactly
> the same error :
> [ INFO ] TASK [Get local VM IP]
> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50,
"changed":
> true, "cmd": "virsh -r net-dhcp-leases default | grep -i
00:16:3e:69:3a:c6
> | awk '{ print $5 }' | cut -f1 -d'/'", "delta":
"0:00:00.073313", "end":
> "2018-06-25 16:11:36.025277", "rc": 0, "start":
"2018-06-25
> 16:11:35.951964", "stderr": "", "stderr_lines":
[], "stdout": "",
> "stdout_lines": []}
> [ INFO ] TASK [include_tasks]
> [ INFO ] ok: [localhost]
> [ INFO ] TASK [Remove local vm dir]
> [ INFO ] changed: [localhost]
> [ INFO ] TASK [Notify the user about a failure]
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false,
"msg": "The
> system may not be provisioned according to the playbook results: please
> check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> [ INFO ] Stage: Clean up
>
>
> I have 4 NIC :
> em1 10.0.0.230/8 is for ovirmgmt, it have the gateway
> em2 10.0.0.229/8 is for a vmnetwork
> em3+em4 in bond0 192.168.0.30 are for gluster with jumbo frames, volumes
> (ENGINE, ISO,EXPORT,DATA) are up and operationals.
>
> I tried to stop em2 (ONBOOT=No and restart network), so the network is
> actually :
> # ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
> default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
> valid_lft forever preferred_lft forever
> inet6 ::1/128 scope host
> valid_lft forever preferred_lft forever
> 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
> group default qlen 1000
> link/ether e0:db:55:15:eb:70 brd ff:ff:ff:ff:ff:ff
> inet 10.0.0.230/8 brd 10.255.255.255 scope global em1
> valid_lft forever preferred_lft forever
> inet6 fe80::e2db:55ff:fe15:eb70/64 scope link
> valid_lft forever preferred_lft forever
> 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
> qlen 1000
> link/ether e0:db:55:15:eb:71 brd ff:ff:ff:ff:ff:ff
> 4: em3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master
> bond0 state UP group default qlen 1000
> link/ether e0:db:55:15:eb:72 brd ff:ff:ff:ff:ff:ff
> 5: em4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master
> bond0 state UP group default qlen 1000
> link/ether e0:db:55:15:eb:72 brd ff:ff:ff:ff:ff:ff
> 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue
> state UP group default qlen 1000
> link/ether e0:db:55:15:eb:72 brd ff:ff:ff:ff:ff:ff
> inet 192.168.0.30/24 brd 192.168.0.255 scope global bond0
> valid_lft forever preferred_lft forever
> inet6 fe80::e2db:55ff:fe15:eb72/64 scope link
> valid_lft forever preferred_lft forever
>
> # ip r
> default via 10.0.1.254 dev em1
> 10.0.0.0/8 dev em1 proto kernel scope link src 10.0.0.230
> 169.254.0.0/16 dev em1 scope link metric 1002
> 169.254.0.0/16 dev bond0 scope link metric 1006
> 192.168.0.0/24 dev bond0 proto kernel scope link src 192.168.0.30
>
> but same issue, after "/usr/sbin/ovirt-hosted-engine-cleanup" and
> restarting the deployment.
>
> NetworkManager was stopped and disabled at the node install, and it is
> still stopped.
> After the error, the network shows this after device 6 (bond0) :
> 7: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
> UP group default qlen 1000
> link/ether 52:54:00:38:e0:5a brd ff:ff:ff:ff:ff:ff
> inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
> valid_lft forever preferred_lft forever
> 8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master
> virbr0 state DOWN group default qlen 1000
> link/ether 52:54:00:38:e0:5a brd ff:ff:ff:ff:ff:ff
> 11: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> master virbr0 state UNKNOWN group default qlen 1000
> link/ether fe:16:3e:69:3a:c6 brd ff:ff:ff:ff:ff:ff
> inet6 fe80::fc16:3eff:fe69:3ac6/64 scope link
> valid_lft forever preferred_lft forever
>
> I do not see ovirmgmt... And I don't know if I can access the engine vm
> as I have not its IP :(
> I tried to ping addresses after 192.168.122.1, but no one are accessible
> so I stopped at 122.10. The VM seems up (kvm process), qemu-kvm process
> taking 150% of cpu in "top"...
>
> I pasted the log here :
https://pastebin.com/Ebzh1uEh
>
> PLEASE ! This issue seems to be reccurent since the beginning of 2018
> (see messages here on list !
> Jamie Lawrence in February, suporte(a)logicworks.pt in april,
> shamilkpm(a)gmail.com and Yaniv Kaul in May,
> florentl on june 01...). Can anyone give us a way to solve this ?
> --
>
> Cordialement,
>
> *Frank Soyer *
>
>
> Le Lundi, Juin 04, 2018 16:07 CEST, Simone Tiraboschi <
> stirabos(a)redhat.com> a écrit:
>
>
>
>
> On Mon, Jun 4, 2018 at 2:20 PM, Phillip Bailey <phbailey(a)redhat.com>
> wrote:
>>
>> Hi Florent,
>>
>> Could you please provide the log for the stage in which the wizard is
>> failing? Logs can be found in /var/log/ovirt-hosted-engine-setup.
>>
>> Thanks!
>>
>> -Phillip Bailey
>>
>> On Fri, Jun 1, 2018 at 7:57 AM, florentl <florentl.info(a)laposte.net>
>> wrote:
>>>
>>> Hi all,
>>> I try to install hosted-engine on node : ovirt-node-ng-4.2.3-0.20180518.
>>> Every times I get stuck on :
>>> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50,
"changed":
>>> true, "cmd": "virsh -r net-dhcp-leases default | grep -i
00:16:3e:6c:5a:91
>>> | awk '{ print $5 }' | cut -f1 -d'/'",
"delta": "0:00:00.108872", "end":
>>> "2018-06-01 11:17:34.421769", "rc": 0, "start":
"2018-06-01
>>> 11:17:34.312897", "stderr": "",
"stderr_lines": [], "stdout": "",
>>> "stdout_lines": []}
>>> I tried with static IP Address and with DHCP but both failed.
>>>
>>> To be more specific, I installed three nodes, deployed glusterfs with
>>> the wizard. I'm in a nested virtualization environment for this lab
(Vmware
>>> Esxi Hypervisor).
>>
>>
> Unfortunately I think that the issue is trying to run a nested env over
> ESXi.
> AFAIk nesting KVM VMs over ESX is still problematic.
>
> I'd suggest to repeat the experiment nesting over KVM on L0.
>
>
>
>> My node IP is : 192.168.176.40 / and I want the hosted-engine vm has
>>> 192.168.176.43.
>>>
>>> Thanks,
>>> Florent
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>>
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/F3BNUQ2T434...
>>
>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>>
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RU34XDM2W6G...
>>
>
>
>
>