[ovirt-users] More 4.1 Networking Questions
Charles Tassell
charles at islandadmin.ca
Mon Apr 10 10:59:30 UTC 2017
Ah, spoke to soon. 30 seconds later the network went down with IPv6
disabled. So it does appear to be a host forwarding problem, not a VM
problem. I have an oVirt 4.0 cluster on the same network that doesn't
have these issues, so it must be a configuration issue somewhere. Here
is a dump of my ip config on the host:
[07:57:26]root at ovirt730-01 ~ # ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmNet
state UP qlen 1000
link/ether 18:66:da:eb:8f:c0 brd ff:ff:ff:ff:ff:ff
3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 18:66:da:eb:8f:c1 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 18:66:da:eb:8f:c2 brd ff:ff:ff:ff:ff:ff
5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 18:66:da:eb:8f:c3 brd ff:ff:ff:ff:ff:ff
6: p5p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
ovirtmgmt state UP qlen 1000
link/ether f4:e9:d4:a9:7a:f0 brd ff:ff:ff:ff:ff:ff
7: p5p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether f4:e9:d4:a9:7a:f2 brd ff:ff:ff:ff:ff:ff
8: vmNet: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP qlen 1000
link/ether 18:66:da:eb:8f:c0 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.180/24 brd 192.168.1.255 scope global vmNet
valid_lft forever preferred_lft forever
10: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP qlen 1000
link/ether f4:e9:d4:a9:7a:f0 brd ff:ff:ff:ff:ff:ff
inet 192.168.130.180/24 brd 192.168.130.255 scope global ovirtmgmt
valid_lft forever preferred_lft forever
11: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
qlen 1000
link/ether fa:f3:48:35:76:8d brd ff:ff:ff:ff:ff:ff
14: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master ovirtmgmt state UNKNOWN qlen 1000
link/ether fe:16:3e:3f:fb:ec brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc16:3eff:fe3f:fbec/64 scope link
valid_lft forever preferred_lft forever
15: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master vmNet state UNKNOWN qlen 1000
link/ether fe:1a:4a:16:01:51 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc1a:4aff:fe16:151/64 scope link
valid_lft forever preferred_lft forever
16: vnet2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master vmNet state UNKNOWN qlen 1000
link/ether fe:1a:4a:16:01:52 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc1a:4aff:fe16:152/64 scope link
valid_lft forever preferred_lft forever
[07:57:50]root at ovirt730-01 ~ # ip route show
default via 192.168.1.254 dev vmNet
192.168.1.0/24 dev vmNet proto kernel scope link src 192.168.1.180
169.254.0.0/16 dev vmNet scope link metric 1008
169.254.0.0/16 dev ovirtmgmt scope link metric 1010
192.168.130.0/24 dev ovirtmgmt proto kernel scope link src
192.168.130.180
[07:57:53]root at ovirt730-01 ~ # ip rule show
0: from all lookup local
32760: from all to 192.168.130.0/24 iif ovirtmgmt lookup 3232268980
32761: from 192.168.130.0/24 lookup 3232268980
32762: from all to 192.168.1.0/24 iif vmNet lookup 2308294836
32763: from 192.168.1.0/24 lookup 2308294836
32766: from all lookup main
32767: from all lookup default
[07:57:58]root at ovirt730-01 ~ #
On 2017-04-10 07:54 AM, Charles Tassell wrote:
>
> Hi Everyone,
>
> Just an update, I installed a new Ubuntu guest VM and it was doing
> the same thing regarding the network going down, then I disabled IPv6
> and it's been fine for the past 10-15 minutes. So the issue seems to
> be IPv6 related, and I don't need IPv6 so I can just turn it off. The
> eth1 NIC disappearing is still worrisome though.
>
>
> On 2017-04-10 07:13 AM, Charles Tassell wrote:
>> Hi Everyone,
>>
>> Thanks for the help, answers below.
>>
>> On 2017-04-10 05:27 AM, Sandro Bonazzola wrote:
>>> Adding Simone and Martin, replying inline.
>>>
>>> On Mon, Apr 10, 2017 at 10:16 AM, Ondrej Svoboda
>>> <osvoboda at redhat.com <mailto:osvoboda at redhat.com>> wrote:
>>>
>>> Hello Charles,
>>>
>>> First, can you give us more information regarding the duplicated
>>> IPv6 addresses? Since you are going to reinstall the hosted
>>> engine, could you make sure that NetworkManager is disabled
>>> before adding the second vNIC (and perhaps even disable IPv6 and
>>> reboot as well, so we have a solid base and see what makes the
>>> difference)?
>>>
>> I disabled NetworkManager on the hosts (systemctl disable
>> NetworkManager ; service NetworkManager stop) before doing the oVirt
>> setup and rebooted to make sure that it didn't come back up. Or are
>> you referring to on the hosted engine VM? I just removed and
>> re-added the eth1 NIC in the hosted engine, and this is what showed
>> up in dmesg:
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: [1af4:1000] type 00
>> class 0x020000
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x10: [io 0x0000-0x001f]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x14: [mem
>> 0x00000000-0x00000fff]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x20: [mem
>> 0x00000000-0x00003fff 64bit pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x30: [mem
>> 0x00000000-0x0003ffff pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 6: assigned [mem
>> 0xc0000000-0xc003ffff pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 4: assigned [mem
>> 0xc0040000-0xc0043fff 64bit pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 1: assigned [mem
>> 0xc0044000-0xc0044fff]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 0: assigned [io
>> 0x1000-0x101f]
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: enabling device
>> (0000 -> 0003)
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 35 for MSI/MSI-X
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 36 for MSI/MSI-X
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 37 for MSI/MSI-X
>> [Mon Apr 10 06:46:43 2017] IPv6: ADDRCONF(NETDEV_UP): eth1: link is
>> not ready
>> [Mon Apr 10 06:46:43 2017] IPv6: eth1: IPv6 duplicate address
>> fe80::21a:4aff:fe16:151 detected!
>>
>> Then when the network dropped I started getting these:
>>
>> [Mon Apr 10 06:48:00 2017] IPv6: eth1: IPv6 duplicate address
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>> [Mon Apr 10 06:48:00 2017] IPv6: eth1: IPv6 duplicate address
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>> [Mon Apr 10 06:49:51 2017] IPv6: eth1: IPv6 duplicate address
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>> [Mon Apr 10 06:51:40 2017] IPv6: eth1: IPv6 duplicate address
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>>
>> The network on eth1 would go down for a few seconds then come back
>> up, but networking stays solid on eth0. I disabled NetworkManager on
>> the HE VM as well to see if that makes a difference. I also disabled
>> IPv6 with sysctl to see if that helps. I'll install a Ubuntu VM on
>> the cluster later today and see if it has a similar issue.
>>
>>
>>>
>>> What kind of documentation did you follow to install the hosted
>>> engine? Was it this page?
>>> https://www.ovirt.org/documentation/how-to/hosted-engine/
>>> <https://www.ovirt.org/documentation/how-to/hosted-engine/> If
>>> so, could you file a bug against VDSM networking and attach
>>> /var/log/vdsm/vdsm.log and supervdsm.log, and make sure they
>>> include the time period from adding the second vNIC to rebooting?
>>>
>>> Second, even the vNIC going missing after reboot looks like a
>>> bug to me. Even though eth1 does not exist in the VM, can you
>>> see it defined for the VM in the engine web GUI?
>>>
>>>
>>> If the HE vm configuration wasn't flushed to the OVF_STORE yet, it
>>> make sense it disappeared on restart.
>>>
>> The docs I used were
>> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/self-hosted_engine_guide/chap-deploying_self-hosted_engine#Deploying_Self-Hosted_Engine_on_RHEL
>> which someone on the list pointed me to last week as being more
>> up-to-date than what was on the website (the docs on the website
>> don't seem to mention that you need to put the HE on it's own
>> datastore and look to be more geared towards bare-metal engine rather
>> than the VM self hosted option.)
>>
>> When I went back into the GUI and looked at the hosted engine config
>> the second NIC was listed there, but it wasn't showing up in lspci on
>> the VM. I removed the NIC in the GUI and re-added it, and the device
>> appeared again on the VM. What is the proper way to "save" the state
>> of the VM so that the OVF_STORE gets updated? When I do anything on
>> the HE VM that I want to test I just type "reboot", but that powers
>> down the VM. I then login to my host and run "hosted-engine
>> --vm-start" which restarts it, but of course the last time I did that
>> it restarted without the second NIC.
>>
>>>
>>> The steps you took to install the hosted engine with regards to
>>> networking look good to me, but I believe Sandro (CC'ed) would
>>> be able to give more advice.
>>>
>>> Sandro, since we want to configure bonding, would you recommend
>>> to install the engine physically first, move it to a VM,
>>> according to the following method, and only then reconfigure
>>> networking?
>>> https://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/
>>> <https://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/>
>>>
>>>
>>>
>>> I don't see why a diret HE deployment couldn't be done. Simone,
>>> Martin can you help here?
>>>
>>>
>>>
>>> Thank you,
>>> Ondra
>>>
>>> On Mon, Apr 10, 2017 at 8:51 AM, Charles Tassell
>>> <ctassell at gmail.com <mailto:ctassell at gmail.com>> wrote:
>>>
>>> Hi Everyone,
>>>
>>> Okay, I'm again having problems with getting basic
>>> networking setup with oVirt 4.1 Here is my situation. I
>>> have two servers I want to use to create an oVirt cluster,
>>> with two different networks. My "public" network is a 1G
>>> link on device em1 connected to my Internet feed, and my
>>> "storage" network is a 10G link connected on device p5p1 to
>>> my file server. Since I need to connect to my storage
>>> network in order to do the install, I selected p5p1 has the
>>> ovirtmgmt interface when installing the hosted engine. That
>>> worked fine, I got everything installed, so I used some
>>> ssh-proxy magic to connect to the web console and completed
>>> the install (setup a Storage domain and create a new network
>>> vmNet for VM networking and added em1 to it.)
>>>
>>> The problem was that when I added a second network device
>>> to the HostedEngine VM (so that I can connect to it from my
>>> public network) it would intermittently go down. I did some
>>> digging and found some IPV6 errors in the dmesg (IPv6: eth1:
>>> IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151
>>> detected!) so I disabled IPv6 on both eth0 and eth1 in the
>>> HostedEngine and rebooted it. The problem is that when I
>>> restarted the VM, the eth1 device was missing.
>>>
>>> So, my question is: Can I add a second NIC to the
>>> HostedEngine VM and make it stick, or will it be deleted
>>> whenever the engine VM is restarted?
>>>
>>>
>>> When you change something in the HE Vm using the web ui, it has to
>>> be saved also on the OVF_STORE to make it permanent for further reboot.
>>> Martin can you please elaborate here?
>>>
>>>
>>> Is there a better way to do what I'm trying to do, ie,
>>> should I setup ovirtmgmt on the public em1 interface, and
>>> then create the "storage" network after the fact for
>>> connecting to the datastores and such. Is that even
>>> possible, or required? I was thinking that it would be
>>> better for migrations and other management functions to
>>> happen on the faster 10G network, but if the HostedEngine
>>> doesn't need to be able to connect to the storage network
>>> maybe it's not worth the effort?
>>>
>>> Eventually I want to setup LACP on the storage network,
>>> but I had to wipe the servers and reinstall from scratch the
>>> last time I tried to set that up. I was thinking that it
>>> was because I setup the bonding before installing oVirt, so
>>> I didn't do that this time.
>>>
>>> Here are my /etc/sysconfig/network-scripts/ifcfg-* files
>>> in case I did something wrong there (I'm more familiar with
>>> Debian/Ubuntu network setup than CentOS)
>>>
>>> ifcfg-eth0: (ovirtmgmt aka storage)
>>> ----------------
>>> BROADCAST=192.168.130.255
>>> NETMASK=255.255.255.0
>>> BOOTPROTO=static
>>> DEVICE=eth0
>>> IPADDR=192.168.130.179
>>> ONBOOT=yes
>>> DOMAIN=public.net <http://public.net>
>>> ZONE=public
>>> IPV6INIT=no
>>>
>>>
>>> ifcfg-eth1: (vmNet aka Internet)
>>> ----------------
>>> BROADCAST=192.168.1.255
>>> NETMASK=255.255.255.0
>>> BOOTPROTO=static
>>> DEVICE=eth1
>>> IPADDR=192.168.1.179
>>> GATEWAY=192.168.1.254
>>> ONBOOT=yes
>>> DNS1=192.168.1.1
>>> DNS2=192.168.1.2
>>> DOMAIN=public.net <http://public.net>
>>> ZONE=public
>>> IPV6INIT=no
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org <mailto:Users at ovirt.org>
>>> http://lists.ovirt.org/mailman/listinfo/users
>>> <http://lists.ovirt.org/mailman/listinfo/users>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> SANDRO BONAZZOLA
>>>
>>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>>>
>>> Red Hat EMEA <https://www.redhat.com/>
>>>
>>> <https://red.ht/sig>
>>> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170410/ffa373d6/attachment-0001.html>
More information about the Users
mailing list