[ovirt-users] More 4.1 Networking Questions

Mon Apr 10 10:59:30 UTC 2017

Ah, spoke to soon.  30 seconds later the network went down with IPv6 
disabled.  So it does appear to be a host forwarding problem, not a VM 
problem.  I have an oVirt 4.0 cluster on the same network that doesn't 
have these issues, so it must be a configuration issue somewhere.  Here 
is a dump of my ip config on the host:

[07:57:26]root at ovirt730-01 ~ # ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
     inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
     inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmNet 
state UP qlen 1000
     link/ether 18:66:da:eb:8f:c0 brd ff:ff:ff:ff:ff:ff
3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
     link/ether 18:66:da:eb:8f:c1 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
     link/ether 18:66:da:eb:8f:c2 brd ff:ff:ff:ff:ff:ff
5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
     link/ether 18:66:da:eb:8f:c3 brd ff:ff:ff:ff:ff:ff
6: p5p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master 
ovirtmgmt state UP qlen 1000
     link/ether f4:e9:d4:a9:7a:f0 brd ff:ff:ff:ff:ff:ff
7: p5p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
     link/ether f4:e9:d4:a9:7a:f2 brd ff:ff:ff:ff:ff:ff
8: vmNet: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UP qlen 1000
     link/ether 18:66:da:eb:8f:c0 brd ff:ff:ff:ff:ff:ff
     inet 192.168.1.180/24 brd 192.168.1.255 scope global vmNet
        valid_lft forever preferred_lft forever
10: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP qlen 1000
     link/ether f4:e9:d4:a9:7a:f0 brd ff:ff:ff:ff:ff:ff
     inet 192.168.130.180/24 brd 192.168.130.255 scope global ovirtmgmt
        valid_lft forever preferred_lft forever
11: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
qlen 1000
     link/ether fa:f3:48:35:76:8d brd ff:ff:ff:ff:ff:ff
14: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
master ovirtmgmt state UNKNOWN qlen 1000
     link/ether fe:16:3e:3f:fb:ec brd ff:ff:ff:ff:ff:ff
     inet6 fe80::fc16:3eff:fe3f:fbec/64 scope link
        valid_lft forever preferred_lft forever
15: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
master vmNet state UNKNOWN qlen 1000
     link/ether fe:1a:4a:16:01:51 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::fc1a:4aff:fe16:151/64 scope link
        valid_lft forever preferred_lft forever
16: vnet2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
master vmNet state UNKNOWN qlen 1000
     link/ether fe:1a:4a:16:01:52 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::fc1a:4aff:fe16:152/64 scope link
        valid_lft forever preferred_lft forever

[07:57:50]root at ovirt730-01 ~ # ip route show
default via 192.168.1.254 dev vmNet
192.168.1.0/24 dev vmNet  proto kernel  scope link  src 192.168.1.180
169.254.0.0/16 dev vmNet  scope link  metric 1008
169.254.0.0/16 dev ovirtmgmt  scope link  metric 1010
192.168.130.0/24 dev ovirtmgmt  proto kernel  scope link  src 
192.168.130.180

[07:57:53]root at ovirt730-01 ~ # ip rule show
0:    from all lookup local
32760:    from all to 192.168.130.0/24 iif ovirtmgmt lookup 3232268980
32761:    from 192.168.130.0/24 lookup 3232268980
32762:    from all to 192.168.1.0/24 iif vmNet lookup 2308294836
32763:    from 192.168.1.0/24 lookup 2308294836
32766:    from all lookup main
32767:    from all lookup default
[07:57:58]root at ovirt730-01 ~ #

On 2017-04-10 07:54 AM, Charles Tassell wrote:
>
> Hi Everyone,
>
>   Just an update, I installed a new Ubuntu guest VM and it was doing 
> the same thing regarding the network going down, then I disabled IPv6 
> and it's been fine for the past 10-15 minutes.  So the issue seems to 
> be IPv6 related, and I don't need IPv6 so I can just turn it off.  The 
> eth1 NIC disappearing is still worrisome though.
>
>
> On 2017-04-10 07:13 AM, Charles Tassell wrote:
>> Hi Everyone,
>>
>>   Thanks for the help, answers below.
>>
>> On 2017-04-10 05:27 AM, Sandro Bonazzola wrote:
>>> Adding Simone and Martin, replying inline.
>>>
>>> On Mon, Apr 10, 2017 at 10:16 AM, Ondrej Svoboda 
>>> <osvoboda at redhat.com <mailto:osvoboda at redhat.com>> wrote:
>>>
>>>     Hello Charles,
>>>
>>>     First, can you give us more information regarding the duplicated
>>>     IPv6 addresses? Since you are going to reinstall the hosted
>>>     engine, could you make sure that NetworkManager is disabled
>>>     before adding the second vNIC (and perhaps even disable IPv6 and
>>>     reboot as well, so we have a solid base and see what makes the
>>>     difference)?
>>>
>> I disabled NetworkManager on the hosts (systemctl disable 
>> NetworkManager ; service NetworkManager stop) before doing the oVirt 
>> setup and rebooted to make sure that it didn't come back up.  Or are 
>> you referring to on the hosted engine VM?  I just removed and 
>> re-added the eth1 NIC in the hosted engine, and this is what showed 
>> up in dmesg:
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: [1af4:1000] type 00 
>> class 0x020000
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x10: [io 0x0000-0x001f]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x14: [mem 
>> 0x00000000-0x00000fff]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x20: [mem 
>> 0x00000000-0x00003fff 64bit pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x30: [mem 
>> 0x00000000-0x0003ffff pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 6: assigned [mem 
>> 0xc0000000-0xc003ffff pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 4: assigned [mem 
>> 0xc0040000-0xc0043fff 64bit pref]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 1: assigned [mem 
>> 0xc0044000-0xc0044fff]
>> [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 0: assigned [io  
>> 0x1000-0x101f]
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: enabling device 
>> (0000 -> 0003)
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 35 for MSI/MSI-X
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 36 for MSI/MSI-X
>> [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 37 for MSI/MSI-X
>> [Mon Apr 10 06:46:43 2017] IPv6: ADDRCONF(NETDEV_UP): eth1: link is 
>> not ready
>> [Mon Apr 10 06:46:43 2017] IPv6: eth1: IPv6 duplicate address 
>> fe80::21a:4aff:fe16:151 detected!
>>
>> Then when the network dropped I started getting these:
>>
>> [Mon Apr 10 06:48:00 2017] IPv6: eth1: IPv6 duplicate address 
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>> [Mon Apr 10 06:48:00 2017] IPv6: eth1: IPv6 duplicate address 
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>> [Mon Apr 10 06:49:51 2017] IPv6: eth1: IPv6 duplicate address 
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>> [Mon Apr 10 06:51:40 2017] IPv6: eth1: IPv6 duplicate address 
>> 2001:410:e000:902:21a:4aff:fe16:151 detected!
>>
>> The network on eth1 would go down for a few seconds then come back 
>> up, but networking stays solid on eth0.  I disabled NetworkManager on 
>> the HE VM as well to see if that makes a difference.  I also disabled 
>> IPv6 with sysctl to see if that helps.  I'll install a Ubuntu VM on 
>> the cluster later today and see if it has a similar issue.
>>
>>
>>>
>>>     What kind of documentation did you follow to install the hosted
>>>     engine? Was it this page?
>>>     https://www.ovirt.org/documentation/how-to/hosted-engine/
>>>     <https://www.ovirt.org/documentation/how-to/hosted-engine/> If
>>>     so, could you file a bug against VDSM networking and attach
>>>     /var/log/vdsm/vdsm.log and supervdsm.log, and make sure they
>>>     include the time period from adding the second vNIC to rebooting?
>>>
>>>     Second, even the vNIC going missing after reboot looks like a
>>>     bug to me. Even though eth1 does not exist in the VM, can you
>>>     see it defined for the VM in the engine web GUI?
>>>
>>>
>>> If the HE vm configuration wasn't flushed to the OVF_STORE yet, it 
>>> make sense it disappeared on restart.
>>>
>> The docs I used were 
>> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/self-hosted_engine_guide/chap-deploying_self-hosted_engine#Deploying_Self-Hosted_Engine_on_RHEL 
>> which someone on the list pointed me to last week as being more 
>> up-to-date than what was on the website (the docs on the website 
>> don't seem to mention that you need to put the HE on it's own 
>> datastore and look to be more geared towards bare-metal engine rather 
>> than the VM self hosted option.)
>>
>> When I went back into the GUI and looked at the hosted engine config 
>> the second NIC was listed there, but it wasn't showing up in lspci on 
>> the VM.  I removed the NIC in the GUI and re-added it, and the device 
>> appeared again on the VM.  What is the proper way to "save" the state 
>> of the VM so that the OVF_STORE gets updated?  When I do anything on 
>> the HE VM that I want to test I just type "reboot", but that powers 
>> down the VM.  I then login to my host and run "hosted-engine 
>> --vm-start" which restarts it, but of course the last time I did that 
>> it restarted without the second NIC.
>>
>>>
>>>     The steps you took to install the hosted engine with regards to
>>>     networking look good to me, but I believe Sandro (CC'ed) would
>>>     be able to give more advice.
>>>
>>>     Sandro, since we want to configure bonding, would you recommend
>>>     to install the engine physically first, move it to a VM,
>>>     according to the following method, and only then reconfigure
>>>     networking?
>>>     https://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/
>>>     <https://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/>
>>>
>>>
>>>
>>> I don't see why a diret HE deployment couldn't be done. Simone, 
>>> Martin can you help here?
>>>
>>>
>>>
>>>     Thank you,
>>>     Ondra
>>>
>>>     On Mon, Apr 10, 2017 at 8:51 AM, Charles Tassell
>>>     <ctassell at gmail.com <mailto:ctassell at gmail.com>> wrote:
>>>
>>>         Hi Everyone,
>>>
>>>           Okay, I'm again having problems with getting basic
>>>         networking setup with oVirt 4.1  Here is my situation.  I
>>>         have two servers I want to use to create an oVirt cluster,
>>>         with two different networks.  My "public" network is a 1G
>>>         link on device em1 connected to my Internet feed, and my
>>>         "storage" network is a 10G link connected on device p5p1 to
>>>         my file server.  Since I need to connect to my storage
>>>         network in order to do the install, I selected p5p1 has the
>>>         ovirtmgmt interface when installing the hosted engine.  That
>>>         worked fine, I got everything installed, so I used some
>>>         ssh-proxy magic to connect to the web console and completed
>>>         the install (setup a Storage domain and create a new network
>>>         vmNet for VM networking and added em1 to it.)
>>>
>>>           The problem was that when I added a second network device
>>>         to the HostedEngine VM (so that I can connect to it from my
>>>         public network) it would intermittently go down.  I did some
>>>         digging and found some IPV6 errors in the dmesg (IPv6: eth1:
>>>         IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151
>>>         detected!) so I disabled IPv6 on both eth0 and eth1 in the
>>>         HostedEngine and rebooted it.  The problem is that when I
>>>         restarted the VM, the eth1 device was missing.
>>>
>>>           So, my question is: Can I add a second NIC to the
>>>         HostedEngine VM and make it stick, or will it be deleted
>>>         whenever the engine VM is restarted? 
>>>
>>>
>>> When you change something in the HE Vm using the web ui, it has to 
>>> be saved also on the OVF_STORE to make it permanent for further reboot.
>>> Martin can you please elaborate here?
>>>
>>>
>>>         Is there a better way to do what I'm trying to do, ie,
>>>         should I setup ovirtmgmt on the public em1 interface, and
>>>         then create the "storage" network after the fact for
>>>         connecting to the datastores and such.  Is that even
>>>         possible, or required?  I was thinking that it would be
>>>         better for migrations and other management functions to
>>>         happen on the faster 10G network, but if the HostedEngine
>>>         doesn't need to be able to connect to the storage network
>>>         maybe it's not worth the effort?
>>>
>>>           Eventually I want to setup LACP on the storage network,
>>>         but I had to wipe the servers and reinstall from scratch the
>>>         last time I tried to set that up.  I was thinking that it
>>>         was because I setup the bonding before installing oVirt, so
>>>         I didn't do that this time.
>>>
>>>           Here are my /etc/sysconfig/network-scripts/ifcfg-* files
>>>         in case I did something wrong there (I'm more familiar with
>>>         Debian/Ubuntu network setup than CentOS)
>>>
>>>         ifcfg-eth0: (ovirtmgmt aka storage)
>>>         ----------------
>>>         BROADCAST=192.168.130.255
>>>         NETMASK=255.255.255.0
>>>         BOOTPROTO=static
>>>         DEVICE=eth0
>>>         IPADDR=192.168.130.179
>>>         ONBOOT=yes
>>>         DOMAIN=public.net <http://public.net>
>>>         ZONE=public
>>>         IPV6INIT=no
>>>
>>>
>>>         ifcfg-eth1: (vmNet aka Internet)
>>>         ----------------
>>>         BROADCAST=192.168.1.255
>>>         NETMASK=255.255.255.0
>>>         BOOTPROTO=static
>>>         DEVICE=eth1
>>>         IPADDR=192.168.1.179
>>>         GATEWAY=192.168.1.254
>>>         ONBOOT=yes
>>>         DNS1=192.168.1.1
>>>         DNS2=192.168.1.2
>>>         DOMAIN=public.net <http://public.net>
>>>         ZONE=public
>>>         IPV6INIT=no
>>>
>>>         _______________________________________________
>>>         Users mailing list
>>>         Users at ovirt.org <mailto:Users at ovirt.org>
>>>         http://lists.ovirt.org/mailman/listinfo/users
>>>         <http://lists.ovirt.org/mailman/listinfo/users>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>>
>>> SANDRO BONAZZOLA
>>>
>>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>>>
>>> Red Hat EMEA <https://www.redhat.com/>
>>>
>>> <https://red.ht/sig> 	
>>> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170410/ffa373d6/attachment-0001.html>