Hi Yevgeny,
Thanks for your help.
No luck yet!
Q: IIUC, ovirt-engine and its host are vSphere VMs. Then, a kind of no-macspoof should be applied from the vSphere side.
A: I'm not exactly sure what setting this would be on the vmware side but we have ensured the following settings are enabled; Promiscuous Mode, MAC Address Changes, Forged Transmits.
Q: BTW, are both of them on the same vShepre host?
A: Yes, they are both on the same vsphere host.
Q: Is DHCP server another VM on that host?
A: Yes, the DHCP server is another VM on that vsphere host as well.
Q: Where/how did you "turn on Port Mirroring"?
A: I turned on Port Mirroring in the vNIC Profile called ovirtmgmt, it's off for most of my tests, however, when I turn it on, everything works.
Q: I'd start the troubleshooting by using tcpdump utility in order to pinpoint the component that blocks the traffic.
A: I have done this and I can see all DHCP messages including the DHCPACK on the bridge named 'ovirtmgmt', but when I do tcpdump on the guest VM network device 'eth0' (which is also 'vnet0' on the host)I never see the DHCPACK. This same thing happens with ARP replies from the gateway.
Here is an ACK message that is returned from the DHCP server to the 'ovirtmgmt' bridge (I don't know if this is helpful, but thought I'd include it, just in case).
16:35:22.958616 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
10.255.233.125.67 > 10.255.233.185.68: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x8d5c8e3d, Flags [none] (0x0000)
Your-IP 10.255.233.185
Client-Ethernet-Address 00:1a:4a:16:01:55
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: ACK
Server-ID Option 54, length 4: 10.255.233.125
Lease-Time Option 51, length 4: 600
Subnet-Mask Option 1, length 4: 255.255.255.0
BR Option 28, length 4: 10.255.233.255
Domain-Name-Server Option 6, length 4: 10.255.233.102
Default-Gateway Option 3, length 4: 10.255.233.1
Q: Did you try assigning a static IP instead of DHCP and then check connectivity? If that works, then the problem is on the DHCP sever side probably.
A: Yes, with a static IP on the guest, everything works fine. It's possible, but in my naive understanding it seems unlikely, that the problem is on the DHCP server side because the ACK makes it all the way back to the bridge. Again, I could be wrong.
Q: If you do not see any requests in the DHCP server log, then I guess, "dhclient -B" wouldn't help.
A: I see the whole DHCP process in my DHCP Server log:
Jul 5 21:21:12 mydomain dhcpd: DHCPDISCOVER from 00:1a:4a:16:01:55 via eth0
Jul 5 21:21:12 mydomain dhcpd: DHCPOFFER on 10.255.233.185 to 00:1a:4a:16:01:55 via eth0
Jul 5 21:21:12 mydomain dhcpd: DHCPREQUEST for 10.255.233.185 (10.255.233.125) from 00:1a:4a:16:01:55 via eth0
Jul 5 21:21:12 mydomain dhcpd: DHCPACK on 10.255.233.185 to 00:1a:4a:16:01:55 via eth0
In addition to "dhclient -B" allowing the guest to successfully gain an IP address from the DHCP server, I recently found that running the command "brctl ovirtmgmt setageing 0" works also. I read that this turns the bridge into a hub, which is only a clue and not a solution, I think. It seems that Forwarding Database changes that occur when I run that command are what allows ACK message to then pass to the guest.
Here is my Forwarding Database when the Ageing is set to 300:
01:00:5e:00:00:01 dev ens32 self permanent
33:33:00:00:02:02 dev ens32 self permanent
33:33:00:00:00:01 dev ens32 self permanent
33:33:00:00:00:01 dev bond0 self permanent
fe:4e:66:bf:cf:48 dev ;vdsmdummy; vlan 1 master ;vdsmdummy; permanent
00:50:56:8e:aa:2a dev ens32 master ovirtmgmt
e4:d3:f1:d1:99:c4 dev ens32 master ovirtmgmt
00:50:56:8e:d6:a2 dev ens32 master ovirtmgmt
00:50:56:8e:d2:cd dev ens32 master ovirtmgmt
00:50:56:8e:be:ca dev ens32 master ovirtmgmt permanent
00:50:56:8e:17:59 dev ens32 master ovirtmgmt
fe:1a:4a:16:01:55 dev vnet0 master ovirtmgmt permanent
fe:1a:4a:16:01:55 dev vnet0 vlan 1 master ovirtmgmt permanent
00:50:56:8e:be:ca dev ens32 vlan 1 master ovirtmgmt permanent
00:50:56:8e:3d:f1 dev ens32 master ovirtmgmt
e4:d3:f1:d1:99:8c dev ens32 master ovirtmgmt
e4:d3:f1:d1:99:8b dev ens32 master ovirtmgmt
76:f6:65:58:fe:f5 dev ovirtmgmt vlan 1 master ovirtmgmt permanent
33:33:00:00:00:01 dev vnet0 self permanent
01:00:5e:00:00:01 dev vnet0 self permanent
33:33:ff:16:01:55 dev vnet0 self permanent
Here is the Forwarding Database(fdb) after setting the Ageing to 0:
01:00:5e:00:00:01 dev ens32 self permanent
33:33:00:00:02:02 dev ens32 self permanent
33:33:00:00:00:01 dev ens32 self permanent
33:33:00:00:00:01 dev bond0 self permanent
fe:4e:66:bf:cf:48 dev ;vdsmdummy; vlan 1 master ;vdsmdummy; permanent
00:50:56:8e:be:ca dev ens32 master ovirtmgmt permanent
fe:1a:4a:16:01:55 dev vnet0 master ovirtmgmt permanent
fe:1a:4a:16:01:55 dev vnet0 vlan 1 master ovirtmgmt permanent
00:50:56:8e:be:ca dev ens32 vlan 1 master ovirtmgmt permanent
76:f6:65:58:fe:f5 dev ovirtmgmt vlan 1 master ovirtmgmt permanent
33:33:00:00:00:01 dev vnet0 self permanent
01:00:5e:00:00:01 dev vnet0 self permanent
33:33:ff:16:01:55 dev vnet0 self permanent
This is the result of 'brctl showmacs ovirtmgmt' with Ageing set to 300:
[root@ovirthost2 ~]# brctl showmacs ovirtmgmt
port no mac addr is local? ageing timer
1 00:50:56:8e:17:59 no 0.10
1 00:50:56:8e:2b:d7 no 22.30
1 00:50:56:8e:2e:ec no 3.66
1 00:50:56:8e:3d:f1 no 1.16
1 00:50:56:8e:a4:66 no 15.34
1 00:50:56:8e:a4:cc no 11.39
1 00:50:56:8e:aa:2a no 3.66
1 00:50:56:8e:be:ca yes 0.00
1 00:50:56:8e:be:ca yes 0.00
1 00:50:56:8e:d2:cd no 0.10
1 00:50:56:8e:d6:a2 no 3.70
1 e4:d3:f1:d1:99:8b no 0.26
1 e4:d3:f1:d1:99:8c no 0.26
1 e4:d3:f1:d1:99:c4 no 0.00
2 fe:1a:4a:16:01:55 yes 0.00
2 fe:1a:4a:16:01:55 yes 0.00
And here is the result of 'brctl showmacs ovirtmgmt' with Ageing set to 0:
[root@ovirthost2 ~]# brctl showmacs ovirtmgmt
port no mac addr is local? ageing timer
1 00:50:56:8e:be:ca yes 0.00
1 00:50:56:8e:be:ca yes 0.00
2 fe:1a:4a:16:01:55 yes 0.00
2 fe:1a:4a:16:01:55 yes 0.00
Maybe this will give more clues as to what's going on.
Q: Please turn iptables/firewalld off.
A: I have tried this ('service iptables stop') and it doesn't seem to make a difference.
Thanks again for your help,
Clint