On Tue, Sep 4, 2018 at 9:02 AM Edward Haas <ehaas(a)redhat.com> wrote:
Hello Florian,
Thanks for checking the patch and posting the bug.
You need to restart vdsmd and supervdsmd.
It should not affect running VM/s, but you always have a risk that
something unexpected can happen. Perhaps try it on a host and then proceed
with others.
Thanks,
Edy.
I'm having similar problem in a 3 hosts oVirt test cluster with these
notifications every day on 1Gbit adapters.
I have bond0 on em1 and em2 and then bondo.65, bond0.68, bond0.167 vlans
defined for the VMs
I get these warnings
Message:Host ov300 has network interface which exceeded the defined
threshold [95%] (em1: transmit rate[98%], receive rate [0%])
when actually I think the 3 VMs running on this host generate few MB/s of
traffic
I applied the changes to the 3 hosts.
I notice that due to dependencies it is sufficient to restart supervdsmd
and then also vdsmd will be automatically restarted, correct?
In my case for each of the 3 hosts, after restarting supervdsmd I got
messages like these, but without impacts on runnign VMs
VDSM ov300 command GetStatsAsyncVDS failed: Broken pipe 9/4/18 9:07:52 AM
Host ov300 is not responding. It will stay in Connecting state for a grace
period of 61 seconds and after that an attempt to fence the host will be
issued. 9/4/18 9:07:52 AM
No faulty multipath paths on host ov300 9/4/18 9:07:58 AM
Executing power management status on Host ov300 using Proxy Host ov200 and
Fence Agent ipmilan:10.10.193.103. 9/4/18 9:07:58 AM
Status of host ov300 was set to Up. 9/4/18 9:07:58 AM
Host ov300 power management was verified successfully. 9/4/18 9:07:58 AM
Please note that when doing on SPM host you could also get these:
VDSM ov301 command SpmStatusVDS failed: Broken pipe 9/4/18 9:10:00 AM
Host ov301 is not responding. It will stay in Connecting state for a grace
period of 81 seconds and after that an attempt to fence the host will be
issued. 9/4/18 9:10:00 AM
Invalid status on Data Center MYDC. Setting Data Center status to Non
Responsive (On host ov301, Error: Network error during communication with
the Host.). 9/4/18 9:10:00 AM
with reassignment of SPM role:
VDSM command GetStoragePoolInfoVDS failed: Heartbeat exceeded 9/4/18
9:10:12 AM
Storage Pool Manager runs on Host ov200 (Address: ov200), Data Center MYDC.
9/4/18 9:10:14 AM
Probably safer to manually move the SPM before restarting supervdsmd on
that host.
Let's see this evening if I will get any message about thresholds.
BTW: one question. I see in the code iface.Type.NIC and now
also iface.Type.BOND. Don't you think that you should manage also the
network teaming option available in RH EL 7, as described here:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/...
?
This only if it is supported to use the new network teaming implementation
in oVirt, and I'm not sure about it...
Thanks,
Gianluca