Hello Florian,

Thanks for checking the patch and posting the bug.

You need to restart vdsmd and supervdsmd.
It should not affect running VM/s, but you always have a risk that something unexpected can happen. Perhaps try it on a host and then proceed with others.

Thanks,
Edy.


On Tue, Sep 4, 2018 at 9:45 AM, Florian Schmid <fschmid@ubimet.com> wrote:
Hello Edward,

I have applied the patch and it looks very good!
vdsm-client Host getStats ->
...
        "enp9s0.88": {
            "rxErrors": "0", 
            "name": "enp9s0.88", 
            "tx": "1226", 
            "txDropped": "0", 
            "sampleTime": 1536043097.701361, 
            "rx": "98642", 
            "txErrors": "0", 
            "state": "up", 
            "speed": "10000", 
            "rxDropped": "0"
        }, 
...

Bridge devices still have only 1000 configured:
        "vm-int-dev": {
            "rxErrors": "0", 
            "name": "vm-int-dev", 
            "tx": "578", 
            "txDropped": "0", 
            "sampleTime": 1536043097.701361, 
            "rx": "27843284", 
            "txErrors": "0", 
            "state": "up", 
            "speed": "1000", 
            "rxDropped": "0"
        }, 

One important question:
I want to apply this patch without upgrading all hosts, because this is a huge task.
When I apply that patch only to this particular file, which service do I need to restart?
I have restarted now all three vdsm services, but I think, I can't do that while VMs are running on the hosts, do I?

LG Florian


Von: "Florian Schmid" <fschmid@ubimet.com>
An: "edwardh" <edwardh@redhat.com>
CC: "users" <users@ovirt.org>
Gesendet: Dienstag, 4. September 2018 08:32:09
Betreff: [ovirt-users] Re: Wrong network threshold limit warnings on 4.2.5

Hello Edward,


I will try your patch.

LG Florian


Von: "Edward Haas" <ehaas@redhat.com>
An: "p staniforth" <P.Staniforth@leedsbeckett.ac.uk>, "Florian Schmid" <fschmid@ubimet.com>
CC: "users" <users@ovirt.org>
Gesendet: Montag, 3. September 2018 16:42:25
Betreff: Re: Wrong network threshold limit warnings on 4.2.5

Indeed looks like a nasty bug.
Could you please open a bug on this? https://tinyurl.com/ya7crjhf

If you can, could you also verify the fix? https://gerrit.ovirt.org/#/c/94132/

Thanks,
Edy.



On Mon, Sep 3, 2018 at 2:32 PM, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:

Hello Edward,

                       I am also seeing this problem, it's on our ovirtmgmt.


cat /sys/class/net/eno49/speed
10000

 

cat /sys/class/net/eno49.20/speed
10000

 

cat /sys/class/net/ovirtmgmt/speed
cat: /sys/class/net/ovirtmgmt/speed: Invalid argument


vdsm-client Host getStats ->
...

        "eno49": {
            "rxErrors": "0",
            "name": "eno49",
            "tx": "3456777",
            "txDropped": "0",
            "sampleTime": 1535974190.687987,
            "rx": "121362321",
            "txErrors": "0",
            "state": "up",
            "speed": "10000",
            "rxDropped": "2"
        },

        "eno49.20": {
            "rxErrors": "0",
            "name": "eno49.20",
            "tx": "3384452",
            "txDropped": "0",
            "sampleTime": 1535974190.687987,
            "rx": "115884579",
            "txErrors": "0",
            "state": "up",
            "speed": "1000",
            "rxDropped": "0"
        },

        "ovirtmgmt": {
            "rxErrors": "0",
            "name": "ovirtmgmt",
            "tx": "3383804",
            "txDropped": "0",
            "sampleTime": 1535974190.687987,
            "rx": "115710919",
            "txErrors": "0",
            "state": "up",
            "speed": "1000",
            "rxDropped": "0"
        },


Regards,

               Paul S.


From: Florian Schmid <fschmid@ubimet.com>
Sent: 03 September 2018 11:44
To: edwardh@redhat.com
Cc: users
Subject: [ovirt-users] Re: Wrong network threshold limit warnings on 4.2.5
 
Hi Edward,

I got some alarms today from a server and I have checked your command there. (not at the time the issue happened!!)
Hosts are on latest patch level CentOS 7.5 and oVirt 4.2.5

Example:
cat /sys/class/net/enp9s0/speed
10000

cat /sys/class/net/enp9s0.80/speed
10000

cat /sys/class/net/vm-int-nfs/speed
cat: /sys/class/net/vm-int-nfs/speed: invalid argument     <- this is the bridge for the VMs

vdsm-client Host getStats ->
...
        "enp9s0": {
            "rxErrors": "0", 
            "name": "enp9s0", 
            "tx": "3335325754762", 
            "txDropped": "0", 
            "sampleTime": 1535970960.602359, 
            "rx": "5916567956502", 
            "txErrors": "0", 
            "state": "up", 
            "speed": "10000", 
            "rxDropped": "0"
        }, 
...
        "enp9s0.80": {
            "rxErrors": "0", 
            "name": "enp9s0.80", 
            "tx": "3180024039398", 
            "txDropped": "0", 
            "sampleTime": 1535970960.602359, 
            "rx": "5669421065686", 
            "txErrors": "0", 
            "state": "up", 
            "speed": "1000", 
            "rxDropped": "0"
        }, 
...
        "vm-int-nfs": {
            "rxErrors": "0", 
            "name": "vm-int-nfs", 
            "tx": "508", 
            "txDropped": "0", 
            "sampleTime": 1535970960.602359, 
            "rx": "4428568", 
            "txErrors": "0", 
            "state": "up", 
            "speed": "1000", 
            "rxDropped": "0"
        }, 
...

As you see here, vdsm is reporting the wrong speed for the vlan devices.

BR Florian Schmid



Von: "Edward Haas" <ehaas@redhat.com>
An: "Jayme" <jaymef@gmail.com>, "Florian Schmid" <fschmid@ubimet.com>
CC: "users" <users@ovirt.org>, "Alona Kaplan" <alkaplan@redhat.com>
Gesendet: Montag, 3. September 2018 11:38:25
Betreff: Re: [ovirt-users] Re: Wrong network threshold limit warnings on 4.2.5

If you manage to recreate this, please collect a few samples from what the hypervisor reports back:
Run the command: vdsm-client Host getStats

Engine is calculating based on this information the rate.
(and the agent collects it from /sys/class/net/<device>/statistics/)

Please also mention on what OS you are running the hosts.

Thanks,
Edy


On Fri, Aug 31, 2018 at 5:35 PM, Jayme <jaymef@gmail.com> wrote:
I've been seeing these warnings myself, on 1Gb ovirtmanagement (glusterFS is 10Gbe backend).  I haven't correlated to network graphs yet but I don't know what would be happening on my management network that would be exhausting 1Gb network. 

On Fri, Aug 31, 2018 at 3:27 AM Florian Schmid <fschmid@ubimet.com> wrote:
Good morning,

since we have upgraded to version 4.2.5, we get a lot of warnings about network interface exceeded defined threshold limits.

For example:
Aug 31, 2018, 7:54:05 AM
Host xxx has network interface which exceeded the defined threshold [95%] (enp9s0.80: transmit rate[100%], receive rate [12%])

This is a 10 Gbit interface and on our monitoring software, which is getting network statistics every 10s, the bandwidth of TX was 150 Mbit maximum at this time, so far away from being 100%.

Could it be, that the engine detected the wrong interface speed or there is a calculation error?
In the engine for this host, I have 10000 Mbps for all interfaces.

I have checked now all those warnings on our different hosts and they happen every time, we go over 100 Mbit and this is for sure quite often...

Can I maybe disable these warnings, because we have it anyway in our monitoring software?

If you need any logs, please ask.

BR Florian Schmid
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2NFL3O66IN4Z6HUK45WQFXRUBMQDUY7P/

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3UPGCAKVDJDPNSODTO6PWXOD2ETT63N6/


To view the terms under which this email is distributed, please go to:-
http://disclaimer.leedsbeckett.ac.uk/disclaimer/disclaimer.html



_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/O2JKN5QYZUOV4T7RBSGSBTBSU6V6HQ62/