
Hi Folks, I did a minor upgrade on the first host in my cluster and now it is reporting "Non Operational" This is what yum showed as updatable. However, I did the update through the ovirt-engine web interface. ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 Obsoleting Packages ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update.noarch 4.4.8.3-1.el8 @System ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update-placeholder.noarch 4.4.8.3-1.el8 @System How do I start to debug this issue? Also, it looks like the vmstore brick is not mounting on that host. I only see the engine mounted. Broken server: root@ovirt1.dgi log]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) Working server: [root@ovirt2.dgi ~]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) ovirt1-storage.dgi:/vmstore on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_vmstore type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) I tried putting the server into maintenance mode and running a reinstall on it. No change. I'de really appreciate some help sorting this our. Cheers, Gervais

On Tuesday, 23 November 2021 03:36:07 CET Gervais de Montbrun wrote:
Hi Folks,
I did a minor upgrade on the first host in my cluster and now it is reporting "Non Operational"
This is what yum showed as updatable. However, I did the update through the ovirt-engine web interface.
ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 Obsoleting Packages ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update.noarch 4.4.8.3-1.el8 @System ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update-placeholder.noarch 4.4.8.3-1.el8 @System
How do I start to debug this issue?
Check engine log in /var/log/ovirt-engine/engine.log on the machine where engine runs
Also, it looks like the vmstore brick is not mounting on that host. I only see the engine mounted.
Could you also attach relevant part of vdsm log (/var/log/vdsm/vdsm.log) from the machine where mount failed? You should see some mount related error there. This could be also a reason why hosts become non-operational. Thanks Vojta
Broken server: root@ovirt1.dgi log]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072) Working server: [root@ovirt2.dgi ~]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072) ovirt1-storage.dgi:/vmstore on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_vmstore type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072)
I tried putting the server into maintenance mode and running a reinstall on it. No change. I'de really appreciate some help sorting this our.
Cheers, Gervais

Hi Vojta, Thanks for the help. I tried to activate my server this morning and captured the logs from vdsm.log and engine.log. They are attached. Something went awry with my gluster (I think) as it is showing that the bricks on the affected server (ovirt1) are not mounted: The networking looks fine. Cheers, Gervais
On Nov 23, 2021, at 3:37 AM, Vojtech Juranek <vjuranek@redhat.com> wrote:
On Tuesday, 23 November 2021 03:36:07 CET Gervais de Montbrun wrote:
Hi Folks,
I did a minor upgrade on the first host in my cluster and now it is reporting "Non Operational"
This is what yum showed as updatable. However, I did the update through the ovirt-engine web interface.
ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 Obsoleting Packages ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update.noarch 4.4.8.3-1.el8 @System ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update-placeholder.noarch 4.4.8.3-1.el8 @System
How do I start to debug this issue?
Check engine log in /var/log/ovirt-engine/engine.log on the machine where engine runs
Also, it looks like the vmstore brick is not mounting on that host. I only see the engine mounted.
Could you also attach relevant part of vdsm log (/var/log/vdsm/vdsm.log) from the machine where mount failed? You should see some mount related error there. This could be also a reason why hosts become non-operational.
Thanks Vojta
Broken server: root@ovirt1.dgi log]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072) Working server: [root@ovirt2.dgi ~]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072) ovirt1-storage.dgi:/vmstore on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_vmstore type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072)
I tried putting the server into maintenance mode and running a reinstall on it. No change. I'de really appreciate some help sorting this our.
Cheers, Gervais
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/S6C7R6LUTJXFMG...

On Tuesday, 23 November 2021 14:42:31 CET Gervais de Montbrun wrote:
Hi Vojta,
Thanks for the help.
I tried to activate my server this morning and captured the logs from vdsm.log and engine.log. They are attached.
Something went awry with my gluster (I think) as it is showing that the bricks on the affected server (ovirt1) are not mounted:
It seems not to be available, therefore vdsm fails with "OSError: [Errno 116] Stale file handle" and therefore fails to mount it. I'd suggest to investigate what's happening with you Gluster storage, eventually try to mount it manually from affected machine - if you are able to mount it manually, vdsm should be able to mount it as well. Given lots of warning in engine log "Could not associate brick 'ovirt1- storage.dgi:/gluster_bricks/vmstore/vmstore' of volume '2670ff29-8d43-4610- a437-c6ec2c235753' with correct network as no gluster network found in cluster '404c8d14-73c1-11eb-8755-00163e5907f6'", I'd probably first take a look on the network. Vojta

Hello Gervais, is the brick mounted on ovirt1 ? can you mount it using the settings in /etc/fstab ? The hostname is not using a FQDN for ovirt1 assuming you have a storage network for the gluster nodes the engine needs to resolve be able to resolve the host addresses ovirt1-storage.dgi ovirt2-storage.dgi ovirt3-storage.dgi So that it can assign them to the correct network. When the volume is showing yellow you can force restart them again from the GUI. Regards, Paul S. ________________________________ From: Gervais de Montbrun <gervais@demontbrun.com> Sent: 23 November 2021 13:42 To: Vojtech Juranek <vjuranek@redhat.com> Cc: users@ovirt.org <users@ovirt.org> Subject: [ovirt-users] Re: How to debug "Non Operational" host Caution External Mail: Do not click any links or open any attachments unless you trust the sender and know that the content is safe. Hi Vojta, Thanks for the help. I tried to activate my server this morning and captured the logs from vdsm.log and engine.log. They are attached. Something went awry with my gluster (I think) as it is showing that the bricks on the affected server (ovirt1) are not mounted: [cid:2a29d29c-b652-4af8-acf0-1270cb8864bc@eurprd03.prod.outlook.com] [cid:b0dd6964-58c9-453f-8a6b-fdda6641bde7@eurprd03.prod.outlook.com] [cid:fb25f398-906f-4a72-9927-b0fdf45e8a23@eurprd03.prod.outlook.com] The networking looks fine. Cheers, Gervais
On Nov 23, 2021, at 3:37 AM, Vojtech Juranek <vjuranek@redhat.com> wrote:
On Tuesday, 23 November 2021 03:36:07 CET Gervais de Montbrun wrote:
Hi Folks,
I did a minor upgrade on the first host in my cluster and now it is reporting "Non Operational"
This is what yum showed as updatable. However, I did the update through the ovirt-engine web interface.
ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 Obsoleting Packages ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update.noarch 4.4.8.3-1.el8 @System ovirt-node-ng-image-update.noarch 4.4.9-1.el8 ovirt-4.4 ovirt-node-ng-image-update-placeholder.noarch 4.4.8.3-1.el8 @System
How do I start to debug this issue?
Check engine log in /var/log/ovirt-engine/engine.log on the machine where engine runs
Also, it looks like the vmstore brick is not mounting on that host. I only see the engine mounted.
Could you also attach relevant part of vdsm log (/var/log/vdsm/vdsm.log) from the machine where mount failed? You should see some mount related error there. This could be also a reason why hosts become non-operational.
Thanks Vojta
Broken server: root@ovirt1.dgi log]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072) Working server: [root@ovirt2.dgi ~]# mount | grep storage ovirt1-storage.dgi:/engine on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072) ovirt1-storage.dgi:/vmstore on /rhev/data-center/mnt/glusterSD/ovirt1-storage.dgi:_vmstore type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read= 131072)
I tried putting the server into maintenance mode and running a reinstall on it. No change. I'de really appreciate some help sorting this our.
Cheers, Gervais
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.org%2Fprivacy-policy.html&data=04%7C01%7Cp.staniforth%40leedsbeckett.ac.uk%7Cdc2421c40bc24707ac7208d9ae8b9b18%7Cd79a81124fbe417aa112cd0fb490d85c%7C0%7C0%7C637732736906127847%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=h8Tr3INeL9M8Ta8rwdvA3IwAPrgzQtlTsE3e0VSO%2FHM%3D&reserved=0> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.org%2Fcommunity%2Fabout%2Fcommunity-guidelines%2F&data=04%7C01%7Cp.staniforth%40leedsbeckett.ac.uk%7Cdc2421c40bc24707ac7208d9ae8b9b18%7Cd79a81124fbe417aa112cd0fb490d85c%7C0%7C0%7C637732736906137843%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=hwg9G5Kd523C1romBACDYbN54DF9lugXU695DboEwA8%3D&reserved=0> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/S6C7R6LUTJXFMG7WIODA53VEU4O7ZNHJ/<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovirt.org%2Farchives%2Flist%2Fusers%40ovirt.org%2Fmessage%2FS6C7R6LUTJXFMG7WIODA53VEU4O7ZNHJ%2F&data=04%7C01%7Cp.staniforth%40leedsbeckett.ac.uk%7Cdc2421c40bc24707ac7208d9ae8b9b18%7Cd79a81124fbe417aa112cd0fb490d85c%7C0%7C0%7C637732736906137843%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=IHOa%2F8lisbYAjShzfGKVJCbwWgH%2FwcZN2jfoBLCDmbo%3D&reserved=0>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.org%2Fprivacy-policy.html&data=04%7C01%7Cp.staniforth%40leedsbeckett.ac.uk%7Cdc2421c40bc24707ac7208d9ae8b9b18%7Cd79a81124fbe417aa112cd0fb490d85c%7C0%7C0%7C637732736906147835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=OeaQ%2Bj36E3o9m06dLio9EKhGbJYnwErPzN0UbaGry3I%3D&reserved=0> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.org%2Fcommunity%2Fabout%2Fcommunity-guidelines%2F&data=04%7C01%7Cp.staniforth%40leedsbeckett.ac.uk%7Cdc2421c40bc24707ac7208d9ae8b9b18%7Cd79a81124fbe417aa112cd0fb490d85c%7C0%7C0%7C637732736906147835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6ct%2FueKbEYjZZnqfUlNCVoH9IP3GCrqSdjNJITp1FBE%3D&reserved=0> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AWSWTXS6CEAYSAC3DUDNUTUZKPA7237E/<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovirt.org%2Farchives%2Flist%2Fusers%40ovirt.org%2Fmessage%2FAWSWTXS6CEAYSAC3DUDNUTUZKPA7237E%2F&data=04%7C01%7Cp.staniforth%40leedsbeckett.ac.uk%7Cdc2421c40bc24707ac7208d9ae8b9b18%7Cd79a81124fbe417aa112cd0fb490d85c%7C0%7C0%7C637732736906157828%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=kiBvlrDPoQsHL43oCpIuwoKmlZ4apj559xXzxdP6x6k%3D&reserved=0> To view the terms under which this email is distributed, please go to:- https://leedsbeckett.ac.uk/disclaimer/email
participants (3)
-
Gervais de Montbrun
-
Staniforth, Paul
-
Vojtech Juranek