Just to add, after we updated to 4.3 our gluster just went south. Thankfully gluster is only secondary storage for us, and our primary storage is an ISCSI SAN.  We migrated everything over to the SAN that we could, but a few VM's got corrupted by gluster (data was gone). Right now we just have gluster off and set to maintenance because the connectivity issues were causing our main cluster to continuously migrate VMs.

Looking at the gluster hosts themselves I noticed that heal info would often report one brick down, even if ovirt didn't. Checking the status of glusterd would also show health checks failed.

Feb  6 15:36:37 vmc3h1 glusterfs-virtstore[17036]: [2019-02-06 23:36:37.937041] M [MSGID: 113075] [posix-helpers.c:1957:posix_health_check_thread_proc] 0-VirstStore-posix: health-check failed, going down
Feb  6 15:36:37 vmc3h1 glusterfs-virtstore[17036]: [2019-02-06 23:36:37.937561] M [MSGID: 113075] [posix-helpers.c:1975:posix_health_check_thread_proc] 0-VirstStore-posix: still alive! -> SIGTERM

I think the health-check is failing (maybe erroneously) which is then killing the brick. When this happens it just causing a continuous cycle of brick up-down and healing, and in turn connectivity issues.

This is our second time running into issues with gluster, so I think we are going to sideline it for awhile.

-Ryan

On Thu, Feb 14, 2019 at 12:47 PM Darryl Scott <dscott@umbctraining.com> wrote:

I do believe something went wrong after fully updating everything last Friday.  I updated all the ovirt compute nodes on Friday and gluster/engine on Saturday.  I have been experiencing these issues every since.  I have pour over engine.log and seems to be connection to storage issue.



From: Jayme <jaymef@gmail.com>
Sent: Thursday, February 14, 2019 1:52:59 AM
To: Darryl Scott
Cc: users
Subject: Re: [ovirt-users] Ovirt Cluster completely unstable
 
I have a three node HCI gluster which was previously running 4.2 with zero problems.  I just upgraded it yesterday.  I ran in to a few bugs right away with the upgrade process, but aside from that I also discovered other users with severe GlusterFS problems since the upgrade to new GlusterFS version.  It is less than 24 hours since I upgrade my cluster and I just got a notice that one of my GlusterFS bricks is offline.  There does appear to be a very real and serious issue here with the latest updates.


On Wed, Feb 13, 2019 at 7:26 PM <dscott@umbctraining.com> wrote:
I'm abandoning my production ovirt cluster due to instability.   I have a 7 host cluster running about 300 vms and have been for over a year.  It has become unstable over the past three days.  I have random hosts both, compute and storage disconnecting.  AND many vms disconnecting and becoming unusable.

7 host are 4 compute hosts running Ovirt 4.2.8 and three glusterfs hosts running 3.12.5.  I submitted a bugzilla bug and they immediately assigned it to the storage people but have not responded with any meaningful information.  I have submitted several logs. 

I have found some discussion on problems with instability with gluster 3.12.5.  I would be willing to upgrade my gluster to a more stable version if that's the culprit.  I installed gluster using the ovirt gui and this is the version the ovirt gui installed.

Is there an ovirt health monitor available?  Where should I be looking to get a resolution the problems I'm facing.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BL4M3JQA3IEXCQUY4IGQXOAALRUQ7TVB/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/IMUKFFANNJXLKXNVGMMJ6Y7MOLW2CQE3/