Just to add, after we updated to 4.3 our gluster just went south.
Thankfully gluster is only secondary storage for us, and our primary
storage is an ISCSI SAN. We migrated everything over to the SAN that we
could, but a few VM's got corrupted by gluster (data was gone). Right now
we just have gluster off and set to maintenance because the connectivity
issues were causing our main cluster to continuously migrate VMs.
Looking at the gluster hosts themselves I noticed that heal info would
often report one brick down, even if ovirt didn't. Checking the status of
glusterd would also show health checks failed.
Feb 6 15:36:37 vmc3h1 glusterfs-virtstore[17036]: [2019-02-06
23:36:37.937041] M [MSGID: 113075]
[posix-helpers.c:1957:posix_health_check_thread_proc] 0-VirstStore-posix:
health-check failed, going down
Feb 6 15:36:37 vmc3h1 glusterfs-virtstore[17036]: [2019-02-06
23:36:37.937561] M [MSGID: 113075]
[posix-helpers.c:1975:posix_health_check_thread_proc] 0-VirstStore-posix:
still alive! -> SIGTERM
I think the health-check is failing (maybe erroneously) which is then
killing the brick. When this happens it just causing a continuous cycle of
brick up-down and healing, and in turn connectivity issues.
This is our second time running into issues with gluster, so I think we are
going to sideline it for awhile.
-Ryan
On Thu, Feb 14, 2019 at 12:47 PM Darryl Scott <dscott(a)umbctraining.com>
wrote:
I do believe something went wrong after fully updating everything
last
Friday. I updated all the ovirt compute nodes on Friday and gluster/engine
on Saturday. I have been experiencing these issues every since. I have
pour over engine.log and seems to be connection to storage issue.
------------------------------
*From:* Jayme <jaymef(a)gmail.com>
*Sent:* Thursday, February 14, 2019 1:52:59 AM
*To:* Darryl Scott
*Cc:* users
*Subject:* Re: [ovirt-users] Ovirt Cluster completely unstable
I have a three node HCI gluster which was previously running 4.2 with zero
problems. I just upgraded it yesterday. I ran in to a few bugs right away
with the upgrade process, but aside from that I also discovered other users
with severe GlusterFS problems since the upgrade to new GlusterFS version.
It is less than 24 hours since I upgrade my cluster and I just got a notice
that one of my GlusterFS bricks is offline. There does appear to be a very
real and serious issue here with the latest updates.
On Wed, Feb 13, 2019 at 7:26 PM <dscott(a)umbctraining.com> wrote:
I'm abandoning my production ovirt cluster due to instability. I have a
7 host cluster running about 300 vms and have been for over a year. It has
become unstable over the past three days. I have random hosts both,
compute and storage disconnecting. AND many vms disconnecting and becoming
unusable.
7 host are 4 compute hosts running Ovirt 4.2.8 and three glusterfs hosts
running 3.12.5. I submitted a bugzilla bug and they immediately assigned
it to the storage people but have not responded with any meaningful
information. I have submitted several logs.
I have found some discussion on problems with instability with gluster
3.12.5. I would be willing to upgrade my gluster to a more stable version
if that's the culprit. I installed gluster using the ovirt gui and this is
the version the ovirt gui installed.
Is there an ovirt health monitor available? Where should I be looking to
get a resolution the problems I'm facing.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BL4M3JQA3IE...
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IMUKFFANNJX...