[ovirt-users] Re: Ovirt cluster unstable; gluster to blame (again)

Friday, 6 July 2018

Load like that is mostly io based either the machine is swapping or network 
is to slow. Check I/o wait in top.

And the problem where you get oom killer to kill off gluster. That means 
that you don't monitor ram usage on the servers? Either it's eating all 
your ram and swap gets really io intensive and then is killed off. Or you 
have the wrong swap settings in sysctl.conf (there are tons of broken 
guides that recommends swappines to 0 but that disables swap on newer 
kernels. The proper swappines for only swapping when nesseary is 1 or a 
sufficiently low number like 10 default is 60)

Moving to nfs will not improve things. You will get more memory since 
gluster isn't running and that is good. But you will have a single node 
that can fail with all your storage and it would still be on 1 gigabit only 
and your three node cluster would easily saturate that link.

On July 7, 2018 04:13:13 Jim Kusznir <jim(a)palousetech.com&gt; wrote:
...
 So far it does not appear to be helping much. I'm still getting
VM's 
 locking up and all kinds of notices from overt engine about non-responsive 
 hosts.  I'm still seeing load averages in the 20-30 range.

 Jim

 On Fri, Jul 6, 2018, 3:13 PM Jim Kusznir <jim(a)palousetech.com&gt; wrote:
 Thank you for the advice and help

 I do plan on going 10Gbps networking; haven't quite jumped off that cliff 
 yet, though.

 I did put my data-hdd (main VM storage volume) onto a dedicated 1Gbps 
 network, and I've watched throughput on that and never seen more than 
 60GB/s achieved (as reported by bwm-ng).  I have a separate 1Gbps network 
 for communication and ovirt migration, but I wanted to break that up 
 further (separate out VM traffice from migration/mgmt traffic).  My three 
 SSD-backed gluster volumes run the main network too, as I haven't been able 
 to get them to move to the new network (which I was trying to use as all 
 gluster).  I tried bonding, but that seamed to reduce performance rather 
 than improve it.

 --Jim

 On Fri, Jul 6, 2018 at 2:52 PM, Jamie Lawrence <jlawrence(a)squaretrade.com&gt; 
 wrote:

 Hi Jim,

 I don't have any targeted suggestions, because there isn't much to latch on 
 to. I can say Gluster replica three  (no arbiters) on dedicated servers 
 serving a couple Ovirt VM clusters here have not had these sorts of issues.

 I suspect your long heal times (and the resultant long periods of high 
 load) are at least partly related to 1G networking. That is just a matter 
 of IO - heals of VMs involve moving a lot of bits. My cluster uses 10G 
 bonded NICs on the gluster and ovirt boxes for storage traffic and separate 
 bonded 1G for ovirtmgmt and communication with other machines/people, and 
 we're occasionally hitting the bandwidth ceiling on the storage network. 
 I'm starting to think about 40/100G, different ways of splitting up 
 intensive systems, and considering iSCSI for specific volumes, although I 
 really don't want to go there.

 I don't run FreeNAS[1], but I do run FreeBSD as storage servers for their 
 excellent ZFS implementation, mostly for backups. ZFS will make your `heal` 
 problem go away, but not your bandwidth problems, which become worse 
 (because of fewer NICS pushing traffic). 10G hardware is not exactly in the 
 impulse-buy territory, but if you can, I'd recommend doing some testing 
 using it. I think at least some of your problems are related.

 If that's not possible, my next stops would be optimizing everything I 
 could about sharding, healing and optimizing for serving the shard size to 
 squeeze as much performance out of 1G as I could, but that will only go so far.

 -j

 [1] FreeNAS is just a storage-tuned FreeBSD with a GUI.

> On Jul 6, 2018, at 1:19 PM, Jim Kusznir <jim(a)palousetech.com&gt; wrote:
>
> hi all:
>
> Once again my production ovirt cluster is collapsing in on itself.  My 
> servers are intermittently unavailable or degrading, customers are noticing 
> and calling in.  This seems to be yet another gluster failure that I 
> haven't been able to pin down.
>
> I posted about this a while ago, but didn't get anywhere (no replies that I 
> found).  The problem started out as a glusterfsd process consuming large 
> amounts of ram (up to the point where ram and swap were exhausted and the 
> kernel OOM killer killed off the glusterfsd process).  For reasons not 
> clear to me at this time, that resulted in any VMs running on that host and 
> that gluster volume to be paused with I/O error (the glusterfs process is 
> usually unharmed; why it didn't continue I/O with other servers is 
> confusing to me).
>
> I have 3 servers and a total of 4 gluster volumes (engine, iso, data, and 
> data-hdd).  The first 3 are replica 2+arb; the 4th (data-hdd) is replica 3. 
>  The first 3 are backed by an LVM partition (some thin provisioned) on an 
> SSD; the 4th is on a seagate hybrid disk (hdd + some internal flash for 
> acceleration).  data-hdd is the only thing on the disk.  Servers are Dell 
> R610 with the PERC/6i raid card, with the disks individually passed through 
> to the OS (no raid enabled).
>
> The above RAM usage issue came from the data-hdd volume.  Yesterday, I 
> cought one of the glusterfsd high ram usage before the OOM-Killer had to 
> run.  I was able to migrate the VMs off the machine and for good measure, 
> reboot the entire machine (after taking this opportunity to run the 
> software updates that ovirt said were pending).  Upon booting back up, the 
> necessary volume healing began.  However, this time, the healing caused all 
> three servers to go to very, very high load averages (I saw just under 200 
> on one server; typically they've been 40-70) with top reporting IO Wait at 
> 7-20%.  Network for this volume is a dedicated gig network.  According to 
> bwm-ng, initially the network bandwidth would hit 50MB/s (yes, bytes), but 
> tailed off to mostly in the kB/s for a while.  All machines' load averages 
> were still 40+ and gluster volume heal data-hdd info reported 5 items 
> needing healing.  Server's were intermittently experiencing IO issues, even 
> on the 3 gluster volumes that appeared largely unaffected.  Even the OS 
> activities on the hosts itself (logging in, running commands) would often 
> be very delayed.  The ovirt engine was seemingly randomly throwing engine 
> down / engine up / engine failed notifications.  Responsiveness on ANY VM 
> was horrific most of the time, with random VMs being inaccessible.
>
> I let the gluster heal run overnight.  By morning, there were still 5 items 
> needing healing, all three servers were still experiencing high load, and 
> servers were still largely unstable.
>
> I've noticed that all of my ovirt outages (and I've had a lot, way more 
> than is acceptable for a production cluster) have come from gluster.  I 
> still have 3 VMs who's hard disk images have become corrupted by my last 
> gluster crash that I haven't had time to repair / rebuild yet (I believe 
> this crash was caused by the OOM issue previously mentioned, but I didn't 
> know it at the time).
>
> Is gluster really ready for production yet?  It seems so unstable to me.... 
>  I'm looking at replacing gluster with a dedicated NFS server likely 
> FreeNAS.  Any suggestions?  What is the "right" way to do production 
> storage on this (3 node cluster)?  Can I get this gluster volume stable 
> enough to get my VMs to run reliably again until I can deploy another 
> storage solution?
>
> --Jim
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YQX3LQFQQPW...

 _______________________________________________
 Users mailing list -- users(a)ovirt.org
 To unsubscribe send an email to users-leave(a)ovirt.org
 Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 oVirt Code of Conduct: 
 https://www.ovirt.org/community/about/community-guidelines/
 List Archives: 

https://lists.ovirt.org/archives/list/users@ovirt.org/message/O2HIECLFMYG...

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: Ovirt cluster unstable; gluster to blame (again)