
On Wed, Feb 17, 2016 at 7:57 AM, Sahina Bose <sabose@redhat.com> wrote:
On 02/17/2016 05:08 AM, Bill James wrote:
I have a ovirt cluster with and system running ovirt-engine 3.6.2.6-1 on centos7.2 and 3 hardware nodes running glusterfs 3.7.6-1 and centos7.2. I created a gluster volume using gluster cli and then went to add a storage domain.
I created it with Path ovirt1-ks.test:/gv1 and all works fine, until ovirt1 goes down. Then ALL VMs pause till ovirt1 comes back up. Do I have to list all nodes in the path for this to work?
Path: ovirt1-ks.test:/gv1 ovirt2-ks.test:/gv1 ovirt3-ks.test:/gv1
Or how do I prevent ovirt1 from being single point of failure?
Which type of storage domain have you created - NFS or GlusterFS? With NFS, this could be a problem, as the nfs server running on ovirt1 can become single point of failure. To work around this, you will need to set up HA with CTDB or pacemaker/corosync depending on version of NFS you're using.
If you're using glusterfs, did the hypervisor node go down as well when ovirt1 went down? Only during mounting of the storage domain (either on activate of hypervisor host/ reboot of host) is the server provided in path required to access volume (ovirt1). You can provide backup-volfile-servers=ovirt2-ks:ovirt3-ks in Mount options while creating the storage domain.
oVirt is automatically detecting the available bricks and generating the backup-volfile-servers option when connecting to a gluster storage domain. You can see if this worked in vdsm.log, the backup-volfile-servers option should appear in the mount command. Please attach vdsm.log showing the time you connected to the gluster storage domain, and the time of the failure. Attach also /var/log/sanlock.log - this is the best place to detect issues accessing storage as it reads and writes to all storage domains frequently.
Please also provide output of gluster volume info. (I'm assuming the bricks on the remaining servers were online)
Nir