On Wed, Feb 17, 2016 at 7:57 AM, Sahina Bose <sabose(a)redhat.com> wrote:
On 02/17/2016 05:08 AM, Bill James wrote:
>
> I have a ovirt cluster with and system running ovirt-engine 3.6.2.6-1 on
> centos7.2
> and 3 hardware nodes running glusterfs 3.7.6-1 and centos7.2.
> I created a gluster volume using gluster cli and then went to add a
> storage domain.
>
> I created it with Path ovirt1-ks.test:/gv1 and all works fine, until
> ovirt1 goes down.
> Then ALL VMs pause till ovirt1 comes back up.
> Do I have to list all nodes in the path for this to work?
>
> Path: ovirt1-ks.test:/gv1 ovirt2-ks.test:/gv1 ovirt3-ks.test:/gv1
>
> Or how do I prevent ovirt1 from being single point of failure?
Which type of storage domain have you created - NFS or GlusterFS?
With NFS, this could be a problem, as the nfs server running on ovirt1 can
become single point of failure. To work around this, you will need to set up
HA with CTDB or pacemaker/corosync depending on version of NFS you're using.
If you're using glusterfs, did the hypervisor node go down as well when
ovirt1 went down? Only during mounting of the storage domain (either on
activate of hypervisor host/ reboot of host) is the server provided in path
required to access volume (ovirt1). You can provide
backup-volfile-servers=ovirt2-ks:ovirt3-ks in Mount options while creating
the storage domain.
oVirt is automatically detecting the available bricks and generating the
backup-volfile-servers option when connecting to a gluster storage domain.
You can see if this worked in vdsm.log, the backup-volfile-servers option
should appear in the mount command.
Please attach vdsm.log showing the time you connected to the gluster
storage domain, and the time of the failure.
Attach also /var/log/sanlock.log - this is the best place to detect issues
accessing storage as it reads and writes to all storage domains frequently.
Please also provide output of gluster volume info. (I'm assuming
the bricks
on the remaining servers were online)
Nir