
Hi, I've set up 4 oVirt nodes with Gluster storage to provide high available virtual machines. The Gluster volumes are Distributed-Replicate with a replica count of 2. The extra volume options are configured: cat /var/lib/glusterd/groups/virt quick-read=off read-ahead=off io-cache=off stat-prefetch=off eager-lock=enable remote-dio=enable quorum-type=auto server-quorum-type=server Volume for the self-hosted engine: gluster volume info engine Volume Name: engine Type: Distributed-Replicate Volume ID: 9e7a3265-1e91-46e1-a0ba-09c5cc1fc1c1 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gluster004:/gluster/engine/004 Brick2: gluster005:/gluster/engine/005 Brick3: gluster006:/gluster/engine/006 Brick4: gluster007:/gluster/engine/007 Options Reconfigured: cluster.quorum-type: auto storage.owner-gid: 36 storage.owner-uid: 36 cluster.server-quorum-type: server network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off network.ping-timeout: 10 Volume for the virtual machines: gluster volume info data Volume Name: data Type: Distributed-Replicate Volume ID: 896db323-7ac4-4023-82a6-a8815a4d06b4 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gluster004:/gluster/data/004 Brick2: gluster005:/gluster/data/005 Brick3: gluster006:/gluster/data/006 Brick4: gluster007:/gluster/data/007 Options Reconfigured: cluster.quorum-type: auto performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable storage.owner-uid: 36 storage.owner-gid: 36 cluster.server-quorum-type: server network.ping-timeout: 10 Everything seems to be working fine. However, when I stop the storage network on gluster004 or gluster006, client-quorum is lost. Client-quorum isn't lost when the storage network is stopped on gluster005 or gluster007. [2015-02-16 07:05:58.541531] W [MSGID: 108001] [afr-common.c:3635:afr_notify] 0-data-replicate-1: Client-quorum is not met [2015-02-16 07:05:58.541579] W [MSGID: 108001] [afr-common.c:3635:afr_notify] 0-engine-replicate-1: Client-quorum is not met And as a result, the volumes are read-only and the VM's are paused. I've added a "dummy" gluster node for quorum use (no bricks, only running glusterd), but that didn't help. gluster peer status Number of Peers: 4 Hostname: gluster005 Uuid: 6c5253b4-b1c6-4d0a-9e6b-1f3efc1e8086 State: Peer in Cluster (Connected) Hostname: gluster006 Uuid: 4b3d15c4-2de0-4d2e-aa4c-3981e47dadbd State: Peer in Cluster (Connected) Hostname: gluster007 Uuid: 165e9ada-addb-496e-abf7-4a4efda4d5d3 State: Peer in Cluster (Connected) Hostname: glusterdummy Uuid: 3ef8177b-2394-429b-a58e-ecf0f6ce79a0 State: Peer in Cluster (Connected) The 4 nodes are running CentOS 7, with the following oVirt / Gluster packages: glusterfs-3.6.2-1.el7.x86_64 glusterfs-api-3.6.2-1.el7.x86_64 glusterfs-cli-3.6.2-1.el7.x86_64 glusterfs-fuse-3.6.2-1.el7.x86_64 glusterfs-libs-3.6.2-1.el7.x86_64 glusterfs-rdma-3.6.2-1.el7.x86_64 glusterfs-server-3.6.2-1.el7.x86_64 ovirt-engine-sdk-python-3.5.1.0-1.el7.centos.noarch ovirt-host-deploy-1.3.1-1.el7.noarch ovirt-hosted-engine-ha-1.2.5-1.el7.centos.noarch ovirt-hosted-engine-setup-1.2.2-1.el7.centos.noarch vdsm-gluster-4.16.10-8.gitc937927.el7.noarch The self-hosted engine is running CentOS 6 with ovirt-engine-3.5.1-1.el6.noarch Regards, Wesley