[ovirt-users] Not able to resume a VM which was paused because of gluster quorum issue
Ramesh Nachimuthu
rnachimu at redhat.com
Thu Sep 24 06:06:11 UTC 2015
On 09/24/2015 11:28 AM, Nir Soffer wrote:
> On Thu, Sep 24, 2015 at 7:37 AM, Ramesh Nachimuthu
> <rnachimu at redhat.com <mailto:rnachimu at redhat.com>> wrote:
>
>
>
> On 09/24/2015 02:38 AM, Darrell Budic wrote:
>> This is a known issue in overt 3.5.x and below. It’s been solved
>> in the upcoming ovirt 3.6.
>>
>> Related to https://bugzilla.redhat.com/show_bug.cgi?id=1172905,
>> the fix involved setting up a special cgroup for the mount, but i
>> can’t find the exact details atm.
>>
>
> I have vdsm 4.17.6-0.el7.centos already installed on the hosts. So
> I am not sure above bug 1172905
> <https://bugzilla.redhat.com/show_bug.cgi?id=1172905> fixes this
> correctly.
>
>
> I think the root cause is the same - qemu cannot recover from
> glusterfs unmount, and the only way to resume the vm is to restart it
> with a fresh mount.
>
> The mentioned bug handle the case where stopping vdsm kills the
> glusterfs mount helper. This issue is fixed in 3.6.
>
> The issue here seems different. I suggest you open a bug so gluster
> guys can investigate this.
>
Seems like I am hitting the issue reported in bz
https://bugzilla.redhat.com/show_bug.cgi?id=1171261.
Regards,
Ramesh
> Nir
>
>
>
> Regards,
> Ramesh
>
>
>>
>>> On Sep 23, 2015, at 7:38 AM, Ramesh Nachimuthu
>>> <rnachimu at redhat.com <mailto:rnachimu at redhat.com>> wrote:
>>>
>>>
>>>
>>> On 09/22/2015 05:57 PM, Alastair Neil wrote:
>>>> You need to set the gluster.server-quorum-ratio to 51%
>>>>
>>>
>>> I did that. But still I am facing the same issue. VM get paused
>>> when I do some I/O using fio on some disks backed by gluster. I
>>> am not able to resume the VM after this. Now only way is to
>>> bring down the VM and run again. It runs successfully on the
>>> same host without any issue.
>>>
>>> Regards,
>>> Ramesh
>>>
>>>> On 22 September 2015 at 08:25, Ramesh Nachimuthu
>>>> <rnachimu at redhat.com <mailto:rnachimu at redhat.com>> wrote:
>>>>
>>>>
>>>>
>>>> On 09/22/2015 05:43 PM, Alastair Neil wrote:
>>>>> what are the gluster-quorum-type
>>>>> and gluster.server-quorum-ratio settings on the volume?
>>>>>
>>>>
>>>> *cluster.server-quorum-type*:server
>>>> *cluster.quorum-type*:auto
>>>> *gluster.server-quorum-ratio is not set.*
>>>>
>>>> One brick process is purposefully killed but remaining two
>>>> bricks are up and running.
>>>>
>>>> Regards,
>>>> Ramesh
>>>>
>>>>> On 22 September 2015 at 06:24, Ramesh Nachimuthu
>>>>> <rnachimu at redhat.com <mailto:rnachimu at redhat.com>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am not able to resume a VM which was paused
>>>>> because of gluster client quorum issue. Here is what
>>>>> happened in my setup.
>>>>>
>>>>> 1. Created a gluster storage domain which is backed by
>>>>> gluster volume with replica 3.
>>>>> 2. Killed one brick process. So only two bricks are
>>>>> running in replica 3 setup.
>>>>> 3. Created two VMs
>>>>> 4. Started some IO using fio on both of the VMs
>>>>> 5. After some time got the following error in gluster
>>>>> mount and VMs moved to paused state.
>>>>> " server 10.70.45.17:49217
>>>>> <http://10.70.45.17:49217/> has not responded in the
>>>>> last 42 seconds, disconnecting."
>>>>> "vmstore-replicate-0:
>>>>> e16d1e40-2b6e-4f19-977d-e099f465dfc6: Failing WRITE as
>>>>> quorum is not met"
>>>>> more gluster mount logs at
>>>>> http://pastebin.com/UmiUQq0F
>>>>> 6. After some time gluster quorum is active and I am
>>>>> able to write the the gluster file system.
>>>>> 7. When I try to resume the VM it doesn't work and I
>>>>> got following error in vdsm log.
>>>>> http://pastebin.com/aXiamY15
>>>>>
>>>>>
>>>>> Regards,
>>>>> Ramesh
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org <mailto:Users at ovirt.org>
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org <mailto:Users at ovirt.org>
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org <mailto:Users at ovirt.org>
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150924/90f4a3bf/attachment-0001.html>
More information about the Users
mailing list