[ovirt-users] Not able to resume a VM which was paused because of gluster quorum issue

Nir Soffer nsoffer at redhat.com
Thu Sep 24 06:27:35 UTC 2015


On Thu, Sep 24, 2015 at 9:06 AM, Ramesh Nachimuthu <rnachimu at redhat.com>
wrote:

>
>
> On 09/24/2015 11:28 AM, Nir Soffer wrote:
>
> On Thu, Sep 24, 2015 at 7:37 AM, Ramesh Nachimuthu < <rnachimu at redhat.com>
> rnachimu at redhat.com> wrote:
>
>>
>>
>> On 09/24/2015 02:38 AM, Darrell Budic wrote:
>>
>> This is a known issue in overt 3.5.x and below. It’s been solved in the
>> upcoming ovirt 3.6.
>>
>> Related to https://bugzilla.redhat.com/show_bug.cgi?id=1172905, the fix
>> involved setting up a special cgroup for the mount, but i can’t find the
>> exact details atm.
>>
>>
>> I have vdsm 4.17.6-0.el7.centos already installed on the hosts. So I am
>> not sure above bug 1172905
>> <https://bugzilla.redhat.com/show_bug.cgi?id=1172905> fixes this
>> correctly.
>>
>
> I think the root cause is the same - qemu cannot recover from glusterfs
> unmount, and the only way to resume the vm is to restart it with a fresh
> mount.
>
> The mentioned bug handle the case where stopping vdsm kills the glusterfs
> mount helper. This issue is fixed in 3.6.
>
> The issue here seems different. I suggest you open a bug so gluster guys
> can investigate this.
>
>
> Seems like I am hitting the issue reported in bz
> https://bugzilla.redhat.com/show_bug.cgi?id=1171261.
>

Indeed.

I would open an ovirt bug anyway and make it depend on the glusterfs bug.

We need a way to track this issues, and having no ovirt/rhev hides this
issue.


>
> Regards,
> Ramesh
>
>
> Nir
>
>
>
>
>>
>> Regards,
>> Ramesh
>>
>>
>>
>> On Sep 23, 2015, at 7:38 AM, Ramesh Nachimuthu <rnachimu at redhat.com>
>> wrote:
>>
>>
>>
>> On 09/22/2015 05:57 PM, Alastair Neil wrote:
>>
>> You need to set the gluster.server-quorum-ratio to 51%
>>
>>
>> I did that. But still I am facing the same issue. VM get paused when I do
>> some I/O using fio on some disks backed by gluster. I am not able to resume
>> the VM after this. Now only way is to bring down the VM and run again. It
>> runs successfully on the same host without any issue.
>>
>> Regards,
>> Ramesh
>>
>> On 22 September 2015 at 08:25, Ramesh Nachimuthu < <rnachimu at redhat.com>
>> rnachimu at redhat.com> wrote:
>>
>>>
>>>
>>> On 09/22/2015 05:43 PM, Alastair Neil wrote:
>>>
>>> what are the gluster-quorum-type and gluster.server-quorum-ratio
>>>  settings on the volume?
>>>
>>>
>>> *cluster.server-quorum-type*:server
>>> *cluster.quorum-type*:auto
>>> *gluster.server-quorum-ratio is not set.*
>>>
>>> One brick process is purposefully killed  but remaining two bricks are
>>> up and running.
>>>
>>> Regards,
>>> Ramesh
>>>
>>> On 22 September 2015 at 06:24, Ramesh Nachimuthu < <rnachimu at redhat.com>
>>> rnachimu at redhat.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>    I am not able to resume a VM which was paused because of gluster
>>>> client quorum issue. Here is what happened in my setup.
>>>>
>>>> 1. Created a gluster storage domain which is backed by gluster volume
>>>> with replica 3.
>>>> 2. Killed one brick process. So only two bricks are running in replica
>>>> 3 setup.
>>>> 3. Created two VMs
>>>> 4. Started some IO using fio on both of the VMs
>>>> 5. After some time got the following error in gluster mount and VMs
>>>> moved to paused state.
>>>>          " server 10.70.45.17:49217 has not responded in the last 42
>>>> seconds, disconnecting."
>>>>       "vmstore-replicate-0: e16d1e40-2b6e-4f19-977d-e099f465dfc6:
>>>> Failing WRITE as quorum is not met"
>>>>       more gluster mount logs at <http://pastebin.com/UmiUQq0F>
>>>> http://pastebin.com/UmiUQq0F
>>>> 6. After some time gluster quorum is active and I am able to write the
>>>> the gluster file system.
>>>> 7. When I try to resume the VM it doesn't work and I got following
>>>> error in vdsm log.
>>>>       <http://pastebin.com/aXiamY15>http://pastebin.com/aXiamY15
>>>>
>>>>
>>>> Regards,
>>>> Ramesh
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> <Users at ovirt.org>Users at ovirt.org
>>>> <http://lists.ovirt.org/mailman/listinfo/users>
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150924/ec7654b5/attachment-0001.html>


More information about the Users mailing list