This is a multi-part message in MIME format.
--------------030503030503080006060300
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
On 09/24/2015 11:28 AM, Nir Soffer wrote:
On Thu, Sep 24, 2015 at 7:37 AM, Ramesh Nachimuthu
<rnachimu(a)redhat.com <mailto:rnachimu@redhat.com>> wrote:
On 09/24/2015 02:38 AM, Darrell Budic wrote:
> This is a known issue in overt 3.5.x and below. It’s been solved
> in the upcoming ovirt 3.6.
>
> Related to
https://bugzilla.redhat.com/show_bug.cgi?id=1172905,
> the fix involved setting up a special cgroup for the mount, but i
> can’t find the exact details atm.
>
I have vdsm 4.17.6-0.el7.centos already installed on the hosts. So
I am not sure above bug 1172905
<
https://bugzilla.redhat.com/show_bug.cgi?id=1172905> fixes this
correctly.
I think the root cause is the same - qemu cannot recover from
glusterfs unmount, and the only way to resume the vm is to restart it
with a fresh mount.
The mentioned bug handle the case where stopping vdsm kills the
glusterfs mount helper. This issue is fixed in 3.6.
The issue here seems different. I suggest you open a bug so gluster
guys can investigate this.
Seems like I am hitting the issue reported in bz
https://bugzilla.redhat.com/show_bug.cgi?id=1171261.
Regards,
Ramesh
Nir
Regards,
Ramesh
>
>> On Sep 23, 2015, at 7:38 AM, Ramesh Nachimuthu
>> <rnachimu(a)redhat.com <mailto:rnachimu@redhat.com>> wrote:
>>
>>
>>
>> On 09/22/2015 05:57 PM, Alastair Neil wrote:
>>> You need to set the gluster.server-quorum-ratio to 51%
>>>
>>
>> I did that. But still I am facing the same issue. VM get paused
>> when I do some I/O using fio on some disks backed by gluster. I
>> am not able to resume the VM after this. Now only way is to
>> bring down the VM and run again. It runs successfully on the
>> same host without any issue.
>>
>> Regards,
>> Ramesh
>>
>>> On 22 September 2015 at 08:25, Ramesh Nachimuthu
>>> <rnachimu(a)redhat.com <mailto:rnachimu@redhat.com>> wrote:
>>>
>>>
>>>
>>> On 09/22/2015 05:43 PM, Alastair Neil wrote:
>>>> what are the gluster-quorum-type
>>>> and gluster.server-quorum-ratio settings on the volume?
>>>>
>>>
>>> *cluster.server-quorum-type*:server
>>> *cluster.quorum-type*:auto
>>> *gluster.server-quorum-ratio is not set.*
>>>
>>> One brick process is purposefully killed but remaining two
>>> bricks are up and running.
>>>
>>> Regards,
>>> Ramesh
>>>
>>>> On 22 September 2015 at 06:24, Ramesh Nachimuthu
>>>> <rnachimu(a)redhat.com <mailto:rnachimu@redhat.com>>
wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am not able to resume a VM which was paused
>>>> because of gluster client quorum issue. Here is what
>>>> happened in my setup.
>>>>
>>>> 1. Created a gluster storage domain which is backed by
>>>> gluster volume with replica 3.
>>>> 2. Killed one brick process. So only two bricks are
>>>> running in replica 3 setup.
>>>> 3. Created two VMs
>>>> 4. Started some IO using fio on both of the VMs
>>>> 5. After some time got the following error in gluster
>>>> mount and VMs moved to paused state.
>>>> " server 10.70.45.17:49217
>>>> <
http://10.70.45.17:49217/> has not responded in the
>>>> last 42 seconds, disconnecting."
>>>> "vmstore-replicate-0:
>>>> e16d1e40-2b6e-4f19-977d-e099f465dfc6: Failing WRITE as
>>>> quorum is not met"
>>>> more gluster mount logs at
>>>>
http://pastebin.com/UmiUQq0F
>>>> 6. After some time gluster quorum is active and I am
>>>> able to write the the gluster file system.
>>>> 7. When I try to resume the VM it doesn't work and I
>>>> got following error in vdsm log.
>>>>
http://pastebin.com/aXiamY15
>>>>
>>>>
>>>> Regards,
>>>> Ramesh
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users(a)ovirt.org <mailto:Users@ovirt.org>
>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org <mailto:Users@ovirt.org>
>>
http://lists.ovirt.org/mailman/listinfo/users
>
_______________________________________________
Users mailing list
Users(a)ovirt.org <mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
--------------030503030503080006060300
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit
<html>
<head>
<meta content="text/html; charset=utf-8"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 09/24/2015 11:28 AM, Nir Soffer
wrote:<br>
</div>
<blockquote
cite="mid:CAMRbyyvi0mtEVyae4TJxq-67B7ZKwnEVwUu5Dp6DrChO1Y2_Yw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Thu, Sep 24, 2015 at 7:37 AM,
Ramesh Nachimuthu <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rnachimu@redhat.com"
target="_blank"><a
class="moz-txt-link-abbreviated"
href="mailto:rnachimu@redhat.com">rnachimu@redhat.com</a></a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span
class=""> <br>
<br>
<div>On 09/24/2015 02:38 AM, Darrell Budic wrote:<br>
</div>
<blockquote type="cite"> This is a known issue in
overt 3.5.x and below. It’s been solved in the
upcoming ovirt 3.6.
<div><br>
</div>
<div>Related to <a moz-do-not-send="true"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1172905"
target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=...;,
the fix involved setting up a special cgroup for
the mount, but i can’t find the exact details atm.</div>
<div><br>
</div>
</blockquote>
<br>
</span> I have vdsm 4.17.6-0.el7.centos already
installed on the hosts. So I am not sure above bug <a
moz-do-not-send="true"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1172905"
target="_blank">1172905</a> fixes this
correctly.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I think the root cause is the same - qemu cannot
recover from glusterfs unmount, and the only way to resume
the vm is to restart it with a fresh mount.</div>
<div><br>
</div>
<div>The mentioned bug handle the case where stopping vdsm
kills the glusterfs mount helper. This issue is fixed in
3.6. </div>
<div><br>
</div>
<div>The issue here seems different. I suggest you open a
bug so gluster guys can investigate this.</div>
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Seems like I am hitting the issue reported in bz
<a class="moz-txt-link-freetext"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1171261">h...;.
<br>
<br>
Regards,<br>
Ramesh<br>
<br>
<blockquote
cite="mid:CAMRbyyvi0mtEVyae4TJxq-67B7ZKwnEVwUu5Dp6DrChO1Y2_Yw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Nir</div>
<div><br>
</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
Regards,<br>
Ramesh
<div>
<div class="h5"><br>
<br>
<blockquote type="cite">
<div><br>
<div>
<blockquote type="cite">
<div>On Sep 23, 2015, at 7:38 AM, Ramesh
Nachimuthu <<a moz-do-not-send="true"
href="mailto:rnachimu@redhat.com"
target="_blank">rnachimu(a)redhat.com</a>&gt;
wrote:</div>
<br>
<div>
<div text="#000000"
bgcolor="#FFFFFF"> <br>
<br>
<div>On 09/22/2015 05:57 PM, Alastair
Neil wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">You need to set the
gluster.server-quorum-ratio to 51%</div>
<div class="gmail_extra"><br>
</div>
</blockquote>
<br>
I did that. But still I am facing the
same issue. VM get paused when I do some
I/O using fio on some disks backed by
gluster. I am not able to resume the VM
after this. Now only way is to bring
down the VM and run again. It runs
successfully on the same host without
any issue.<br>
<br>
Regards,<br>
Ramesh<br>
<br>
<blockquote type="cite">
<div class="gmail_extra">
<div class="gmail_quote">On 22
September 2015 at 08:25, Ramesh
Nachimuthu <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rnachimu@redhat.com"
target="_blank"><a
class="moz-txt-link-abbreviated"
href="mailto:rnachimu@redhat.com">rnachimu@redhat.com</a></a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF"><span>
<br>
<br>
<div>On 09/22/2015 05:43 PM,
Alastair Neil wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">what are
the gluster-quorum-type
and gluster.server-quorum-ratio
settings on the volume?</div>
<div
class="gmail_extra"><br>
</div>
</blockquote>
<br>
</span>
<div
style="outline-style:none">
<div
style="overflow:hidden;text-overflow:ellipsis;white-space:nowrap"><b>cluster.server-quorum-type</b>:server<br>
<div title=""
style="outline-style:none">
<div
style="overflow:hidden;text-overflow:ellipsis;white-space:nowrap"><b>cluster.quorum-type</b>:auto<br>
<b>gluster.server-quorum-ratio
is not set.</b><br>
<br>
</div>
</div>
One brick process is
purposefully killed but
remaining two bricks are
up and running.<br>
<br>
Regards,<br>
Ramesh<br>
</div>
</div>
<span> <br>
<blockquote type="cite">
<div class="gmail_extra">
<div
class="gmail_quote">On
22 September 2015 at
06:24, Ramesh
Nachimuthu <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rnachimu@redhat.com" target="_blank"><a
class="moz-txt-link-abbreviated"
href="mailto:rnachimu@redhat.com">rnachimu@redhat.com</a></a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div text="#000000"
bgcolor="#FFFFFF">
Hi,<br>
<br>
I am not able
to resume a VM
which was paused
because of gluster
client quorum
issue. Here is
what happened in
my setup. <br>
<br>
1. Created a
gluster storage
domain which is
backed by gluster
volume with
replica 3. <br>
2. Killed one
brick process. So
only two bricks
are running in
replica 3 setup.<br>
3. Created two VMs<br>
4. Started some IO
using fio on both
of the VMs<br>
5. After some time
got the following
error in gluster
mount and VMs
moved to paused
state.<br>
" <span
style="color:rgb(51,51,51);font-family:monospace;font-size:11px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:13.2px;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none;background-color:rgb(255,255,255)">server
<a
moz-do-not-send="true"
href="http://10.70.45.17:49217/"
target="_blank">10.70.45.17:49217</a>
has not
responded in the
last 42 seconds,
disconnecting."<br>
"</span><span
style="color:rgb(51,51,51);font-family:monospace;font-size:11px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:13.2px;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none;background-color:rgb(255,255,255)"><span
style="color:rgb(51,51,51);font-family:monospace;font-size:11px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:13.2px;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none;background-color:rgb(255,255,255)">vmstore-replicate-0:
e16d1e40-2b6e-4f19-977d-e099f465dfc6:
Failing WRITE
as quorum is
not
met</span>"<br>
more
gluster mount
logs at <a
moz-do-not-send="true"
href="http://pastebin.com/UmiUQq0F" target="_blank"><a
class="moz-txt-link-freetext"
href="http://pastebin.com/UmiUQq0F">http://pastebin.com/UmiU...
</span>6. After
some time gluster
quorum is active
and I am able to
write the the
gluster file
system.<br>
7. When I try to
resume the VM it
doesn't work and I
got following
error in vdsm log.<br>
<a
moz-do-not-send="true"
href="http://pastebin.com/aXiamY15" target="_blank"><a
class="moz-txt-link-freetext"
href="http://pastebin.com/aXiamY15">http://pastebin.com/aXia...
<br>
<br>
Regards,<br>
Ramesh<br>
<br>
</div>
<br>
_______________________________________________<br>
Users mailing list<br>
<a
moz-do-not-send="true"
href="mailto:Users@ovirt.org" target="_blank"><a
class="moz-txt-link-abbreviated"
href="mailto:Users@ovirt.org">Users@ovirt.org</a></a><br>
<a
moz-do-not-send="true"
href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer"
target="_blank"><a
class="moz-txt-link-freetext"
href="http://lists.ovirt.org/mailman/listinfo/users">http://...
<br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</span></div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
_______________________________________________<br>
Users mailing list<br>
<a moz-do-not-send="true"
href="mailto:Users@ovirt.org"
target="_blank">Users(a)ovirt.org</a><br>
<a moz-do-not-send="true"
href="http://lists.ovirt.org/mailman/listinfo/users"
target="_blank">http://lists.ovirt.org/mailman/listinfo/user...
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Users mailing list<br>
<a moz-do-not-send="true"
href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a moz-do-not-send="true"
href="http://lists.ovirt.org/mailman/listinfo/users"
rel="noreferrer"
target="_blank">http://lists.ovirt.org/mailman/listinfo/user...
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>
--------------030503030503080006060300--