Re: [ovirt-users] ovirt with glusterfs - big test - unwanted results

11 Apr 2016

      This is a multi-part message in MIME format.
--------------070804060902090506040505
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit

So I looked at the vdsm logs and since there were multiple tests done it 
was difficult to isolate which error to track down. You mentioned test 
between 14:00-14:30  CET - but the gluster logs that were attached ended 
at 11.29 UTC

Tracking down the errors when the master domain (gluster volume 
1HP12-R3A1P1) went inactive for time period when corresponding gluster 
volume log was available - they all seem to correspond to an issue where 
gluster volume quorum was not met.

Can you confirm if this was for the test performed - or provide logs 
from correct time period (both vdsm and gluster mount logs are required 
- from hypervisors where the master domain is mounted)?

For master domain:
On 1hp1:
vdsm.log
Thread-35::ERROR::2016-03-31 
13:21:27,225::monitor::276::Storage.Monitor::(_monitorDomain) Err
or monitoring domain 14995860-1127-4dc4-b8c8-b540b89f9313
Traceback (most recent call last):
...
   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 
454, in statvfs
     resdict = self._sendCommand("statvfs", {"path": path}, self.timeout)
   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 
427, in _sendCommand
     raise OSError(errcode, errstr)
OSError: [Errno 107] Transport endpoint is not connected
Thread-35::INFO::2016-03-31 
13:21:27,267::monitor::299::Storage.Monitor::(_notifyStatusChanges) 
Domain 14995860-1127-4dc4-b8c8-b540b89f9313 became INVALID

-- And I see a corresponding:
[2016-03-31 11:21:16.027090] W [MSGID: 108001] 
[afr-common.c:4093:afr_notify] 0-1HP12-R3A1P1-r
eplicate-0: Client-quorum is not met

jsonrpc.Executor/0::DEBUG::2016-03-31 
13:23:34,110::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) 
Return 'GlusterVolume.status' in bridge with {'volumeStatus': {'bricks': 
[{'status': 'OFFLINE', 'hostuuid': 
'f6568a3b-3d65-4f4f-be9f-14a5935e37a4', 'pid': '-1', 'rdma_port': 'N/A', 
'brick': '1hp1:/STORAGES/P1/GFS', 'port': 'N/A'}, {'status': 'OFFLINE', 
'hostuuid': '8e87cf18-8958-41b7-8d24-7ee420a1ef9f', 'pid': '-1', 
'rdma_port': 'N/A', 'brick': '1hp2:/STORAGES/P1/GFS', 'port': 'N/A'}], 
'nfs': [{'status': 'OFFLINE', 'hostuuid': 
'f6568a3b-3d65-4f4f-be9f-14a5935e37a4', 'hostname': '172.16.5.151/24', 
'pid': '-1', 'rdma_port': 'N/A', 'port': 'N/A'}, {'status': 'OFFLINE', 
'hostuuid': '8e87cf18-8958-41b7-8d24-7ee420a1ef9f', 'hostname': '1hp2', 
'pid': '-1', 'rdma_port': 'N/A', 'port': 'N/A'}], 'shd': [{'status': 
'ONLINE', 'hostname': '172.16.5.151/24', 'pid': '2148', 'hostuuid': 
'f6568a3b-3d65-4f4f-be9f-14a5935e37a4'}, {'status': 'ONLINE', 
'hostname': '1hp2', 'pid': '2146', 'hostuuid': 
'8e87cf18-8958-41b7-8d24-7ee420a1ef9f'}], 'name': '1HP12-R3A1P1'}}

-- 2 bricks were offline. I think the arbiter brick is not reported in 
the xml output - this is a bug.

Similarly on 1hp2:
Thread-35::ERROR::2016-03-31 
13:21:14,284::monitor::276::Storage.Monitor::(_monitorDomain) Err
or monitoring domain 14995860-1127-4dc4-b8c8-b540b89f9313
Traceback (most recent call last):
   ...
     raise OSError(errcode, errstr)
OSError: [Errno 2] No such file or directory
Thread-35::INFO::2016-03-31 
13:21:14,285::monitor::299::Storage.Monitor::(_notifyStatusChanges) 
Domain 14995860-1127-4dc4-b8c8-b540b89f9313 became INVALID

Corresponding gluster mount log -
[2016-03-31 11:21:16.027640] W [MSGID: 108001] 
[afr-common.c:4093:afr_notify] 0-1HP12-R3A1P1-r
eplicate-0: Client-quorum is not met

On 04/05/2016 07:02 PM, paf1@email.cz wrote:
...
Hello Sahina,
look attached logs which U requested
regs.
Pavel
On 5.4.2016 14:07, Sahina Bose wrote:
...
On 03/31/2016 06:41 PM, paf1@email.cz wrote:
...
Hi,
rest of logs:
www.uschovna.cz/en/zasilka/HYGXR57CNHM3TP39-L3W 
<http://www.uschovna.cz/en/zasilka/HYGXR57CNHM3TP39-L3W>
The TEST is the last big event in logs ....
TEST TIME : about 14:00-14:30  CET
Thank you Pavel for the interesting test report and sharing the logs.
You are right - the master domain should not go down if 2 of 3 bricks 
are available from volume A (1HP12-R3A1P1).
I notice that host kvmarbiter was not responsive at 2016-03-31 
13:27:19 , but the ConnectStorageServerVDSCommand executed on 
kvmarbiter node returned success at 2016-03-31 13:27:26
Could you also share the vdsm logs from 1hp1, 1hp2 and kvmarbiter 
nodes during this time ?
Ravi, Krutika - could you take a look at the gluster logs?
...
regs.Pavel
On 31.3.2016 14:30, Yaniv Kaul wrote:
...
Hi Pavel,
Thanks for the report. Can you begin with a more accurate 
description of your environment?
Begin with host, oVirt and Gluster versions. Then continue with the 
exact setup (what are 'A', 'B', 'C' - domains? Volumes? What is the 
mapping between domains and volumes?).
Are there any logs you can share with us?
I'm sure with more information, we'd be happy to look at the issue.
Y.
On Thu, Mar 31, 2016 at 3:09 PM, paf1@email.cz <paf1@email.cz 
<mailto:paf1@email.cz>> wrote:
Hello,
    we tried the  following test - with unwanted results
input:
    5 node gluster
    A = replica 3 with arbiter 1 ( node1+node2+arbiter on node 5 )
    B = replica 3 with arbiter 1 ( node3+node4+arbiter on node 5 )
    C = distributed replica 3 arbiter 1  ( node1+node2,
    node3+node4, each arbiter on node 5)
    node 5 has only arbiter replica ( 4x )
TEST:
    1)  directly reboot one node - OK ( is not important which (
    data node or arbiter node ))
    2)  directly reboot two nodes - OK ( if  nodes are not from the
    same replica )
    3)  directly reboot three nodes - yes, this is the main problem
    and a questions ....
        - rebooted all three nodes from replica "B"  ( not so
    possible, but who knows ... )
        - all VMs with data on this replica was paused ( no data
    access ) - OK
        - all VMs running on replica "B" nodes lost ( started
    manually, later )( datas on other replicas ) - acceptable
    BUT
        - !!! all oVIrt domains went down !! - master domain is on
    replica "A" which lost only one member from three !!!
        so we are not expecting that all domain will go down,
    especially master with 2 live members.
Results:
        - the whole cluster unreachable until at all domains up -
    depent of all nodes up !!!
        - all paused VMs started back - OK
        - rest of all VMs rebooted and runnig - OK
Questions:
        1) why all domains down if master domain ( on replica "A" )
    has two runnig members ( 2 of 3 )  ??
        2) how to fix that colaps without waiting to all nodes up ?
    ( in worste case if node has HW error eg. ) ??
        3) which oVirt  cluster  policy  can prevent that situation
    ?? ( if any )
regs.
    Pavel
_______________________________________________
    Users mailing list
    Users@ovirt.org <mailto:Users@ovirt.org>
    http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--------------070804060902090506040505
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    So I looked at the vdsm logs and since there were multiple tests
    done it was difficult to isolate which error to track down. You
    mentioned test between 14:00-14:30  CET - but the gluster logs that
    were attached ended at 11.29 UTC<br>
    <br>
    Tracking down the errors when the master domain (gluster volume
    1HP12-R3A1P1) went inactive for time period when corresponding
    gluster volume log was available - they all seem to correspond to an
    issue where gluster volume quorum was not met.<br>
    <br>
    Can you confirm if this was for the test performed - or provide logs
    from correct time period (both vdsm and gluster mount logs are
    required - from hypervisors where the master domain is mounted)?<br>
    <br>
    For master domain:<br>
    On 1hp1:<br>
    vdsm.log<br>
    Thread-35::ERROR::2016-03-31
    13:21:27,225::monitor::276::Storage.Monitor::(_monitorDomain) Err<br>
    or monitoring domain 14995860-1127-4dc4-b8c8-b540b89f9313<br>
    Traceback (most recent call last):<br>
    ...<br>
      File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py",
    line 454, in statvfs<br>
        resdict = self._sendCommand("statvfs", {"path": path},
    self.timeout)<br>
      File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py",
    line 427, in _sendCommand<br>
        raise OSError(errcode, errstr)<br>
    OSError: [Errno 107] Transport endpoint is not connected<br>
    Thread-35::<a class="moz-txt-link-freetext" href="INFO::2016-03-31">INFO::2016-03-31</a>
    13:21:27,267::monitor::299::Storage.Monitor::(_notifyStatusChanges)
    Domain 14995860-1127-4dc4-b8c8-b540b89f9313 became INVALID<br>
     <br>
    -- And I see a corresponding:<br>
    [2016-03-31 11:21:16.027090] W [MSGID: 108001]
    [afr-common.c:4093:afr_notify] 0-1HP12-R3A1P1-r<br>
    eplicate-0: Client-quorum is not met<br>
    <br>
    jsonrpc.Executor/0::DEBUG::2016-03-31
    13:23:34,110::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest)
    Return 'GlusterVolume.status' in bridge with {'volumeStatus':
    {'bricks': [{'status': 'OFFLINE', 'hostuuid':
    'f6568a3b-3d65-4f4f-be9f-14a5935e37a4', 'pid': '-1', 'rdma_port':
    'N/A', 'brick': '1hp1:/STORAGES/P1/GFS', 'port': 'N/A'}, {'status':
    'OFFLINE', 'hostuuid': '8e87cf18-8958-41b7-8d24-7ee420a1ef9f',
    'pid': '-1', 'rdma_port': 'N/A', 'brick': '1hp2:/STORAGES/P1/GFS',
    'port': 'N/A'}], 'nfs': [{'status': 'OFFLINE', 'hostuuid':
    'f6568a3b-3d65-4f4f-be9f-14a5935e37a4', 'hostname':
    '172.16.5.151/24', 'pid': '-1', 'rdma_port': 'N/A', 'port': 'N/A'},
    {'status': 'OFFLINE', 'hostuuid':
    '8e87cf18-8958-41b7-8d24-7ee420a1ef9f', 'hostname': '1hp2', 'pid':
    '-1', 'rdma_port': 'N/A', 'port': 'N/A'}], 'shd': [{'status':
    'ONLINE', 'hostname': '172.16.5.151/24', 'pid': '2148', 'hostuuid':
    'f6568a3b-3d65-4f4f-be9f-14a5935e37a4'}, {'status': 'ONLINE',
    'hostname': '1hp2', 'pid': '2146', 'hostuuid':
    '8e87cf18-8958-41b7-8d24-7ee420a1ef9f'}], 'name': '1HP12-R3A1P1'}}<br>
    <br>
    -- 2 bricks were offline. I think the arbiter brick is not reported
    in the xml output - this is a bug.<br>
    <br>
    Similarly on 1hp2:<br>
    Thread-35::ERROR::2016-03-31
    13:21:14,284::monitor::276::Storage.Monitor::(_monitorDomain) Err<br>
    or monitoring domain 14995860-1127-4dc4-b8c8-b540b89f9313<br>
    Traceback (most recent call last):<br>
      ...<br>
        raise OSError(errcode, errstr)<br>
    OSError: [Errno 2] No such file or directory<br>
    Thread-35::<a class="moz-txt-link-freetext" href="INFO::2016-03-31">INFO::2016-03-31</a>
    13:21:14,285::monitor::299::Storage.Monitor::(_notifyStatusChanges)
    Domain 14995860-1127-4dc4-b8c8-b540b89f9313 became INVALID<br>
    <br>
    Corresponding gluster mount log - <br>
    [2016-03-31 11:21:16.027640] W [MSGID: 108001]
    [afr-common.c:4093:afr_notify] 0-1HP12-R3A1P1-r<br>
    eplicate-0: Client-quorum is not met<br>
    <br>
    <div class="moz-cite-prefix">On 04/05/2016 07:02 PM, <a class="moz-txt-link-abbreviated" href="mailto:paf1@email.cz">paf1@email.cz</a>
      wrote:<br>
    </div>
    <blockquote cite="mid:5703BE74.1050408@email.cz" type="cite">
      <meta content="text/html; charset=windows-1252"
        http-equiv="Content-Type">
      Hello Sahina, <br>
      look attached logs which U requested<br>
      <br>
      regs.<br>
      Pavel<br>
      <br>
      <div class="moz-cite-prefix">On 5.4.2016 14:07, Sahina Bose wrote:<br>
      </div>
      <blockquote cite="mid:5703AA9D.40303@redhat.com" type="cite">
        <meta content="text/html; charset=windows-1252"
          http-equiv="Content-Type">
        <br>
        <br>
        <div class="moz-cite-prefix">On 03/31/2016 06:41 PM, <a
            moz-do-not-send="true" class="moz-txt-link-abbreviated"
            href="mailto:paf1@email.cz"><a class="moz-txt-link-abbreviated" href="mailto:paf1@email.cz">paf1@email.cz</a></a> wrote:<br>
        </div>
        <blockquote cite="mid:56FD221F.30707@email.cz" type="cite">
          <meta content="text/html; charset=windows-1252"
            http-equiv="Content-Type">
          Hi, <br>
          rest of logs:<br>
          <a moz-do-not-send="true"
            href="http://www.uschovna.cz/en/zasilka/HYGXR57CNHM3TP39-L3W"
            style="text-decoration:none;color:#ff9c00;">www.uschovna.cz/en/zasilka/HYGXR57CNHM3TP39-L3W</a><br>
          <br>
          The TEST is the last big event in logs ....<br>
          TEST TIME : about 14:00-14:30  CET<br>
        </blockquote>
        <br>
        Thank you Pavel for the interesting test report and sharing the
        logs.<br>
        <br>
        You are right - the master domain should not go down if 2 of 3
        bricks are available from volume A (1HP12-R3A1P1).<br>
        <br>
        I notice that host kvmarbiter was not responsive at 2016-03-31
        13:27:19 , but the ConnectStorageServerVDSCommand executed on
        kvmarbiter node returned success at 2016-03-31 13:27:26<br>
        <br>
        Could you also share the vdsm logs from 1hp1, 1hp2 and
        kvmarbiter nodes during this time ?<br>
        <br>
        Ravi, Krutika - could you take a look at the gluster logs? <br>
        <br>
        <blockquote cite="mid:56FD221F.30707@email.cz" type="cite"> <br>
          regs.Pavel<br>
          <br>
          <div class="moz-cite-prefix">On 31.3.2016 14:30, Yaniv Kaul
            wrote:<br>
          </div>
          <blockquote
cite="mid:CAJgorsaOUQ_42GUSPh-H1vGUgJ114JYcUHR8vHwvmcWR+w8Jmw@mail.gmail.com"
            type="cite">
            <div dir="ltr">Hi Pavel,
              <div><br>
              </div>
              <div>Thanks for the report. Can you begin with a more
                accurate description of your environment?</div>
              <div>Begin with host, oVirt and Gluster versions. Then
                continue with the exact setup (what are 'A', 'B', 'C' -
                domains? Volumes? What is the mapping between domains
                and volumes?).</div>
              <div><br>
              </div>
              <div>Are there any logs you can share with us?</div>
              <div><br>
              </div>
              <div>I'm sure with more information, we'd be happy to look
                at the issue.</div>
              <div>Y.</div>
              <div><br>
              </div>
            </div>
            <div class="gmail_extra"><br>
              <div class="gmail_quote">On Thu, Mar 31, 2016 at 3:09 PM,
                <a moz-do-not-send="true"
                  class="moz-txt-link-abbreviated"
                  href="mailto:paf1@email.cz">paf1@email.cz</a> <span
                  dir="ltr"><<a moz-do-not-send="true"
                    href="mailto:paf1@email.cz" target="_blank">paf1@email.cz</a>></span>
                wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div text="#000066" bgcolor="#FFFFFF"> Hello, <br>
                    we tried the  following test - with unwanted results<br>
                    <br>
                    input:<br>
                    5 node gluster<br>
                    A = replica 3 with arbiter 1 ( node1+node2+arbiter
                    on node 5 )<br>
                    B = replica 3 with arbiter 1 ( node3+node4+arbiter
                    on node 5 )<br>
                    C = distributed replica 3 arbiter 1  ( node1+node2,
                    node3+node4, each arbiter on node 5)<br>
                    node 5 has only arbiter replica ( 4x )<br>
                    <br>
                    TEST:<br>
                    1)  directly reboot one node - OK ( is not important
                    which ( data node or arbiter node ))<br>
                    2)  directly reboot two nodes - OK ( if  nodes are
                    not from the same replica ) <br>
                    3)  directly reboot three nodes - yes, this is the
                    main problem and a questions ....<br>
                        - rebooted all three nodes from replica "B"  (
                    not so possible, but who knows ... )<br>
                        - all VMs with data on this replica was paused (
                    no data access ) - OK<br>
                        - all VMs running on replica "B" nodes lost ( 
                    started manually, later )( datas on other replicas )
                    - acceptable<br>
                    BUT<br>
                        - !!! all oVIrt domains went down !! - master
                    domain is on replica "A" which lost only one member
                    from three !!!<br>
                        so we are not expecting that all domain will go
                    down, especially master with 2 live members.<br>
                        <br>
                    Results: <br>
                        - the whole cluster unreachable until at all
                    domains up - depent of all nodes up !!!<br>
                        - all paused VMs started back - OK<br>
                        - rest of all VMs rebooted and runnig - OK<br>
                    <br>
                    Questions:<br>
                        1) why all domains down if master domain ( on
                    replica "A" ) has two runnig members ( 2 of 3 )  ??<br>
                        2) how to fix that colaps without waiting to all
                    nodes up ? ( in worste case if node has HW error eg.
                    ) ??<br>
                        3) which oVirt  cluster  policy  can prevent
                    that situation ?? ( if any )<br>
                    <br>
                    regs.<br>
                    Pavel<br>
                    <br>
                    <br>
                  </div>
                  <br>
                  _______________________________________________<br>
                  Users mailing list<br>
                  <a moz-do-not-send="true"
                    href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
                  <a moz-do-not-send="true"
                    href="http://lists.ovirt.org/mailman/listinfo/users"
                    rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
                  <br>
                </blockquote>
              </div>
              <br>
            </div>
          </blockquote>
          <br>
          <br>
          <fieldset class="mimeAttachmentHeader"></fieldset>
          <br>
          <pre wrap="">_______________________________________________
Users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a>
</pre>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>

--------------070804060902090506040505--