<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <br>

    <div class="moz-cite-prefix">On 03/31/2016 06:41 PM, <a class="moz-txt-link-abbreviated" href="mailto:paf1@email.cz">paf1@email.cz</a>

      wrote:<br>

    </div>

    <blockquote cite="mid:56FD221F.30707@email.cz" type="cite">

      <meta content="text/html; charset=windows-1252"

        http-equiv="Content-Type">

      Hi, <br>

      rest of logs:<br>

      <a moz-do-not-send="true"

        href="http://www.uschovna.cz/en/zasilka/HYGXR57CNHM3TP39-L3W"

        style="text-decoration:none;color:#ff9c00;">www.uschovna.cz/en/zasilka/HYGXR57CNHM3TP39-L3W</a><br>

      <br>

      The TEST is the last big event in logs ....<br>

      TEST TIME : about 14:00-14:30  CET<br>

    </blockquote>

    <br>

    Thank you Pavel for the interesting test report and sharing the

    logs.<br>

    <br>

    You are right - the master domain should not go down if 2 of 3

    bricks are available from volume A (1HP12-R3A1P1).<br>

    <br>

    I notice that host kvmarbiter was not responsive at 2016-03-31

    13:27:19 , but the ConnectStorageServerVDSCommand executed on

    kvmarbiter node returned success at 2016-03-31 13:27:26<br>

    <br>

    Could you also share the vdsm logs from 1hp1, 1hp2 and kvmarbiter

    nodes during this time ?<br>

    <br>

    Ravi, Krutika - could you take a look at the gluster logs? <br>

    <br>

    <blockquote cite="mid:56FD221F.30707@email.cz" type="cite"> <br>

      regs.Pavel<br>

      <br>

      <div class="moz-cite-prefix">On 31.3.2016 14:30, Yaniv Kaul wrote:<br>

      </div>

      <blockquote

cite="mid:CAJgorsaOUQ_42GUSPh-H1vGUgJ114JYcUHR8vHwvmcWR+w8Jmw@mail.gmail.com"

        type="cite">

        <div dir="ltr">Hi Pavel,

          <div><br>

          </div>

          <div>Thanks for the report. Can you begin with a more accurate

            description of your environment?</div>

          <div>Begin with host, oVirt and Gluster versions. Then

            continue with the exact setup (what are 'A', 'B', 'C' -

            domains? Volumes? What is the mapping between domains and

            volumes?).</div>

          <div><br>

          </div>

          <div>Are there any logs you can share with us?</div>

          <div><br>

          </div>

          <div>I'm sure with more information, we'd be happy to look at

            the issue.</div>

          <div>Y.</div>

          <div><br>

          </div>

        </div>

        <div class="gmail_extra"><br>

          <div class="gmail_quote">On Thu, Mar 31, 2016 at 3:09 PM, <a

              moz-do-not-send="true" href="mailto:paf1@email.cz"><a class="moz-txt-link-abbreviated" href="mailto:paf1@email.cz">paf1@email.cz</a></a>

            <span dir="ltr">&lt;<a moz-do-not-send="true"

                href="mailto:paf1@email.cz" target="_blank">paf1@email.cz</a>&gt;</span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000066" bgcolor="#FFFFFF"> Hello, <br>

                we tried the  following test - with unwanted results<br>

                <br>

                input:<br>

                5 node gluster<br>

                A = replica 3 with arbiter 1 ( node1+node2+arbiter on

                node 5 )<br>

                B = replica 3 with arbiter 1 ( node3+node4+arbiter on

                node 5 )<br>

                C = distributed replica 3 arbiter 1  ( node1+node2,

                node3+node4, each arbiter on node 5)<br>

                node 5 has only arbiter replica ( 4x )<br>

                <br>

                TEST:<br>

                1)  directly reboot one node - OK ( is not important

                which ( data node or arbiter node ))<br>

                2)  directly reboot two nodes - OK ( if  nodes are not

                from the same replica ) <br>

                3)  directly reboot three nodes - yes, this is the main

                problem and a questions ....<br>

                    - rebooted all three nodes from replica "B"  ( not

                so possible, but who knows ... )<br>

                    - all VMs with data on this replica was paused ( no

                data access ) - OK<br>

                    - all VMs running on replica "B" nodes lost ( 

                started manually, later )( datas on other replicas ) -

                acceptable<br>

                BUT<br>

                    - !!! all oVIrt domains went down !! - master domain

                is on replica "A" which lost only one member from three

                !!!<br>

                    so we are not expecting that all domain will go

                down, especially master with 2 live members.<br>

                    <br>

                Results: <br>

                    - the whole cluster unreachable until at all domains

                up - depent of all nodes up !!!<br>

                    - all paused VMs started back - OK<br>

                    - rest of all VMs rebooted and runnig - OK<br>

                <br>

                Questions:<br>

                    1) why all domains down if master domain ( on

                replica "A" ) has two runnig members ( 2 of 3 )  ??<br>

                    2) how to fix that colaps without waiting to all

                nodes up ? ( in worste case if node has HW error eg. )

                ??<br>

                    3) which oVirt  cluster  policy  can prevent that

                situation ?? ( if any )<br>

                <br>

                regs.<br>

                Pavel<br>

                <br>

                <br>

              </div>

              <br>

              _______________________________________________<br>

              Users mailing list<br>

              <a moz-do-not-send="true" href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>

              <a moz-do-not-send="true"

                href="http://lists.ovirt.org/mailman/listinfo/users"

                rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>

              <br>

            </blockquote>

          </div>

          <br>

        </div>

      </blockquote>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a>

<a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>