<div dir="ltr">I see there an ERROR on stopMonitoringDomain but I cannot see the correspondent  startMonitoringDomain; could you please look for it?</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Feb 3, 2017 at 1:16 PM, Ralf Schenk <span dir="ltr">&lt;<a href="mailto:rs@databay.de" target="_blank">rs@databay.de</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <p>Hello,</p>
    <p>attached is my vdsm.log from the host with hosted-engine-ha
      around the time-frame of agent timeout that is not working anymore
      for engine (it works in Ovirt and is active). It simply isn&#39;t
      working for engine-ha anymore after Update.</p>
    <p>At 2017-02-02 19:25:34,248 you&#39;ll find an error corresponoding to
      agent timeout error.</p>
    <p>Bye<br>
    </p><div><div class="h5">
    <p><br>
    </p>
    <br>
    <div class="m_-5371711976759655950moz-cite-prefix">Am 03.02.2017 um 11:28 schrieb Simone
      Tiraboschi:<br>
    </div>
    <blockquote type="cite">
      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote"><span>
                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div bgcolor="#FFFFFF" text="#000000">
                    <p>3. Three of my hosts have the hosted engine
                      deployed for ha. First all three where marked by a
                      crown (running was gold and others where silver).
                      After upgrading the 3 Host deployed hosted engine
                      ha is not active anymore.</p>
                    <p>I can&#39;t get this host back with working
                      ovirt-ha-agent/broker. I already rebooted,
                      manually restarted the services but It isn&#39;t able
                      to get cluster state according to <br>
                      &quot;hosted-engine --vm-status&quot;. The other hosts state
                      the host status as &quot;unknown stale-data&quot;</p>
                    <p>I already shut down all agents on all hosts and
                      issued a &quot;hosted-engine --reinitialize-lockspace&quot;
                      but that didn&#39;t help.<br>
                    </p>
                    <p>Agents stops working after a timeout-error
                      according to log:</p>
                    <p><tt>MainThread::INFO::2017-02-02
                        19:24:52,040::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:24:59,185::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:06,333::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:13,554::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:20,710::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:27,865::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::ERROR::2017-02-02
                        19:25:27,866::hosted_engine::8<wbr>15::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_domain_monitor)
                        Failed to start monitoring domain
                        (sd_uuid=7c8deaa8-be02-4aaf-b9<wbr>b4-ddc8da99ad96,
                        host_id=3): timeout during domain acquisition</tt><tt><br>
                      </tt><tt>MainThread::WARNING::2017-02-0<wbr>2
                        19:25:27,866::hosted_engine::4<wbr>69::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring)
                        Error while monitoring engine: Failed to start
                        monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9<wbr>b4-ddc8da99ad96,
                        host_id=3): timeout during domain acquisition</tt><tt><br>
                      </tt><tt>MainThread::WARNING::2017-02-0<wbr>2
                        19:25:27,866::hosted_engine::4<wbr>72::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring)
                        Unexpected error</tt><tt><br>
                      </tt><tt>Traceback (most recent call last):</tt><tt><br>
                      </tt><tt>  File
                        &quot;/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/age<wbr>nt/hosted_engine.py&quot;,
                        line 443, in start_monitoring</tt><tt><br>
                      </tt><tt>    self._initialize_domain_monito<wbr>r()</tt><tt><br>
                      </tt><tt>  File
                        &quot;/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/age<wbr>nt/hosted_engine.py&quot;,
                        line 816, in _initialize_domain_monitor</tt><tt><br>
                      </tt><tt>    raise Exception(msg)</tt><tt><br>
                      </tt><tt>Exception: Failed to start monitoring
                        domain (sd_uuid=7c8deaa8-be02-4aaf-b9<wbr>b4-ddc8da99ad96,
                        host_id=3): timeout during domain acquisition</tt><tt><br>
                      </tt><tt>MainThread::ERROR::2017-02-02
                        19:25:27,866::hosted_engine::4<wbr>85::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring)
                        Shutting down the agent because of 3 failures in
                        a row!</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:32,087::hosted_engine::8<wbr>41::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_domain_monitor_status)
                        VDSM domain monitor status: PENDING</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:34,250::hosted_engine::7<wbr>69::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_stop_domain_monitor)
                        Failed to stop monitoring domain
                        (sd_uuid=7c8deaa8-be02-4aaf-b9<wbr>b4-ddc8da99ad96):
                        Storage domain is member of pool:
                        u&#39;domain=7c8deaa8-be02-4aaf-b9<wbr>b4-ddc8da99ad96&#39;</tt><tt><br>
                      </tt><tt>MainThread::INFO::2017-02-02
                        19:25:34,254::agent::143::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(run)
                        Agent shutting down</tt></p>
                  </div>
                </blockquote>
              </span>
              <div>Simone, Martin, can you please follow up on this?</div>
            </div>
          </div>
        </div>
      </blockquote>
      <div><br>
      </div>
      <div>Ralph, could you please attach vdsm logs from on of your
        hosts for the relevant time frame?</div>
    </blockquote>
    <br>
    </div></div><span class=""><div class="m_-5371711976759655950moz-signature">-- <br>
      <p>
      </p>
      <table border="0" cellpadding="0" cellspacing="0">
        <tbody>
          <tr>
            <td colspan="3"><img src="cid:part1.256205D1.7A341235@databay.de" height="30" border="0" width="151"></td>
          </tr>
          <tr>
            <td valign="top"> <font face="Verdana, Arial, sans-serif" size="-1"><br>
                <b>Ralf Schenk</b><br>
                fon <a href="tel:+49%202405%20408370" value="+492405408370" target="_blank">+49 (0) 24 05 / 40 83 70</a><br>
                fax <a href="tel:+49%202405%204083759" value="+4924054083759" target="_blank">+49 (0) 24 05 / 40 83 759</a><br>
                mail <a href="mailto:rs@databay.de" target="_blank"><font color="#FF0000"><b>rs@databay.de</b></font></a><br>
              </font> </td>
            <td width="30"> </td>
            <td valign="top"> <font face="Verdana, Arial, sans-serif" size="-1"><br>
                <b>Databay AG</b><br>
                Jens-Otto-Krag-Straße 11<br>
                D-52146 Würselen<br>
                <a href="http://www.databay.de" target="_blank"><font color="#FF0000"><b>www.databay.de</b></font></a>
              </font> </td>
          </tr>
          <tr>
            <td colspan="3" valign="top"> <font face="Verdana, Arial,
                sans-serif" size="1"><br>
                Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE
                210844202<br>
                Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch
                Yavari, Dipl.-Kfm. Philipp Hermanns<br>
                Aufsichtsratsvorsitzender: Wilhelm Dohmen </font> </td>
          </tr>
        </tbody>
      </table>
      <hr color="#000000" noshade size="1" width="100%">
    </div>
  </span></div>

</blockquote></div><br></div>