<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    .recovery setting before removing:<br>
    p298<br>
    sS'status'<br>
    p299<br>
    S'Paused'<br>
    p300<br>
    <br>
    <br>
    <br>
    After removing .recovery file and shutdown and restart:<br>
    V0<br>
    sS'status'<br>
    p51<br>
    S'Up'<br>
    p52<br>
    <br>
    <br>
    So far looks good, GUI show's VM as Up.<br>
    <br>
    <br>
    another host was:<br>
    p318<br>
    sS'status'<br>
    p319<br>
    S'Paused'<br>
    p320<br>
    <br>
    after moving .recovery file and restarting:<br>
    V0<br>
    sS'status'<br>
    p51<br>
    S'Up'<br>
    <br>
    <br>
    Thanks.<br>
    <br>
    <div class="moz-cite-prefix">On 04/29/2016 02:36 PM, Nir Soffer
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAMRbyyt8v2KnW74a066bBdtkhPd6TeaeoPop7t9oNqjyJi4efA@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div dir="ltr">/run/vdsm/&lt;vmid&gt;.recovery</div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Fri, Apr 29, 2016 at 10:59 PM, Bill
          James <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:bill.james@j2.com" target="_blank">bill.james@j2.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"> where do I find the
              recovery files?<br>
              <br>
              [root@ovirt1 test vdsm]# pwd<br>
              /var/lib/vdsm<br>
              [root@ovirt1 test vdsm]# ls -la<br>
              total 16<br>
              drwxr-xr-x   6 vdsm kvm    100 Mar 17 16:33 .<br>
              drwxr-xr-x. 45 root root  4096 Apr 29 12:01 ..<br>
              -rw-r--r--   1 vdsm kvm  10170 Jan 19 05:04
              bonding-defaults.json<br>
              drwxr-xr-x   2 vdsm root     6 Apr 19 11:34 netconfback<br>
              drwxr-xr-x   3 vdsm kvm     54 Apr 19 11:35 persistence<br>
              drwxr-x---.  2 vdsm kvm      6 Mar 17 16:33 transient<br>
              drwxr-xr-x   2 vdsm kvm     40 Mar 17 16:33 upgrade<br>
              <br>
              <div>
                <div class="h5"> <br>
                  <br>
                  <div>On 4/29/16 10:02 AM, Michal Skrivanek wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div><br>
                    </div>
                    <div><br>
                      On 29 Apr 2016, at 18:26, Bill James &lt;<a
                        moz-do-not-send="true"
                        href="mailto:bill.james@j2.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:bill.james@j2.com">bill.james@j2.com</a></a>&gt;

                      wrote:<br>
                      <br>
                    </div>
                    <blockquote type="cite">
                      <div> yes they are still saying "paused" state.<br>
                        No, bouncing libvirt didn't help.<br>
                      </div>
                    </blockquote>
                    <div><br>
                    </div>
                    Then my suspicion of vm recovery gets closer to a
                    certainty:)
                    <div>Can you get one of the paused vm's .recovery
                      file from /var/lib/vdsm and check it says Paused
                      there? It's worth a shot to try to remove that
                      file and restart vdsm, then check logs and that vm
                      status...it should recover "good enough" from
                      libvirt only. </div>
                    <div>Try it with one first<br>
                      <div>
                        <div><br>
                          <blockquote type="cite">
                            <div> I noticed the errors about the ISO
                              domain. Didn't think that was related.<br>
                              I have been migrating a lot of VMs to
                              ovirt lately, and recently added another
                              node.<br>
                              Also had some problems with /etc/exports
                              for a while, but I think those issues are
                              all resolved.<br>
                              <br>
                              <br>
                              Last "unresponsive" message in vdsm.log
                              was:<br>
                              <br>
vdsm.log.49.xz:jsonrpc.Executor/0::WARNING::<b>2016-04-21</b>
                              11:00:54,703::vm::5067::virt.vm::(_setUnresponsiveIfTimeout)
                              vmId=`b6a13808-9552-401b-840b-4f7022e8293d`::monitor

                              become unresponsive (command timeout,
                              age=310323.97)<br>
                              vdsm.log.49.xz:jsonrpc.Executor/0::WARNING::2016-04-21

                              11:00:54,703::vm::5067::virt.vm::(_setUnresponsiveIfTimeout)

                              vmId=`5bfb140a-a971-4c9c-82c6-277929eb45d4`::monitor

                              become unresponsive (command timeout,
                              age=310323.97)<br>
                              <br>
                              <br>
                              <br>
                              Thanks.<br>
                              <br>
                              <br>
                              <br>
                              <div>On 4/29/16 1:40 AM, Michal Skrivanek
                                wrote:<br>
                              </div>
                              <blockquote type="cite"> <br>
                                <div>
                                  <blockquote type="cite">
                                    <div>On 28 Apr 2016, at 19:40, Bill
                                      James &lt;<a
                                        moz-do-not-send="true"
                                        href="mailto:bill.james@j2.com"
                                        target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:bill.james@j2.com">bill.james@j2.com</a></a>&gt;

                                      wrote:</div>
                                    <br>
                                    <div>
                                      <div bgcolor="#FFFFFF"
                                        text="#000000"> thank you for
                                        response.<br>
                                        I bold-ed the ones that are
                                        listed as "paused".<br>
                                        <br>
                                        <br>
                                        [root@ovirt1 test vdsm]# virsh
                                        -r list --all<br>
                                        Â Id   
                                        Name                          
                                        State<br>
----------------------------------------------------<br>
                                      </div>
                                    </div>
                                  </blockquote>
                                </div>
                                <div><br>
                                </div>
                                <div>
                                  <blockquote type="cite">
                                    <div>
                                      <div bgcolor="#FFFFFF"
                                        text="#000000"> <br>
                                        <br>
                                        Looks like problem started
                                        around 2016-04-17 20:19:34,822,
                                        based on engine.log attached.<br>
                                      </div>
                                    </div>
                                  </blockquote>
                                  <div><br>
                                  </div>
                                  <div>yes, that time looks correct. Any
                                    idea what might have been a trigger?
                                    Anything interesting happened at
                                    that time (power outage of some
                                    host, some maintenance action,
                                    anything)? </div>
                                  <div>logs indicate a problem when vdsm
                                    talks to libvirt(all those "monitor
                                    become unresponsive†)</div>
                                  <div><br>
                                  </div>
                                  <div>It does seem that at that time
                                    you started to have some storage
                                    connectivity issues - first one
                                    at 2016-04-17 20:06:53,929. And it
                                    doesn’t look temporary because
                                    such errors are still there couple
                                    hours later(in your most recent file
                                    you attached I can see at 23:00:54)</div>
                                  <div>When I/O gets blocked the VMs may
                                    experience issues (then VM gets
                                    Paused), or their qemu process gets
                                    stuck(resulting in libvirt either
                                    reporting error or getting stuck as
                                    well -&gt; resulting in what vdsm
                                    sees as â€œmonitor unresponsive†)</div>
                                  <div><br>
                                  </div>
                                  <div>Since you now bounced libvirtd -
                                    did it help? Do you still see wrong
                                    status for those VMs and still those
                                    "monitor unresponsive" errors in
                                    vdsm.log?</div>
                                  <div>If not…then I would suspect the
                                    â€œvm recovery†code not working
                                    correctly. Milan is looking at that.</div>
                                  <div><br>
                                  </div>
                                  <div>Thanks,</div>
                                  <div>michal</div>
                                  <div>
                                    <div><br>
                                    </div>
                                  </div>
                                  <div><br>
                                  </div>
                                  <blockquote type="cite">
                                    <div>
                                      <div bgcolor="#FFFFFF"
                                        text="#000000"> There's a lot of
                                        vdsm logs!<br>
                                        <br>
                                        fyi, the storage domain for
                                        these Vms is a "local" nfs
                                        share,
                                        7e566f55-e060-47b7-bfa4-ac3c48d70dda.<br>
                                        <br>
                                        attached more logs.<br>
                                        <br>
                                        <br>
                                        <div>On 04/28/2016 12:53 AM,
                                          Michal Skrivanek wrote:<br>
                                        </div>
                                        <blockquote type="cite">
                                          <blockquote type="cite">
                                            <pre>On 27 Apr 2016, at 19:16, Bill James <a moz-do-not-send="true" href="mailto:bill.james@j2.com" target="_blank">&lt;bill.james@j2.com&gt;</a> wrote:

virsh # list --all
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

</pre>
                                          </blockquote>
                                          <pre>you need to run virsh in read-only mode
virsh -r list â€”all

</pre>
                                          <blockquote type="cite">
                                            <pre>[root@ovirt1 test vdsm]# systemctl status libvirtd
â—  libvirtd.service - Virtualization daemon
  Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
 Drop-In: /etc/systemd/system/libvirtd.service.d
          â””─unlimited-core.conf
  Active: active (running) since Thu 2016-04-21 16:00:03 PDT; 5 days ago


tried systemctl restart libvirtd.
No change.

Attached vdsm.log and supervdsm.log.


[root@ovirt1 test vdsm]# systemctl status vdsmd
â—  vdsmd.service - Virtual Desktop Server Manager
  Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
  Active: active (running) since Wed 2016-04-27 10:09:14 PDT; 3min 46s ago


vdsm-4.17.18-0.el7.centos.noarch
</pre>
                                          </blockquote>
                                          <pre>the vdsm.log attach is good, but it’s too short interval, it only shows recovery(vdsm restart) phase when the VMs are identified as paused….can you add earlier logs? Did you restart vdsm yourself or did it crash?


</pre>
                                          <blockquote type="cite">
                                            <pre>libvirt-daemon-1.2.17-13.el7_2.4.x86_64


Thanks.


On 04/26/2016 11:35 PM, Michal Skrivanek wrote:
</pre>
                                            <blockquote type="cite">
                                              <blockquote type="cite">
                                                <pre>On 27 Apr 2016, at 02:04, Nir Soffer <a moz-do-not-send="true" href="mailto:nsoffer@redhat.com" target="_blank">&lt;nsoffer@redhat.com&gt;</a> wrote:

jjOn Wed, Apr 27, 2016 at 2:03 AM, Bill James <a moz-do-not-send="true" href="mailto:bill.james@j2.com" target="_blank">&lt;bill.james@j2.com&gt;</a> wrote:
</pre>
                                                <blockquote type="cite">
                                                  <pre>I have a hardware node that has 26 VMs.
9 are listed as "running", 17 are listed as "paused".

In truth all VMs are up and running fine.

I tried telling the db they are up:

engine=&gt; update vm_dynamic set status = 1 where vm_guid =(select
vm_guid from vm_static where vm_name = '<a moz-do-not-send="true" href="http://api1.test.j2noc.com" target="_blank">api1.test.j2noc.com</a>');

GUI then shows it up for a short while,

then puts it back in paused state.

2016-04-26 15:16:46,095 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-16) [157cc21e] VM '242ca0af-4ab2-4dd6-b515-5
d435e6452c4'(<a moz-do-not-send="true" href="http://api1.test.j2noc.com" target="_blank">api1.test.j2noc.com</a>) moved from 'Up' --&gt; 'Paused'
2016-04-26 15:16:46,221 INFO [org.ovirt.engine.core.dal.dbbroker.auditlogh
andling.AuditLogDirector] (DefaultQuartzScheduler_Worker-16) [157cc21e] Cor
relation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM api1.
<a moz-do-not-send="true" href="http://test.j2noc.com" target="_blank">test.j2noc.com</a> has been paused.


Why does the engine think the VMs are paused?
Attached engine.log.

I can fix the problem by powering off the VM then starting it back up.
But the VM is working fine! How do I get ovirt to realize that?
</pre>
                                                </blockquote>
                                                <pre>If this is an issue in engine, restarting engine may fix this.
but having this problem only with one node, I don't think this is the issue.

If this is an issue in vdsm, restarting vdsm may fix this.

If this does not help, maybe this is libvirt issue? did you try to check vm
status using virsh?
</pre>
                                              </blockquote>
                                              <pre>this looks more likely as it seems such status is being reported
logs would help, vdsm.log at the very least.

</pre>
                                              <blockquote type="cite">
                                                <pre>If virsh thinks that the vms are paused, you can try to restart libvirtd.

Please file a bug about this in any case with engine and vdsm logs.

Adding Michal in case he has better idea how to proceed.

Nir
</pre>
                                              </blockquote>
                                            </blockquote>
                                            <pre><a moz-do-not-send="true" href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a>
<a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a>
</pre>
                                          </blockquote>
                                        </blockquote>
                                        <br>
                                      </div>
                                      <span>&lt;engine.log-20160421.gz&gt;</span><span>&lt;vdsm.logs.tar.gz&gt;</span></div>
                                  </blockquote>
                                </div>
                                <br>
                              </blockquote>
                              <br>
                              <p><a moz-do-not-send="true"
href="http://www.j2.com/?utm_source=j2global&amp;utm_medium=xsell-referral&amp;utm_campaign=employeeemail"
                                  target="_blank"><span
                                    style="color:windowtext;text-decoration:none"><img
                                      moz-do-not-send="true"
src="http://home.j2.com/j2_Global_Cloud_Services/j2_Global_Email_Footer.jpg"
                                      alt="www.j2.com" border="0"
                                      height="46" width="391"></span></a></p>
                              <p><span
style="font-size:8.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:gray">This

                                  email, its contents and attachments
                                  contain information from <a
                                    moz-do-not-send="true"
href="http://www.j2.com/?utm_source=j2global&amp;utm_medium=xsell-referral&amp;utm_campaign=employemail"
                                    target="_blank">j2 Global, Inc</a>.
                                  and/or its affiliates which may be
                                  privileged, confidential or otherwise
                                  protected from disclosure. The
                                  information is intended to be for the
                                  addressee(s) only. If you are not an
                                  addressee, any disclosure, copy,
                                  distribution, or use of the contents
                                  of this message is prohibited. If you
                                  have received this email in error
                                  please notify the sender by reply
                                  e-mail and delete the original message
                                  and any copies. © 2015 <a
                                    moz-do-not-send="true"
                                    href="http://www.j2.com/"
                                    target="_blank">j2 Global, Inc</a>.
                                  All rights reserved. <a
                                    moz-do-not-send="true"
                                    href="http://www.efax.com/"
                                    target="_blank">eFax ®</a>, <a
                                    moz-do-not-send="true"
                                    href="http://www.evoice.com/"
                                    target="_blank">eVoice ®</a>, <a
                                    moz-do-not-send="true"
                                    href="http://www.campaigner.com/"
                                    target="_blank">Campaigner ®</a>, <a
                                    moz-do-not-send="true"
                                    href="http://www.fusemail.com/"
                                    target="_blank">FuseMail ®</a>, <a
                                    moz-do-not-send="true"
                                    href="http://www.keepitsafe.com/"
                                    target="_blank">KeepItSafe ®</a> and
                                  <a moz-do-not-send="true"
                                    href="http://www.onebox.com/"
                                    target="_blank">Onebox ®</a> are !
                                  registere d trademarks of <a
                                    moz-do-not-send="true"
                                    href="http://www.j2.com/"
                                    target="_blank">j2 Global, Inc</a>.
                                  and its affiliates.</span></p>
                            </div>
                          </blockquote>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                  <p><a moz-do-not-send="true"
href="http://www.j2.com/?utm_source=j2global&amp;utm_medium=xsell-referral&amp;utm_campaign=employeeemail"
                      target="_blank"><span
                        style="color:windowtext;text-decoration:none"><img
                          moz-do-not-send="true"
src="http://home.j2.com/j2_Global_Cloud_Services/j2_Global_Email_Footer.jpg"
                          alt="www.j2.com" border="0" height="46"
                          width="391"></span></a></p>
                </div>
              </div>
              <p><span
style="font-size:8.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:gray">This
                  email, its contents and attachments contain
                  information from <a moz-do-not-send="true"
href="http://www.j2.com/?utm_source=j2global&amp;utm_medium=xsell-referral&amp;utm_campaign=employemail"
                    target="_blank">j2 Global, Inc</a>. and/or its
                  affiliates which may be privileged, confidential or
                  otherwise protected from disclosure. The information
                  is intended to be for the addressee(s) only. If you
                  are not an addressee, any disclosure, copy,
                  distribution, or use of the contents of this message
                  is prohibited. If you have received this email in
                  error please notify the sender by reply e-mail and
                  delete the original message and any copies. © 2015 <a
                    moz-do-not-send="true" href="http://www.j2.com/"
                    target="_blank">j2 Global, Inc</a>. All rights
                  reserved. <a moz-do-not-send="true"
                    href="http://www.efax.com/" target="_blank">eFax ®</a>,
                  <a moz-do-not-send="true"
                    href="http://www.evoice.com/" target="_blank">eVoice
                    ®</a>, <a moz-do-not-send="true"
                    href="http://www.campaigner.com/" target="_blank">Campaigner
                    ®</a>, <a moz-do-not-send="true"
                    href="http://www.fusemail.com/" target="_blank">FuseMail
                    ®</a>, <a moz-do-not-send="true"
                    href="http://www.keepitsafe.com/" target="_blank">KeepItSafe
                    ®</a> and <a moz-do-not-send="true"
                    href="http://www.onebox.com/" target="_blank">Onebox
                    ®</a> are r egistered trademarks of <a
                    moz-do-not-send="true" href="http://www.j2.com/"
                    target="_blank">j2 Global, Inc</a>. and its
                  affiliates.</span></p>
            </div>
            <br>
            _______________________________________________<br>
            Users mailing list<br>
            <a moz-do-not-send="true" href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
            <a moz-do-not-send="true"
              href="http://lists.ovirt.org/mailman/listinfo/users"
              rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  
<p><a href="http://www.j2.com/?utm_source=j2global&utm_medium=xsell-referral&utm_campaign=employeeemail"><span style='color:windowtext;
text-decoration:none'><img border=0 width=391 height=46
src="http://home.j2.com/j2_Global_Cloud_Services/j2_Global_Email_Footer.jpg" alt="www.j2.com"></span></a></p>

<p><span style='font-size:8.0pt;font-family:"Arial","sans-serif";
color:gray'>This email, its contents and attachments contain information from <a href="http://www.j2.com/?utm_source=j2global&utm_medium=xsell-referral&utm_campaign=employemail">j2 Global, Inc</a>. and/or its affiliates which may be privileged, confidential or otherwise protected from disclosure. The information is intended to be for the addressee(s) only. If you are not an addressee, any disclosure, copy, distribution, or use of the contents of this message is prohibited. If you have received this email in error please notify the sender by reply e-mail and delete the original message and any copies. © 2015 <a href="http://www.j2.com/">j2 Global, Inc</a>. All rights reserved. <a href="http://www.efax.com/">eFax ®</a>, <a href="http://www.evoice.com/">eVoice ®</a>, <a href="http://www.campaigner.com/">Campaigner ®</a>, <a href="http://www.fusemail.com/">FuseMail ®</a>, <a href="http://www.keepitsafe.com/">KeepItSafe ®</a> and <a href="http://www.onebox.com/">Onebox ®</a> are registered trademarks of <a href="http://www.j2.com/">j2 Global, Inc</a>. and its affiliates.</span></p></body>
</html>