--------------090600060303000004030204
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 8bit
where do I find the recovery files?
[root@ovirt1 test vdsm]# pwd
/var/lib/vdsm
[root@ovirt1 test vdsm]# ls -la
total 16
drwxr-xr-x 6 vdsm kvm 100 Mar 17 16:33 .
drwxr-xr-x. 45 root root 4096 Apr 29 12:01 ..
-rw-r--r-- 1 vdsm kvm 10170 Jan 19 05:04 bonding-defaults.json
drwxr-xr-x 2 vdsm root 6 Apr 19 11:34 netconfback
drwxr-xr-x 3 vdsm kvm 54 Apr 19 11:35 persistence
drwxr-x---. 2 vdsm kvm 6 Mar 17 16:33 transient
drwxr-xr-x 2 vdsm kvm 40 Mar 17 16:33 upgrade
[root@ovirt1 test vdsm]# locate recovery
/opt/hp/hpdiags/en/tcstorage.ldinterimrecovery.htm
/opt/hp/hpdiags/en/tcstorage.ldrecoveryready.htm
/usr/share/doc/postgresql-9.2.15/html/archive-recovery-settings.html
/usr/share/doc/postgresql-9.2.15/html/recovery-config.html
/usr/share/doc/postgresql-9.2.15/html/recovery-target-settings.html
/usr/share/pgsql/recovery.conf.sample
/var/lib/nfs/v4recovery
[root@ovirt1 test vdsm]# locate 757a5 (disk id)
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118/211581dc-fa98-41be-a0b9-ace236149bc2
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118/211581dc-fa98-41be-a0b9-ace236149bc2.lease
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118/211581dc-fa98-41be-a0b9-ace236149bc2.meta
[root@ovirt1 test vdsm]# locate 5bfb140 (vm id)
/var/lib/libvirt/qemu/channels/5bfb140a-a971-4c9c-82c6-277929eb45d4.com.redhat.rhevm.vdsm
/var/lib/libvirt/qemu/channels/5bfb140a-a971-4c9c-82c6-277929eb45d4.org.qemu.guest_agent.0
On 4/29/16 10:02 AM, Michal Skrivanek wrote:
On 29 Apr 2016, at 18:26, Bill James <bill.james(a)j2.com
<mailto:bill.james@j2.com>> wrote:
> yes they are still saying "paused" state.
> No, bouncing libvirt didn't help.
Then my suspicion of vm recovery gets closer to a certainty:)
Can you get one of the paused vm's .recovery file from /var/lib/vdsm
and check it says Paused there? It's worth a shot to try to remove
that file and restart vdsm, then check logs and that vm status...it
should recover "good enough" from libvirt only.
Try it with one first
> I noticed the errors about the ISO domain. Didn't think that was related.
> I have been migrating a lot of VMs to ovirt lately, and recently
> added another node.
> Also had some problems with /etc/exports for a while, but I think
> those issues are all resolved.
>
>
> Last "unresponsive" message in vdsm.log was:
>
> vdsm.log.49.xz:jsonrpc.Executor/0::WARNING::*2016-04-21*
> 11:00:54,703::vm::5067::virt.vm::(_setUnresponsiveIfTimeout)
> vmId=`b6a13808-9552-401b-840b-4f7022e8293d`::monitor become
> unresponsive (command timeout, age=310323.97)
> vdsm.log.49.xz:jsonrpc.Executor/0::WARNING::2016-04-21
> 11:00:54,703::vm::5067::virt.vm::(_setUnresponsiveIfTimeout)
> vmId=`5bfb140a-a971-4c9c-82c6-277929eb45d4`::monitor become
> unresponsive (command timeout, age=310323.97)
>
>
>
> Thanks.
>
>
>
> On 4/29/16 1:40 AM, Michal Skrivanek wrote:
>>
>>> On 28 Apr 2016, at 19:40, Bill James <bill.james(a)j2.com> wrote:
>>>
>>> thank you for response.
>>> I bold-ed the ones that are listed as "paused".
>>>
>>>
>>> [root@ovirt1 test vdsm]# virsh -r list --all
>>> à Idà à à Nameà à à à à à à à à à à à à à à à à à à Ã
à à à à à à State
>>> ----------------------------------------------------
>>
>>>
>>>
>>> Looks like problem started around 2016-04-17 20:19:34,822, based on
>>> engine.log attached.
>>
>> yes, that time looks correct. Any idea what might have been a
>> trigger? Anything interesting happened at that time (power outage of
>> some host, some maintenance action, anything)?Ã
>> logs indicate a problem when vdsm talks to libvirt(all those
>> "monitor become unresponsiveââ¬Â)
>>
>> It does seem that at that time you started to have some storage
>> connectivity issues - first one atà 2016-04-17 20:06:53,929. And it
>> doesnââ¬â¢t look temporary because such errors are still there couple
>> hours later(in your most recent file you attached I can see at 23:00:54)
>> When I/O gets blocked the VMs may experience issues (then VM gets
>> Paused), or their qemu process gets stuck(resulting in libvirt
>> either reporting error or getting stuck as well -> resulting in what
>> vdsm sees as ââ¬Åmonitor unresponsiveââ¬Â)
>>
>> Since you now bounced libvirtd - did it help? Do you still see wrong
>> status for those VMs and still those "monitor unresponsive" errors
>> in vdsm.log?
>> If notââ¬ÂŠthen I would suspect the ââ¬Åvm recoveryââ¬Â code not working
>> correctly. Milan is looking at that.
>>
>> Thanks,
>> michal
>>
>>
>>> There's a lot of vdsm logs!
>>>
>>> fyi, the storage domain for these Vms is a "local" nfs share,
>>> 7e566f55-e060-47b7-bfa4-ac3c48d70dda.
>>>
>>> attached more logs.
>>>
>>>
>>> On 04/28/2016 12:53 AM, Michal Skrivanek wrote:
>>>>> On 27 Apr 2016, at 19:16, Bill James<bill.james(a)j2.com>
wrote:
>>>>>
>>>>> virsh # list --all
>>>>> error: failed to connect to the hypervisor
>>>>> error: no valid connection
>>>>> error: Failed to connect socket to
'/var/run/libvirt/libvirt-sock': No such file or directory
>>>>>
>>>> you need to run virsh in read-only mode
>>>> virsh -r list ââ¬âall
>>>>
>>>>> [root@ovirt1 test vdsm]# systemctl status libvirtd
>>>>> ââ libvirtd.service - Virtualization daemon
>>>>> Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled;
vendor preset: enabled)
>>>>> Drop-In: /etc/systemd/system/libvirtd.service.d
>>>>> ââââââ¬unlimited-core.conf
>>>>> Active: active (running) since Thu 2016-04-21 16:00:03 PDT; 5 days
ago
>>>>>
>>>>>
>>>>> tried systemctl restart libvirtd.
>>>>> No change.
>>>>>
>>>>> Attached vdsm.log and supervdsm.log.
>>>>>
>>>>>
>>>>> [root@ovirt1 test vdsm]# systemctl status vdsmd
>>>>> ââ vdsmd.service - Virtual Desktop Server Manager
>>>>> Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled;
vendor preset: enabled)
>>>>> Active: active (running) since Wed 2016-04-27 10:09:14 PDT; 3min
46s ago
>>>>>
>>>>>
>>>>> vdsm-4.17.18-0.el7.centos.noarch
>>>> the vdsm.log attach is good, but itââ¬â¢s too short interval, it only
shows recovery(vdsm restart) phase when the VMs are identified as pausedââ¬ÂŠ.can you
add earlier logs? Did you restart vdsm yourself or did it crash?
>>>>
>>>>
>>>>> libvirt-daemon-1.2.17-13.el7_2.4.x86_64
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> On 04/26/2016 11:35 PM, Michal Skrivanek wrote:
>>>>>>> On 27 Apr 2016, at 02:04, Nir
Soffer<nsoffer(a)redhat.com> wrote:
>>>>>>>
>>>>>>> jjOn Wed, Apr 27, 2016 at 2:03 AM, Bill
James<bill.james(a)j2.com> wrote:
>>>>>>>> I have a hardware node that has 26 VMs.
>>>>>>>> 9 are listed as "running", 17 are listed as
"paused".
>>>>>>>>
>>>>>>>> In truth all VMs are up and running fine.
>>>>>>>>
>>>>>>>> I tried telling the db they are up:
>>>>>>>>
>>>>>>>> engine=> update vm_dynamic set status = 1 where
vm_guid =(select
>>>>>>>> vm_guid from vm_static where vm_name =
'api1.test.j2noc.com <
http://api1.test.j2noc.com>');
>>>>>>>>
>>>>>>>> GUI then shows it up for a short while,
>>>>>>>>
>>>>>>>> then puts it back in paused state.
>>>>>>>>
>>>>>>>> 2016-04-26 15:16:46,095 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
>>>>>>>> (DefaultQuartzScheduler_Worker-16) [157cc21e] VM
'242ca0af-4ab2-4dd6-b515-5
>>>>>>>>
d435e6452c4'(api1.test.j2noc.com
<
http://api1.test.j2noc.com>) moved from 'Up' --> 'Paused'
>>>>>>>> 2016-04-26 15:16:46,221 INFO
[org.ovirt.engine.core.dal.dbbroker.auditlogh
>>>>>>>> andling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-16) [157cc21e] Cor
>>>>>>>> relation ID: null, Call Stack: null, Custom Event ID: -1,
Message: VM api1.
>>>>>>>>
test.j2noc.com <
http://test.j2noc.com> has been
paused.
>>>>>>>>
>>>>>>>>
>>>>>>>> Why does the engine think the VMs are paused?
>>>>>>>> Attached engine.log.
>>>>>>>>
>>>>>>>> I can fix the problem by powering off the VM then
starting it back up.
>>>>>>>> But the VM is working fine! How do I get ovirt to realize
that?
>>>>>>> If this is an issue in engine, restarting engine may fix
this.
>>>>>>> but having this problem only with one node, I don't think
this is the issue.
>>>>>>>
>>>>>>> If this is an issue in vdsm, restarting vdsm may fix this.
>>>>>>>
>>>>>>> If this does not help, maybe this is libvirt issue? did you
try to check vm
>>>>>>> status using virsh?
>>>>>> this looks more likely as it seems such status is being reported
>>>>>> logs would help, vdsm.log at the very least.
>>>>>>
>>>>>>> If virsh thinks that the vms are paused, you can try to
restart libvirtd.
>>>>>>>
>>>>>>> Please file a bug about this in any case with engine and vdsm
logs.
>>>>>>>
>>>>>>> Adding Michal in case he has better idea how to proceed.
>>>>>>>
>>>>>>> Nir
>>>>> Users(a)ovirt.org
>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>
>>> <engine.log-20160421.gz><vdsm.logs.tar.gz>
>>
>
>
www.j2.com
>
<
http://www.j2.com/?utm_source=j2global&utm_medium=xsell-referral&...
>
> This email, its contents and attachments contain information from j2
> Global, Inc
>
<
http://www.j2.com/?utm_source=j2global&utm_medium=xsell-referral&...;.
> and/or its affiliates which may be privileged, confidential or
> otherwise protected from disclosure. The information is intended to
> be for the addressee(s) only. If you are not an addressee, any
> disclosure, copy, distribution, or use of the contents of this
> message is prohibited. If you have received this email in error
> please notify the sender by reply e-mail and delete the original
> message and any copies. © 2015 j2 Global, Inc <
http://www.j2.com/>.
> All rights reserved. eFax ® <
http://www.efax.com/>, eVoice ®
> <
http://www.evoice.com/>, Campaigner ® <
http://www.campaigner.com/>,
> FuseMail ® <
http://www.fusemail.com/>, KeepItSafe ®
> <
http://www.keepitsafe.com/> and Onebox ® <
http://www.onebox.com/>
> are ! registere d trademarks of j2 Global, Inc <
http://www.j2.com/>.
> and its affiliates.
>
Cloud Services for Business
www.j2.com
j2 | eFax | eVoice | FuseMail | Campaigner | KeepItSafe | Onebox
This email, its contents and attachments contain information from j2 Global, Inc. and/or
its affiliates which may be privileged, confidential or otherwise protected from
disclosure. The information is intended to be for the addressee(s) only. If you are not an
addressee, any disclosure, copy, distribution, or use of the contents of this message is
prohibited. If you have received this email in error please notify the sender by reply
e-mail and delete the original message and any copies. (c) 2015 j2 Global, Inc. All rights
reserved. eFax, eVoice, Campaigner, FuseMail, KeepItSafe, and Onebox are registered
trademarks of j2 Global, Inc. and its affiliates.
--------------090600060303000004030204
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: 8bit
<html>
<head>
<meta content="text/html; charset=utf-8"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
where do I find the recovery files?<br>
<br>
[root@ovirt1 test vdsm]# pwd<br>
/var/lib/vdsm<br>
[root@ovirt1 test vdsm]# ls -la<br>
total 16<br>
drwxr-xr-x  6 vdsm kvm   100 Mar 17 16:33 .<br>
drwxr-xr-x. 45 root root 4096 Apr 29 12:01 ..<br>
-rw-r--r--  1 vdsm kvm 10170 Jan 19 05:04 bonding-defaults.json<br>
drwxr-xr-x  2 vdsm root    6 Apr 19 11:34 netconfback<br>
drwxr-xr-x  3 vdsm kvm    54 Apr 19 11:35 persistence<br>
drwxr-x---. 2 vdsm kvm     6 Mar 17 16:33 transient<br>
drwxr-xr-x  2 vdsm kvm    40 Mar 17 16:33 upgrade<br>
[root@ovirt1 test vdsm]# locate recovery<br>
/opt/hp/hpdiags/en/tcstorage.ldinterimrecovery.htm<br>
/opt/hp/hpdiags/en/tcstorage.ldrecoveryready.htm<br>
/usr/share/doc/postgresql-9.2.15/html/archive-recovery-settings.html<br>
/usr/share/doc/postgresql-9.2.15/html/recovery-config.html<br>
/usr/share/doc/postgresql-9.2.15/html/recovery-target-settings.html<br>
/usr/share/pgsql/recovery.conf.sample<br>
/var/lib/nfs/v4recovery<br>
<br>
<br>
[root@ovirt1 test vdsm]# locate 757a5Â (disk id)<br>
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118<br>
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118/211581dc-fa98-41be-a0b9-ace236149bc2<br>
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118/211581dc-fa98-41be-a0b9-ace236149bc2.lease<br>
/ovirt-store/nfs1/7e566f55-e060-47b7-bfa4-ac3c48d70dda/images/757a5e69-a791-4391-9d7d-9516bf7f2118/211581dc-fa98-41be-a0b9-ace236149bc2.meta<br>
[root@ovirt1 test vdsm]# locate 5bfb140 (vm id)<br>
/var/lib/libvirt/qemu/channels/5bfb140a-a971-4c9c-82c6-277929eb45d4.com.redhat.rhevm.vdsm<br>
/var/lib/libvirt/qemu/channels/5bfb140a-a971-4c9c-82c6-277929eb45d4.org.qemu.guest_agent.0<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 4/29/16 10:02 AM, Michal Skrivanek
wrote:<br>
</div>
<blockquote
cite="mid:034AC28F-A06A-43B4-95E3-BB4EAD615E04@redhat.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<div><br>
</div>
<div><br>
On 29 Apr 2016, at 18:26, Bill James <<a
moz-do-not-send="true"
href="mailto:bill.james@j2.com"><a
class="moz-txt-link-abbreviated"
href="mailto:bill.james@j2.com">bill.james@j2.com</a></a>>
wrote:<br>
<br>
</div>
<blockquote type="cite">
<div> yes they are still saying "paused" state.<br>
No, bouncing libvirt didn't help.<br>
</div>
</blockquote>
<div><br>
</div>
Then my suspicion of vm recovery gets closer to a certainty:)
<div>Can you get one of the paused vm's .recovery file from
/var/lib/vdsm and check it says Paused there? It's worth a shot
to try to remove that file and restart vdsm, then check logs and
that vm status...it should recover "good enough" from libvirt
only. </div>
<div>Try it with one first<br>
<div>
<div><br>
<blockquote type="cite">
<div> I noticed the errors about the ISO domain. Didn't
think that was related.<br>
I have been migrating a lot of VMs to ovirt lately, and
recently added another node.<br>
Also had some problems with /etc/exports for a while,
but I think those issues are all resolved.<br>
<br>
<br>
Last "unresponsive" message in vdsm.log was:<br>
<br>
vdsm.log.49.xz:jsonrpc.Executor/0::WARNING::<b>2016-04-21</b>
11:00:54,703::vm::5067::virt.vm::(_setUnresponsiveIfTimeout)
vmId=`b6a13808-9552-401b-840b-4f7022e8293d`::monitor
become unresponsive (command timeout, age=310323.97)<br>
vdsm.log.49.xz:jsonrpc.Executor/0::WARNING::2016-04-21
11:00:54,703::vm::5067::virt.vm::(_setUnresponsiveIfTimeout)
vmId=`5bfb140a-a971-4c9c-82c6-277929eb45d4`::monitor
become unresponsive (command timeout, age=310323.97)<br>
<br>
<br>
<br>
Thanks.<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 4/29/16 1:40 AM, Michal
Skrivanek wrote:<br>
</div>
<blockquote
cite="mid:656BFC5C-A6F5-4332-90AC-C039D4E9170E@redhat.com"
type="cite"> <br class="">
<div>
<blockquote type="cite" class="">
<div class="">On 28 Apr 2016, at 19:40, Bill James
<<a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:bill.james@j2.com">bill.james@j2.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div bgcolor="#FFFFFF" text="#000000"
class="">
thank you for response.<br class="">
I bold-ed the ones that are listed as
"paused".<br class="">
<br class="">
<br class="">
[root@ovirt1 test vdsm]# virsh -r list --all<br
class="">
ÃÂ IdÃÂ ÃÂ ÃÂ
NameÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ
State<br class="">
----------------------------------------------------<br class="">
</div>
</div>
</blockquote>
</div>
<div><br class="">
</div>
<div>
<blockquote type="cite" class="">
<div class="">
<div bgcolor="#FFFFFF" text="#000000"
class="">
<br class="">
<br class="">
Looks like problem started around 2016-04-17
20:19:34,822, based on engine.log attached.<br
class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>yes, that time looks correct. Any idea what
might have been a trigger? Anything interesting
happened at that time (power outage of some host,
some maintenance action, anything)?ÃÂ </div>
<div>logs indicate a problem when vdsm talks to
libvirt(all those "monitor become
unresponsiveââ¬Â)</div>
<div><br class="">
</div>
<div>It does seem that at that time you started to
have some storage connectivity issues - first one
atà2016-04-17 20:06:53,929. And it doesnââ¬â¢t look
temporary because such errors are still there
couple hours later(in your most recent file you
attached I can see at 23:00:54)</div>
<div>When I/O gets blocked the VMs may experience
issues (then VM gets Paused), or their qemu
process gets stuck(resulting in libvirt either
reporting error or getting stuck as well ->
resulting in what vdsm sees as ââ¬Åmonitor
unresponsiveââ¬Â)</div>
<div><br class="">
</div>
<div>Since you now bounced libvirtd - did it help?
Do you still see wrong status for those VMs and
still those "monitor unresponsive" errors in
vdsm.log?</div>
<div>If notââ¬ÂŠthen I would suspect the ââ¬Åvm
recoveryââ¬Â code not working correctly. Milan is
looking at that.</div>
<div><br class="">
</div>
<div>Thanks,</div>
<div>michal</div>
<div>
<div><br class="">
</div>
</div>
<div class=""><br class="">
</div>
<blockquote type="cite" class="">
<div class="">
<div bgcolor="#FFFFFF" text="#000000"
class="">
There's a lot of vdsm logs!<br class="">
<br class="">
fyi, the storage domain for these Vms is a
"local" nfs share,
7e566f55-e060-47b7-bfa4-ac3c48d70dda.<br
class="">
<br class="">
attached more logs.<br class="">
<br class="">
<br class="">
<div class="moz-cite-prefix">On 04/28/2016
12:53 AM, Michal Skrivanek wrote:<br
class="">
</div>
<blockquote
cite="mid:28BF55E6-3A90-4BB7-90B9-1EE0A82FC460@redhat.com"
type="cite" class="">
<blockquote type="cite" class="">
<pre class="" wrap="">On 27 Apr
2016, at 19:16, Bill James <a moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:bill.james@j2.com"><bill.james@j2.com></a>
wrote:
virsh # list --all
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file
or directory
</pre>
</blockquote>
<pre class="" wrap="">you need to
run virsh in read-only mode
virsh -r list ââ¬âall
</pre>
<blockquote type="cite" class="">
<pre class="" wrap="">[root@ovirt1
test vdsm]# systemctl status libvirtd
ââ libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset:
enabled)
Drop-In: /etc/systemd/system/libvirtd.service.d
ââââââ¬unlimited-core.conf
Active: active (running) since Thu 2016-04-21 16:00:03 PDT; 5 days ago
tried systemctl restart libvirtd.
No change.
Attached vdsm.log and supervdsm.log.
[root@ovirt1 test vdsm]# systemctl status vdsmd
ââ vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2016-04-27 10:09:14 PDT; 3min 46s ago
vdsm-4.17.18-0.el7.centos.noarch
</pre>
</blockquote>
<pre class="" wrap="">the vdsm.log
attach is good, but itââ¬â¢s too short interval, it only shows recovery(vdsm restart)
phase when the VMs are identified as pausedââ¬ÂŠ.can you add earlier logs? Did you
restart vdsm yourself or did it crash?
</pre>
<blockquote type="cite" class="">
<pre class=""
wrap="">libvirt-daemon-1.2.17-13.el7_2.4.x86_64
Thanks.
On 04/26/2016 11:35 PM, Michal Skrivanek wrote:
</pre>
<blockquote type="cite" class="">
<blockquote type="cite"
class="">
<pre class="" wrap="">On 27
Apr 2016, at 02:04, Nir Soffer <a moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:nsoffer@redhat.com"><nsoffer@redhat.com></a>
wrote:
jjOn Wed, Apr 27, 2016 at 2:03 AM, Bill James <a moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:bill.james@j2.com"><bill.james@j2.com></a>
wrote:
</pre>
<blockquote type="cite"
class="">
<pre class="" wrap="">I have
a hardware node that has 26 VMs.
9 are listed as "running", 17 are listed as "paused".
In truth all VMs are up and running fine.
I tried telling the db they are up:
engine=> update vm_dynamic set status = 1 where vm_guid =(select
vm_guid from vm_static where vm_name = '<a moz-do-not-send="true"
href="http://api1.test.j2noc.com"
class="">api1.test.j2noc.com</a>');
GUI then shows it up for a short while,
then puts it back in paused state.
2016-04-26 15:16:46,095 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-16) [157cc21e] VM '242ca0af-4ab2-4dd6-b515-5
d435e6452c4'(<a moz-do-not-send="true"
href="http://api1.test.j2noc.com"
class="">api1.test.j2noc.com</a>) moved from 'Up' -->
'Paused'
2016-04-26 15:16:46,221 INFO [org.ovirt.engine.core.dal.dbbroker.auditlogh
andling.AuditLogDirector] (DefaultQuartzScheduler_Worker-16) [157cc21e] Cor
relation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM api1.
<a moz-do-not-send="true" href="http://test.j2noc.com"
class="">test.j2noc.com</a> has been paused.
Why does the engine think the VMs are paused?
Attached engine.log.
I can fix the problem by powering off the VM then starting it back up.
But the VM is working fine! How do I get ovirt to realize that?
</pre>
</blockquote>
<pre class="" wrap="">If this
is an issue in engine, restarting engine may fix this.
but having this problem only with one node, I don't think this is the issue.
If this is an issue in vdsm, restarting vdsm may fix this.
If this does not help, maybe this is libvirt issue? did you try to check vm
status using virsh?
</pre>
</blockquote>
<pre class="" wrap="">this looks
more likely as it seems such status is being reported
logs would help, vdsm.log at the very least.
</pre>
<blockquote type="cite"
class="">
<pre class="" wrap="">If virsh
thinks that the vms are paused, you can try to restart libvirtd.
Please file a bug about this in any case with engine and vdsm logs.
Adding Michal in case he has better idea how to proceed.
Nir
</pre>
</blockquote>
</blockquote>
<pre class="" wrap=""><a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:Users@ovirt.org">Users@ovirt.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://lists.ovirt.org/mailman/listinfo/users">http://...
</pre>
</blockquote>
</blockquote>
<br class="">
</div>
<span
id="cid:EB02B488-C070-46FA-9938-DC7D6DF5BEED@brq.redhat.com"><engine.log-20160421.gz></span><span
id="cid:52E27023-A602-4DB0-B69A-18237CC048A3@brq.redhat.com"><vdsm.logs.tar.gz></span></div>
</blockquote>
</div>
<br class="">
</blockquote>
<br>
<p><a moz-do-not-send="true"
href="http://www.j2.com/?utm_source=j2global&utm_medium=xsel...
style="color:windowtext;
text-decoration:none"><img
moz-do-not-send="true"
src="http://home.j2.com/j2_Global_Cloud_Services/j2_Global_Email_Foo...
alt="www.j2.com" height="46"
border="0"
width="391"></span></a></p>
<p><span
style="font-size:8.0pt;font-family:"Arial","sans-serif";color:gray">This
email, its contents and attachments contain
information from <a moz-do-not-send="true"
href="http://www.j2.com/?utm_source=j2global&utm_medium=xsel...
Global, Inc</a>. and/or its affiliates which may
be privileged, confidential or otherwise protected
from disclosure. The information is intended to be
for the addressee(s) only. If you are not an
addressee, any disclosure, copy, distribution, or
use of the contents of this message is prohibited.
If you have received this email in error please
notify the sender by reply e-mail and delete the
original message and any copies. © 2015 <a
moz-do-not-send="true"
href="http://www.j2.com/">j2
Global, Inc</a>. All rights reserved. <a
moz-do-not-send="true"
href="http://www.efax.com/">eFax
®</a>, <a moz-do-not-send="true"
href="http://www.evoice.com/">eVoice ®</a>,
<a
moz-do-not-send="true"
href="http://www.campaigner.com/">Campaigner
®</a>,
<a moz-do-not-send="true"
href="http://www.fusemail.com/">FuseMail ®</a>,
<a
moz-do-not-send="true"
href="http://www.keepitsafe.com/">KeepItSafe
®</a>
and <a moz-do-not-send="true"
href="http://www.onebox.com/">Onebox ®</a> are
!
registere d trademarks of <a moz-do-not-send="true"
href="http://www.j2.com/">j2 Global, Inc</a>.
and
its affiliates.</span></p>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
<p><a
href="http://www.j2.com/?utm_source=j2global&utm_medium=xsell-re...
style='color:windowtext;
text-decoration:none'><img border=0 width=391 height=46
src="http://home.j2.com/j2_Global_Cloud_Services/j2_Global_Email_Foo...
alt="www.j2.com"></span></a></p>
<p><span
style='font-size:8.0pt;font-family:"Arial","sans-serif";
color:gray'>This email, its contents and attachments contain information from <a
href="http://www.j2.com/?utm_source=j2global&utm_medium=xsell-re...
Global, Inc</a>. and/or its affiliates which may be privileged, confidential or
otherwise protected from disclosure. The information is intended to be for the
addressee(s) only. If you are not an addressee, any disclosure, copy, distribution, or use
of the contents of this message is prohibited. If you have received this email in error
please notify the sender by reply e-mail and delete the original message and any copies. ©
2015 <a
href="http://www.j2.com/">j2 Global, Inc</a>. All rights
reserved. <a
href="http://www.efax.com/">eFax ®</a>, <a
href="http://www.evoice.com/">eVoice ®</a>, <a
href="http://www.campaigner.com/">Campaigner ®</a>, <a
href="http://www.fusemail.com/">FuseMail ®</a>, <a
href="http://www.keepitsafe.com/">KeepItSafe ®</a> and <a
href="http://www.onebox.com/">Onebox ®</a> are registered trademarks of
<a
href="http://www.j2.com/">j2 Global, Inc</a>. and its
affiliates.</span></p></body>
</html>
--------------090600060303000004030204--