<html>


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


</head>


<body>


<p dir="ltr">Do you have a chance to install qemu-debug? If yes I would try a backtrace.</p>


<p dir="ltr">gdb -p &lt;qemu-pid&gt;<br>


# bt</p>


<p dir="ltr">Markus</p>


<div class="gmail_quote">Am 15.09.2015 4:15 nachm. schrieb Daniel Helgenberger &lt;daniel.helgenberger@m-box.de&gt;:<br type="attribution">


<blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div><font size="2"><span style="font-size:10pt"></span></font>


<div>Hello,<br>


<br>


I do not want to hijack the thread but maybe my issue is related?<br>


<br>


It might have started with ovirt 3.5.3; but I cannot tell for sure.<br>


<br>


For me, one vm (foreman) is affected; the second time in 14 days. I can confirm this as I also loose any network connection to the VM and<br>


the ability to connect a console.<br>


Also, the only thing witch 'fixes' the issue is right now 'kill -9 &lt;pid of qemu-kvm process&gt;'<br>


<br>


As far as I can tell the VM became unresponsive at around Sep 15 12:30:01; engine logged this at 12:34. Nothing obvious in VDSM logs (see<br>


attached).<br>


<br>


Below the engine.log part.<br>


<br>


Versions:<br>


ovirt-engine-3.5.4.2-1.el7.centos.noarch<br>


<br>


vdsm-4.16.26-0.el7.centos<br>


libvirt-1.2.8-16.el7_1.3<br>


<br>


engine.log (1200 - 1300:<br>


2015-09-15 12:03:47,949 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-56) [264d502a] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:08:02,708 INFO&nbsp; [org.ovirt.engine.core.bll.OvfDataUpdater] (DefaultQuartzScheduler_Worker-89) [2e7bf56e] Attempting to update<br>


VMs/Templates Ovf.<br>


2015-09-15 12:08:02,709 INFO&nbsp; [org.ovirt.engine.core.bll.ProcessOvfUpdateForStoragePoolCommand] (DefaultQuartzScheduler_Worker-89)<br>


[5e9f4ba6] Running command: ProcessOvfUpdateForStoragePoolCommand internal: true. Entities affected :&nbsp; ID:<br>


00000002-0002-0002-0002-000000000088 Type: l<br>


2015-09-15 12:08:02,780 INFO&nbsp; [org.ovirt.engine.core.bll.ProcessOvfUpdateForStoragePoolCommand] (DefaultQuartzScheduler_Worker-89)<br>


[5e9f4ba6] Lock freed to object EngineLock [exclusiveLocks= key: 00000002-0002-0002-0002-000000000088 value: OVF_UPDATE<br>


2015-09-15 12:08:47,997 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-21) [3fc854a2] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:13:06,998 INFO&nbsp; [org.ovirt.engine.core.vdsbroker.vdsbroker.GetFileStatsVDSCommand] (org.ovirt.thread.pool-8-thread-48)<br>


[50221cdc] START, GetFileStatsVDSCommand( storagePoolId = 00000002-0002-0002-0002-000000000088, ignoreFailoverLimit = false), log id: 1503968<br>


2015-09-15 12:13:07,137 INFO&nbsp; [org.ovirt.engine.core.vdsbroker.vdsbroker.GetFileStatsVDSCommand] (org.ovirt.thread.pool-8-thread-48)<br>


[50221cdc] FINISH, GetFileStatsVDSCommand, return: {pfSense-2.0-RELEASE-i386.iso={status=0, ctime=1432286887.0, size=115709952},<br>


Fedora-15-i686-Live8<br>


2015-09-15 12:13:07,178 INFO&nbsp; [org.ovirt.engine.core.bll.IsoDomainListSyncronizer] (org.ovirt.thread.pool-8-thread-48) [50221cdc] Finished<br>


automatic refresh process for ISO file type with success, for storage domain id 84dcb2fc-fb63-442f-aa77-3e84dc7d5a72.<br>


2015-09-15 12:13:48,043 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-87) [4fa1bb16] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:18:48,088 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-44) [6345e698] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:23:48,137 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-13) HA reservation<br>


status for cluster Default is OK<br>


2015-09-15 12:28:48,183 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-76) [154c91d5] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:33:48,229 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-36) [27c73ac6] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:34:49,432 INFO&nbsp; [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-41) [5f2a4b68] VM<br>


foreman 8b57ff1d-2800-48ad-b267-fd8e9e2f6fb2 moved from Up --&gt; NotResponding<br>


2015-09-15 12:34:49,578 WARN&nbsp; [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-41)<br>


[5f2a4b68] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM foreman is not responding.<br>


2015-09-15 12:38:48,273 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-10) [7a800766] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:43:48,320 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-42) [440f1c40] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:48:48,366 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-70) HA reservation<br>


status for cluster Default is OK<br>


2015-09-15 12:53:48,412 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-12) [50221cdc] HA<br>


reservation status for cluster Default is OK<br>


2015-09-15 12:58:48,459 INFO&nbsp; [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler_Worker-3) HA reservation<br>


status for cluster Default is OK<br>


<br>


<br>


<br>


On 29.08.2015 22:48, Christian Hailer wrote:<br>


&gt; Hello,<br>


&gt; <br>


&gt; last Wednesday I wanted to update my oVirt 3.5 hypervisor. It is a single Centos


<br>


&gt; 7 server, so I started by suspending the VMs in order to set the oVirt engine <br>


&gt; host to maintenance mode. During the process of suspending the VMs the server <br>


&gt; crashed, kernel panic…<br>


&gt; <br>


&gt; After restarting the server I installed the updates via yum an restarted the <br>


&gt; server again. Afterwards, all the VMs could be started again. Some hours later <br>


&gt; my monitoring system registered some unresponsive hosts, I had a look in the <br>


&gt; oVirt interface, 3 of the VMs were in the state “not responding”, marked by a <br>


&gt; question mark.<br>


&gt; <br>


&gt; I tried to shut down the VMs, but oVirt wasn’t able to do so. I tried to reset <br>


&gt; the status in the database with the sql statement<br>


&gt; <br>


&gt; update vm_dynamic set status = 0 where vm_guid = (select vm_guid from vm_static


<br>


&gt; where vm_name = 'MYVMNAME');<br>


&gt; <br>


&gt; but that didn’t help, either. Only rebooting the whole hypervisor helped… <br>


&gt; afterwards everything worked again. But only for a few hours, then one of the <br>


&gt; VMs entered the “not responding” state again… again only a reboot helped. <br>


&gt; Yesterday it happened again:<br>


&gt; <br>


&gt; 2015-08-28 17:44:22,664 INFO&nbsp; <br>


&gt; [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] <br>


&gt; (DefaultQuartzScheduler_Worker-60) [4ef90b12] VM DC <br>


&gt; 0f3d1f06-e516-48ce-aa6f-7273c33d3491 moved from Up --&gt; NotResponding<br>


&gt; <br>


&gt; 2015-08-28 17:44:22,692 WARN&nbsp; <br>


&gt; [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] <br>


&gt; (DefaultQuartzScheduler_Worker-60) [4ef90b12] Correlation ID: null, Call Stack:


<br>


&gt; null, Custom Event ID: -1, Message: VM DC is not responding.<br>


&gt; <br>


&gt; Does anybody know what I can do? Where should I have a look? Hints are greatly <br>


&gt; appreciated!<br>


&gt; <br>


&gt; Thanks,<br>


&gt; <br>


&gt; Christian<br>


&gt; <br>


<br>


-- <br>


Daniel Helgenberger<br>


m box bewegtbild GmbH<br>


<br>


P: &#43;49/30/2408781-22<br>


F: &#43;49/30/2408781-10<br>


<br>


ACKERSTR. 19<br>


D-10115 BERLIN<br>


<br>


<br>


<a href="http://www.m-box.de">www.m-box.de</a>&nbsp; <a href="http://www.monkeymen.tv">


www.monkeymen.tv</a><br>


<br>


Geschäftsführer: Martin Retschitzegger / Michaela Göllner<br>


Handeslregister: Amtsgericht Charlottenburg / HRB 112767<br>


</div>


</div>


</blockquote>


</div>


</body>


</html>