According to VDSM log [1], there was a timeout error during snapshot
operation.
This could be a duplicate of bugs [2] already resolved in latest version.
Can you please provide the versions of the following components for further
investigation:
engine / vdsm / qemu-kvm-rhev / libvirt. Also, please attach libivrt/qemu
logs.
[1] jsonrpc.Executor/6::ERROR::2016-05-23
13:09:46,790::vm::3311::virt.vm::(snapshot)
vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
Traceback (most recent call last):
File "/usr/share/vdsm/virt/vm.py", line 3309, in snapshot
self._dom.snapshotCreateXML(snapxml, snapFlags)
File "/usr/share/vdsm/virt/virdomain.py", line 76, in f
raise toe
TimeoutError: Timed out during operation: cannot acquire state change lock
(held by remoteDispatchDomainSnapshotCreateXML)
[2]
On Tue, Apr 12, 2016 at 11:13 PM, Kevin Hrpcek <khrpcek(a)gmail.com> wrote:
> Hello,
>
> I'm running into a problem with live snapshots not working when using
> cinder/ceph disks. There are different failures for including and not
> including memory, but in each case cinder/ceph creates a new snapshot that
> can be seen in cinder and ceph. When doing a memory/disk snapshot the VM
> ends up in a paused state and I need to kill -9 the qemu process to be able
> to boot the vm again. The engine seems to be losing connection with the
> vdsm process on the VM host after freezing the guest's filesystems. The
> guest never receives the thaw command and it fails in the logs. I am
> pasting in some log snippets.
>
> 2016-04-12 19:24:58,851 INFO
> [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand]
> (org.ovirt.thread.pool-8-thread-27) [5c4493e] Ending command
> 'org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand' successfully.
> 2016-04-12 19:27:56,873 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Correlation ID: null, Call
> Stack: null, Custom Event ID: -1, Message: VDSM OVCL1A command failed:
> Message timeout which can be caused by communication issues
> 2016-04-12 19:27:56,873 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
> 'org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand' return value
> 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=5022,
> message=Message timeout which can be caused by communication issues]]'
> 2016-04-12 19:27:56,874 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] HostName = OVCL1A
> 2016-04-12 19:27:56,874 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
> 'SnapshotVDSCommand(HostName = OVCL1A,
> SnapshotVDSCommandParameters:{runAsync='true',
> hostId='9bdfaedc-34a8-4a08-ad8a-c117835a6094',
> vmId='040609f6-cfe0-4763-8b32-08ffad158c93'})' execution failed:
> VDSGenericException: VDSNetworkException: Message timeout which can be
> caused by communication issues
> 2016-04-12 19:27:56,875 WARN
> [org.ovirt.engine.core.vdsbroker.VdsManager]
> (org.ovirt.thread.pool-8-thread-16) [4d97ca06] Host 'OVCL1A' is not
> responding.
>
> Disk only live snapshots freeze the guest file systems, the vm receives
> the thaw command, but the VM is no longer responsive. The VM pings on the
> network but it is hung and it also needs a kill -9 to the qemu process so
> that it can be booted again.
>
> jsonrpc.Executor/0::DEBUG::2016-04-12
> 19:41:58,342::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.snapshot' in bridge with {u'frozen': True, u'vmID':
> u'040609f6-cfe0-4763-8b32-08ffad158c93', u'snapDrives': []}
> jsonrpc.Executor/0::INFO::2016-04-12
> 19:41:58,343::vm::3237::virt.vm::(snapshot)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::<domainsnapshot>
> <disks/>
> </domainsnapshot>
>
> jsonrpc.Executor/0::ERROR::2016-04-12
> 19:41:58,346::vm::3252::virt.vm::(snapshot)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
> Traceback (most recent call last):
> File "/usr/share/vdsm/virt/vm.py", line 3250, in snapshot
> self._dom.snapshotCreateXML(snapxml, snapFlags)
> File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
> ret = attr(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> 124, in wrapper
> ret = f(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in
> wrapper
> return func(inst, *args, **kwargs)
> File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2581, in
> snapshotCreateXML
> if ret is None:raise libvirtError('virDomainSnapshotCreateXML()
> failed', dom=self)
> libvirtError: unsupported configuration: nothing selected for snapshot
> jsonrpc.Executor/7::DEBUG::2016-04-12
> 19:41:58,391::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.thaw' in bridge with {u'vmID':
u'040609f6-cfe0-4763-8b32-08ffad158c93'}
> jsonrpc.Executor/7::INFO::2016-04-12
> 19:41:58,391::vm::3041::virt.vm::(thaw)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Thawing guest filesystems
> jsonrpc.Executor/7::INFO::2016-04-12
> 19:41:58,396::vm::3056::virt.vm::(thaw)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::6 guest filesystems thawed
>
It could be an issue of a guest agent. Please make sure the
ovirt-guest-agent and qemu-guest-agent are installed and running in the VM.
Further details are available at:
http://www.ovirt.org/documentation/internal/guest-agent/understanding-gue...
In addition, can you please attach full engine/vdsm logs.
>
> Everything else is working well with cinder for running VMs (making
> disks, running VMs, live migration, etc...). I was able to get live
> snapshots when using a CephFS Posix storage domain.
>
> Versions..
> Ceph 9.2.0
> oVirt Latest
> CentOS 7.2
> Cinder 7.0.1-1.el7
>
> Any help would be appreciated.
>
> Thanks,
> Kevin
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
>