[ovirt-users] Cinder Snapshot Issues

Daniel Erez derez at redhat.com
Wed May 25 12:59:59 UTC 2016


According to VDSM log [1], there was a timeout error during snapshot
operation.
This could be a duplicate of bugs [2] already resolved in latest version.
Can you please provide the versions of the following components for further
investigation:
engine / vdsm / qemu-kvm-rhev / libvirt. Also, please attach libivrt/qemu
logs.

[1] jsonrpc.Executor/6::ERROR::2016-05-23
13:09:46,790::vm::3311::virt.vm::(snapshot)
vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 3309, in snapshot
    self._dom.snapshotCreateXML(snapxml, snapFlags)
  File "/usr/share/vdsm/virt/virdomain.py", line 76, in f
    raise toe
TimeoutError: Timed out during operation: cannot acquire state change lock
(held by remoteDispatchDomainSnapshotCreateXML)

[2]
https://bugzilla.redhat.com/show_bug.cgi?id=1261980
https://bugzilla.redhat.com/show_bug.cgi?id=1250839

On Mon, May 23, 2016 at 11:31 AM, Daniel Erez <derez at redhat.com> wrote:

>
>
> On Tue, Apr 12, 2016 at 11:13 PM, Kevin Hrpcek <khrpcek at gmail.com> wrote:
>
>> Hello,
>>
>> I'm running into a problem with live snapshots not working when using
>> cinder/ceph disks. There are different failures for including and not
>> including memory, but in each case cinder/ceph creates a new snapshot that
>> can be seen in cinder and ceph. When doing a memory/disk snapshot the VM
>> ends up in a paused state and I need to kill -9 the qemu process to be able
>> to boot the vm again. The engine seems to be losing connection with the
>> vdsm process on the VM host after freezing the guest's filesystems. The
>> guest never receives the thaw command and it fails in the logs. I am
>> pasting in some log snippets.
>>
>> 2016-04-12 19:24:58,851 INFO
>> [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand]
>> (org.ovirt.thread.pool-8-thread-27) [5c4493e] Ending command
>> 'org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand' successfully.
>> 2016-04-12 19:27:56,873 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Correlation ID: null, Call
>> Stack: null, Custom Event ID: -1, Message: VDSM OVCL1A command failed:
>> Message timeout which can be caused by communication issues
>> 2016-04-12 19:27:56,873 INFO
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
>> 'org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand' return value
>> 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=5022,
>> message=Message timeout which can be caused by communication issues]]'
>> 2016-04-12 19:27:56,874 INFO
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] HostName = OVCL1A
>> 2016-04-12 19:27:56,874 ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
>> 'SnapshotVDSCommand(HostName = OVCL1A,
>> SnapshotVDSCommandParameters:{runAsync='true',
>> hostId='9bdfaedc-34a8-4a08-ad8a-c117835a6094',
>> vmId='040609f6-cfe0-4763-8b32-08ffad158c93'})' execution failed:
>> VDSGenericException: VDSNetworkException: Message timeout which can be
>> caused by communication issues
>> 2016-04-12 19:27:56,875 WARN
>> [org.ovirt.engine.core.vdsbroker.VdsManager]
>> (org.ovirt.thread.pool-8-thread-16) [4d97ca06] Host 'OVCL1A' is not
>> responding.
>>
>> Disk only live snapshots freeze the guest file systems, the vm receives
>> the thaw command, but the VM is no longer responsive. The VM pings on the
>> network but it is hung and it also needs a kill -9 to the qemu process so
>> that it can be booted again.
>>
>> jsonrpc.Executor/0::DEBUG::2016-04-12
>> 19:41:58,342::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
>> 'VM.snapshot' in bridge with {u'frozen': True, u'vmID':
>> u'040609f6-cfe0-4763-8b32-08ffad158c93', u'snapDrives': []}
>> jsonrpc.Executor/0::INFO::2016-04-12
>> 19:41:58,343::vm::3237::virt.vm::(snapshot)
>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::<domainsnapshot>
>>         <disks/>
>> </domainsnapshot>
>>
>> jsonrpc.Executor/0::ERROR::2016-04-12
>> 19:41:58,346::vm::3252::virt.vm::(snapshot)
>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/virt/vm.py", line 3250, in snapshot
>>     self._dom.snapshotCreateXML(snapxml, snapFlags)
>>   File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
>>     ret = attr(*args, **kwargs)
>>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
>> 124, in wrapper
>>     ret = f(*args, **kwargs)
>>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in
>> wrapper
>>     return func(inst, *args, **kwargs)
>>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2581, in
>> snapshotCreateXML
>>     if ret is None:raise libvirtError('virDomainSnapshotCreateXML()
>> failed', dom=self)
>> libvirtError: unsupported configuration: nothing selected for snapshot
>> jsonrpc.Executor/7::DEBUG::2016-04-12
>> 19:41:58,391::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
>> 'VM.thaw' in bridge with {u'vmID': u'040609f6-cfe0-4763-8b32-08ffad158c93'}
>> jsonrpc.Executor/7::INFO::2016-04-12
>> 19:41:58,391::vm::3041::virt.vm::(thaw)
>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Thawing guest filesystems
>> jsonrpc.Executor/7::INFO::2016-04-12
>> 19:41:58,396::vm::3056::virt.vm::(thaw)
>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::6 guest filesystems thawed
>>
>
> It could be an issue of a guest agent. Please make sure the
> ovirt-guest-agent and qemu-guest-agent are installed and running in the VM.
> Further details are available at:
> http://www.ovirt.org/documentation/internal/guest-agent/understanding-guest-agents-and-other-tools/
> In addition, can you please attach full engine/vdsm logs.
>
>
>>
>> Everything else is working well with cinder for running VMs (making
>> disks, running VMs, live migration, etc...). I was able to get live
>> snapshots when using a CephFS Posix storage domain.
>>
>> Versions..
>> Ceph 9.2.0
>> oVirt Latest
>> CentOS 7.2
>> Cinder 7.0.1-1.el7
>>
>> Any help would be appreciated.
>>
>> Thanks,
>> Kevin
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160525/2202f929/attachment-0001.html>


More information about the Users mailing list