[ovirt-users] Cinder Snapshot Issues

Kevin Hrpcek khrpcek at gmail.com
Wed May 25 14:05:49 EDT 2016


Qemu log for a cinder based VM is attached.

Current versions:
ovirt-engine-3.6.5.3-1.el7.centos.noarch
vdsm-4.17.26-0.el7.centos.noarch
qemu-kvm-ev-2.3.0-31.el7_2.10.1.x86_64
libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-1.2.17-13.el7_2.4.x86_64



On Wed, May 25, 2016 at 7:59 AM, Daniel Erez <derez at redhat.com> wrote:

> According to VDSM log [1], there was a timeout error during snapshot
> operation.
> This could be a duplicate of bugs [2] already resolved in latest version.
> Can you please provide the versions of the following components for
> further investigation:
> engine / vdsm / qemu-kvm-rhev / libvirt. Also, please attach libivrt/qemu
> logs.
>
> [1] jsonrpc.Executor/6::ERROR::2016-05-23
> 13:09:46,790::vm::3311::virt.vm::(snapshot)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
> Traceback (most recent call last):
>   File "/usr/share/vdsm/virt/vm.py", line 3309, in snapshot
>     self._dom.snapshotCreateXML(snapxml, snapFlags)
>   File "/usr/share/vdsm/virt/virdomain.py", line 76, in f
>     raise toe
> TimeoutError: Timed out during operation: cannot acquire state change lock
> (held by remoteDispatchDomainSnapshotCreateXML)
>
> [2]
> https://bugzilla.redhat.com/show_bug.cgi?id=1261980
> https://bugzilla.redhat.com/show_bug.cgi?id=1250839
>
> On Mon, May 23, 2016 at 11:31 AM, Daniel Erez <derez at redhat.com> wrote:
>
>>
>>
>> On Tue, Apr 12, 2016 at 11:13 PM, Kevin Hrpcek <khrpcek at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm running into a problem with live snapshots not working when using
>>> cinder/ceph disks. There are different failures for including and not
>>> including memory, but in each case cinder/ceph creates a new snapshot that
>>> can be seen in cinder and ceph. When doing a memory/disk snapshot the VM
>>> ends up in a paused state and I need to kill -9 the qemu process to be able
>>> to boot the vm again. The engine seems to be losing connection with the
>>> vdsm process on the VM host after freezing the guest's filesystems. The
>>> guest never receives the thaw command and it fails in the logs. I am
>>> pasting in some log snippets.
>>>
>>> 2016-04-12 19:24:58,851 INFO
>>> [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand]
>>> (org.ovirt.thread.pool-8-thread-27) [5c4493e] Ending command
>>> 'org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand' successfully.
>>> 2016-04-12 19:27:56,873 ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Correlation ID: null, Call
>>> Stack: null, Custom Event ID: -1, Message: VDSM OVCL1A command failed:
>>> Message timeout which can be caused by communication issues
>>> 2016-04-12 19:27:56,873 INFO
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
>>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
>>> 'org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand' return value
>>> 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=5022,
>>> message=Message timeout which can be caused by communication issues]]'
>>> 2016-04-12 19:27:56,874 INFO
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
>>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] HostName = OVCL1A
>>> 2016-04-12 19:27:56,874 ERROR
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
>>> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
>>> 'SnapshotVDSCommand(HostName = OVCL1A,
>>> SnapshotVDSCommandParameters:{runAsync='true',
>>> hostId='9bdfaedc-34a8-4a08-ad8a-c117835a6094',
>>> vmId='040609f6-cfe0-4763-8b32-08ffad158c93'})' execution failed:
>>> VDSGenericException: VDSNetworkException: Message timeout which can be
>>> caused by communication issues
>>> 2016-04-12 19:27:56,875 WARN
>>> [org.ovirt.engine.core.vdsbroker.VdsManager]
>>> (org.ovirt.thread.pool-8-thread-16) [4d97ca06] Host 'OVCL1A' is not
>>> responding.
>>>
>>> Disk only live snapshots freeze the guest file systems, the vm receives
>>> the thaw command, but the VM is no longer responsive. The VM pings on the
>>> network but it is hung and it also needs a kill -9 to the qemu process so
>>> that it can be booted again.
>>>
>>> jsonrpc.Executor/0::DEBUG::2016-04-12
>>> 19:41:58,342::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
>>> 'VM.snapshot' in bridge with {u'frozen': True, u'vmID':
>>> u'040609f6-cfe0-4763-8b32-08ffad158c93', u'snapDrives': []}
>>> jsonrpc.Executor/0::INFO::2016-04-12
>>> 19:41:58,343::vm::3237::virt.vm::(snapshot)
>>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::<domainsnapshot>
>>>         <disks/>
>>> </domainsnapshot>
>>>
>>> jsonrpc.Executor/0::ERROR::2016-04-12
>>> 19:41:58,346::vm::3252::virt.vm::(snapshot)
>>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
>>> Traceback (most recent call last):
>>>   File "/usr/share/vdsm/virt/vm.py", line 3250, in snapshot
>>>     self._dom.snapshotCreateXML(snapxml, snapFlags)
>>>   File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
>>>     ret = attr(*args, **kwargs)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
>>> line 124, in wrapper
>>>     ret = f(*args, **kwargs)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in
>>> wrapper
>>>     return func(inst, *args, **kwargs)
>>>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2581, in
>>> snapshotCreateXML
>>>     if ret is None:raise libvirtError('virDomainSnapshotCreateXML()
>>> failed', dom=self)
>>> libvirtError: unsupported configuration: nothing selected for snapshot
>>> jsonrpc.Executor/7::DEBUG::2016-04-12
>>> 19:41:58,391::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
>>> 'VM.thaw' in bridge with {u'vmID': u'040609f6-cfe0-4763-8b32-08ffad158c93'}
>>> jsonrpc.Executor/7::INFO::2016-04-12
>>> 19:41:58,391::vm::3041::virt.vm::(thaw)
>>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Thawing guest filesystems
>>> jsonrpc.Executor/7::INFO::2016-04-12
>>> 19:41:58,396::vm::3056::virt.vm::(thaw)
>>> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::6 guest filesystems thawed
>>>
>>
>> It could be an issue of a guest agent. Please make sure the
>> ovirt-guest-agent and qemu-guest-agent are installed and running in the VM.
>> Further details are available at:
>> http://www.ovirt.org/documentation/internal/guest-agent/understanding-guest-agents-and-other-tools/
>> In addition, can you please attach full engine/vdsm logs.
>>
>>
>>>
>>> Everything else is working well with cinder for running VMs (making
>>> disks, running VMs, live migration, etc...). I was able to get live
>>> snapshots when using a CephFS Posix storage domain.
>>>
>>> Versions..
>>> Ceph 9.2.0
>>> oVirt Latest
>>> CentOS 7.2
>>> Cinder 7.0.1-1.el7
>>>
>>> Any help would be appreciated.
>>>
>>> Thanks,
>>> Kevin
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160525/07585088/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ov1.log.gz
Type: application/x-gzip
Size: 2636 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160525/07585088/attachment.gz>


More information about the Users mailing list