[ovirt-users] Cinder Snapshot Issues

Daniel Erez derez at redhat.com
Mon May 23 08:31:25 UTC 2016


On Tue, Apr 12, 2016 at 11:13 PM, Kevin Hrpcek <khrpcek at gmail.com> wrote:

> Hello,
>
> I'm running into a problem with live snapshots not working when using
> cinder/ceph disks. There are different failures for including and not
> including memory, but in each case cinder/ceph creates a new snapshot that
> can be seen in cinder and ceph. When doing a memory/disk snapshot the VM
> ends up in a paused state and I need to kill -9 the qemu process to be able
> to boot the vm again. The engine seems to be losing connection with the
> vdsm process on the VM host after freezing the guest's filesystems. The
> guest never receives the thaw command and it fails in the logs. I am
> pasting in some log snippets.
>
> 2016-04-12 19:24:58,851 INFO
> [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand]
> (org.ovirt.thread.pool-8-thread-27) [5c4493e] Ending command
> 'org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand' successfully.
> 2016-04-12 19:27:56,873 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Correlation ID: null, Call
> Stack: null, Custom Event ID: -1, Message: VDSM OVCL1A command failed:
> Message timeout which can be caused by communication issues
> 2016-04-12 19:27:56,873 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
> 'org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand' return value
> 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=5022,
> message=Message timeout which can be caused by communication issues]]'
> 2016-04-12 19:27:56,874 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] HostName = OVCL1A
> 2016-04-12 19:27:56,874 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (DefaultQuartzScheduler_Worker-27) [4d97ca06] Command
> 'SnapshotVDSCommand(HostName = OVCL1A,
> SnapshotVDSCommandParameters:{runAsync='true',
> hostId='9bdfaedc-34a8-4a08-ad8a-c117835a6094',
> vmId='040609f6-cfe0-4763-8b32-08ffad158c93'})' execution failed:
> VDSGenericException: VDSNetworkException: Message timeout which can be
> caused by communication issues
> 2016-04-12 19:27:56,875 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager]
> (org.ovirt.thread.pool-8-thread-16) [4d97ca06] Host 'OVCL1A' is not
> responding.
>
> Disk only live snapshots freeze the guest file systems, the vm receives
> the thaw command, but the VM is no longer responsive. The VM pings on the
> network but it is hung and it also needs a kill -9 to the qemu process so
> that it can be booted again.
>
> jsonrpc.Executor/0::DEBUG::2016-04-12
> 19:41:58,342::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.snapshot' in bridge with {u'frozen': True, u'vmID':
> u'040609f6-cfe0-4763-8b32-08ffad158c93', u'snapDrives': []}
> jsonrpc.Executor/0::INFO::2016-04-12
> 19:41:58,343::vm::3237::virt.vm::(snapshot)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::<domainsnapshot>
>         <disks/>
> </domainsnapshot>
>
> jsonrpc.Executor/0::ERROR::2016-04-12
> 19:41:58,346::vm::3252::virt.vm::(snapshot)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Unable to take snapshot
> Traceback (most recent call last):
>   File "/usr/share/vdsm/virt/vm.py", line 3250, in snapshot
>     self._dom.snapshotCreateXML(snapxml, snapFlags)
>   File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
>     ret = attr(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> 124, in wrapper
>     ret = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in
> wrapper
>     return func(inst, *args, **kwargs)
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2581, in
> snapshotCreateXML
>     if ret is None:raise libvirtError('virDomainSnapshotCreateXML()
> failed', dom=self)
> libvirtError: unsupported configuration: nothing selected for snapshot
> jsonrpc.Executor/7::DEBUG::2016-04-12
> 19:41:58,391::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.thaw' in bridge with {u'vmID': u'040609f6-cfe0-4763-8b32-08ffad158c93'}
> jsonrpc.Executor/7::INFO::2016-04-12
> 19:41:58,391::vm::3041::virt.vm::(thaw)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::Thawing guest filesystems
> jsonrpc.Executor/7::INFO::2016-04-12
> 19:41:58,396::vm::3056::virt.vm::(thaw)
> vmId=`040609f6-cfe0-4763-8b32-08ffad158c93`::6 guest filesystems thawed
>

It could be an issue of a guest agent. Please make sure the
ovirt-guest-agent and qemu-guest-agent are installed and running in the VM.
Further details are available at:
http://www.ovirt.org/documentation/internal/guest-agent/understanding-guest-agents-and-other-tools/
In addition, can you please attach full engine/vdsm logs.


>
> Everything else is working well with cinder for running VMs (making disks,
> running VMs, live migration, etc...). I was able to get live snapshots when
> using a CephFS Posix storage domain.
>
> Versions..
> Ceph 9.2.0
> oVirt Latest
> CentOS 7.2
> Cinder 7.0.1-1.el7
>
> Any help would be appreciated.
>
> Thanks,
> Kevin
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160523/a43a40aa/attachment-0001.html>


More information about the Users mailing list