[ovirt-users] trouble when creating VM snapshots including memory

Matthias Leopold matthias.leopold at meduniwien.ac.at
Mon Jun 12 09:06:09 UTC 2017



Am 2017-06-09 um 21:48 schrieb Karli Sjöberg:
> 
> 
> Den 9 juni 2017 21:40 skrev Matthias Leopold 
> <matthias.leopold at meduniwien.ac.at>:
> 
>     hi,
> 
>     i'm having trouble creating VM snapshots that include memory in my
>     oVirt
>     4.1 test environment. when i do this the VM gets paused and shortly
>     (20-30s) afterwards i'm seeing messages in engine.log about both iSCSI
>     storage domains (master storage domain and data storage where VM
>     resides) experiencing high latency. this quickly worsens from the
>     engines view: VM is unresponsive, Host is unresponsive, engine wants to
>     fence the host (impossible because it's the only host in the test
>     cluster). in the end there is an EngineException
> 
>     EngineException:
>     org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
>     VDSGenericException: VDSNetworkException: Message timeout which can be
>     caused by communication issues (Failed with error VDS_NETWORK_ERROR and
>     code 5022)
> 
>     the snapshot fails and is left in an inconsistent state. the situation
>     has to be resolved manually with unlock_entity.sh and maybe lvm
>     commands. this happened twice in exactly the same manner. VM snapshots
>     without memory for this VM are not a problem.
> 
>     VM guest OS is CentOS7 installed from one of the ovirt-image-repository
>     images. it has the oVirt guest agent running.
> 
>     what could be wrong?
> 
> 
> Seems to me that the snapshot operation, where the host needs to save 
> all of the VM memory chokes the storage pipe, the host becomes 
> "unresponsive" from engine's point of view and all goes up shit creek. 
> How is the hypervisor connected to the storage, in more detail?
> 
> /K

i shot myself in the foot by also playing around with network QoS and 
forgetting about it.... no wonder the network chokes when i tell it to 
do so. without randomly applied QoS profiles snapshots work perfectly ;-)

thx
matthias




More information about the Users mailing list