Hi all,
Since we upgraded our Ovirt nodes to CentOS 7 a vm (not a specific one but never more then
one) will sometimes pause suddenly with the error "VM ... has paused due to unknown
storage error". It happens now two times in a month.
The Ovirt node uses san storage for the vm's running on it. When a specific vm is
pausing with an error the other vm's keeps running without problems.
The vm runs without problems after unpausing it.
Versions:
CentOS Linux release 7.1.1503
vdsm-4.14.17-0
libvirt-daemon-1.2.8-16
vdsm.log:
VM Channels Listener::DEBUG::2015-10-25
07:43:54,382::vmChannels::95::vds::(_handle_timeouts) Timeout on fileno 78.
libvirtEventLoop::INFO::2015-10-25 07:43:56,177::vm::4602::vm.Vm::(_onIOError)
vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::abnormal vm stop device virtio-disk0 error
eother
libvirtEventLoop::DEBUG::2015-10-25
07:43:56,178::vm::5204::vm.Vm::(_onLibvirtLifecycleEvent)
vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::event Suspended detail 2 opaque None
libvirtEventLoop::INFO::2015-10-25 07:43:56,178::vm::4602::vm.Vm::(_onIOError)
vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::abnormal vm stop device virtio-disk0 error
eother
...........
libvirtEventLoop::INFO::2015-10-25 07:43:56,180::vm::4602::vm.Vm::(_onIOError)
vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::abnormal vm stop device virtio-disk0 error
eother
specific error part in libvirt vm log:
block I/O error in device 'drive-virtio-disk0': Unknown error 32758 (32758)
...........
block I/O error in device 'drive-virtio-disk0': Unknown error 32758 (32758)
engine.log:
2015-10-25 07:44:48,945 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-40) [a43dcc8] VM diataal-prod-cas1
77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb moved from
Up --> Paused
2015-10-25 07:44:49,003 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-40) [a43dcc8] Correlation ID: null, Call Stack: null,
Custom Event
ID: -1, Message: VM diataal-prod-cas1 has paused due to unknown storage error.
Has anyone experienced the same problem or knows a way to solve this?
Kind regards,
Jasper