[ovirt-users] vm freezes when using yum update

Yaniv Kaul ykaul at redhat.com
Mon Jul 3 07:00:33 UTC 2017


On Mon, Jul 3, 2017 at 6:49 AM, M Mahboubian <m_mahboubian at yahoo.com> wrote:

> Hi Yaniv,
>
> Thanks for your reply. Apologies for my late reply we had a long holiday
> here.
>
> To answer you:
>
> Yes the  guest VM become completely frozen and non responsive as soon as
> its disk has any activity for example when we shutdown or do a yum update.
>
>
> Versions of all the components involved - guest OS, host OS (qemu-kvm
> version), how do you run the VM (vdsm log would be helpful here), exact
> storage specification (1Gb or 10Gb link? What is the NFS version? What is
> it hosted on? etc.)
>  Y.
>
> Some facts about our environment:
>
> 1) Previously, this environment was using XEN using raw disk and we change
> it to Ovirt (Ovirt were able to read the VMs's disks without any
> conversion.)
>

Interesting - what interface are they using?
Is that raw or raw sparse? How did you perform the conversion? (or no
conversion - just copied the disks over?)


> 2) The issue we are facing is not happening for any of the existing VMs.
> *3) This issue only happens for new VMs.*
>

New VMs from blank, or from a template (as a snapshot over the previous
VMs) ?


> 4) Guest (kernel v3.10) and host(kernel v4.1) OSes are both CentOS 7
> minimal installation.
>

Kernel 4.1? From where?


> *5) NFS version 4* and Using Ovirt 4.1
> 6) *The network speed is 1 GB.*
>

That might be very slow (but should not cause such an issue, unless
severely overloaded.

7) The output for rpm -qa | grep qemu-kvm shows:
> *     qemu-kvm-common-ev-2.6.0-28.e17_3.6.1.x86_64*
> *     qemu-kvm-tools-ev-2.6.0-28.e17_3.6.1.x86_64*
> *     qemu-kvm-ev-2.6.0-28.e17_3.6.1.x86_64*
>

That's good - that's almost the latest-greatest.


> *8) *The storage is from a SAN device which is connected to the NFS
> server using fiber channel.
>
> So for example during shutdown also it froze and shows something like this
> in event section:
>
> *VM ILMU_WEB has been paused due to storage I/O problem.*
>

We might need to get libvirt debug logs (and perhaps journal output of the
host).
Y.


>
>
> More information:
>
> VDSM log at the time of this issue (The issue happened at Jul 3, 2017
> 9:50:43 AM):
>
> 2017-07-03 09:50:37,113+0800 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:515)
> 2017-07-03 09:50:37,897+0800 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.02 seconds (__init__:515)
> 2017-07-03 09:50:42,510+0800 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:515)*2017-07-03 09:50:43,548+0800 INFO  (jsonrpc/3) [dispatcher] Run and protect: repoStats(options=None) (logUtils:51)
> 2017-07-03 09:50:43,548+0800 INFO  (jsonrpc/3) [dispatcher] Run and protect: repoStats, Return response: {u'e01186c1-7e44-4808-b551-4722f0f8e84b': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000144822', 'lastCheck': '8.9', 'valid': True}, u'721b5233-b0ba-4722-8a7d-ba2a372190a0': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000327909', 'lastCheck': '8.9', 'valid': True}, u'94775bd3-3244-45b4-8a06-37eff8856afa': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000256425', 'lastCheck': '8.9', 'valid': True}, u'731bb771-5b73-4b5c-ac46-56499df97721': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000238159', 'lastCheck': '8.9', 'valid': True}, u'f620781f-93d4-4410-8697-eb41045cacd6': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.00022004', 'lastCheck': '8.9', 'valid': True}, u'a1a7d0a4-e3b6-4bd5-862b-96e70dae3f29': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000298581', 'lastCheck': '8.8', 'valid': True}} (logUtils:54)
> *2017-07-03 09:50:43,563+0800 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.getStats succeeded in 0.01 seconds (__init__:515)
> 2017-07-03 09:50:46,737+0800 INFO  (periodic/3) [dispatcher] Run and protect: getVolumeSize(sdUUID=u'721b5233-b0ba-4722-8a7d-ba2a372190a0', spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d', imgUUID=u'3c26476e-1dae-44d7-9208-531b91ae5ae1', volUUID=u'a7e789fb-6646-4d0a-9b51-f5ab8242c8d5', options=None) (logUtils:51)
> 2017-07-03 09:50:46,738+0800 INFO  (periodic/0) [dispatcher] Run and protect: getVolumeSize(sdUUID=u'f620781f-93d4-4410-8697-eb41045cacd6', spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d', imgUUID=u'2158fdae-54e1-413d-a844-73da5d1bb4ca', volUUID=u'6ee0b0eb-0bba-4e18-9c00-c1539b632e8a', options=None) (logUtils:51)
> 2017-07-03 09:50:46,740+0800 INFO  (periodic/2) [dispatcher] Run and protect: getVolumeSize(sdUUID=u'f620781f-93d4-4410-8697-eb41045cacd6', spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d', imgUUID=u'a967016d-a56b-41e8-b7a2-57903cbd2825', volUUID=u'784514cb-2b33-431c-b193-045f23c596d8', options=None) (logUtils:51)
> 2017-07-03 09:50:46,741+0800 INFO  (periodic/1) [dispatcher] Run and protect: getVolumeSize(sdUUID=u'721b5233-b0ba-4722-8a7d-ba2a372190a0', spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d', imgUUID=u'bb35c163-f068-4f08-a1c2-28c4cb1b76d9', volUUID=u'fce7e0a0-7411-4d8c-b72c-2f46c4b4db1e', options=None) (logUtils:51)
> 2017-07-03 09:50:46,743+0800 INFO  (periodic/0) [dispatcher] Run and protect: getVolumeSize, Return response: {'truesize': '6361276416', 'apparentsize': '107374182400'} (logUtils:54)
>
>
> ......
>
> ......
>
>
> *2017-07-03 09:52:16,941+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') abnormal vm stop device scsi0-0-0-0 error eio (vm:4112)
> 2017-07-03 09:52:16,941+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onIOError (vm:4997)
> 2017-07-03 09:52:16,942+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onSuspend (vm:4997)
> 2017-07-03 09:52:16,942+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') abnormal vm stop device scsi0-0-0-0 error eio (vm:4112)
> 2017-07-03 09:52:16,943+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onIOError (vm:4997)
> 2017-07-03 09:52:16,943+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') abnormal vm stop device scsi0-0-0-0 error eio (vm:4112)
> 2017-07-03 09:52:16,944+0800 INFO  (libvirt/events) [virt.vm] (vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onIOError*
>
>
>
>
>
>
> *On Thursday, June 22, 2*017, 2:48 PM, Yaniv Kaul <ykaul at redhat.com>
> wrote:
>
>
>
> On Thu, Jun 22, 2017 at 5:07 AM, M Mahboubian <m_mahboubian at yahoo.com>
> wrote:
>
> Dear all,
> I appreciate if anybody could possibly help with the issue I am facing.
>
> In our environment we have 2 hosts 1 NFS server and 1 ovirt engine server.
> The NFS server provides storage to the VMs in the hosts.
>
> I can create new VMs and install os but once i do something like yum
> update the VM freezes. I can reproduce this every single time I do yum
> update.
>
>
> Is it paused, or completely frozen?
>
>
>
> what information/log files should I provide you to trubleshoot this?
>
>
> Versions of all the components involved - guest OS, host OS (qemu-kvm
> version), how do you run the VM (vdsm log would be helpful here), exact
> storage specification (1Gb or 10Gb link? What is the NFS version? What is
> it hosted on? etc.)
>  Y.
>
>
>  Regards
>
> ______________________________ _________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/ mailman/listinfo/users
> <http://lists.ovirt.org/mailman/listinfo/users>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170703/c7445274/attachment-0001.html>


More information about the Users mailing list