On Mon, Jul 3, 2017 at 6:49 AM, M Mahboubian <m_mahboubian(a)yahoo.com> wrote:
Hi Yaniv,
Thanks for your reply. Apologies for my late reply we had a long holiday
here.
To answer you:
Yes the guest VM become completely frozen and non responsive as soon as
its disk has any activity for example when we shutdown or do a yum update.
Versions of all the components involved - guest OS, host OS (qemu-kvm
version), how do you run the VM (vdsm log would be helpful here), exact
storage specification (1Gb or 10Gb link? What is the NFS version? What is
it hosted on? etc.)
Y.
Some facts about our environment:
1) Previously, this environment was using XEN using raw disk and we change
it to Ovirt (Ovirt were able to read the VMs's disks without any
conversion.)
Interesting - what interface are they using?
Is that raw or raw sparse? How did you perform the conversion? (or no
conversion - just copied the disks over?)
2) The issue we are facing is not happening for any of the existing
VMs.
*3) This issue only happens for new VMs.*
New VMs from blank, or from a template (as a snapshot over the previous
VMs) ?
4) Guest (kernel v3.10) and host(kernel v4.1) OSes are both CentOS 7
minimal installation.
Kernel 4.1? From where?
*5) NFS version 4* and Using Ovirt 4.1
6) *The network speed is 1 GB.*
That might be very slow (but should not cause such an issue, unless
severely overloaded.
7) The output for rpm -qa | grep qemu-kvm shows:
* qemu-kvm-common-ev-2.6.0-28.e17_3.6.1.x86_64*
* qemu-kvm-tools-ev-2.6.0-28.e17_3.6.1.x86_64*
* qemu-kvm-ev-2.6.0-28.e17_3.6.1.x86_64*
That's good - that's almost the latest-greatest.
*8) *The storage is from a SAN device which is connected to the NFS
server using fiber channel.
So for example during shutdown also it froze and shows something like this
in event section:
*VM ILMU_WEB has been paused due to storage I/O problem.*
We might need to get libvirt debug logs (and perhaps journal output of the
host).
Y.
More information:
VDSM log at the time of this issue (The issue happened at Jul 3, 2017
9:50:43 AM):
2017-07-03 09:50:37,113+0800 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call
Host.getAllVmStats succeeded in 0.00 seconds (__init__:515)
2017-07-03 09:50:37,897+0800 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call
Host.getAllVmIoTunePolicies succeeded in 0.02 seconds (__init__:515)
2017-07-03 09:50:42,510+0800 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call
Host.getAllVmStats succeeded in 0.00 seconds (__init__:515)*2017-07-03 09:50:43,548+0800
INFO (jsonrpc/3) [dispatcher] Run and protect: repoStats(options=None) (logUtils:51)
2017-07-03 09:50:43,548+0800 INFO (jsonrpc/3) [dispatcher] Run and protect: repoStats,
Return response: {u'e01186c1-7e44-4808-b551-4722f0f8e84b': {'code': 0,
'actual': True, 'version': 4, 'acquired': True, 'delay':
'0.000144822', 'lastCheck': '8.9', 'valid': True},
u'721b5233-b0ba-4722-8a7d-ba2a372190a0': {'code': 0, 'actual':
True, 'version': 4, 'acquired': True, 'delay':
'0.000327909', 'lastCheck': '8.9', 'valid': True},
u'94775bd3-3244-45b4-8a06-37eff8856afa': {'code': 0, 'actual':
True, 'version': 4, 'acquired': True, 'delay':
'0.000256425', 'lastCheck': '8.9', 'valid': True},
u'731bb771-5b73-4b5c-ac46-56499df97721': {'code': 0, 'actual':
True, 'version': 0, 'acquired': True, 'delay':
'0.000238159', 'lastCheck': '8.9', 'valid': True},
u'f620781f-93d4-4410-8697-eb41045cacd6': {'code': 0, 'actual':
True, 'version': 4, 'acquired': True, 'delay':
'0.00022004', 'lastCheck': '8.9', 'valid': True},
u'a1a7d0a4-e3b6-4bd5-862b-96e70dae3f29': {'code': 0, 'actual':
True, 'version': 0, 'acquired': True, 'delay':
'0.000298581', 'lastCheck': '8.8', 'valid': True}}
(logUtils:54)
*2017-07-03 09:50:43,563+0800 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call
Host.getStats succeeded in 0.01 seconds (__init__:515)
2017-07-03 09:50:46,737+0800 INFO (periodic/3) [dispatcher] Run and protect:
getVolumeSize(sdUUID=u'721b5233-b0ba-4722-8a7d-ba2a372190a0',
spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d',
imgUUID=u'3c26476e-1dae-44d7-9208-531b91ae5ae1',
volUUID=u'a7e789fb-6646-4d0a-9b51-f5ab8242c8d5', options=None) (logUtils:51)
2017-07-03 09:50:46,738+0800 INFO (periodic/0) [dispatcher] Run and protect:
getVolumeSize(sdUUID=u'f620781f-93d4-4410-8697-eb41045cacd6',
spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d',
imgUUID=u'2158fdae-54e1-413d-a844-73da5d1bb4ca',
volUUID=u'6ee0b0eb-0bba-4e18-9c00-c1539b632e8a', options=None) (logUtils:51)
2017-07-03 09:50:46,740+0800 INFO (periodic/2) [dispatcher] Run and protect:
getVolumeSize(sdUUID=u'f620781f-93d4-4410-8697-eb41045cacd6',
spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d',
imgUUID=u'a967016d-a56b-41e8-b7a2-57903cbd2825',
volUUID=u'784514cb-2b33-431c-b193-045f23c596d8', options=None) (logUtils:51)
2017-07-03 09:50:46,741+0800 INFO (periodic/1) [dispatcher] Run and protect:
getVolumeSize(sdUUID=u'721b5233-b0ba-4722-8a7d-ba2a372190a0',
spUUID=u'b04ca6e4-2660-4eaa-acdb-c1dae4e21f2d',
imgUUID=u'bb35c163-f068-4f08-a1c2-28c4cb1b76d9',
volUUID=u'fce7e0a0-7411-4d8c-b72c-2f46c4b4db1e', options=None) (logUtils:51)
2017-07-03 09:50:46,743+0800 INFO (periodic/0) [dispatcher] Run and protect:
getVolumeSize, Return response: {'truesize': '6361276416',
'apparentsize': '107374182400'} (logUtils:54)
......
......
*2017-07-03 09:52:16,941+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') abnormal vm stop device scsi0-0-0-0
error eio (vm:4112)
2017-07-03 09:52:16,941+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onIOError (vm:4997)
2017-07-03 09:52:16,942+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onSuspend (vm:4997)
2017-07-03 09:52:16,942+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') abnormal vm stop device scsi0-0-0-0
error eio (vm:4112)
2017-07-03 09:52:16,943+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onIOError (vm:4997)
2017-07-03 09:52:16,943+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') abnormal vm stop device scsi0-0-0-0
error eio (vm:4112)
2017-07-03 09:52:16,944+0800 INFO (libvirt/events) [virt.vm]
(vmId='c84f519e-398d-40a3-85b2-b7e53f3d7f67') CPU stopped: onIOError*
*On Thursday, June 22, 2*017, 2:48 PM, Yaniv Kaul <ykaul(a)redhat.com>
wrote:
On Thu, Jun 22, 2017 at 5:07 AM, M Mahboubian <m_mahboubian(a)yahoo.com>
wrote:
Dear all,
I appreciate if anybody could possibly help with the issue I am facing.
In our environment we have 2 hosts 1 NFS server and 1 ovirt engine server.
The NFS server provides storage to the VMs in the hosts.
I can create new VMs and install os but once i do something like yum
update the VM freezes. I can reproduce this every single time I do yum
update.
Is it paused, or completely frozen?
what information/log files should I provide you to trubleshoot this?
Versions of all the components involved - guest OS, host OS (qemu-kvm
version), how do you run the VM (vdsm log would be helpful here), exact
storage specification (1Gb or 10Gb link? What is the NFS version? What is
it hosted on? etc.)
Y.
Regards
______________________________ _________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/ mailman/listinfo/users
<
http://lists.ovirt.org/mailman/listinfo/users>