On Fri, Nov 02, 2012 at 10:42:49AM +0100, Adam Tkáč wrote:
Hello all,
I'm experiencing strange issue on one of the nodes. There is only one
VM on the node but it generates _a lot_ of disk I/O.
Sometimes vdsm writes following error to the system log:
Nov 1 14:45:17 virtualizer respawn: slave '/usr/share/vdsm/vdsm'
died, respawning slave
Nov 1 14:45:20 virtualizer vdsm vds ERROR Unable to load the rest
server module. Please make sure it is installed.
Nov 1 14:45:20 virtualizer kernel: [105149.428697] ata1: hard resetting link
Nov 1 14:45:21 virtualizer kernel: [105149.758448] ata1: SATA link
down (SStatus 0 SControl 300)
Nov 1 14:45:21 virtualizer kernel: [105149.784996] ata1: EH complete
Nov 1 14:45:21 virtualizer vdsm vm.Vm WARNING
vmId=`6c6fc6f7-1b61-484b-9e44-0e9bb57ac1c9`::Unknown type found,
device: '{'device': u'unix', 'alias': u'channel0',
'type': u'channel',
'address': {u'bus': u'0', u'controller': u'0',
u'type':
u'virtio-serial', u'port': u'1'}}' found
Nov 1 14:45:21 virtualizer kernel: [105150.159060] ata2: EH complete
Nov 1 14:45:21 virtualizer kernel: [105150.181694] ata3: hard resetting link
Nov 1 14:45:21 virtualizer kernel: [105150.510474] ata3: SATA link
down (SStatus 0 SControl 300)
<repeated for every ata* link...>
Nov 1 14:45:25 virtualizer vdsm Storage.LVM WARNING lvm vgs failed: 5
[] [' Volume group "752f4ef6-2e3a-4cec-aad9-848a8b8b9e80" not found']
Nov 1 14:55:41 virtualizer vdsm vds ERROR connection to libvirt
broken. taking vdsm down.
Nov 1 14:55:41 virtualizer vdsm root ERROR client ('192.168.30.140', 34435)
Nov 1 14:55:42 virtualizer vdsm vm.Vm ERROR
vmId=`6c6fc6f7-1b61-484b-9e44-0e9bb57ac1c9`::Stats function failed:
<AdvancedStatsFunction _sampleNet at 0x24fc1e0>
Nov 1 14:55:42 virtualizer vdsm Storage.LVM WARNING lvm vgs failed: 5
[] [' Volume group "035e2517-c12b-4c7f-b80e-695fec96757e" not found']
Nov 1 14:55:42 virtualizer vdsm Storage.LVM WARNING lvm vgs failed: 5
[] [' Volume group "035e2517-c12b-4c7f-b80e-695fec96757e" not found']
The "752f4ef6-2e3a-4cec-aad9-848a8b8b9e80" is LOCALFS storage and VM
image is stored there.
After that VM becomes unresponsible and I have to kill qemu-kvm
process manually. The VM's image is stored on XFS system and after
SATA reset it takes some time before XFS responds. During this time
libvirtd daemon is not responsible and vdsm timeouts.
Is it possible to configure vdsm not to reset SATA link every time
when it starts? And if not, is it possible to increase timeout how
long vdsm waits for libvirtd response?
I do not have an answer to your questions yet, so I'll add questions of
my own: which versions of libvirt and vdsm do you use?
Recently, libvirt introduced a keepalive mechanism which you could
control with /etc/libvirt/libvirtd.conf's keepalive_interval and
keepalive_count.
Dan