[Users] Host Non-Operational from sanlock and VM fails to migrate

I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup for production use on our HPC system for managing our VM infrastructure. Currently I'm trying to utilize our DDR InfiniBand fabric for the storage domains in oVirt using NFS over RDMA. I've noticed some unstable behavior and it seems in every case to begin with sanlock. The ovirt web admin interface shows the following message as first sign of trouble on 2014-Feb-03 07:45. "Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error during communication with the Host.).". The single VM I had running is stuck in the "Migrating From" state. virsh shows the VM paused on the crashed host and the one it attempted to migrate to. Right now I have a few concerns. 1) The cause of the sanlock (or other instability) and if it's related to a bug or an issue using NFSoRDMA. 2) Why the VM failed to migrate if the second host had no issues. If the first host is down should the VM be considered offline and booted on the second host after first is fenced? Attached are logs from the failed host (vm01) and the healthy host (vm02) as well as engine. The failed host's /var/log/message is also attached (vm01_message.log). Thanks - Trey

On 02/03/2014 06:58 PM, Trey Dockendorf wrote:
I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup for production use on our HPC system for managing our VM infrastructure. Currently I'm trying to utilize our DDR InfiniBand fabric for the storage domains in oVirt using NFS over RDMA. I've noticed some unstable behavior and it seems in every case to begin with sanlock.
The ovirt web admin interface shows the following message as first sign of trouble on 2014-Feb-03 07:45.
"Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error during communication with the Host.).".
The single VM I had running is stuck in the "Migrating From" state. virsh shows the VM paused on the crashed host and the one it attempted to migrate to.
Right now I have a few concerns.
1) The cause of the sanlock (or other instability) and if it's related to a bug or an issue using NFSoRDMA. 2) Why the VM failed to migrate if the second host had no issues. If the first host is down should the VM be considered offline and booted on the second host after first is fenced?
Attached are logs from the failed host (vm01) and the healthy host (vm02) as well as engine. The failed host's /var/log/message is also attached (vm01_message.log).
Thanks - Trey
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
was this resolved?

No, in fact I just had the issue arise again after trying to figure out what about my setup causes this crash. So far it only seems to occur if both nodes are running NFS over RDMA, but I'm unsure if it's VM traffic or the host being SPM that causes it to misbehave. vm02 was running a single VM and was SPM. The crash was on vm02 "Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm02, Error: Network error during communication with the Host).". SPM successfully switched to vm01 but the VM is stuck in migration and unresponsive. Both engine and nodes using ovirt 3.3.3. vm01 and vm02 both have the following in vdsm.conf [addresses] management_port = 54321 [vars] ssl = true [irs] nfs_mount_options = rdma,port=20049 This is the ovirt NFS mount lines in /proc/mounts for each: vm01: 192.168.211.245:/tank/ovirt/import_export /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/iso /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/data /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 vm02: 192.168.211.245:/tank/ovirt/import_export /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/iso /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/data /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 The NFS server had these 2 log entries in /var/log/messages around the time vm02 went Non-operational. Feb 9 17:27:59 vmstore1 kernel: svcrdma: Error fast registering memory for xprt ffff882014683400 Feb 9 17:28:21 vmstore1 kernel: svcrdma: Error fast registering memory for xprt ffff882025bf1400 Attached is tar of the logs from vm01, vm02 and the engine server. vm01 & vm02 folders contain files from '/var/log/messages /var/log/sanlock.log /var/log/vdsm/*.log' engine from '/var/log/messages /var/log/ovirt-engine/*.log' Thanks - Trey On Sun, Feb 9, 2014 at 4:15 PM, Itamar Heim <iheim@redhat.com> wrote:
On 02/03/2014 06:58 PM, Trey Dockendorf wrote:
I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup for production use on our HPC system for managing our VM infrastructure. Currently I'm trying to utilize our DDR InfiniBand fabric for the storage domains in oVirt using NFS over RDMA. I've noticed some unstable behavior and it seems in every case to begin with sanlock.
The ovirt web admin interface shows the following message as first sign of trouble on 2014-Feb-03 07:45.
"Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error during communication with the Host.).".
The single VM I had running is stuck in the "Migrating From" state. virsh shows the VM paused on the crashed host and the one it attempted to migrate to.
Right now I have a few concerns.
1) The cause of the sanlock (or other instability) and if it's related to a bug or an issue using NFSoRDMA. 2) Why the VM failed to migrate if the second host had no issues. If the first host is down should the VM be considered offline and booted on the second host after first is fenced?
Attached are logs from the failed host (vm01) and the healthy host (vm02) as well as engine. The failed host's /var/log/message is also attached (vm01_message.log).
Thanks - Trey
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
was this resolved?

----- Original Message -----
From: "Trey Dockendorf" <treydock@gmail.com> To: "Itamar Heim" <iheim@redhat.com> Cc: "users" <users@ovirt.org> Sent: Monday, February 10, 2014 3:03:05 AM Subject: Re: [Users] Host Non-Operational from sanlock and VM fails to migrate
No, in fact I just had the issue arise again after trying to figure out what about my setup causes this crash. So far it only seems to occur if both nodes are running NFS over RDMA, but I'm unsure if it's VM traffic or the host being SPM that causes it to misbehave.
vm02 was running a single VM and was SPM. The crash was on vm02 "Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm02, Error: Network error during communication with the Host).". SPM successfully switched to vm01 but the VM is stuck in migration and unresponsive. Both engine and nodes using ovirt 3.3.3.
vm01 and vm02 both have the following in vdsm.conf
[addresses] management_port = 54321
[vars] ssl = true
[irs] nfs_mount_options = rdma,port=20049
This is the ovirt NFS mount lines in /proc/mounts for each:
vm01:
192.168.211.245:/tank/ovirt/import_export /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/iso /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/data /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0
vm02:
192.168.211.245:/tank/ovirt/import_export /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/iso /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/data /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0
The NFS server had these 2 log entries in /var/log/messages around the time vm02 went Non-operational.
Feb 9 17:27:59 vmstore1 kernel: svcrdma: Error fast registering memory for xprt ffff882014683400 Feb 9 17:28:21 vmstore1 kernel: svcrdma: Error fast registering memory for xprt ffff882025bf1400
This looks like the root cause - failure on the storage server This leads to failure in the hosts connected to this storage: Feb 2 13:37:11 vm01 kernel: rpcrdma: connection to 192.168.211.245:20049 closed (-103) ... Feb 3 07:44:31 vm01 kernel: ------------[ cut here ]------------ Feb 3 07:44:31 vm01 kernel: WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0() (Not tainted) Feb 3 07:44:31 vm01 kernel: Hardware name: H8DMT-IBX Feb 3 07:44:31 vm01 kernel: Modules linked in: ebt_arp xprtrdma nfs fscache auth_rpcgss nfs_acl bonding ebtable_nat ebtables softdog lockd sunrpc powernow_k8 freq_table mperf 8021q garp bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_ conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_round_robin dm_multipath vhost_net macvtap macvlan tun kvm_a md kvm microcode serio_raw k10temp amd64_edac_mod edac_core edac_mce_amd igb dca i2c_algo_bit ptp pps_core mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core sg i2c_nforce2 i2c_cor e ext4 jbd2 mbcache raid1 sd_mod crc_t10dif sata_nv ata_generic pata_acpi pata_amd dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Feb 3 07:44:31 vm01 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-431.3.1.el6.x86_64 #1 Feb 3 07:44:31 vm01 kernel: Call Trace: Feb 3 07:44:31 vm01 kernel: <IRQ> [<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0 Feb 3 07:44:31 vm01 kernel: [<ffffffff81071e7a>] ? warn_slowpath_null+0x1a/0x20 Feb 3 07:44:31 vm01 kernel: [<ffffffff8107a3ed>] ? local_bh_enable_ip+0x7d/0xb0 Feb 3 07:44:31 vm01 kernel: [<ffffffff8152a4fb>] ? _spin_unlock_bh+0x1b/0x20 Feb 3 07:44:31 vm01 kernel: [<ffffffffa044c4f0>] ? rpc_wake_up_status+0x70/0x80 [sunrpc] Feb 3 07:44:31 vm01 kernel: [<ffffffffa044579c>] ? xprt_wake_pending_tasks+0x2c/0x30 [sunrpc] Feb 3 07:44:31 vm01 kernel: [<ffffffffa02782fc>] ? rpcrdma_conn_func+0x9c/0xb0 [xprtrdma] Feb 3 07:44:31 vm01 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 Feb 3 07:44:31 vm01 kernel: [<ffffffffa027b450>] ? rpcrdma_qp_async_error_upcall+0x40/0x80 [xprtrdma] Feb 3 07:44:31 vm01 kernel: [<ffffffffa019a1cb>] ? mlx4_ib_qp_event+0x8b/0x100 [mlx4_ib] Feb 3 07:44:31 vm01 kernel: [<ffffffffa0138c54>] ? mlx4_qp_event+0x74/0xf0 [mlx4_core] Feb 3 07:44:31 vm01 kernel: [<ffffffffa01bd396>] ? igb_poll+0xb66/0x1020 [igb] Feb 3 07:44:31 vm01 kernel: [<ffffffffa0126057>] ? mlx4_eq_int+0x557/0xcb0 [mlx4_core] Feb 3 07:44:31 vm01 kernel: [<ffffffff8109a9a0>] ? posix_timer_fn+0x0/0xe0 Feb 3 07:44:31 vm01 kernel: [<ffffffff8109a982>] ? posix_timer_event+0x42/0x60 Feb 3 07:44:31 vm01 kernel: [<ffffffff810a7159>] ? ktime_get+0x69/0xf0 Feb 3 07:44:31 vm01 kernel: [<ffffffffa01267c4>] ? mlx4_msi_x_interrupt+0x14/0x20 [mlx4_core] Feb 3 07:44:31 vm01 kernel: [<ffffffff810e6ed0>] ? handle_IRQ_event+0x60/0x170 Feb 3 07:44:31 vm01 kernel: [<ffffffff810e982e>] ? handle_edge_irq+0xde/0x180 Feb 3 07:44:31 vm01 kernel: [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 Feb 3 07:44:31 vm01 kernel: [<ffffffff81530fec>] ? do_IRQ+0x6c/0xf0 Feb 3 07:44:31 vm01 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Feb 3 07:44:31 vm01 kernel: <EOI> [<ffffffff8103eacb>] ? native_safe_halt+0xb/0x10 Feb 3 07:44:31 vm01 kernel: [<ffffffff810167bd>] ? default_idle+0x4d/0xb0 Feb 3 07:44:31 vm01 kernel: [<ffffffff810168bd>] ? c1e_idle+0x9d/0x120 Feb 3 07:44:31 vm01 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Feb 3 07:44:31 vm01 kernel: [<ffffffff81520e2c>] ? start_secondary+0x2ac/0x2ef Feb 3 07:44:31 vm01 kernel: ---[ end trace 9d97672873a67a1f ]--- This looks like an error in the kernel. You should consult kernel/rpcrdma folks with this error. Which cause sanlock to fail to update the lease on the storage (expected): Feb 3 07:44:58 vm01 sanlock[2536]: 2014-02-03 07:44:58-0600 240689 [2536]: s1 check_our_lease failed 80 Then sanlock try to kill vdsm, the owner of the lease: Feb 3 07:44:58 vm01 sanlock[2536]: 2014-02-03 07:44:58-0600 240689 [2536]: s1 kill 3030 sig 15 count 1 ... Feb 3 07:45:06 vm01 sanlock[2536]: 2014-02-03 07:45:06-0600 240698 [2536]: s1 kill 3030 sig 15 count 10 Feb 3 07:45:07 vm01 sanlock[2536]: 2014-02-03 07:45:07-0600 240698 [2536]: dead 3030 ci 3 count 10 This makes the host Non-Responsive (expected). Now vdsm is restarted, which will make it responsive again: Feb 3 07:45:07 vm01 respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died, respawning slave But sanlock since there is no access to storage, the host is Non Operational (expected). The vm was starting a migration to the other host: Thread-26::DEBUG::2014-02-03 07:49:18,067::BindingXMLRPC::965::vds::(wrapper) client [192.168.202.99]::call vmMigrate with ({'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu.edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'},) {} flowID [7829ae2a] Thread-26::DEBUG::2014-02-03 07:49:18,067::API::463::vds::(migrate) {'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu. edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'} Thread-26::DEBUG::2014-02-03 07:49:18,068::BindingXMLRPC::972::vds::(wrapper) return vmMigrate with {'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0} The migration was almost complete after 20 seconds: Thread-29::INFO::2014-02-03 07:49:38,329::vm::815::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration Progress: 20 seconds elapsed, 99% of data processed, 99% of mem processed But it never completed: Thread-29::WARNING::2014-02-03 07:54:38,383::vm::792::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration is stuck: Hasn't progressed in 300.054134846 seconds. Aborting. CCing Michal to inspect why the migration has failed.
Attached is tar of the logs from vm01, vm02 and the engine server.
vm01 & vm02 folders contain files from '/var/log/messages /var/log/sanlock.log /var/log/vdsm/*.log' engine from '/var/log/messages /var/log/ovirt-engine/*.log'
Thanks - Trey
On Sun, Feb 9, 2014 at 4:15 PM, Itamar Heim <iheim@redhat.com> wrote:
On 02/03/2014 06:58 PM, Trey Dockendorf wrote:
I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup for production use on our HPC system for managing our VM infrastructure. Currently I'm trying to utilize our DDR InfiniBand fabric for the storage domains in oVirt using NFS over RDMA. I've noticed some unstable behavior and it seems in every case to begin with sanlock.
The ovirt web admin interface shows the following message as first sign of trouble on 2014-Feb-03 07:45.
"Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error during communication with the Host.).".
The single VM I had running is stuck in the "Migrating From" state. virsh shows the VM paused on the crashed host and the one it attempted to migrate to.
Right now I have a few concerns.
1) The cause of the sanlock (or other instability) and if it's related to a bug or an issue using NFSoRDMA.
vdsm and sanlock seem to behave as they should when storage is not accessible.
2) Why the VM failed to migrate if the second host had no issues. If
Virt team will have to answer this.
the first host is down should the VM be considered offline and booted on the second host after first is fenced?
The host was not fenced, and was not down. It was up and the vm was still running, possibly accessing the storage.
Attached are logs from the failed host (vm01) and the healthy host (vm02) as well as engine. The failed host's /var/log/message is also attached (vm01_message.log).
Thanks - Trey
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
was this resolved?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "Trey Dockendorf" <treydock@gmail.com> Cc: "users" <users@ovirt.org>, "Michal Skrivanek" <mskrivan@redhat.com> Sent: Wednesday, February 12, 2014 10:04:04 AM Subject: Re: [Users] Host Non-Operational from sanlock and VM fails to migrate
[...]
The vm was starting a migration to the other host:
Thread-26::DEBUG::2014-02-03 07:49:18,067::BindingXMLRPC::965::vds::(wrapper) client [192.168.202.99]::call vmMigrate with ({'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu.edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'},) {} flowID [7829ae2a] Thread-26::DEBUG::2014-02-03 07:49:18,067::API::463::vds::(migrate) {'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu. edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'} Thread-26::DEBUG::2014-02-03 07:49:18,068::BindingXMLRPC::972::vds::(wrapper) return vmMigrate with {'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0}
The migration was almost complete after 20 seconds:
Thread-29::INFO::2014-02-03 07:49:38,329::vm::815::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration Progress: 20 seconds elapsed, 99% of data processed, 99% of mem processed
But it never completed:
Thread-29::WARNING::2014-02-03 07:54:38,383::vm::792::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration is stuck: Hasn't progressed in 300.054134846 seconds. Aborting.
CCing Michal to inspect why the migration has failed.
Hi, I had a look at the logs, and this looks like another libvirt/qemu I/O related issue. If QEMU on the src hosts cannot reliably access storage, migration may get stuck; this seems the case given the information provided; VDSM detected the migration was not progressing and aborted it. libvirt has an option (which we already use) to detect those scenarios, the VIR_MIGRATE_ABORT_ON_ERROR flag, but unfortunately this is not 100% reliable yet, for reasons outlined below in this mail. We are aware of this issue and we are actively working to improve the handling of such scenarios, but actually this is mostly on QEMU. The core issue here is that when we use NFS (or ISCSI), and there is an I/O error, QEMU can get blocked inside the kernel, waiting for the faulty I/O operation to complete, and thus fail to report an I/O error. It really depends on what specific operation fails, and there are many possible cases and error scenarios. Of course, if QEMU is blocked and fails to report the I/O error, libvirt can do nothing to report/recover error, so VDSM can do even less. This is known and acknowledged both by libvirt and QEMU developers. But there are some good news, because newer versions of QEMU have improvements on this field: QEMU recently gained native block devices[1], which, among other things, will make it more robust in presence of I/O errors, and should improve the error reporting as well. RHEL7 should have a version of QEMU with native ISCSI; hopefully NFS will follow soon enough. HTH, +++ [1] for example, ISCSI, recently merged: http://comments.gmane.org/gmane.comp.emulators.qemu/92599 work on NFS is ongoing. -- Francesco Romani RedHat Engineering Virtualization R & D IRC: fromani

Thanks for the thorough response!
This looks like an error in the kernel. You should consult kernel/rpcrdma folks with this error.
Mind pointing me in the direction of how to get in contact with the appropriate mailing list to begin a dialog with the kernel/rpcrdma folks? Thanks - Trey On Wed, Feb 12, 2014 at 3:04 AM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
From: "Trey Dockendorf" <treydock@gmail.com> To: "Itamar Heim" <iheim@redhat.com> Cc: "users" <users@ovirt.org> Sent: Monday, February 10, 2014 3:03:05 AM Subject: Re: [Users] Host Non-Operational from sanlock and VM fails to migrate
No, in fact I just had the issue arise again after trying to figure out what about my setup causes this crash. So far it only seems to occur if both nodes are running NFS over RDMA, but I'm unsure if it's VM traffic or the host being SPM that causes it to misbehave.
vm02 was running a single VM and was SPM. The crash was on vm02 "Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm02, Error: Network error during communication with the Host).". SPM successfully switched to vm01 but the VM is stuck in migration and unresponsive. Both engine and nodes using ovirt 3.3.3.
vm01 and vm02 both have the following in vdsm.conf
[addresses] management_port = 54321
[vars] ssl = true
[irs] nfs_mount_options = rdma,port=20049
This is the ovirt NFS mount lines in /proc/mounts for each:
vm01:
192.168.211.245:/tank/ovirt/import_export /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/iso /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/data /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0
vm02:
192.168.211.245:/tank/ovirt/import_export /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/iso /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0 192.168.211.245:/tank/ovirt/data /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245 0 0
The NFS server had these 2 log entries in /var/log/messages around the time vm02 went Non-operational.
Feb 9 17:27:59 vmstore1 kernel: svcrdma: Error fast registering memory for xprt ffff882014683400 Feb 9 17:28:21 vmstore1 kernel: svcrdma: Error fast registering memory for xprt ffff882025bf1400
This looks like the root cause - failure on the storage server
This leads to failure in the hosts connected to this storage:
Feb 2 13:37:11 vm01 kernel: rpcrdma: connection to 192.168.211.245:20049 closed (-103) ... Feb 3 07:44:31 vm01 kernel: ------------[ cut here ]------------ Feb 3 07:44:31 vm01 kernel: WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0() (Not tainted) Feb 3 07:44:31 vm01 kernel: Hardware name: H8DMT-IBX Feb 3 07:44:31 vm01 kernel: Modules linked in: ebt_arp xprtrdma nfs fscache auth_rpcgss nfs_acl bonding ebtable_nat ebtables softdog lockd sunrpc powernow_k8 freq_table mperf 8021q garp bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_ conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_round_robin dm_multipath vhost_net macvtap macvlan tun kvm_a md kvm microcode serio_raw k10temp amd64_edac_mod edac_core edac_mce_amd igb dca i2c_algo_bit ptp pps_core mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core sg i2c_nforce2 i2c_cor e ext4 jbd2 mbcache raid1 sd_mod crc_t10dif sata_nv ata_generic pata_acpi pata_amd dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Feb 3 07:44:31 vm01 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-431.3.1.el6.x86_64 #1 Feb 3 07:44:31 vm01 kernel: Call Trace: Feb 3 07:44:31 vm01 kernel: <IRQ> [<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0 Feb 3 07:44:31 vm01 kernel: [<ffffffff81071e7a>] ? warn_slowpath_null+0x1a/0x20 Feb 3 07:44:31 vm01 kernel: [<ffffffff8107a3ed>] ? local_bh_enable_ip+0x7d/0xb0 Feb 3 07:44:31 vm01 kernel: [<ffffffff8152a4fb>] ? _spin_unlock_bh+0x1b/0x20 Feb 3 07:44:31 vm01 kernel: [<ffffffffa044c4f0>] ? rpc_wake_up_status+0x70/0x80 [sunrpc] Feb 3 07:44:31 vm01 kernel: [<ffffffffa044579c>] ? xprt_wake_pending_tasks+0x2c/0x30 [sunrpc] Feb 3 07:44:31 vm01 kernel: [<ffffffffa02782fc>] ? rpcrdma_conn_func+0x9c/0xb0 [xprtrdma] Feb 3 07:44:31 vm01 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 Feb 3 07:44:31 vm01 kernel: [<ffffffffa027b450>] ? rpcrdma_qp_async_error_upcall+0x40/0x80 [xprtrdma] Feb 3 07:44:31 vm01 kernel: [<ffffffffa019a1cb>] ? mlx4_ib_qp_event+0x8b/0x100 [mlx4_ib] Feb 3 07:44:31 vm01 kernel: [<ffffffffa0138c54>] ? mlx4_qp_event+0x74/0xf0 [mlx4_core] Feb 3 07:44:31 vm01 kernel: [<ffffffffa01bd396>] ? igb_poll+0xb66/0x1020 [igb] Feb 3 07:44:31 vm01 kernel: [<ffffffffa0126057>] ? mlx4_eq_int+0x557/0xcb0 [mlx4_core] Feb 3 07:44:31 vm01 kernel: [<ffffffff8109a9a0>] ? posix_timer_fn+0x0/0xe0 Feb 3 07:44:31 vm01 kernel: [<ffffffff8109a982>] ? posix_timer_event+0x42/0x60 Feb 3 07:44:31 vm01 kernel: [<ffffffff810a7159>] ? ktime_get+0x69/0xf0 Feb 3 07:44:31 vm01 kernel: [<ffffffffa01267c4>] ? mlx4_msi_x_interrupt+0x14/0x20 [mlx4_core] Feb 3 07:44:31 vm01 kernel: [<ffffffff810e6ed0>] ? handle_IRQ_event+0x60/0x170 Feb 3 07:44:31 vm01 kernel: [<ffffffff810e982e>] ? handle_edge_irq+0xde/0x180 Feb 3 07:44:31 vm01 kernel: [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 Feb 3 07:44:31 vm01 kernel: [<ffffffff81530fec>] ? do_IRQ+0x6c/0xf0 Feb 3 07:44:31 vm01 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Feb 3 07:44:31 vm01 kernel: <EOI> [<ffffffff8103eacb>] ? native_safe_halt+0xb/0x10 Feb 3 07:44:31 vm01 kernel: [<ffffffff810167bd>] ? default_idle+0x4d/0xb0 Feb 3 07:44:31 vm01 kernel: [<ffffffff810168bd>] ? c1e_idle+0x9d/0x120 Feb 3 07:44:31 vm01 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Feb 3 07:44:31 vm01 kernel: [<ffffffff81520e2c>] ? start_secondary+0x2ac/0x2ef Feb 3 07:44:31 vm01 kernel: ---[ end trace 9d97672873a67a1f ]---
This looks like an error in the kernel. You should consult kernel/rpcrdma folks with this error.
Which cause sanlock to fail to update the lease on the storage (expected):
Feb 3 07:44:58 vm01 sanlock[2536]: 2014-02-03 07:44:58-0600 240689 [2536]: s1 check_our_lease failed 80
Then sanlock try to kill vdsm, the owner of the lease:
Feb 3 07:44:58 vm01 sanlock[2536]: 2014-02-03 07:44:58-0600 240689 [2536]: s1 kill 3030 sig 15 count 1 ... Feb 3 07:45:06 vm01 sanlock[2536]: 2014-02-03 07:45:06-0600 240698 [2536]: s1 kill 3030 sig 15 count 10 Feb 3 07:45:07 vm01 sanlock[2536]: 2014-02-03 07:45:07-0600 240698 [2536]: dead 3030 ci 3 count 10
This makes the host Non-Responsive (expected).
Now vdsm is restarted, which will make it responsive again:
Feb 3 07:45:07 vm01 respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died, respawning slave
But sanlock since there is no access to storage, the host is Non Operational (expected).
The vm was starting a migration to the other host:
Thread-26::DEBUG::2014-02-03 07:49:18,067::BindingXMLRPC::965::vds::(wrapper) client [192.168.202.99]::call vmMigrate with ({'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu.edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'},) {} flowID [7829ae2a] Thread-26::DEBUG::2014-02-03 07:49:18,067::API::463::vds::(migrate) {'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu. edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'} Thread-26::DEBUG::2014-02-03 07:49:18,068::BindingXMLRPC::972::vds::(wrapper) return vmMigrate with {'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0}
The migration was almost complete after 20 seconds:
Thread-29::INFO::2014-02-03 07:49:38,329::vm::815::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration Progress: 20 seconds elapsed, 99% of data processed, 99% of mem processed
But it never completed:
Thread-29::WARNING::2014-02-03 07:54:38,383::vm::792::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration is stuck: Hasn't progressed in 300.054134846 seconds. Aborting.
CCing Michal to inspect why the migration has failed.
Attached is tar of the logs from vm01, vm02 and the engine server.
vm01 & vm02 folders contain files from '/var/log/messages /var/log/sanlock.log /var/log/vdsm/*.log' engine from '/var/log/messages /var/log/ovirt-engine/*.log'
Thanks - Trey
On Sun, Feb 9, 2014 at 4:15 PM, Itamar Heim <iheim@redhat.com> wrote:
On 02/03/2014 06:58 PM, Trey Dockendorf wrote:
I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup for production use on our HPC system for managing our VM infrastructure. Currently I'm trying to utilize our DDR InfiniBand fabric for the storage domains in oVirt using NFS over RDMA. I've noticed some unstable behavior and it seems in every case to begin with sanlock.
The ovirt web admin interface shows the following message as first sign of trouble on 2014-Feb-03 07:45.
"Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error during communication with the Host.).".
The single VM I had running is stuck in the "Migrating From" state. virsh shows the VM paused on the crashed host and the one it attempted to migrate to.
Right now I have a few concerns.
1) The cause of the sanlock (or other instability) and if it's related to a bug or an issue using NFSoRDMA.
vdsm and sanlock seem to behave as they should when storage is not accessible.
2) Why the VM failed to migrate if the second host had no issues. If
Virt team will have to answer this.
the first host is down should the VM be considered offline and booted on the second host after first is fenced?
The host was not fenced, and was not down. It was up and the vm was still running, possibly accessing the storage.
Attached are logs from the failed host (vm01) and the healthy host (vm02) as well as engine. The failed host's /var/log/message is also attached (vm01_message.log).
Thanks - Trey
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
was this resolved?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Trey Dockendorf" <treydock@gmail.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org>, "Michal Skrivanek" <mskrivan@redhat.com>, "Ayal Baron" <abaron@redhat.com>, "Itamar Heim" <iheim@redhat.com> Sent: Wednesday, February 19, 2014 2:20:52 AM Subject: Re: [Users] Host Non-Operational from sanlock and VM fails to migrate
Thanks for the thorough response!
This looks like an error in the kernel. You should consult kernel/rpcrdma folks with this error.
Mind pointing me in the direction of how to get in contact with the appropriate mailing list to begin a dialog with the kernel/rpcrdma folks?
Probably using the kernel mailing list, irc channel or bug tracking system of your distribution. Nir
participants (4)
-
Francesco Romani
-
Itamar Heim
-
Nir Soffer
-
Trey Dockendorf