[Users] Host Non-Operational from sanlock and VM fails to migrate

Wed Feb 19 00:20:52 UTC 2014

Thanks for the thorough response!

> This looks like an error in the kernel. You should consult kernel/rpcrdma folks with this error.

Mind pointing me in the direction of how to get in contact with the
appropriate mailing list to begin a dialog with the kernel/rpcrdma
folks?

Thanks
- Trey

On Wed, Feb 12, 2014 at 3:04 AM, Nir Soffer <nsoffer at redhat.com> wrote:
> ----- Original Message -----
>> From: "Trey Dockendorf" <treydock at gmail.com>
>> To: "Itamar Heim" <iheim at redhat.com>
>> Cc: "users" <users at ovirt.org>
>> Sent: Monday, February 10, 2014 3:03:05 AM
>> Subject: Re: [Users] Host Non-Operational from sanlock and VM fails to        migrate
>>
>> No, in fact I just had the issue arise again after trying to figure
>> out what about my setup causes this crash.  So far it only seems to
>> occur if both nodes are running NFS over RDMA, but I'm unsure if it's
>> VM traffic or the host being SPM that causes it to misbehave.
>>
>> vm02 was running a single VM and was SPM.  The crash was on vm02
>> "Invalid status on Data Center Default.  Setting Data Center status to
>> Non Responsive (On host vm02, Error: Network error during
>> communication with the Host).".  SPM successfully switched to vm01 but
>> the VM is stuck in migration and unresponsive.  Both engine and nodes
>> using ovirt 3.3.3.
>>
>> vm01 and vm02 both have the following in vdsm.conf
>>
>> [addresses]
>> management_port = 54321
>>
>> [vars]
>> ssl = true
>>
>>
>> [irs]
>> nfs_mount_options = rdma,port=20049
>>
>> This is the ovirt NFS mount lines in /proc/mounts for each:
>>
>> vm01:
>>
>> 192.168.211.245:/tank/ovirt/import_export
>> /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs
>> rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245
>> 0 0
>> 192.168.211.245:/tank/ovirt/iso
>> /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs
>> rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245
>> 0 0
>> 192.168.211.245:/tank/ovirt/data
>> /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs
>> rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245
>> 0 0
>>
>> vm02:
>>
>> 192.168.211.245:/tank/ovirt/import_export
>> /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_import__export nfs
>> rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245
>> 0 0
>> 192.168.211.245:/tank/ovirt/iso
>> /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_iso nfs
>> rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245
>> 0 0
>> 192.168.211.245:/tank/ovirt/data
>> /rhev/data-center/mnt/192.168.211.245:_tank_ovirt_data nfs
>> rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,mountaddr=192.168.211.245,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.211.245
>> 0 0
>>
>> The NFS server had these 2 log entries in /var/log/messages around the
>> time vm02 went Non-operational.
>>
>> Feb  9 17:27:59 vmstore1 kernel: svcrdma: Error fast registering
>> memory for xprt ffff882014683400
>> Feb  9 17:28:21 vmstore1 kernel: svcrdma: Error fast registering
>> memory for xprt ffff882025bf1400
>
> This looks like the root cause - failure on the storage server
>
> This leads to failure in the hosts connected to this storage:
>
> Feb  2 13:37:11 vm01 kernel: rpcrdma: connection to 192.168.211.245:20049 closed (-103)
> ...
> Feb  3 07:44:31 vm01 kernel: ------------[ cut here ]------------
> Feb  3 07:44:31 vm01 kernel: WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0() (Not tainted)
> Feb  3 07:44:31 vm01 kernel: Hardware name: H8DMT-IBX
> Feb  3 07:44:31 vm01 kernel: Modules linked in: ebt_arp xprtrdma nfs fscache auth_rpcgss nfs_acl bonding ebtable_nat ebtables softdog lockd sunrpc powernow_k8 freq_table mperf
> 8021q garp bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_
> conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_round_robin dm_multipath vhost_net macvtap macvlan tun kvm_a
> md kvm microcode serio_raw k10temp amd64_edac_mod edac_core edac_mce_amd igb dca i2c_algo_bit ptp pps_core mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core sg i2c_nforce2 i2c_cor
> e ext4 jbd2 mbcache raid1 sd_mod crc_t10dif sata_nv ata_generic pata_acpi pata_amd dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
> Feb  3 07:44:31 vm01 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-431.3.1.el6.x86_64 #1
> Feb  3 07:44:31 vm01 kernel: Call Trace:
> Feb  3 07:44:31 vm01 kernel: <IRQ>  [<ffffffff81071e27>] ? warn_slowpath_common+0x87/0xc0
> Feb  3 07:44:31 vm01 kernel: [<ffffffff81071e7a>] ? warn_slowpath_null+0x1a/0x20
> Feb  3 07:44:31 vm01 kernel: [<ffffffff8107a3ed>] ? local_bh_enable_ip+0x7d/0xb0
> Feb  3 07:44:31 vm01 kernel: [<ffffffff8152a4fb>] ? _spin_unlock_bh+0x1b/0x20
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa044c4f0>] ? rpc_wake_up_status+0x70/0x80 [sunrpc]
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa044579c>] ? xprt_wake_pending_tasks+0x2c/0x30 [sunrpc]
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa02782fc>] ? rpcrdma_conn_func+0x9c/0xb0 [xprtrdma]
> Feb  3 07:44:31 vm01 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa027b450>] ? rpcrdma_qp_async_error_upcall+0x40/0x80 [xprtrdma]
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa019a1cb>] ? mlx4_ib_qp_event+0x8b/0x100 [mlx4_ib]
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa0138c54>] ? mlx4_qp_event+0x74/0xf0 [mlx4_core]
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa01bd396>] ? igb_poll+0xb66/0x1020 [igb]
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa0126057>] ? mlx4_eq_int+0x557/0xcb0 [mlx4_core]
> Feb  3 07:44:31 vm01 kernel: [<ffffffff8109a9a0>] ? posix_timer_fn+0x0/0xe0
> Feb  3 07:44:31 vm01 kernel: [<ffffffff8109a982>] ? posix_timer_event+0x42/0x60
> Feb  3 07:44:31 vm01 kernel: [<ffffffff810a7159>] ? ktime_get+0x69/0xf0
> Feb  3 07:44:31 vm01 kernel: [<ffffffffa01267c4>] ? mlx4_msi_x_interrupt+0x14/0x20 [mlx4_core]
> Feb  3 07:44:31 vm01 kernel: [<ffffffff810e6ed0>] ? handle_IRQ_event+0x60/0x170
> Feb  3 07:44:31 vm01 kernel: [<ffffffff810e982e>] ? handle_edge_irq+0xde/0x180
> Feb  3 07:44:31 vm01 kernel: [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0
> Feb  3 07:44:31 vm01 kernel: [<ffffffff81530fec>] ? do_IRQ+0x6c/0xf0
> Feb  3 07:44:31 vm01 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11
> Feb  3 07:44:31 vm01 kernel: <EOI>  [<ffffffff8103eacb>] ? native_safe_halt+0xb/0x10
> Feb  3 07:44:31 vm01 kernel: [<ffffffff810167bd>] ? default_idle+0x4d/0xb0
> Feb  3 07:44:31 vm01 kernel: [<ffffffff810168bd>] ? c1e_idle+0x9d/0x120
> Feb  3 07:44:31 vm01 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
> Feb  3 07:44:31 vm01 kernel: [<ffffffff81520e2c>] ? start_secondary+0x2ac/0x2ef
> Feb  3 07:44:31 vm01 kernel: ---[ end trace 9d97672873a67a1f ]---
>
> This looks like an error in the kernel. You should consult kernel/rpcrdma folks with this error.
>
> Which cause sanlock to fail to update the lease on the storage (expected):
>
> Feb  3 07:44:58 vm01 sanlock[2536]: 2014-02-03 07:44:58-0600 240689 [2536]: s1 check_our_lease failed 80
>
> Then sanlock try to kill vdsm, the owner of the lease:
>
> Feb  3 07:44:58 vm01 sanlock[2536]: 2014-02-03 07:44:58-0600 240689 [2536]: s1 kill 3030 sig 15 count 1
> ...
> Feb  3 07:45:06 vm01 sanlock[2536]: 2014-02-03 07:45:06-0600 240698 [2536]: s1 kill 3030 sig 15 count 10
> Feb  3 07:45:07 vm01 sanlock[2536]: 2014-02-03 07:45:07-0600 240698 [2536]: dead 3030 ci 3 count 10
>
> This makes the host Non-Responsive (expected).
>
> Now vdsm is restarted, which will make it responsive again:
>
> Feb  3 07:45:07 vm01 respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died, respawning slave
>
> But sanlock since there is no access to storage, the host is Non Operational (expected).
>
> The vm was starting a migration to the other host:
>
> Thread-26::DEBUG::2014-02-03 07:49:18,067::BindingXMLRPC::965::vds::(wrapper) client [192.168.202.99]::call vmMigrate with ({'tunneled': 'false', 'dstqemu': '192.168.202.103',
> 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu.edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'},) {} flowID [7829ae2a]
> Thread-26::DEBUG::2014-02-03 07:49:18,067::API::463::vds::(migrate) {'tunneled': 'false', 'dstqemu': '192.168.202.103', 'src': 'vm01.brazos.tamu.edu', 'dst': 'vm02.brazos.tamu.
> edu:54321', 'vmId': '741f9811-db68-4dc4-a88a-7cb9be576e57', 'method': 'online'}
> Thread-26::DEBUG::2014-02-03 07:49:18,068::BindingXMLRPC::972::vds::(wrapper) return vmMigrate with {'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0}
>
> The migration was almost complete after 20 seconds:
>
> Thread-29::INFO::2014-02-03 07:49:38,329::vm::815::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration Progress: 20 seconds elapsed, 99% of data processed, 99% of mem processed
>
> But it never completed:
>
> Thread-29::WARNING::2014-02-03 07:54:38,383::vm::792::vm.Vm::(run) vmId=`741f9811-db68-4dc4-a88a-7cb9be576e57`::Migration is stuck: Hasn't progressed in 300.054134846 seconds. Aborting.
>
> CCing Michal to inspect why the migration has failed.
>
>>
>> Attached is tar of the logs from vm01, vm02 and the engine server.
>>
>> vm01 & vm02 folders contain files from '/var/log/messages
>> /var/log/sanlock.log /var/log/vdsm/*.log'
>> engine from '/var/log/messages /var/log/ovirt-engine/*.log'
>>
>> Thanks
>> - Trey
>>
>> On Sun, Feb 9, 2014 at 4:15 PM, Itamar Heim <iheim at redhat.com> wrote:
>> > On 02/03/2014 06:58 PM, Trey Dockendorf wrote:
>> >>
>> >> I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup
>> >> for production use on our HPC system for managing our VM
>> >> infrastructure.  Currently I'm trying to utilize our DDR InfiniBand
>> >> fabric for the storage domains in oVirt using NFS over RDMA.  I've
>> >> noticed some unstable behavior and it seems in every case to begin
>> >> with sanlock.
>> >>
>> >> The ovirt web admin interface shows the following message as first
>> >> sign of trouble on 2014-Feb-03 07:45.
>> >>
>> >> "Invalid status on Data Center Default. Setting Data Center status to
>> >> Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error
>> >> during communication with the Host.).".
>> >>
>> >> The single VM I had running is stuck in the "Migrating From" state.
>> >> virsh shows the VM paused on the crashed host and the one it attempted
>> >> to migrate to.
>> >>
>> >> Right now I have a few concerns.
>> >>
>> >> 1) The cause of the sanlock (or other instability) and if it's related
>> >> to a bug or an issue using NFSoRDMA.
>
> vdsm and sanlock seem to behave as they should when storage is not accessible.
>
>> >> 2) Why the VM failed to migrate if the second host had no issues.  If
>
> Virt team will have to answer this.
>
>> >> the first host is down should the VM be considered offline and booted
>> >> on the second host after first is fenced?
>
> The host was not fenced, and was not down. It was up and the vm was still
> running, possibly accessing the storage.
>
>> >>
>> >> Attached are logs from the failed host (vm01) and the healthy host
>> >> (vm02) as well as engine.  The failed host's /var/log/message is also
>> >> attached (vm01_message.log).
>> >>
>> >> Thanks
>> >> - Trey
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Users mailing list
>> >> Users at ovirt.org
>> >> http://lists.ovirt.org/mailman/listinfo/users
>> >>
>> >
>> > was this resolved?
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>