[ovirt-users] Sanlock issues after upgrading to 3.4

Jairo Rizzo jairo.rizzo at gmail.com
Sat Jun 7 16:22:46 EDT 2014


Hello,

I have a small 2-node cluster setup running Glusterfs in replication mode :

CentOS v6.5
kernel-2.6.32-431.17.1.el6.x86_64
vdsm-4.14.6-0.el6.x86_64
ovirt-engine-3.4.0-1.el6.noarch  (on 1 node)

Basically I was running ovirt-engine v 3.3 for months fine and then
upgraded to latest version of 3.3.X two days ago and could not join the
nodes to the cluster due to a version mismatch,basically this:
https://www.mail-archive.com/users@ovirt.org/msg17241.html  . After trying
to correct this problem I ended up upgrading to 3.4 which created a new and
challenng problem for me. Every couple of hours I get error messages like
this:

Jun  7 13:40:01 hv1 sanlock[2341]: 2014-06-07 13:40:01-0400 19647 [2341]:
s3 check_our_lease warning 70 last_success 19577
Jun  7 13:40:02 hv1 sanlock[2341]: 2014-06-07 13:40:02-0400 19648 [2341]:
s3 check_our_lease warning 71 last_success 19577
Jun  7 13:40:03 hv1 sanlock[2341]: 2014-06-07 13:40:03-0400 19649 [2341]:
s3 check_our_lease warning 72 last_success 19577
Jun  7 13:40:04 hv1 sanlock[2341]: 2014-06-07 13:40:04-0400 19650 [2341]:
s3 check_our_lease warning 73 last_success 19577
Jun  7 13:40:05 hv1 sanlock[2341]: 2014-06-07 13:40:05-0400 19651 [2341]:
s3 check_our_lease warning 74 last_success 19577
Jun  7 13:40:06 hv1 sanlock[2341]: 2014-06-07 13:40:06-0400 19652 [2341]:
s3 check_our_lease warning 75 last_success 19577
Jun  7 13:40:07 hv1 sanlock[2341]: 2014-06-07 13:40:07-0400 19653 [2341]:
s3 check_our_lease warning 76 last_success 19577
Jun  7 13:40:08 hv1 sanlock[2341]: 2014-06-07 13:40:08-0400 19654 [2341]:
s3 check_our_lease warning 77 last_success 19577
Jun  7 13:40:09 hv1 wdmd[2330]: test warning now 19654 ping 19644 close 0
renewal 19577 expire 19657 client 2341
sanlock_1e8615b0-7876-4a03-bdb0-352087fad0f3:1
Jun  7 13:40:09 hv1 wdmd[2330]: /dev/watchdog closed unclean
Jun  7 13:40:09 hv1 kernel: SoftDog: Unexpected close, not stopping
watchdog!
Jun  7 13:40:09 hv1 sanlock[2341]: 2014-06-07 13:40:09-0400 19655 [2341]:
s3 check_our_lease warning 78 last_success 19577
Jun  7 13:40:10 hv1 wdmd[2330]: test warning now 19655 ping 19644 close
19654 renewal 19577 expire 19657 client 2341
sanlock_1e8615b0-7876-4a03-bdb0-352087fad0f3:1
Jun  7 13:40:10 hv1 sanlock[2341]: 2014-06-07 13:40:10-0400 19656 [2341]:
s3 check_our_lease warning 79 last_success 19577
Jun  7 13:40:11 hv1 wdmd[2330]: test warning now 19656 ping 19644 close
19654 renewal 19577 expire 19657 client 2341
sanlock_1e8615b0-7876-4a03-bdb0-352087fad0f3:1
Jun  7 13:40:11 hv1 sanlock[2341]: 2014-06-07 13:40:11-0400 19657 [2341]:
s3 check_our_lease failed 80
Jun  7 13:40:11 hv1 sanlock[2341]: 2014-06-07 13:40:11-0400 19657 [2341]:
s3 all pids clear

Jun  7 13:40:11 hv1 wdmd[2330]: /dev/watchdog reopen
Jun  7 13:41:32 hv1 sanlock[2341]: 2014-06-07 13:41:32-0400 19738 [5050]:
s3 delta_renew write error -202
Jun  7 13:41:32 hv1 sanlock[2341]: 2014-06-07 13:41:32-0400 19738 [5050]:
s3 renewal error -202 delta_length 140 last_success 19577
Jun  7 13:41:42 hv1 sanlock[2341]: 2014-06-07 13:41:42-0400 19748 [5050]:
1e8615b0 close_task_aio 0 0x7fd3040008c0 busy
Jun  7 13:41:52 hv1 sanlock[2341]: 2014-06-07 13:41:52-0400 19758 [5050]:
1e8615b0 close_task_aio 0 0x7fd3040008c0 busy

This makes one of the nodes not be able to see the storage and all its VMs
will go into pause mode/stop. Wondering if you could provide some advice.
Thank you

--Rizzo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140607/dbe14447/attachment.html>


More information about the Users mailing list