Hi Jairo,
Can u please open a bug on this at [1].
Also can u please add the logs of the sanlock, vdsm and engine to the bug.
[1]
https://bugzilla.redhat.com/enter_bug.cgi?product=oVirt
Thanks,
Maor
On 06/07/2014 11:22 PM, Jairo Rizzo wrote:
Hello,
I have a small 2-node cluster setup running Glusterfs in replication mode :
CentOS v6.5
kernel-2.6.32-431.17.1.el6.x86_64
vdsm-4.14.6-0.el6.x86_64
ovirt-engine-3.4.0-1.el6.noarch (on 1 node)
Basically I was running ovirt-engine v 3.3 for months fine and then
upgraded to latest version of 3.3.X two days ago and could not join the
nodes to the cluster due to a version mismatch,basically this:
https://www.mail-archive.com/users@ovirt.org/msg17241.html . After
trying to correct this problem I ended up upgrading to 3.4 which created
a new and challenng problem for me. Every couple of hours I get error
messages like this:
Jun 7 13:40:01 hv1 sanlock[2341]: 2014-06-07 13:40:01-0400 19647
[2341]: s3 check_our_lease warning 70 last_success 19577
Jun 7 13:40:02 hv1 sanlock[2341]: 2014-06-07 13:40:02-0400 19648
[2341]: s3 check_our_lease warning 71 last_success 19577
Jun 7 13:40:03 hv1 sanlock[2341]: 2014-06-07 13:40:03-0400 19649
[2341]: s3 check_our_lease warning 72 last_success 19577
Jun 7 13:40:04 hv1 sanlock[2341]: 2014-06-07 13:40:04-0400 19650
[2341]: s3 check_our_lease warning 73 last_success 19577
Jun 7 13:40:05 hv1 sanlock[2341]: 2014-06-07 13:40:05-0400 19651
[2341]: s3 check_our_lease warning 74 last_success 19577
Jun 7 13:40:06 hv1 sanlock[2341]: 2014-06-07 13:40:06-0400 19652
[2341]: s3 check_our_lease warning 75 last_success 19577
Jun 7 13:40:07 hv1 sanlock[2341]: 2014-06-07 13:40:07-0400 19653
[2341]: s3 check_our_lease warning 76 last_success 19577
Jun 7 13:40:08 hv1 sanlock[2341]: 2014-06-07 13:40:08-0400 19654
[2341]: s3 check_our_lease warning 77 last_success 19577
Jun 7 13:40:09 hv1 wdmd[2330]: test warning now 19654 ping 19644 close
0 renewal 19577 expire 19657 client 2341
sanlock_1e8615b0-7876-4a03-bdb0-352087fad0f3:1
Jun 7 13:40:09 hv1 wdmd[2330]: /dev/watchdog closed unclean
Jun 7 13:40:09 hv1 kernel: SoftDog: Unexpected close, not stopping
watchdog!
Jun 7 13:40:09 hv1 sanlock[2341]: 2014-06-07 13:40:09-0400 19655
[2341]: s3 check_our_lease warning 78 last_success 19577
Jun 7 13:40:10 hv1 wdmd[2330]: test warning now 19655 ping 19644 close
19654 renewal 19577 expire 19657 client 2341
sanlock_1e8615b0-7876-4a03-bdb0-352087fad0f3:1
Jun 7 13:40:10 hv1 sanlock[2341]: 2014-06-07 13:40:10-0400 19656
[2341]: s3 check_our_lease warning 79 last_success 19577
Jun 7 13:40:11 hv1 wdmd[2330]: test warning now 19656 ping 19644 close
19654 renewal 19577 expire 19657 client 2341
sanlock_1e8615b0-7876-4a03-bdb0-352087fad0f3:1
Jun 7 13:40:11 hv1 sanlock[2341]: 2014-06-07 13:40:11-0400 19657
[2341]: s3 check_our_lease failed 80
Jun 7 13:40:11 hv1 sanlock[2341]: 2014-06-07 13:40:11-0400 19657
[2341]: s3 all pids clear
Jun 7 13:40:11 hv1 wdmd[2330]: /dev/watchdog reopen
Jun 7 13:41:32 hv1 sanlock[2341]: 2014-06-07 13:41:32-0400 19738
[5050]: s3 delta_renew write error -202
Jun 7 13:41:32 hv1 sanlock[2341]: 2014-06-07 13:41:32-0400 19738
[5050]: s3 renewal error -202 delta_length 140 last_success 19577
Jun 7 13:41:42 hv1 sanlock[2341]: 2014-06-07 13:41:42-0400 19748
[5050]: 1e8615b0 close_task_aio 0 0x7fd3040008c0 busy
Jun 7 13:41:52 hv1 sanlock[2341]: 2014-06-07 13:41:52-0400 19758
[5050]: 1e8615b0 close_task_aio 0 0x7fd3040008c0 busy
This makes one of the nodes not be able to see the storage and all its
VMs will go into pause mode/stop. Wondering if you could provide some
advice. Thank you
--Rizzo
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users