--------------000306010201050105090700
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
After some indirect path to getting here, I believe that the cause of my
problem is with sanlock. (See thread on 'Data Center Status on the GUI flips
between "Non Responsive" and
"Contending" ' for how I got here).
My sanlock.log file is full of:
2014-01-26 21:36:19-0500 36474 [2905]: s4576 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
2014-01-26 21:36:19-0500 36474 [26767]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512
2014-01-26 21:36:19-0500 36474 [26767]: read_sectors delta_leader offset 0 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-26 21:36:20-0500 36475 [2905]: s4576 add_lockspace fail result -90
2014-01-26 21:36:26-0500 36481 [2906]: s4577 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
2014-01-26 21:36:26-0500 36481 [26809]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512
2014-01-26 21:36:26-0500 36481 [26809]: read_sectors delta_leader offset 0 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-26 21:36:27-0500 36482 [2906]: s4577 add_lockspace fail result -90
The only other thing is after rebooting, when it says:
2014-01-26 11:28:53-0500 29 [2897]: sanlock daemon started 2.8 host
09423427-2f70-4f68-9696-e41a93937ef0.office2a.l
2014-01-26 11:29:22-0500 57 [2905]: s1 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
2014-01-26 11:29:22-0500 57 [3569]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c001000 result 0:0 match len 512
2014-01-26 11:29:22-0500 57 [3569]: read_sectors delta_leader offset 0 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-26 11:29:23-0500 58 [2905]: s1 add_lockspace fail result -90
2014-01-26 11:29:32-0500 67 [2905]: s2 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
2014-01-26 11:29:32-0500 67 [3731]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512
2014-01-26 11:29:32-0500 67 [3731]: read_sectors delta_leader offset 0 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-26 11:29:33-0500 68 [2905]: s2 add_lockspace fail result -90
2014-01-26 11:29:42-0500 77 [2905]: s3 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
2014-01-26 11:29:42-0500 77 [3806]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512
2014-01-26 11:29:42-0500 77 [3806]: read_sectors delta_leader offset 0 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-26 11:29:43-0500 78 [2905]: s3 add_lockspace fail result -90
Background:
Setup is 2 hosts on Centos 6.5 with gluster storage, engine on separate
Centos 6.5, all up to date.
Originally set up running:
Data Center: default
Cluster: default
Master storage domain: VM
log had entries like:
2013-12-18 23:20:39-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -103:0 match res
2013-12-18 23:20:39-0500 4510 [3318]: s1 delta_renew read rv -103 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:39-0500 4510 [3318]: s1 renewal error -103 delta_length 9
last_success 4481
2013-12-18 23:20:40-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res
2013-12-18 23:20:40-0500 4510 [3318]: s1 delta_renew read rv -107 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:40-0500 4510 [3318]: s1 renewal error -107 delta_length 0
last_success 4481
2013-12-18 23:20:40-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res
2013-12-18 23:20:40-0500 4511 [3318]: s1 delta_renew read rv -107 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:40-0500 4511 [3318]: s1 renewal error -107 delta_length 0
last_success 4481
2013-12-18 23:20:41-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res
2013-12-18 23:20:41-0500 4511 [3318]: s1 delta_renew read rv -107 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:41-0500 4511 [3318]: s1 renewal error -107 delta_length 0
last_success 4481
storage 'VM' had a Split-Brain. In trying to fix split-brain, file
dom_md/ids was erased. I never revived the Data Center 'default' after that.
log is full of:
2014-01-16 17:23:21-0500 62902 [29262]: open error -13
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2014-01-16 17:23:21-0500 62902 [29262]: s3 open_disk
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids error
-13
2014-01-16 17:23:22-0500 62903 [3368]: s3 add_lockspace fail result -19
2014-01-16 17:23:31-0500 62912 [3368]: s4 lockspace
9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88:1:/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids:0
Jan. 16: Created new: Data Center: mill; Cluster: one; Master storage
domain: 'VM2' All was working.
Logs looked like this:
2013-12-18 23:20:39-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -103:0 match res
2013-12-18 23:20:39-0500 4510 [3318]: s1 delta_renew read rv -103 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:39-0500 4510 [3318]: s1 renewal error -103 delta_length 9
last_success 4481
2013-12-18 23:20:40-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res
2013-12-18 23:20:40-0500 4510 [3318]: s1 delta_renew read rv -107 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:40-0500 4510 [3318]: s1 renewal error -107 delta_length 0
last_success 4481
2013-12-18 23:20:40-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res
2013-12-18 23:20:40-0500 4511 [3318]: s1 delta_renew read rv -107 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:40-0500 4511 [3318]: s1 renewal error -107 delta_length 0
last_success 4481
2013-12-18 23:20:41-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res
2013-12-18 23:20:41-0500 4511 [3318]: s1 delta_renew read rv -107 offset 0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids
2013-12-18 23:20:41-0500 4511 [3318]: s1 renewal error -107 delta_length 0
last_success 4481
For a while, very little was logged at all, and this were working fine. Then,
Jan 22 logs went from:
2014-01-22 00:44:56-0500 521397 [30933]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 og 0 lv
0
2014-01-22 00:44:56-0500 521397 [30933]: leader4 sn rn ts 0 cs 0
2014-01-22 00:44:56-0500 521397 [30933]: s123 delta_renew verify_leader error -223
2014-01-22 00:44:56-0500 521397 [30933]: s123 renewal error -223 delta_length 0
last_success 521369
2014-01-22 00:44:57-0500 521398 [3362]: s123 kill 32599 sig 15 count 9
2014-01-22 00:44:57-0500 521398 [30933]: verify_leader 2 wrong magic 0
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-22 00:44:57-0500 521398 [30933]: leader1 delta_renew error -223 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2
2014-01-22 00:44:57-0500 521398 [30933]: leader2 path
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
offset 0
2014-01-22 00:44:57-0500 521398 [30933]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 og 0 lv
0
to:
2014-01-22 00:45:21-0500 521422 [17199]: 0322a407 aio collect 0
0x7f87ac0008c0:0x7f87ac0008d0:0x7f87ac101000 result 0:0 match len 512
2014-01-22 00:45:21-0500 521422 [17199]: read_sectors delta_leader offset 512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-22 00:45:22-0500 521423 [3368]: s125 add_lockspace fail result -90
2014-01-22 00:45:25-0500 521426 [3368]: s126 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
2014-01-22 00:45:25-0500 521426 [17243]: 0322a407 aio collect 0
0x7f87ac0008c0:0x7f87ac0008d0:0x7f87ac101000 result 0:0 match len 512
2014-01-22 00:45:25-0500 521426 [17243]: read_sectors delta_leader offset 512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
2014-01-22 00:45:26-0500 521427 [3368]: s126 add_lockspace fail result -90
2014-01-22 00:45:36-0500 521437 [3368]: s127 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0
and the Data Center has been down ever since.
Best I can figure is that something is supposed to be in dom_md/ids, but that
file is empty:
ls
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
-l
total 1029
-rw-rw---- 1 vdsm kvm 0 Jan 22 00:44 ids
-rw-rw---- 1 vdsm kvm 0 Jan 16 18:50 inbox
-rw-rw---- 1 vdsm kvm 2097152 Jan 21 18:20 leases
-rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata
-rw-rw---- 1 vdsm kvm 0 Jan 16 18:50 outbox
Any hints as to how to put whatever is needed into 'ids', or reinitialize the
sanlock system--or a better diagnosis and solution--gladly accepted.
Ted Miller
Elkhart, IN, USA
--------------000306010201050105090700
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#FFFFFF">
After some indirect path to getting here, I believe that the cause
of my problem is with sanlock. (See thread on 'Data Center Status
on the GUI flips between "Non Responsive" and
<br>
"Contending" '
for how I got here).<br>
<br>
My sanlock.log file is full of:<br>
<br>
<pre>2014-01-26 21:36:19-0500 36474 [2905]: s4576 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
<pre>2014-01-26 21:36:19-0500 36474 [26767]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512</pre>
<pre>2014-01-26 21:36:19-0500 36474 [26767]: read_sectors delta_leader offset 0
rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-26 21:36:20-0500 36475 [2905]: s4576 add_lockspace fail result
-90</pre>
<pre>2014-01-26 21:36:26-0500 36481 [2906]: s4577 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
<pre>2014-01-26 21:36:26-0500 36481 [26809]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512</pre>
<pre>2014-01-26 21:36:26-0500 36481 [26809]: read_sectors delta_leader offset 0
rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-26 21:36:27-0500 36482 [2906]: s4577 add_lockspace fail result
-90</pre>
<br>
The only other thing is after rebooting, when it says:<br>
<br>
<pre>2014-01-26 11:28:53-0500 29 [2897]: sanlock daemon started 2.8 host
09423427-2f70-4f68-9696-e41a93937ef0.office2a.l</pre>
<pre>2014-01-26 11:29:22-0500 57 [2905]: s1 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
<pre>2014-01-26 11:29:22-0500 57 [3569]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c001000 result 0:0 match len 512</pre>
<pre>2014-01-26 11:29:22-0500 57 [3569]: read_sectors delta_leader offset 0 rv
-90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-26 11:29:23-0500 58 [2905]: s1 add_lockspace fail result
-90</pre>
<pre>2014-01-26 11:29:32-0500 67 [2905]: s2 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
<pre>2014-01-26 11:29:32-0500 67 [3731]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512</pre>
<pre>2014-01-26 11:29:32-0500 67 [3731]: read_sectors delta_leader offset 0 rv
-90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-26 11:29:33-0500 68 [2905]: s2 add_lockspace fail result
-90</pre>
<pre>2014-01-26 11:29:42-0500 77 [2905]: s3 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:1:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
<pre>2014-01-26 11:29:42-0500 77 [3806]: 0322a407 aio collect 0
0x7f196c0008c0:0x7f196c0008d0:0x7f196c101000 result 0:0 match len 512</pre>
<pre>2014-01-26 11:29:42-0500 77 [3806]: read_sectors delta_leader offset 0 rv
-90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-26 11:29:43-0500 78 [2905]: s3 add_lockspace fail result
-90</pre>
<br>
Background:<br>
Setup is 2 hosts on Centos 6.5 with gluster storage, engine on
separate Centos 6.5, all up to date.<br>
Originally set up running:<br>
<blockquote>Data Center: default<br>
Cluster: default<br>
Master storage domain: VM<br>
</blockquote>
log had entries like:<br>
<blockquote>
<pre>2013-12-18 23:20:39-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -103:0 match res</pre>
<pre>2013-12-18 23:20:39-0500 4510 [3318]: s1 delta_renew read rv -103 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:39-0500 4510 [3318]: s1 renewal error -103 delta_length
9 last_success 4481</pre>
<pre>2013-12-18 23:20:40-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res</pre>
<pre>2013-12-18 23:20:40-0500 4510 [3318]: s1 delta_renew read rv -107 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:40-0500 4510 [3318]: s1 renewal error -107 delta_length
0 last_success 4481</pre>
<pre>2013-12-18 23:20:40-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res</pre>
<pre>2013-12-18 23:20:40-0500 4511 [3318]: s1 delta_renew read rv -107 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:40-0500 4511 [3318]: s1 renewal error -107 delta_length
0 last_success 4481</pre>
<pre>2013-12-18 23:20:41-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res</pre>
<pre>2013-12-18 23:20:41-0500 4511 [3318]: s1 delta_renew read rv -107 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:41-0500 4511 [3318]: s1 renewal error -107 delta_length
0 last_success 4481</pre>
<br>
</blockquote>
storage 'VM' had a Split-Brain. In trying to fix split-brain, file
dom_md/ids was erased. I never revived the Data Center 'default'
after that.<br>
<blockquote>log is full of: <br>
<pre>2014-01-16 17:23:21-0500 62902 [29262]: open error -13
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2014-01-16 17:23:21-0500 62902 [29262]: s3 open_disk
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids error
-13</pre>
<pre>2014-01-16 17:23:22-0500 62903 [3368]: s3 add_lockspace fail result
-19</pre>
<pre>2014-01-16 17:23:31-0500 62912 [3368]: s4 lockspace
9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88:1:/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids:0</pre>
<br>
</blockquote>
Jan. 16: Created new: Data Center: mill; Cluster: one; Master
storage domain: 'VM2' All was working.<br>
<blockquote>Logs looked like this:<br>
<pre>2013-12-18 23:20:39-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -103:0 match res</pre>
<pre>2013-12-18 23:20:39-0500 4510 [3318]: s1 delta_renew read rv -103 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:39-0500 4510 [3318]: s1 renewal error -103 delta_length
9 last_success 4481</pre>
<pre>2013-12-18 23:20:40-0500 4510 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res</pre>
<pre>2013-12-18 23:20:40-0500 4510 [3318]: s1 delta_renew read rv -107 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:40-0500 4510 [3318]: s1 renewal error -107 delta_length
0 last_success 4481</pre>
<pre>2013-12-18 23:20:40-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res</pre>
<pre>2013-12-18 23:20:40-0500 4511 [3318]: s1 delta_renew read rv -107 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:40-0500 4511 [3318]: s1 renewal error -107 delta_length
0 last_success 4481</pre>
<pre>2013-12-18 23:20:41-0500 4511 [3318]: 9b6e6bf1 aio collect 0
0x7fd3b40008c0:0x7fd3b40008d0:0x7fd3c59ae000 result -107:0 match res</pre>
<pre>2013-12-18 23:20:41-0500 4511 [3318]: s1 delta_renew read rv -107 offset
0
/rhev/data-center/mnt/10.41.65.4:_VM/9b6e6bf1-29a1-4abf-8e85-bf9c4a329c88/dom_md/ids</pre>
<pre>2013-12-18 23:20:41-0500 4511 [3318]: s1 renewal error -107 delta_length
0 last_success 4481</pre>
</blockquote>
For a while, very little was logged at all, and this were working
fine. Then, <br>
<br>
Jan 22 logs went from:<br>
<br>
<blockquote>
<pre>2014-01-22 00:44:56-0500 521397 [30933]: leader3 m 0 v 0 ss 0 nh 0 mh 0
oi 0 og 0 lv 0</pre>
<pre>2014-01-22 00:44:56-0500 521397 [30933]: leader4 sn
rn ts 0 cs 0</pre>
<pre>2014-01-22 00:44:56-0500 521397 [30933]: s123 delta_renew verify_leader
error -223</pre>
<pre>2014-01-22 00:44:56-0500 521397 [30933]: s123 renewal error -223
delta_length 0 last_success 521369</pre>
<pre>2014-01-22 00:44:57-0500 521398 [3362]: s123 kill 32599 sig 15 count
9</pre>
<pre>2014-01-22 00:44:57-0500 521398 [30933]: verify_leader 2 wrong magic 0
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-22 00:44:57-0500 521398 [30933]: leader1 delta_renew error -223
lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2</pre>
<pre>2014-01-22 00:44:57-0500 521398 [30933]: leader2 path
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
offset 0</pre>
<pre>2014-01-22 00:44:57-0500 521398 [30933]: leader3 m 0 v 0 ss 0 nh 0 mh 0
oi 0 og 0 lv 0</pre>
</blockquote>
<br>
to:<br>
<br>
<blockquote>
<pre>2014-01-22 00:45:21-0500 521422 [17199]: 0322a407 aio collect 0
0x7f87ac0008c0:0x7f87ac0008d0:0x7f87ac101000 result 0:0 match len 512</pre>
<pre>2014-01-22 00:45:21-0500 521422 [17199]: read_sectors delta_leader offset
512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-22 00:45:22-0500 521423 [3368]: s125 add_lockspace fail result
-90</pre>
<pre>2014-01-22 00:45:25-0500 521426 [3368]: s126 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
<pre>2014-01-22 00:45:25-0500 521426 [17243]: 0322a407 aio collect 0
0x7f87ac0008c0:0x7f87ac0008d0:0x7f87ac101000 result 0:0 match len 512</pre>
<pre>2014-01-22 00:45:25-0500 521426 [17243]: read_sectors delta_leader offset
512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids</pre>
<pre>2014-01-22 00:45:26-0500 521427 [3368]: s126 add_lockspace fail result
-90</pre>
<pre>2014-01-22 00:45:36-0500 521437 [3368]: s127 lockspace
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0</pre>
</blockquote>
<br>
and the Data Center has been down ever since.<br>
<br>
Best I can figure is that something is supposed to be in dom_md/ids,
but that file is empty:<br>
<br>
<pre>ls
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
-l</pre>
<pre>total 1029</pre>
<pre>-rw-rw---- 1 vdsm
kvm 0 Jan 22 00:44
ids</pre>
<pre>-rw-rw---- 1 vdsm
kvm 0 Jan 16 18:50
inbox</pre>
<pre>-rw-rw---- 1 vdsm kvm 2097152 Jan 21 18:20 leases</pre>
<pre>-rw-r--r-- 1 vdsm kvm 491 Jan 21
18:20 metadata</pre>
<pre>-rw-rw---- 1 vdsm
kvm 0 Jan 16 18:50 outbox
</pre>
Any hints as to how to put whatever is needed into 'ids', or
reinitialize the sanlock system--or a better diagnosis and
solution--gladly accepted.<br>
<br>
Ted Miller<br>
Elkhart, IN, USA<br>
<br>
</body>
</html>
--------------000306010201050105090700--