[ovirt-users] sanlock ids file broken after server crash (Fixed)

Johan Bernhardsson johan at kafit.se
Tue Aug 1 20:27:45 UTC 2017


I have two options. One it got there when we moved the servers from our
lab desk to the hosting site. We had some problems getting it running. 

Or two a couple of weeks ago two servers rebooted after high load. That
might have caused a damage to the file.

I did manage to move all servers from that storage and removed it,
cleaned it and added it as a new storage.

Not what i wanted but it solved the problem.

/Johan

On Sun, 2017-07-30 at 16:24 +0300, Maor Lipchuk wrote:
> Hi David,
> 
> I'm not sure how it got to that character in the first place.
> Nir, Is there a safe way to fix that while there are running VMs?
> 
> Regards,
> Maor
> 
> On Sun, Jul 30, 2017 at 11:58 AM, Johan Bernhardsson <johan at kafit.se>
> wrote:
> > 
> > (First reply did not get to the list)
> > 
> > From sanlock.log:
> > 
> > 2017-07-30 10:49:31+0200 1766275 [1171]: s310751 lockspace
> > 0924ff77-
> > ef51-435b-b90d-50bfbf2e8de7:1:/rhev/data-
> > center/mnt/glusterSD/vbgsan02:_fs02/0924ff77-ef51-435b-b90d-
> > 50bfbf2e8de7/dom_md/ids:0
> > 2017-07-30 10:49:31+0200 1766275 [10496]: verify_leader 1 wrong
> > space
> > name 0924ff77-ef51-435b-b90d-50bfbf2e<D5>ke7 0924ff77-ef51-435b-
> > b90d-
> > 50bfbf2e8de7 /rhev/data-
> > center/mnt/glusterSD/vbgsan02:_fs02/0924ff77-
> > ef51-435b-b90d-50bfbf2e8de7/dom_md/ids
> > 2017-07-30 10:49:31+0200 1766275 [10496]: leader1
> > delta_acquire_begin
> > error -226 lockspace 0924ff77-ef51-435b-b90d-50bfbf2e8de7 host_id 1
> > 2017-07-30 10:49:31+0200 1766275 [10496]: leader2 path /rhev/data-
> > center/mnt/glusterSD/vbgsan02:_fs02/0924ff77-ef51-435b-b90d-
> > 50bfbf2e8de7/dom_md/ids offset 0
> > 2017-07-30 10:49:31+0200 1766275 [10496]: leader3 m 12212010 v
> > 30003 ss
> > 512 nh 0 mh 4076 oi 1 og 2031079063 lv 0
> > 2017-07-30 10:49:31+0200 1766275 [10496]: leader4 sn 0924ff77-ef51-
> > 435b-b90d-50bfbf2e<D5>ke7 rn <93><F6>7^\afa5-3a91-415b-a04c-
> > 221d3e060163.vbgkvm01.a ts 4351980 cs eefa4dd7
> > 2017-07-30 10:49:32+0200 1766276 [1171]: s310751 add_lockspace fail
> > result -226
> > 
> > 
> > vdsm logs doesnt have any errors and engine.log does not have any
> > errors.
> > 
> > And if i check the ids file manually. I can see that everything in
> > it
> > is correct except for the first host in the cluster where the space
> > name and host id is broken.
> > 
> > 
> > /Johan
> > 
> > On Sun, 2017-07-30 at 11:18 +0300, Maor Lipchuk wrote:
> > > 
> > > Hi Johan,
> > > 
> > > Can you please share the vdsm and engine logs.
> > > 
> > > Also, it won't harm to also get the sanlock logs just in case
> > > sanlock
> > > was configured to save all debugging in a log file (see
> > > http://people.redhat.com/teigland/sanlock-messages.txt)).
> > > Try to share the sanlock ouput by running  'sanlock client
> > > status',
> > > 'sanlock client log_dump'.
> > > 
> > > Regards,
> > > Maor
> > > 
> > > On Thu, Jul 27, 2017 at 6:18 PM, Johan Bernhardsson <johan at kafit.
> > > se>
> > > wrote:
> > > > 
> > > > 
> > > > Hello,
> > > > 
> > > > The ids file for sanlock is broken on one setup. The first host
> > > > id
> > > > in
> > > > the file is wrong.
> > > > 
> > > > From the logfile i have:
> > > > 
> > > > verify_leader 1 wrong space name 0924ff77-ef51-435b-b90d-
> > > > 50bfbf2e�ke7
> > > > 0924ff77-ef51-435b-b90d-50bfbf2e8de7 /rhev/data-
> > > > center/mnt/glusterSD/
> > > > 
> > > > 
> > > > 
> > > > Note the broken char in the space name.
> > > > 
> > > > This also apears. And it seams as the hostid too is broken in
> > > > the
> > > > ids
> > > > file:
> > > > 
> > > > leader4 sn 0924ff77-ef51-435b-b90d-50bfbf2e�ke7 rn ��7 afa5-
> > > > 3a91-
> > > > 415b-
> > > > a04c-221d3e060163.vbgkvm01.a ts 4351980 cs eefa4dd7
> > > > 
> > > > Note the broken chars there as well.
> > > > 
> > > > If i check the ids file with less or strings the first row
> > > > where my
> > > > vbgkvm01 host are. That has broken chars.
> > > > 
> > > > Can this be repaired in some way without taking down all the
> > > > virtual
> > > > machines on that storage?
> > > > 
> > > > 
> > > > /Johan
> > > > _______________________________________________
> > > > Users mailing list
> > > > Users at ovirt.org
> > > > http://lists.ovirt.org/mailman/listinfo/users


More information about the Users mailing list