[ovirt-users] [Gluster-users] open error -13 = sanlock
paf1 at email.cz
paf1 at email.cz
Wed Mar 2 05:36:19 EST 2016
Hi guys,
thx a lot for your support .......at first.
Because we had been under huge time pressure, we found "google
workaround" which delete both files . It helped, probabbly at first
steps of recover .
eg: " # find /STORAGES/g1r5p5/GFS/ -samefile
/STORAGES/g1r5p5/GFS/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids
-print -delete "
---------------------->
Well at first I'll fix permittions from mount points to 660 .
If "ids" file will be writeable , can't became gluster colaps ??
regs.Pavel
On 2.3.2016 08:16, Ravishankar N wrote:
> On 03/02/2016 12:02 PM, Sahina Bose wrote:
>>
>>
>> On 03/02/2016 03:45 AM, Nir Soffer wrote:
>>> On Tue, Mar 1, 2016 at 10:51 PM, paf1 at email.cz <paf1 at email.cz> wrote:
>>> >
>>> > HI,
>>> > requested output:
>>> >
>>> > # ls -lh /rhev/data-center/mnt/glusterSD/localhost:*/*/dom_md
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md:
>>> > total 2,1M
>>> > -rw-rw---- 1 vdsm kvm 1,0M 1. bře 21.28 ids <-- good
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 22.16 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 7. lis 22.17 leases
>>> > -rw-r--r-- 1 vdsm kvm 335 7. lis 22.17 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 22.16 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P1/553d9b92-e4a0-4042-a579-4cabeb55ded4/dom_md:
>>> > total 1,1M
>>> > -rw-r--r-- 1 vdsm kvm 0 24. úno 07.41 ids <-- bad (sanlock
>>> cannot write, other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.14 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 7. lis 03.56 leases
>>> > -rw-r--r-- 1 vdsm kvm 333 7. lis 03.56 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.14 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md:
>>> > total 1,1M
>>> > -rw-r--r-- 1 vdsm kvm 0 24. úno 07.43 ids <-- bad (sanlock
>>> cannot write, other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.15 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 7. lis 22.14 leases
>>> > -rw-r--r-- 1 vdsm kvm 333 7. lis 22.14 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.15 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P3/3c34ad63-6c66-4e23-ab46-084f3d70b147/dom_md:
>>> > total 1,1M
>>> > -rw-r--r-- 1 vdsm kvm 0 24. úno 07.43 ids <-- bad (sanlock
>>> cannot write, other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 23. úno 22.51 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 23. úno 23.12 leases
>>> > -rw-r--r-- 1 vdsm kvm 998 25. úno 00.35 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.16 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md:
>>> > total 1,1M
>>> > -rw-r--r-- 1 vdsm kvm 0 24. úno 07.44 ids <-- bad (sanlock
>>> cannot write, other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.17 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 7. lis 00.18 leases
>>> > -rw-r--r-- 1 vdsm kvm 333 7. lis 00.18 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 00.17 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P1/42d710a9-b844-43dc-be41-77002d1cd553/dom_md:
>>> > total 1,1M
>>> > -rw-rw-r-- 1 vdsm kvm 0 24. úno 07.32 ids <-- bad (other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 22.18 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 7. lis 22.18 leases
>>> > -rw-r--r-- 1 vdsm kvm 333 7. lis 22.18 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 7. lis 22.18 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md:
>>> > total 3,0M
>>> > -rw-rw-r-- 1 vdsm kvm 1,0M 1. bře 21.28 ids <-- bad (other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 25. úno 00.42 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 25. úno 00.44 leases
>>> > -rw-r--r-- 1 vdsm kvm 997 24. úno 02.46 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 25. úno 00.44 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P3/ef010d08-aed1-41c4-ba9a-e6d9bdecb4b4/dom_md:
>>> > total 2,1M
>>> > -rw-r--r-- 1 vdsm kvm 0 24. úno 07.34 ids <-- bad (sanlock
>>> cannot write, other can read)
>>> > -rw-rw---- 1 vdsm kvm 16M 23. úno 22.35 inbox
>>> > -rw-rw---- 1 vdsm kvm 2,0M 23. úno 22.38 leases
>>> > -rw-r--r-- 1 vdsm kvm 1,1K 24. úno 19.07 metadata
>>> > -rw-rw---- 1 vdsm kvm 16M 23. úno 22.27 outbox
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12__P4/300e9ac8-3c2f-4703-9bb1-1df2130c7c97/dom_md:
>>> > total 3,0M
>>> > -rw-rw-r-- 1 vdsm kvm 1,0M 1. bře 21.28 ids <-- bad (other can read)
>>> > -rw-rw-r-- 1 vdsm kvm 16M 6. lis 23.50 inbox <-- bad (other can
>>> read)
>>> > -rw-rw-r-- 1 vdsm kvm 2,0M 6. lis 23.51 leases <-- bad
>>> (other can read)
>>> > -rw-rw-r-- 1 vdsm kvm 734 7. lis 02.13 metadata <-- bad
>>> (group can write, other can read)
>>> > -rw-rw-r-- 1 vdsm kvm 16M 6. lis 16.55 outbox <-- bad (other
>>> can read)
>>> >
>>> >
>>> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P5/1ca56b45-701e-4c22-9f59-3aebea4d8477/dom_md:
>>> > total 1,1M
>>> > -rw-rw-r-- 1 vdsm kvm 0 24. úno 07.35 ids <-- bad (other can read)
>>> > -rw-rw-r-- 1 vdsm kvm 16M 24. úno 01.06 inbox
>>> > -rw-rw-r-- 1 vdsm kvm 2,0M 24. úno 02.44 leases
>>> > -rw-r--r-- 1 vdsm kvm 998 24. úno 19.07 metadata
>>> > -rw-rw-r-- 1 vdsm kvm 16M 7. lis 22.20 outbox
>>>
>>>
>>> It should look like this:
>>>
>>> -rw-rw----. 1 vdsm kvm 1.0M Mar 1 23:36 ids
>>> -rw-rw----. 1 vdsm kvm 2.0M Mar 1 23:35 leases
>>> -rw-r--r--. 1 vdsm kvm 353 Mar 1 23:35 metadata
>>> -rw-rw----. 1 vdsm kvm 16M Mar 1 23:34 outbox
>>> -rw-rw----. 1 vdsm kvm 16M Mar 1 23:34 inbox
>>>
>>> This explains the EACCES error.
>>>
>>> You can start by fixing the permissions manually, you can do this
>>> online.
>>>
>>> > The ids files was generated by "touch" command after deleting
>>> them due "sanlock locking hang" gluster crash & reboot
>>> > I expected that they will be filled automaticaly after gluster
>>> reboot ( the shadow copy from ".gluster " directory was
>>> deleted & created empty too )
>>>
>>> I don't know about gluster shadow copy, I would not play with
>>> gluster internals.
>>> Adding Sahina for advice.
>>
>> Did you generate the ids file on the mount point.
>>
>> Ravi, can you help here?
>>
>
> Okay, so what I understand from the output above is you have different
> gluster volumes mounted and some of them have incorrect permissions
> for the 'ids' file. The way to fix it is to do it from the mount like
> Nir said.
> Why did you delete the file from the .glusterfs in the brick(s)? Was
> there a gfid split brain?
>
> -Ravi
>
>>>
>>> > OK, it looks that sanlock can't work with empty file or rewrite
>>> them .
>>> > Am I right ??
>>>
>>> Yes, the files must be initialized before sanlock can use them.
>>>
>>> You can initialize the file like this:
>>>
>>> sanlock direct init -s <sd_uuid>:0:repair/<sd_uuid>/dom_md/ids:0
>>>
>>> Taken from
>>> http://lists.ovirt.org/pipermail/users/2016-February/038046.html
>>>
>>> > The last point - about "ids" workaround - this is offline version
>>> = VMs have to be moved out from for continual running with
>>> maintenance volume mode
>>> > But this is not acceptable in current situation, so the question
>>> again, is it safe to do it online ?? ( YES / NO )
>>>
>>> The ids file is accessed only by sanlock. I guess that you don't
>>> have a running
>>> SPM on this DC, since sanlock fails to acquire a host id, so you are
>>> pretty safe
>>> to fix the permissions and initialize the ids files.
>>>
>>> I would do this:
>>>
>>> 1. Stop engine, so it will not try to start vdsm
>>> 2. Stop vdsm on all hosts, so they do not try to acquire a host id
>>> with sanlock
>>> This does not affect running vms
>>> 3. Fix the permissions on the ids file, via glusterfs mount
>>> 4. Initialize the ids files from one of the hosts, via the glusterfs
>>> mount
>>> This should fix the ids files on all replicas
>>> 5. Start vdsm on all hosts
>>> 6. Start engine
>>>
>>> Engine will connect to all hosts, hosts will connect to storage and
>>> try to acquire a host id.
>>> Then Engine will start the SPM on one of the hosts, and your DC
>>> should become up.
>>>
>>> David, Sahina, can you confirm that this procedure is safe?
>>
>> Yes, correcting from the mount point should fix it on all replicas
>>
>>
>>>
>>> Nir
>>>
>>> >
>>> > regs.
>>> > Pavel
>>> >
>>> >
>>> >
>>> > On 1.3.2016 18:38, Nir Soffer wrote:
>>> >
>>> > On Tue, Mar 1, 2016 at 5:07 PM, paf1 at email.cz <paf1 at email.cz> wrote:
>>> >>
>>> >> Hello, can anybody explain this error no.13 ( open file ) in
>>> sanlock.log .
>>> >
>>> >
>>> > This is EACCES
>>> >
>>> > Can you share the outoput of:
>>> >
>>> > ls -lh /rhev/data-center/mnt/<server>:<_path>/<sd_uuid>/dom_md
>>> >
>>> >>
>>> >>
>>> >> The size of "ids" file is zero (0)
>>> >
>>> >
>>> > This is how we create the ids file when initializing it.
>>> >
>>> > But then we use sanlock to initialize the ids file, and it should
>>> be 1MiB after that.
>>> >
>>> > Is this ids files created by vdsm, or one you created yourself?
>>> >
>>> >>
>>> >> 2016-02-28 03:25:46+0100 269626 [1951]: open error -13
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids
>>> >> 2016-02-28 03:25:46+0100 269626 [1951]: s187985 open_disk
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids
>>> error -13
>>> >> 2016-02-28 03:25:56+0100 269636 [11304]: s187992 lockspace
>>> 7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>>> >>
>>> >> If the main problem is about zero file size, can I regenerate
>>> this file online securely , with no VM dependence ????
>>> >
>>> >
>>> > Yes, I think I already referred to the instructions how to do that
>>> in a previous mail.
>>> >
>>> >>
>>> >>
>>> >> dist = RHEL - 7 - 2.1511
>>> >> kernel = 3.10.0 - 327.10.1.el7.x86_64
>>> >> KVM = 2.3.0 - 29.1.el7
>>> >> libvirt = libvirt-1.2.17-13.el7_2.3
>>> >> vdsm = vdsm-4.16.30-0.el7
>>> >> GlusterFS = glusterfs-3.7.8-1.el7
>>> >>
>>> >>
>>> >> regs.
>>> >> Pavel
>>> >>
>>> >> _______________________________________________
>>> >> Users mailing list
>>> >> Users at ovirt.org <mailto:Users at ovirt.org>
>>> >> http://lists.ovirt.org/mailman/listinfo/users
>>> >>
>>> >
>>> >
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160302/48d10fdb/attachment-0001.html>
More information about the Users
mailing list