Hey,
For prosperity: Sadly the only way to fix this was to re-init (wipe)
gluster and start from scratch.
-Chris.
On 03/02/2020 19:23, Strahil Nikolov wrote:
On February 3, 2020 2:29:55 PM GMT+02:00, Christian Reiss
<email(a)christian-reiss.de> wrote:
> Ugh,
>
> disregarding off all previous stamenets:
>
> new findinds: vdsm user can NOT read files larger than 64mb. Root can.
>
> [vdsm@node02:/rhev/data-cente[...]c51d8a18370] $ for i in 60 62 64 66
> 68
> ; do dd if=/dev/urandom of=file-$i bs=1M count=$i ; done
>
> [vdsm@node03:/rhev/data-cente[...]c51d8a18370] $ for i in 60 62 64 66
> 68
> ; do echo $i ; dd if=file-$i of=/dev/null ; done
> 60
> 122880+0 records in
> 122880+0 records out
> 62914560 bytes (63 MB) copied, 0.15656 s, 402 MB/s
> 62
> 126976+0 records in
> 126976+0 records out
> 65011712 bytes (65 MB) copied, 0.172463 s, 377 MB/s
> 64
> 131072+0 records in
> 131072+0 records out
> 67108864 bytes (67 MB) copied, 0.180701 s, 371 MB/s
> 66
> dd: error reading ‘file-66’: Permission denied
> 131072+0 records in
> 131072+0 records out
> 67108864 bytes (67 MB) copied, 0.105236 s, 638 MB/s
> 68
> dd: error reading ‘file-68’: Permission denied
> 131072+0 records in
> 131072+0 records out
> 67108864 bytes (67 MB) copied, 0.17046 s, 394 MB/s
>
>
> The files appeared instantly on all nodes, Writing large files work,
> however. Writing large files seem to work.
>
> I think this is the core issue.
>
>
> On 03/02/2020 12:22, Christian Reiss wrote:
>> Further findings:
>>
>> - modified data gets written to local node, not across gluster.
>> - vdsm user can create _new_ files on the cluster, this gets synced
>> immediatly.
>> - vdsm can modify, across all nodes newly created files, changes
> apply
>> immediately.
>>
>> I think vdsm user can not modify already existing files over the
>> gluster. Something selinux?
>>
>> -Chris.
>>
>> On 03/02/2020 11:46, Christian Reiss wrote:
>>> Hey,
>>>
>>> I think I am barking up the right tree with something (else) here;
>>> Note the timestamps & id's:
>>>
>>>
>>> dd'ing a disk image as vdsm user, try 1:
>>>
>>>
>
[vdsm@node03:/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net:_ssd__storage/fec2eb5e-21b5-496b-9ea5-f718b2cb5556/images/4a55b9c0-d550-4ecb-8dd1-cc1f24f2c7ac]
>
>>> $ date ; id ; dd if=5fca6d0e-e320-425b-a89a-f80563461add | pv | dd
>>> of=/dev/null
>>> Mon 3 Feb 11:39:13 CET 2020
>>> uid=36(vdsm) gid=36(kvm) groups=36(kvm),107(qemu),179(sanlock)
>>> context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
>>> dd: error reading ‘5fca6d0e-e320-425b-a89a-f80563461add’: Permission
>
>>> denied
>>> 131072+0 records in
>>> 131072+0 records out
>>> 67108864 bytes (67 MB) copied, 0.169465 s, 396 MB/s
>>> 64MiB 0:00:00 [ 376MiB/s] [ <=>
>>> ]
>>> 131072+0 records in
>>> 131072+0 records out
>>> 67108864 bytes (67 MB) copied, 0.171726 s, 391 MB/s
>>>
>>>
>>> try 2, directly afterward:
>>>
>>>
>
[vdsm@node03:/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net:_ssd__storage/fec2eb5e-21b5-496b-9ea5-f718b2cb5556/images/4a55b9c0-d550-4ecb-8dd1-cc1f24f2c7ac]
>
>>> $ date ; id ; dd if=5fca6d0e-e320-425b-a89a-f80563461add | pv | dd
>>> of=/dev/null
>>> Mon 3 Feb 11:39:16 CET 2020
>>> uid=36(vdsm) gid=36(kvm) groups=36(kvm),107(qemu),179(sanlock)
>>> context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
>>> dd: error reading ‘5fca6d0e-e320-425b-a89a-f80563461add’: Permission
>
>>> denied
>>> 131072+0 records in
>>> 131072+0 records out
>>> 67108864 bytes (67 MB) copied, 0.148846 s, 451 MB/s
>>> 64MiB 0:00:00 [ 427MiB/s] [ <=>
>>> ]
>>> 131072+0 records in
>>> 131072+0 records out
>>> 67108864 bytes (67 MB) copied, 0.149589 s, 449 MB/s
>>>
>>>
>>> try same as root:
>>>
>>>
>
[root@node03:/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net:_ssd__storage/fec2eb5e-21b5-496b-9ea5-f718b2cb5556/images/4a55b9c0-d550-4ecb-8dd1-cc1f24f2c7ac]
>
>>> # date ; id ; dd if=5fca6d0e-e320-425b-a89a-f80563461add | pv | dd
>>> of=/dev/null
>>> Mon 3 Feb 11:39:33 CET 2020
>>> uid=0(root) gid=0(root) groups=0(root)
>>> context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
>>> 50GiB 0:03:06 [ 274MiB/s] [ <=>
>>> ]
>>> 104857600+0 records in
>>> 104857600+0 records out
>>> 53687091200 bytes (54 GB) copied, 186.501 s, 288 MB/s
>>> 104857600+0 records in
>>> 104857600+0 records out
>>> 53687091200 bytes (54 GB) copied, 186.502 s, 288 MB/s
>>>
>>>
>>> Followed by another vdsm dd test:
>>>
>>>
>
[vdsm@node03:/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net:_ssd__storage/fec2eb5e-21b5-496b-9ea5-f718b2cb5556/images/4a55b9c0-d550-4ecb-8dd1-cc1f24f2c7ac]
>
>>> $ date ; id ; dd if=5fca6d0e-e320-425b-a89a-f80563461add | pv | dd
>>> of=/dev/null
>>> Mon 3 Feb 11:42:46 CET 2020
>>> uid=36(vdsm) gid=36(kvm) groups=36(kvm),107(qemu),179(sanlock)
>>> context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
>>> 50GiB 0:02:56 [ 290MiB/s] [ <=> ]
>>> 104857600+0 records in
>>> 104857600+0 records out
>>> 53687091200 bytes (54 GB) copied, 176.189 s, 305 MB/s
>>> 104857600+0 records in
>>> 104857600+0 records out
>>> 53687091200 bytes (54 GB) copied, 176.19 s, 305 MB/s
>>>
>>> So it's a permission problem (access denied) unless root loads it
> first?
>>> Strange: doing things like file & stat work; I can even cat the meta
>
>>> file (small text file). Seems only the disk images (or large files?)
>
>>> are affected.
>>>
>>> huh!?
>>>
>>
What version of gluster are you running ?
Have you tried the solution with :
1. Run fake setfacl ?
Or killing brick processes and start the volume with the 'force' option ?
I saw in your brick output your selinux context is 'unconfined_u' ... So check
for labeling.
Still, it looks like my ACL issue.
Best Regards,
Strahil Nikolov
--
with kind regards,
mit freundlichen Gruessen,
Christian Reiss