Thanks for the hint guys :)
Running the following commands on all nodes resolved the issue:
chmod u+rxw /data/brick*/brick*
chmod g+rx /data/brick*/brick*
I wonder if the upgrade process from
packages
caused this issue.
I have another oVirt cluster running in a data center and it has similar
package versions before this upgrade. I can try to upgrade one of the nodes
there next week and see if the directory permission issue happens again.
I will notify the gluster list and/or open a bug report for CentOS package
maintainers when I finish my test.
On Thu, Jul 28, 2016 at 9:29 PM David Gossage <dgossage(a)carouselchecks.com>
wrote:
On Thu, Jul 28, 2016 at 11:44 AM, Siavash Safi
<siavash.safi(a)gmail.com>
wrote:
> It seems that dir modes are wrong!?
> [root@node1 ~]# ls -ld /data/brick*/brick*
> drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick1/brick1
> drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2
> [root@node2 ~]# ls -ld /data/brick*/brick*
> drwxr-xr-x. 5 vdsm kvm 107 Apr 26 19:33 /data/brick1/brick1
> drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2
> drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick3/brick3
> [root@node3 ~]# ls -ld /data/brick*/brick*
> drw-------. 5 vdsm kvm 107 Jul 28 20:10 /data/brick1/brick1
> drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2
>
That would probably do it. kvm does read access. plus the lack of x on
directories isnt great either. I'd think since they are the bricks you
could maybe manually chmod them appropriately 755. I "think" gluster is
only tracking files under /data/brick1/brick1/* not /data/brick1/brick1
itself.
/data/brick3/brick3 is the only "new" one? Wonder if it could be a bug
from the brick move in some way. Might be something worth posting to
gluster list about.
>
> On Thu, Jul 28, 2016 at 9:06 PM Sahina Bose <sabose(a)redhat.com> wrote:
>
>>
>>
>> ----- Original Message -----
>> > From: "Siavash Safi" <siavash.safi(a)gmail.com>
>> > To: "Sahina Bose" <sabose(a)redhat.com>
>> > Cc: "David Gossage" <dgossage(a)carouselchecks.com>,
"users" <
>> users(a)ovirt.org>, "Nir Soffer" <nsoffer(a)redhat.com>,
>> > "Allon Mureinik" <amureini(a)redhat.com>
>> > Sent: Thursday, July 28, 2016 9:04:32 PM
>> > Subject: Re: [ovirt-users] Cannot find master domain
>> >
>> > Please check the attachment.
>>
>> Nothing out of place in the mount logs.
>>
>> Can you ensure the brick dir permissions are vdsm:kvm - even for the
>> brick that was replaced?
>>
>> >
>> > On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sabose(a)redhat.com>
wrote:
>> >
>> > >
>> > >
>> > > ----- Original Message -----
>> > > > From: "Siavash Safi" <siavash.safi(a)gmail.com>
>> > > > To: "Sahina Bose" <sabose(a)redhat.com>
>> > > > Cc: "David Gossage" <dgossage(a)carouselchecks.com>,
"users" <
>> > > users(a)ovirt.org>
>> > > > Sent: Thursday, July 28, 2016 8:35:18 PM
>> > > > Subject: Re: [ovirt-users] Cannot find master domain
>> > > >
>> > > > [root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/
>> > > > drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28
>> /rhev/data-center/mnt/glusterSD/
>> > > > [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/
>> > > > getfacl: Removing leading '/' from absolute path names
>> > > > # file: rhev/data-center/mnt/glusterSD/
>> > > > # owner: vdsm
>> > > > # group: kvm
>> > > > user::rwx
>> > > > group::r-x
>> > > > other::r-x
>> > > >
>> > >
>> > >
>> > > The ACLs look correct to me. Adding Nir/Allon for insights.
>> > >
>> > > Can you attach the gluster mount logs from this host?
>> > >
>> > >
>> > > > And as I mentioned in another message, the directory is empty.
>> > > >
>> > > > On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose
<sabose(a)redhat.com>
>> wrote:
>> > > >
>> > > > > Error from vdsm log: Permission settings on the specified
path
>> do not
>> > > > > allow access to the storage. Verify permission settings on
the
>> > > specified
>> > > > > storage path.: 'path = /rhev/data-center/mnt/glusterSD/
>> 172.16.0.11:
>> > > _ovirt'
>> > > > >
>> > > > > I remember another thread about a similar issue - can you
check
>> the ACL
>> > > > > settings on the storage path?
>> > > > >
>> > > > > ----- Original Message -----
>> > > > > > From: "Siavash Safi"
<siavash.safi(a)gmail.com>
>> > > > > > To: "David Gossage"
<dgossage(a)carouselchecks.com>
>> > > > > > Cc: "users" <users(a)ovirt.org>
>> > > > > > Sent: Thursday, July 28, 2016 7:58:29 PM
>> > > > > > Subject: Re: [ovirt-users] Cannot find master domain
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
>> > > > > dgossage(a)carouselchecks.com >
>> > > > > > wrote:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <
>> > > siavash.safi(a)gmail.com >
>> > > > > > wrote:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > Issue: Cannot find master domain
>> > > > > > Changes applied before issue started to happen:
replaced
>> > > > > > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:
>> > > /data/brick3/brick3,
>> > > > > did
>> > > > > > minor package upgrades for vdsm and glusterfs
>> > > > > >
>> > > > > > vdsm log:
https://paste.fedoraproject.org/396842/
>> > > > > >
>> > > > > >
>> > > > > > Any errrors in glusters brick or server logs? The
client
>> gluster logs
>> > > > > from
>> > > > > > ovirt?
>> > > > > > Brick errors:
>> > > > > > [2016-07-28 14:03:25.002396] E [MSGID: 113091]
>> > > [posix.c:178:posix_lookup]
>> > > > > > 0-ovirt-posix: null gfid for path (null)
>> > > > > > [2016-07-28 14:03:25.002430] E [MSGID: 113018]
>> > > [posix.c:196:posix_lookup]
>> > > > > > 0-ovirt-posix: lstat on null failed [Invalid argument]
>> > > > > > (Both repeated many times)
>> > > > > >
>> > > > > > Server errors:
>> > > > > > None
>> > > > > >
>> > > > > > Client errors:
>> > > > > > None
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > yum log:
https://paste.fedoraproject.org/396854/
>> > > > > >
>> > > > > > What version of gluster was running prior to update to
3.7.13?
>> > > > > > 3.7.11-1 from
gluster.org repository(after update ovirt
>> switched to
>> > > > > centos
>> > > > > > repository)
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Did it create gluster mounts on server when attempting
to
>> start?
>> > > > > > As I checked the master domain is not mounted on any
nodes.
>> > > > > > Restarting vdsmd generated following errors:
>> > > > > >
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir)
>> Creating
>> > > > > > directory:
/rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>> mode:
>> > > None
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
>> > > > >
>> > >
>>
18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)
>> > > > > > Using bricks: ['172.16.0.11',
'172.16.0.12', '172.16.0.13']
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd)
>> > > /usr/bin/taskset
>> > > > > > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run
--scope
>> > > > > > --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o
>> > > > > > backup-volfile-servers=172.16.0.12:172.16.0.13
172.16.0.11:
>> /ovirt
>> > > > > > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd
None)
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > > 18:50:57,789::__init__::318::IOProcessClient::(_run)
Starting
>> > > > > IOProcess...
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd)
>> > > /usr/bin/taskset
>> > > > > > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l
>> > > > > > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd
None)
>> > > > > > jsonrpc.Executor/5::ERROR::2016-07-28
>> > > > > >
18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer)
>> Could
>> > > not
>> > > > > > connect to storageServer
>> > > > > > Traceback (most recent call last):
>> > > > > > File "/usr/share/vdsm/storage/hsm.py", line
2470, in
>> > > connectStorageServer
>> > > > > > conObj.connect()
>> > > > > > File
"/usr/share/vdsm/storage/storageServer.py", line 248, in
>> connect
>> > > > > > six.reraise(t, v, tb)
>> > > > > > File
"/usr/share/vdsm/storage/storageServer.py", line 241, in
>> connect
>> > > > > > self.getMountObj().getRecord().fs_file)
>> > > > > > File "/usr/share/vdsm/storage/fileSD.py", line
79, in
>> > > validateDirAccess
>> > > > > > raise se.StorageServerAccessPermissionError(dirPath)
>> > > > > > StorageServerAccessPermissionError: Permission settings
on the
>> > > specified
>> > > > > path
>> > > > > > do not allow access to the storage. Verify permission
settings
>> on the
>> > > > > > specified storage path.: 'path =
>> > > > > > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer)
>> > > knownSDs: {}
>> > > > > > jsonrpc.Executor/5::INFO::2016-07-28
>> > > > > > 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run
and
>> protect:
>> > > > > > connectStorageServer, Return response:
{'statuslist':
>> [{'status':
>> > > 469,
>> > > > > 'id':
>> > > > > > u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare)
>> > > > > > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished:
>> {'statuslist':
>> > > > > > [{'status': 469, 'id':
>> u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>> > > > > > jsonrpc.Executor/5::DEBUG::2016-07-28
>> > > > > >
>> 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState)
>> > > > > > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from
state
>> > > preparing
>> > > > > ->
>> > > > > > state finished
>> > > > > >
>> > > > > > I can manually mount the gluster volume on the same
server.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Setup:
>> > > > > > engine running on a separate node
>> > > > > > 3 x kvm/glusterd nodes
>> > > > > >
>> > > > > > Status of volume: ovirt
>> > > > > > Gluster process TCP Port RDMA Port Online Pid
>> > > > > >
>> > > > >
>> > >
>> ------------------------------------------------------------------------------
>> > > > > > Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304
>> > > > > > Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363
>> > > > > > Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684
>> > > > > > Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323
>> > > > > > Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382
>> > > > > > Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703
>> > > > > > NFS Server on localhost 2049 0 Y 30508
>> > > > > > Self-heal Daemon on localhost N/A N/A Y 30521
>> > > > > > NFS Server on 172.16.0.11 2049 0 Y 24999
>> > > > > > Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016
>> > > > > > NFS Server on 172.16.0.13 2049 0 Y 25379
>> > > > > > Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509
>> > > > > >
>> > > > > > Task Status of Volume ovirt
>> > > > > >
>> > > > >
>> > >
>> ------------------------------------------------------------------------------
>> > > > > > Task : Rebalance
>> > > > > > ID : 84d5ab2a-275e-421d-842b-928a9326c19a
>> > > > > > Status : completed
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Siavash
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > _______________________________________________
>> > > > > > Users mailing list
>> > > > > > Users(a)ovirt.org
>> > > > > >
http://lists.ovirt.org/mailman/listinfo/users
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > _______________________________________________
>> > > > > > Users mailing list
>> > > > > > Users(a)ovirt.org
>> > > > > >
http://lists.ovirt.org/mailman/listinfo/users
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>