On Thu, Jul 28, 2016 at 6:29 PM David Gossage <dgossage@carouselchecks.com> wrote:
On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:
Hi,

Issue: Cannot find master domain
Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs



Any errrors in glusters brick or server logs?  The client gluster logs from ovirt?
Brick errors:
[2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null)
[2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument]
(Both repeated many times)

Server errors:
None

Client errors:
None

 

What version of gluster was running prior to update to 3.7.13? 
3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start?
As I checked the master domain is not mounted on any nodes.
Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13']
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess...
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer
    conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect
    six.reraise(t, v, tb)
  File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect
    self.getMountObj().getRecord().fs_file)
  File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess
    raise se.StorageServerAccessPermissionError(dirPath)
StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {}
jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.




Setup:
engine running on a separate node
3 x kvm/glusterd nodes

Status of volume: ovirt
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 172.16.0.11:/data/brick1/brick1       49152     0          Y       17304
Brick 172.16.0.12:/data/brick3/brick3       49155     0          Y       9363
Brick 172.16.0.13:/data/brick1/brick1       49152     0          Y       23684
Brick 172.16.0.11:/data/brick2/brick2       49153     0          Y       17323
Brick 172.16.0.12:/data/brick2/brick2       49153     0          Y       9382
Brick 172.16.0.13:/data/brick2/brick2       49153     0          Y       23703
NFS Server on localhost                     2049      0          Y       30508
Self-heal Daemon on localhost               N/A       N/A        Y       30521
NFS Server on 172.16.0.11                   2049      0          Y       24999
Self-heal Daemon on 172.16.0.11             N/A       N/A        Y       25016
NFS Server on 172.16.0.13                   2049      0          Y       25379
Self-heal Daemon on 172.16.0.13             N/A       N/A        Y       25509

Task Status of Volume ovirt
------------------------------------------------------------------------------
Task                 : Rebalance
ID                   : 84d5ab2a-275e-421d-842b-928a9326c19a
Status               : completed

Thanks,
Siavash

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users