[ovirt-users] Cannot find master domain

David Gossage dgossage at carouselchecks.com
Thu Jul 28 15:35:57 UTC 2016


On Thu, Jul 28, 2016 at 10:28 AM, Siavash Safi <siavash.safi at gmail.com>
wrote:

> I created the directory with correct permissions:
> drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51
> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/
>
> It was removed after I tried to activate the storage from web.
>
> Is dir missing on all 3 oVirt nodes?  Did you create on all 3?

When you did test mount with oVirts mount options did permissions on files
after mount look proper?  Can you read/write to mount?


> Engine displays the master storage as inactive:
> [image: oVirt_Engine_Web_Administration.png]
>
>
> On Thu, Jul 28, 2016 at 7:40 PM David Gossage <dgossage at carouselchecks.com>
> wrote:
>
>> On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.safi at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 28, 2016 at 7:19 PM David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi at gmail.com>
>>>> wrote:
>>>>
>>>>> file system: xfs
>>>>> features.shard: off
>>>>>
>>>>
>>>> Ok was just seeing if matched up to the issues latest 3.7.x releases
>>>> have with zfs and sharding but doesn't look like your issue.
>>>>
>>>>  In your logs I see it mounts with thee commands.  What happens if you
>>>> use same to a test dir?
>>>>
>>>>  /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13
>>>> 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>>>>
>>>
>>> It mounts successfully:
>>> [root at node1 ~]# /usr/bin/mount -t glusterfs -o
>>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
>>> [root at node1 ~]# ls /mnt/
>>> 4697fbde-45fb-4f91-ac4c-5516bc59f683  __DIRECT_IO_TEST__
>>>
>>>
>>>> It then umounts it and complains short while later of permissions.
>>>>
>>>> StorageServerAccessPermissionError: Permission settings on the
>>>> specified path do not allow access to the storage. Verify permission
>>>> settings on the specified storage path.: 'path =
>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>>>>
>>>> Are the permissions of dirs to
>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?
>>>>
>>>
>>> /rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the
>>> directory after failure to cleanup?
>>>
>>
>> Maybe though I don't recall it ever being deleted unless you maybe
>> destroy detach storage. What if you create that directory and permissions
>> appropriately on any node missing then try and activate storage?
>>
>> In engine is it still displaying the master storage domain?
>>
>>
>>> How about on the bricks anything out of place?
>>>>
>>>
>>> I didn't notice anything.
>>>
>>>
>>>> Is gluster still using same options as before?  could it have reset the
>>>> user and group to not be 36?
>>>>
>>>
>>> All options seem to be correct, to make sure I ran "Optimize for Virt
>>> Store" from web.
>>>
>>> Volume Name: ovirt
>>> Type: Distributed-Replicate
>>> Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2
>>> Status: Started
>>> Number of Bricks: 2 x 3 = 6
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 172.16.0.11:/data/brick1/brick1
>>> Brick2: 172.16.0.12:/data/brick3/brick3
>>> Brick3: 172.16.0.13:/data/brick1/brick1
>>> Brick4: 172.16.0.11:/data/brick2/brick2
>>> Brick5: 172.16.0.12:/data/brick2/brick2
>>> Brick6: 172.16.0.13:/data/brick2/brick2
>>> Options Reconfigured:
>>> performance.readdir-ahead: on
>>> nfs.disable: off
>>> user.cifs: enable
>>> auth.allow: *
>>> performance.quick-read: off
>>> performance.read-ahead: off
>>> performance.io-cache: off
>>> performance.stat-prefetch: off
>>> cluster.eager-lock: enable
>>> network.remote-dio: enable
>>> cluster.quorum-type: auto
>>> cluster.server-quorum-type: server
>>> storage.owner-uid: 36
>>> storage.owner-gid: 36
>>> server.allow-insecure: on
>>> network.ping-timeout: 10
>>>
>>>
>>>>> On Thu, Jul 28, 2016 at 7:03 PM David Gossage <
>>>>> dgossage at carouselchecks.com> wrote:
>>>>>
>>>>>> On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>
>>>>>>>> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <
>>>>>>>> siavash.safi at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Issue: Cannot find master domain
>>>>>>>>> Changes applied before issue started to happen: replaced
>>>>>>>>> 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3,
>>>>>>>>> did minor package upgrades for vdsm and glusterfs
>>>>>>>>>
>>>>>>>>> vdsm log: https://paste.fedoraproject.org/396842/
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Any errrors in glusters brick or server logs?  The client gluster
>>>>>>>> logs from ovirt?
>>>>>>>>
>>>>>>> Brick errors:
>>>>>>> [2016-07-28 14:03:25.002396] E [MSGID: 113091]
>>>>>>> [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null)
>>>>>>> [2016-07-28 14:03:25.002430] E [MSGID: 113018]
>>>>>>> [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid
>>>>>>> argument]
>>>>>>> (Both repeated many times)
>>>>>>>
>>>>>>> Server errors:
>>>>>>> None
>>>>>>>
>>>>>>> Client errors:
>>>>>>> None
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> yum log: https://paste.fedoraproject.org/396854/
>>>>>>>>>
>>>>>>>>
>>>>>>>> What version of gluster was running prior to update to 3.7.13?
>>>>>>>>
>>>>>>> 3.7.11-1 from gluster.org repository(after update ovirt switched to
>>>>>>> centos repository)
>>>>>>>
>>>>>>
>>>>>> What file system do your bricks reside on and do you have sharding
>>>>>> enabled?
>>>>>>
>>>>>>
>>>>>>>> Did it create gluster mounts on server when attempting to start?
>>>>>>>>
>>>>>>> As I checked the master domain is not mounted on any nodes.
>>>>>>> Restarting vdsmd generated following errors:
>>>>>>>
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating
>>>>>>> directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode:
>>>>>>> None
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)
>>>>>>> Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13']
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
>>>>>>> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope
>>>>>>> --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o
>>>>>>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt
>>>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess...
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
>>>>>>> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l
>>>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
>>>>>>> jsonrpc.Executor/5::ERROR::2016-07-28
>>>>>>> 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not
>>>>>>> connect to storageServer
>>>>>>> Traceback (most recent call last):
>>>>>>>   File "/usr/share/vdsm/storage/hsm.py", line 2470, in
>>>>>>> connectStorageServer
>>>>>>>     conObj.connect()
>>>>>>>   File "/usr/share/vdsm/storage/storageServer.py", line 248, in
>>>>>>> connect
>>>>>>>     six.reraise(t, v, tb)
>>>>>>>   File "/usr/share/vdsm/storage/storageServer.py", line 241, in
>>>>>>> connect
>>>>>>>     self.getMountObj().getRecord().fs_file)
>>>>>>>   File "/usr/share/vdsm/storage/fileSD.py", line 79, in
>>>>>>> validateDirAccess
>>>>>>>     raise se.StorageServerAccessPermissionError(dirPath)
>>>>>>> StorageServerAccessPermissionError: Permission settings on the
>>>>>>> specified path do not allow access to the storage. Verify permission
>>>>>>> settings on the specified storage path.: 'path =
>>>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {}
>>>>>>> jsonrpc.Executor/5::INFO::2016-07-28
>>>>>>> 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect:
>>>>>>> connectStorageServer, Return response: {'statuslist': [{'status': 469,
>>>>>>> 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare)
>>>>>>> Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist':
>>>>>>> [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>> 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState)
>>>>>>> Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing ->
>>>>>>> state finished
>>>>>>>
>>>>>>> I can manually mount the gluster volume on the same server.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Setup:
>>>>>>>>> engine running on a separate node
>>>>>>>>> 3 x kvm/glusterd nodes
>>>>>>>>>
>>>>>>>>> Status of volume: ovirt
>>>>>>>>> Gluster process                             TCP Port  RDMA Port
>>>>>>>>>  Online  Pid
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>> Brick 172.16.0.11:/data/brick1/brick1       49152     0
>>>>>>>>>  Y       17304
>>>>>>>>> Brick 172.16.0.12:/data/brick3/brick3       49155     0
>>>>>>>>>  Y       9363
>>>>>>>>> Brick 172.16.0.13:/data/brick1/brick1       49152     0
>>>>>>>>>  Y       23684
>>>>>>>>> Brick 172.16.0.11:/data/brick2/brick2       49153     0
>>>>>>>>>  Y       17323
>>>>>>>>> Brick 172.16.0.12:/data/brick2/brick2       49153     0
>>>>>>>>>  Y       9382
>>>>>>>>> Brick 172.16.0.13:/data/brick2/brick2       49153     0
>>>>>>>>>  Y       23703
>>>>>>>>> NFS Server on localhost                     2049      0          Y
>>>>>>>>>       30508
>>>>>>>>> Self-heal Daemon on localhost               N/A       N/A        Y
>>>>>>>>>       30521
>>>>>>>>> NFS Server on 172.16.0.11                   2049      0          Y
>>>>>>>>>       24999
>>>>>>>>> Self-heal Daemon on 172.16.0.11             N/A       N/A        Y
>>>>>>>>>       25016
>>>>>>>>> NFS Server on 172.16.0.13                   2049      0          Y
>>>>>>>>>       25379
>>>>>>>>> Self-heal Daemon on 172.16.0.13             N/A       N/A        Y
>>>>>>>>>       25509
>>>>>>>>>
>>>>>>>>> Task Status of Volume ovirt
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>> Task                 : Rebalance
>>>>>>>>> ID                   : 84d5ab2a-275e-421d-842b-928a9326c19a
>>>>>>>>> Status               : completed
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Siavash
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users at ovirt.org
>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160728/43e151b4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oVirt_Engine_Web_Administration.png
Type: image/png
Size: 24260 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160728/43e151b4/attachment-0001.png>


More information about the Users mailing list