[ovirt-users] Cannot find master domain

Siavash Safi siavash.safi at gmail.com
Thu Jul 28 15:43:25 UTC 2016


Yes, the dir is missing on all node. I only created it on node1 (node2 &
node3 are put in maintenance mode manually)

Yes, manual mount works fine:

[root at node1 ~]# /usr/bin/mount -t glusterfs -o
backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
[root at node1 ~]# ls -l /mnt/
total 4
drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34
4697fbde-45fb-4f91-ac4c-5516bc59f683
-rwxr-xr-x. 1 vdsm kvm    0 Jul 27 23:05 __DIRECT_IO_TEST__
[root at node1 ~]# touch /mnt/test
[root at node1 ~]# ls -l /mnt/
total 4
drwxr-xr-x. 5 vdsm kvm  4096 Apr 26 19:34
4697fbde-45fb-4f91-ac4c-5516bc59f683
-rwxr-xr-x. 1 vdsm kvm     0 Jul 27 23:05 __DIRECT_IO_TEST__
-rw-r--r--. 1 root root    0 Jul 28 20:10 test
[root at node1 ~]# chown vdsm:kvm /mnt/test
[root at node1 ~]# ls -l /mnt/
total 4
drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34
4697fbde-45fb-4f91-ac4c-5516bc59f683
-rwxr-xr-x. 1 vdsm kvm    0 Jul 27 23:05 __DIRECT_IO_TEST__
-rw-r--r--. 1 vdsm kvm    0 Jul 28 20:10 test
[root at node1 ~]# echo foo > /mnt/test
[root at node1 ~]# cat /mnt/test
foo


On Thu, Jul 28, 2016 at 8:06 PM David Gossage <dgossage at carouselchecks.com>
wrote:

> On Thu, Jul 28, 2016 at 10:28 AM, Siavash Safi <siavash.safi at gmail.com>
> wrote:
>
>> I created the directory with correct permissions:
>> drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51
>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/
>>
>> It was removed after I tried to activate the storage from web.
>>
>> Is dir missing on all 3 oVirt nodes?  Did you create on all 3?
>
> When you did test mount with oVirts mount options did permissions on files
> after mount look proper?  Can you read/write to mount?
>
>
>> Engine displays the master storage as inactive:
>> [image: oVirt_Engine_Web_Administration.png]
>>
>>
>> On Thu, Jul 28, 2016 at 7:40 PM David Gossage <
>> dgossage at carouselchecks.com> wrote:
>>
>>> On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.safi at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jul 28, 2016 at 7:19 PM David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> file system: xfs
>>>>>> features.shard: off
>>>>>>
>>>>>
>>>>> Ok was just seeing if matched up to the issues latest 3.7.x releases
>>>>> have with zfs and sharding but doesn't look like your issue.
>>>>>
>>>>>  In your logs I see it mounts with thee commands.  What happens if you
>>>>> use same to a test dir?
>>>>>
>>>>>  /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13
>>>>> 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>>>>>
>>>>
>>>> It mounts successfully:
>>>> [root at node1 ~]# /usr/bin/mount -t glusterfs -o
>>>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
>>>> [root at node1 ~]# ls /mnt/
>>>> 4697fbde-45fb-4f91-ac4c-5516bc59f683  __DIRECT_IO_TEST__
>>>>
>>>>
>>>>> It then umounts it and complains short while later of permissions.
>>>>>
>>>>> StorageServerAccessPermissionError: Permission settings on the
>>>>> specified path do not allow access to the storage. Verify permission
>>>>> settings on the specified storage path.: 'path =
>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>>>>>
>>>>> Are the permissions of dirs to
>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?
>>>>>
>>>>
>>>> /rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the
>>>> directory after failure to cleanup?
>>>>
>>>
>>> Maybe though I don't recall it ever being deleted unless you maybe
>>> destroy detach storage. What if you create that directory and permissions
>>> appropriately on any node missing then try and activate storage?
>>>
>>> In engine is it still displaying the master storage domain?
>>>
>>>
>>>> How about on the bricks anything out of place?
>>>>>
>>>>
>>>> I didn't notice anything.
>>>>
>>>>
>>>>> Is gluster still using same options as before?  could it have reset
>>>>> the user and group to not be 36?
>>>>>
>>>>
>>>> All options seem to be correct, to make sure I ran "Optimize for Virt
>>>> Store" from web.
>>>>
>>>> Volume Name: ovirt
>>>> Type: Distributed-Replicate
>>>> Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2
>>>> Status: Started
>>>> Number of Bricks: 2 x 3 = 6
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: 172.16.0.11:/data/brick1/brick1
>>>> Brick2: 172.16.0.12:/data/brick3/brick3
>>>> Brick3: 172.16.0.13:/data/brick1/brick1
>>>> Brick4: 172.16.0.11:/data/brick2/brick2
>>>> Brick5: 172.16.0.12:/data/brick2/brick2
>>>> Brick6: 172.16.0.13:/data/brick2/brick2
>>>> Options Reconfigured:
>>>> performance.readdir-ahead: on
>>>> nfs.disable: off
>>>> user.cifs: enable
>>>> auth.allow: *
>>>> performance.quick-read: off
>>>> performance.read-ahead: off
>>>> performance.io-cache: off
>>>> performance.stat-prefetch: off
>>>> cluster.eager-lock: enable
>>>> network.remote-dio: enable
>>>> cluster.quorum-type: auto
>>>> cluster.server-quorum-type: server
>>>> storage.owner-uid: 36
>>>> storage.owner-gid: 36
>>>> server.allow-insecure: on
>>>> network.ping-timeout: 10
>>>>
>>>>
>>>>>> On Thu, Jul 28, 2016 at 7:03 PM David Gossage <
>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>
>>>>>>> On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <
>>>>>>> siavash.safi at gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>
>>>>>>>>> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <
>>>>>>>>> siavash.safi at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Issue: Cannot find master domain
>>>>>>>>>> Changes applied before issue started to happen: replaced
>>>>>>>>>> 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3,
>>>>>>>>>> did minor package upgrades for vdsm and glusterfs
>>>>>>>>>>
>>>>>>>>>> vdsm log: https://paste.fedoraproject.org/396842/
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Any errrors in glusters brick or server logs?  The client gluster
>>>>>>>>> logs from ovirt?
>>>>>>>>>
>>>>>>>> Brick errors:
>>>>>>>> [2016-07-28 14:03:25.002396] E [MSGID: 113091]
>>>>>>>> [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null)
>>>>>>>> [2016-07-28 14:03:25.002430] E [MSGID: 113018]
>>>>>>>> [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid
>>>>>>>> argument]
>>>>>>>> (Both repeated many times)
>>>>>>>>
>>>>>>>> Server errors:
>>>>>>>> None
>>>>>>>>
>>>>>>>> Client errors:
>>>>>>>> None
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> yum log: https://paste.fedoraproject.org/396854/
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> What version of gluster was running prior to update to 3.7.13?
>>>>>>>>>
>>>>>>>> 3.7.11-1 from gluster.org repository(after update ovirt switched
>>>>>>>> to centos repository)
>>>>>>>>
>>>>>>>
>>>>>>> What file system do your bricks reside on and do you have sharding
>>>>>>> enabled?
>>>>>>>
>>>>>>>
>>>>>>>>> Did it create gluster mounts on server when attempting to start?
>>>>>>>>>
>>>>>>>> As I checked the master domain is not mounted on any nodes.
>>>>>>>> Restarting vdsmd generated following errors:
>>>>>>>>
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating
>>>>>>>> directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>>>>>>>> mode: None
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)
>>>>>>>> Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13']
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
>>>>>>>> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope
>>>>>>>> --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o
>>>>>>>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt
>>>>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess...
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
>>>>>>>> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l
>>>>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
>>>>>>>> jsonrpc.Executor/5::ERROR::2016-07-28
>>>>>>>> 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not
>>>>>>>> connect to storageServer
>>>>>>>> Traceback (most recent call last):
>>>>>>>>   File "/usr/share/vdsm/storage/hsm.py", line 2470, in
>>>>>>>> connectStorageServer
>>>>>>>>     conObj.connect()
>>>>>>>>   File "/usr/share/vdsm/storage/storageServer.py", line 248, in
>>>>>>>> connect
>>>>>>>>     six.reraise(t, v, tb)
>>>>>>>>   File "/usr/share/vdsm/storage/storageServer.py", line 241, in
>>>>>>>> connect
>>>>>>>>     self.getMountObj().getRecord().fs_file)
>>>>>>>>   File "/usr/share/vdsm/storage/fileSD.py", line 79, in
>>>>>>>> validateDirAccess
>>>>>>>>     raise se.StorageServerAccessPermissionError(dirPath)
>>>>>>>> StorageServerAccessPermissionError: Permission settings on the
>>>>>>>> specified path do not allow access to the storage. Verify permission
>>>>>>>> settings on the specified storage path.: 'path =
>>>>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {}
>>>>>>>> jsonrpc.Executor/5::INFO::2016-07-28
>>>>>>>> 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect:
>>>>>>>> connectStorageServer, Return response: {'statuslist': [{'status': 469,
>>>>>>>> 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare)
>>>>>>>> Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist':
>>>>>>>> [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>>>>>>>> jsonrpc.Executor/5::DEBUG::2016-07-28
>>>>>>>> 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState)
>>>>>>>> Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing ->
>>>>>>>> state finished
>>>>>>>>
>>>>>>>> I can manually mount the gluster volume on the same server.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Setup:
>>>>>>>>>> engine running on a separate node
>>>>>>>>>> 3 x kvm/glusterd nodes
>>>>>>>>>>
>>>>>>>>>> Status of volume: ovirt
>>>>>>>>>> Gluster process                             TCP Port  RDMA Port
>>>>>>>>>>  Online  Pid
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>> Brick 172.16.0.11:/data/brick1/brick1       49152     0
>>>>>>>>>>  Y       17304
>>>>>>>>>> Brick 172.16.0.12:/data/brick3/brick3       49155     0
>>>>>>>>>>  Y       9363
>>>>>>>>>> Brick 172.16.0.13:/data/brick1/brick1       49152     0
>>>>>>>>>>  Y       23684
>>>>>>>>>> Brick 172.16.0.11:/data/brick2/brick2       49153     0
>>>>>>>>>>  Y       17323
>>>>>>>>>> Brick 172.16.0.12:/data/brick2/brick2       49153     0
>>>>>>>>>>  Y       9382
>>>>>>>>>> Brick 172.16.0.13:/data/brick2/brick2       49153     0
>>>>>>>>>>  Y       23703
>>>>>>>>>> NFS Server on localhost                     2049      0
>>>>>>>>>>  Y       30508
>>>>>>>>>> Self-heal Daemon on localhost               N/A       N/A
>>>>>>>>>>  Y       30521
>>>>>>>>>> NFS Server on 172.16.0.11                   2049      0
>>>>>>>>>>  Y       24999
>>>>>>>>>> Self-heal Daemon on 172.16.0.11             N/A       N/A
>>>>>>>>>>  Y       25016
>>>>>>>>>> NFS Server on 172.16.0.13                   2049      0
>>>>>>>>>>  Y       25379
>>>>>>>>>> Self-heal Daemon on 172.16.0.13             N/A       N/A
>>>>>>>>>>  Y       25509
>>>>>>>>>>
>>>>>>>>>> Task Status of Volume ovirt
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>> Task                 : Rebalance
>>>>>>>>>> ID                   : 84d5ab2a-275e-421d-842b-928a9326c19a
>>>>>>>>>> Status               : completed
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Siavash
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list
>>>>>>>>>> Users at ovirt.org
>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160728/cec52cc6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oVirt_Engine_Web_Administration.png
Type: image/x-png
Size: 24260 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160728/cec52cc6/attachment-0001.bin>


More information about the Users mailing list