Cannot find master domain

newer
How to access ovirt 4.0 desktop...

Siavash Safi

28 Jul 2016 28 Jul '16

3:52 p.m.

Hi, Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs vdsm log: https://paste.fedoraproject.org/396842/ yum log: https://paste.fedoraproject.org/396854/ Setup: engine running on a separate node 3 x kvm/glusterd nodes Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509 Task Status of Volume ovirt ------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed Thanks, Siavash

Attachments:

attachment.html (text/html — 2.5 KB)

Show replies by date

David Gossage

28 Jul 28 Jul

3:58 p.m.

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

...

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? Did it create gluster mounts on server when attempting to start?

...

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Siavash Safi

4:28 p.m.

On Thu, Jul 28, 2016 at 6:29 PM David Gossage <dgossage@carouselchecks.com> wrote:

...

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times) Server errors: None Client errors: None

...

...
yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13?

3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

...

Did it create gluster mounts on server when attempting to start?

As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors: jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished I can manually mount the gluster volume on the same server.

...

...
Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

David Gossage

4:33 p.m.

On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...

On Thu, Jul 28, 2016 at 6:29 PM David Gossage <dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

...
...
yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13?

3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

What file system do your bricks reside on and do you have sharding enabled?

...

...
Did it create gluster mounts on server when attempting to start?

As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

...
...
Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Siavash Safi

4:38 p.m.

file system: xfs features.shard: off On Thu, Jul 28, 2016 at 7:03 PM David Gossage <dgossage@carouselchecks.com> wrote:

...

On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

...
...
yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13?

3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

What file system do your bricks reside on and do you have sharding enabled?

...
...
Did it create gluster mounts on server when attempting to start?

As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

...
...
Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

David Gossage

4:49 p.m.

On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...

file system: xfs features.shard: off

Ok was just seeing if matched up to the issues latest 3.7.x releases have with zfs and sharding but doesn't look like your issue. In your logs I see it mounts with thee commands. What happens if you use same to a test dir? /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt It then umounts it and complains short while later of permissions. StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' Are the permissions of dirs to /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected? How about on the bricks anything out of place? Is gluster still using same options as before? could it have reset the user and group to not be 36?

...

On Thu, Jul 28, 2016 at 7:03 PM David Gossage <dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

...
...
yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13?

3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

What file system do your bricks reside on and do you have sharding enabled?

...
...
Did it create gluster mounts on server when attempting to start?

As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

...
...
Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Siavash Safi

5 p.m.

On Thu, Jul 28, 2016 at 7:19 PM David Gossage <dgossage@carouselchecks.com> wrote:

...

On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
file system: xfs features.shard: off

Ok was just seeing if matched up to the issues latest 3.7.x releases have with zfs and sharding but doesn't look like your issue.

In your logs I see it mounts with thee commands. What happens if you use same to a test dir?

/usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt

It mounts successfully: [root@node1 ~]# /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt [root@node1 ~]# ls /mnt/ 4697fbde-45fb-4f91-ac4c-5516bc59f683 __DIRECT_IO_TEST__

...

It then umounts it and complains short while later of permissions.

StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

Are the permissions of dirs to /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?

/rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory after failure to cleanup? How about on the bricks anything out of place?

...

I didn't notice anything.

...

Is gluster still using same options as before? could it have reset the user and group to not be 36?

All options seem to be correct, to make sure I ran "Optimize for Virt Store" from web. Volume Name: ovirt Type: Distributed-Replicate Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 172.16.0.11:/data/brick1/brick1 Brick2: 172.16.0.12:/data/brick3/brick3 Brick3: 172.16.0.13:/data/brick1/brick1 Brick4: 172.16.0.11:/data/brick2/brick2 Brick5: 172.16.0.12:/data/brick2/brick2 Brick6: 172.16.0.13:/data/brick2/brick2 Options Reconfigured: performance.readdir-ahead: on nfs.disable: off user.cifs: enable auth.allow: * performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 server.allow-insecure: on network.ping-timeout: 10

...

...
On Thu, Jul 28, 2016 at 7:03 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

...
...
yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13?

3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

What file system do your bricks reside on and do you have sharding enabled?

...
...
Did it create gluster mounts on server when attempting to start?

As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

...
...
Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

David Gossage

5:10 p.m.

On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...

On Thu, Jul 28, 2016 at 7:19 PM David Gossage <dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
file system: xfs features.shard: off

Ok was just seeing if matched up to the issues latest 3.7.x releases have with zfs and sharding but doesn't look like your issue.

In your logs I see it mounts with thee commands. What happens if you use same to a test dir?

/usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt

It mounts successfully: [root@node1 ~]# /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt [root@node1 ~]# ls /mnt/ 4697fbde-45fb-4f91-ac4c-5516bc59f683 __DIRECT_IO_TEST__

...
It then umounts it and complains short while later of permissions.

StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

Are the permissions of dirs to /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?

/rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory after failure to cleanup?

Maybe though I don't recall it ever being deleted unless you maybe destroy detach storage. What if you create that directory and permissions appropriately on any node missing then try and activate storage? In engine is it still displaying the master storage domain?

...

How about on the bricks anything out of place?

...
I didn't notice anything.

...
Is gluster still using same options as before? could it have reset the user and group to not be 36?

All options seem to be correct, to make sure I ran "Optimize for Virt Store" from web.

Volume Name: ovirt Type: Distributed-Replicate Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 172.16.0.11:/data/brick1/brick1 Brick2: 172.16.0.12:/data/brick3/brick3 Brick3: 172.16.0.13:/data/brick1/brick1 Brick4: 172.16.0.11:/data/brick2/brick2 Brick5: 172.16.0.12:/data/brick2/brick2 Brick6: 172.16.0.13:/data/brick2/brick2 Options Reconfigured: performance.readdir-ahead: on nfs.disable: off user.cifs: enable auth.allow: * performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 server.allow-insecure: on network.ping-timeout: 10

...
...
On Thu, Jul 28, 2016 at 7:03 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.safi@gmail.com > wrote:

> Hi, > > Issue: Cannot find master domain > Changes applied before issue started to happen: replaced > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, > did minor package upgrades for vdsm and glusterfs > > vdsm log: https://paste.fedoraproject.org/396842/ >

Any errrors in glusters brick or server logs? The client gluster logs from ovirt?

Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

...
> yum log: https://paste.fedoraproject.org/396854/ >

What version of gluster was running prior to update to 3.7.13?

3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

What file system do your bricks reside on and do you have sharding enabled?

...
...
Did it create gluster mounts on server when attempting to start?

As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

...
> Setup: > engine running on a separate node > 3 x kvm/glusterd nodes > > Status of volume: ovirt > Gluster process TCP Port RDMA Port > Online Pid > > ------------------------------------------------------------------------------ > Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y > 17304 > Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y > 9363 > Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y > 23684 > Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y > 17323 > Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y > 9382 > Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y > 23703 > NFS Server on localhost 2049 0 Y > 30508 > Self-heal Daemon on localhost N/A N/A Y > 30521 > NFS Server on 172.16.0.11 2049 0 Y > 24999 > Self-heal Daemon on 172.16.0.11 N/A N/A Y > 25016 > NFS Server on 172.16.0.13 2049 0 Y > 25379 > Self-heal Daemon on 172.16.0.13 N/A N/A Y > 25509 > > Task Status of Volume ovirt > > ------------------------------------------------------------------------------ > Task : Rebalance > ID : 84d5ab2a-275e-421d-842b-928a9326c19a > Status : completed > > Thanks, > Siavash > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >

Siavash Safi

5:28 p.m.

I created the directory with correct permissions: drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51 /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/ It was removed after I tried to activate the storage from web. Engine displays the master storage as inactive: [image: oVirt_Engine_Web_Administration.png] On Thu, Jul 28, 2016 at 7:40 PM David Gossage <dgossage@carouselchecks.com> wrote:

...

On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 7:19 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
file system: xfs features.shard: off

Ok was just seeing if matched up to the issues latest 3.7.x releases have with zfs and sharding but doesn't look like your issue.

In your logs I see it mounts with thee commands. What happens if you use same to a test dir?

/usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt

It mounts successfully: [root@node1 ~]# /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt [root@node1 ~]# ls /mnt/ 4697fbde-45fb-4f91-ac4c-5516bc59f683 __DIRECT_IO_TEST__

...
It then umounts it and complains short while later of permissions.

StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

Are the permissions of dirs to /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?

/rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory after failure to cleanup?

Maybe though I don't recall it ever being deleted unless you maybe destroy detach storage. What if you create that directory and permissions appropriately on any node missing then try and activate storage?

In engine is it still displaying the master storage domain?

...
How about on the bricks anything out of place?

...
I didn't notice anything.

...
Is gluster still using same options as before? could it have reset the user and group to not be 36?

All options seem to be correct, to make sure I ran "Optimize for Virt Store" from web.

Volume Name: ovirt Type: Distributed-Replicate Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 172.16.0.11:/data/brick1/brick1 Brick2: 172.16.0.12:/data/brick3/brick3 Brick3: 172.16.0.13:/data/brick1/brick1 Brick4: 172.16.0.11:/data/brick2/brick2 Brick5: 172.16.0.12:/data/brick2/brick2 Brick6: 172.16.0.13:/data/brick2/brick2 Options Reconfigured: performance.readdir-ahead: on nfs.disable: off user.cifs: enable auth.allow: * performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 server.allow-insecure: on network.ping-timeout: 10

...
...
On Thu, Jul 28, 2016 at 7:03 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com> wrote:

> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < > siavash.safi@gmail.com> wrote: > >> Hi, >> >> Issue: Cannot find master domain >> Changes applied before issue started to happen: replaced >> 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, >> did minor package upgrades for vdsm and glusterfs >> >> vdsm log: https://paste.fedoraproject.org/396842/ >> > > > Any errrors in glusters brick or server logs? The client gluster > logs from ovirt? > Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

> >> yum log: https://paste.fedoraproject.org/396854/ >> > > What version of gluster was running prior to update to 3.7.13? > 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

What file system do your bricks reside on and do you have sharding enabled?

...
> Did it create gluster mounts on server when attempting to start? > As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

> > >> Setup: >> engine running on a separate node >> 3 x kvm/glusterd nodes >> >> Status of volume: ovirt >> Gluster process TCP Port RDMA Port >> Online Pid >> >> ------------------------------------------------------------------------------ >> Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y >> 17304 >> Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y >> 9363 >> Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y >> 23684 >> Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y >> 17323 >> Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y >> 9382 >> Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y >> 23703 >> NFS Server on localhost 2049 0 Y >> 30508 >> Self-heal Daemon on localhost N/A N/A Y >> 30521 >> NFS Server on 172.16.0.11 2049 0 Y >> 24999 >> Self-heal Daemon on 172.16.0.11 N/A N/A Y >> 25016 >> NFS Server on 172.16.0.13 2049 0 Y >> 25379 >> Self-heal Daemon on 172.16.0.13 N/A N/A Y >> 25509 >> >> Task Status of Volume ovirt >> >> ------------------------------------------------------------------------------ >> Task : Rebalance >> ID : 84d5ab2a-275e-421d-842b-928a9326c19a >> Status : completed >> >> Thanks, >> Siavash >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >

David Gossage

5:35 p.m.

On Thu, Jul 28, 2016 at 10:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...

I created the directory with correct permissions: drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51 /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/

It was removed after I tried to activate the storage from web.

Is dir missing on all 3 oVirt nodes? Did you create on all 3?

When you did test mount with oVirts mount options did permissions on files after mount look proper? Can you read/write to mount?

...

Engine displays the master storage as inactive: [image: oVirt_Engine_Web_Administration.png]

On Thu, Jul 28, 2016 at 7:40 PM David Gossage <dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 7:19 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
file system: xfs features.shard: off

Ok was just seeing if matched up to the issues latest 3.7.x releases have with zfs and sharding but doesn't look like your issue.

In your logs I see it mounts with thee commands. What happens if you use same to a test dir?

/usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt

It mounts successfully: [root@node1 ~]# /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt [root@node1 ~]# ls /mnt/ 4697fbde-45fb-4f91-ac4c-5516bc59f683 __DIRECT_IO_TEST__

...
It then umounts it and complains short while later of permissions.

StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

Are the permissions of dirs to /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?

/rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory after failure to cleanup?

Maybe though I don't recall it ever being deleted unless you maybe destroy detach storage. What if you create that directory and permissions appropriately on any node missing then try and activate storage?

In engine is it still displaying the master storage domain?

...
How about on the bricks anything out of place?

...
I didn't notice anything.

...
Is gluster still using same options as before? could it have reset the user and group to not be 36?

All options seem to be correct, to make sure I ran "Optimize for Virt Store" from web.

Volume Name: ovirt Type: Distributed-Replicate Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 172.16.0.11:/data/brick1/brick1 Brick2: 172.16.0.12:/data/brick3/brick3 Brick3: 172.16.0.13:/data/brick1/brick1 Brick4: 172.16.0.11:/data/brick2/brick2 Brick5: 172.16.0.12:/data/brick2/brick2 Brick6: 172.16.0.13:/data/brick2/brick2 Options Reconfigured: performance.readdir-ahead: on nfs.disable: off user.cifs: enable auth.allow: * performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 server.allow-insecure: on network.ping-timeout: 10

...
...
On Thu, Jul 28, 2016 at 7:03 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.safi@gmail.com > wrote:

> > > On Thu, Jul 28, 2016 at 6:29 PM David Gossage < > dgossage@carouselchecks.com> wrote: > >> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < >> siavash.safi@gmail.com> wrote: >> >>> Hi, >>> >>> Issue: Cannot find master domain >>> Changes applied before issue started to happen: replaced >>> 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, >>> did minor package upgrades for vdsm and glusterfs >>> >>> vdsm log: https://paste.fedoraproject.org/396842/ >>> >> >> >> Any errrors in glusters brick or server logs? The client gluster >> logs from ovirt? >> > Brick errors: > [2016-07-28 14:03:25.002396] E [MSGID: 113091] > [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) > [2016-07-28 14:03:25.002430] E [MSGID: 113018] > [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid > argument] > (Both repeated many times) > > Server errors: > None > > Client errors: > None > > >> >>> yum log: https://paste.fedoraproject.org/396854/ >>> >> >> What version of gluster was running prior to update to 3.7.13? >> > 3.7.11-1 from gluster.org repository(after update ovirt switched to > centos repository) >

What file system do your bricks reside on and do you have sharding enabled?

>> Did it create gluster mounts on server when attempting to start? >> > As I checked the master domain is not mounted on any nodes. > Restarting vdsmd generated following errors: > > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating > directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: > None > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) > Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope > --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o > backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) > jsonrpc.Executor/5::ERROR::2016-07-28 > 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not > connect to storageServer > Traceback (most recent call last): > File "/usr/share/vdsm/storage/hsm.py", line 2470, in > connectStorageServer > conObj.connect() > File "/usr/share/vdsm/storage/storageServer.py", line 248, in > connect > six.reraise(t, v, tb) > File "/usr/share/vdsm/storage/storageServer.py", line 241, in > connect > self.getMountObj().getRecord().fs_file) > File "/usr/share/vdsm/storage/fileSD.py", line 79, in > validateDirAccess > raise se.StorageServerAccessPermissionError(dirPath) > StorageServerAccessPermissionError: Permission settings on the > specified path do not allow access to the storage. Verify permission > settings on the specified storage path.: 'path = > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} > jsonrpc.Executor/5::INFO::2016-07-28 > 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: > connectStorageServer, Return response: {'statuslist': [{'status': 469, > 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': > [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> > state finished > > I can manually mount the gluster volume on the same server. > > >> >> >>> Setup: >>> engine running on a separate node >>> 3 x kvm/glusterd nodes >>> >>> Status of volume: ovirt >>> Gluster process TCP Port RDMA Port >>> Online Pid >>> >>> ------------------------------------------------------------------------------ >>> Brick 172.16.0.11:/data/brick1/brick1 49152 0 >>> Y 17304 >>> Brick 172.16.0.12:/data/brick3/brick3 49155 0 >>> Y 9363 >>> Brick 172.16.0.13:/data/brick1/brick1 49152 0 >>> Y 23684 >>> Brick 172.16.0.11:/data/brick2/brick2 49153 0 >>> Y 17323 >>> Brick 172.16.0.12:/data/brick2/brick2 49153 0 >>> Y 9382 >>> Brick 172.16.0.13:/data/brick2/brick2 49153 0 >>> Y 23703 >>> NFS Server on localhost 2049 0 Y >>> 30508 >>> Self-heal Daemon on localhost N/A N/A Y >>> 30521 >>> NFS Server on 172.16.0.11 2049 0 Y >>> 24999 >>> Self-heal Daemon on 172.16.0.11 N/A N/A Y >>> 25016 >>> NFS Server on 172.16.0.13 2049 0 Y >>> 25379 >>> Self-heal Daemon on 172.16.0.13 N/A N/A Y >>> 25509 >>> >>> Task Status of Volume ovirt >>> >>> ------------------------------------------------------------------------------ >>> Task : Rebalance >>> ID : 84d5ab2a-275e-421d-842b-928a9326c19a >>> Status : completed >>> >>> Thanks, >>> Siavash >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >>

Siavash Safi

5:43 p.m.

Yes, the dir is missing on all node. I only created it on node1 (node2 & node3 are put in maintenance mode manually) Yes, manual mount works fine: [root@node1 ~]# /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt [root@node1 ~]# ls -l /mnt/ total 4 drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34 4697fbde-45fb-4f91-ac4c-5516bc59f683 -rwxr-xr-x. 1 vdsm kvm 0 Jul 27 23:05 __DIRECT_IO_TEST__ [root@node1 ~]# touch /mnt/test [root@node1 ~]# ls -l /mnt/ total 4 drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34 4697fbde-45fb-4f91-ac4c-5516bc59f683 -rwxr-xr-x. 1 vdsm kvm 0 Jul 27 23:05 __DIRECT_IO_TEST__ -rw-r--r--. 1 root root 0 Jul 28 20:10 test [root@node1 ~]# chown vdsm:kvm /mnt/test [root@node1 ~]# ls -l /mnt/ total 4 drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34 4697fbde-45fb-4f91-ac4c-5516bc59f683 -rwxr-xr-x. 1 vdsm kvm 0 Jul 27 23:05 __DIRECT_IO_TEST__ -rw-r--r--. 1 vdsm kvm 0 Jul 28 20:10 test [root@node1 ~]# echo foo > /mnt/test [root@node1 ~]# cat /mnt/test foo On Thu, Jul 28, 2016 at 8:06 PM David Gossage <dgossage@carouselchecks.com> wrote:

...

On Thu, Jul 28, 2016 at 10:28 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
I created the directory with correct permissions: drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51 /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/

It was removed after I tried to activate the storage from web.

Is dir missing on all 3 oVirt nodes? Did you create on all 3?

When you did test mount with oVirts mount options did permissions on files after mount look proper? Can you read/write to mount?

...
Engine displays the master storage as inactive: [image: oVirt_Engine_Web_Administration.png]

On Thu, Jul 28, 2016 at 7:40 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
On Thu, Jul 28, 2016 at 7:19 PM David Gossage < dgossage@carouselchecks.com> wrote:

...
On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
file system: xfs features.shard: off

Ok was just seeing if matched up to the issues latest 3.7.x releases have with zfs and sharding but doesn't look like your issue.

In your logs I see it mounts with thee commands. What happens if you use same to a test dir?

/usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt

It mounts successfully: [root@node1 ~]# /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt [root@node1 ~]# ls /mnt/ 4697fbde-45fb-4f91-ac4c-5516bc59f683 __DIRECT_IO_TEST__

...
It then umounts it and complains short while later of permissions.

StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

Are the permissions of dirs to /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?

/rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory after failure to cleanup?

Maybe though I don't recall it ever being deleted unless you maybe destroy detach storage. What if you create that directory and permissions appropriately on any node missing then try and activate storage?

In engine is it still displaying the master storage domain?

...
How about on the bricks anything out of place?

...
I didn't notice anything.

...
Is gluster still using same options as before? could it have reset the user and group to not be 36?

All options seem to be correct, to make sure I ran "Optimize for Virt Store" from web.

Volume Name: ovirt Type: Distributed-Replicate Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 172.16.0.11:/data/brick1/brick1 Brick2: 172.16.0.12:/data/brick3/brick3 Brick3: 172.16.0.13:/data/brick1/brick1 Brick4: 172.16.0.11:/data/brick2/brick2 Brick5: 172.16.0.12:/data/brick2/brick2 Brick6: 172.16.0.13:/data/brick2/brick2 Options Reconfigured: performance.readdir-ahead: on nfs.disable: off user.cifs: enable auth.allow: * performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 server.allow-insecure: on network.ping-timeout: 10

...
...
On Thu, Jul 28, 2016 at 7:03 PM David Gossage < dgossage@carouselchecks.com> wrote:

> On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi < > siavash.safi@gmail.com> wrote: > >> >> >> On Thu, Jul 28, 2016 at 6:29 PM David Gossage < >> dgossage@carouselchecks.com> wrote: >> >>> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < >>> siavash.safi@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> Issue: Cannot find master domain >>>> Changes applied before issue started to happen: replaced >>>> 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, >>>> did minor package upgrades for vdsm and glusterfs >>>> >>>> vdsm log: https://paste.fedoraproject.org/396842/ >>>> >>> >>> >>> Any errrors in glusters brick or server logs? The client gluster >>> logs from ovirt? >>> >> Brick errors: >> [2016-07-28 14:03:25.002396] E [MSGID: 113091] >> [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) >> [2016-07-28 14:03:25.002430] E [MSGID: 113018] >> [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid >> argument] >> (Both repeated many times) >> >> Server errors: >> None >> >> Client errors: >> None >> >> >>> >>>> yum log: https://paste.fedoraproject.org/396854/ >>>> >>> >>> What version of gluster was running prior to update to 3.7.13? >>> >> 3.7.11-1 from gluster.org repository(after update ovirt switched >> to centos repository) >> > > What file system do your bricks reside on and do you have sharding > enabled? > > >>> Did it create gluster mounts on server when attempting to start? >>> >> As I checked the master domain is not mounted on any nodes. >> Restarting vdsmd generated following errors: >> >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating >> directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt >> mode: None >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) >> Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset >> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope >> --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o >> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt >> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset >> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l >> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) >> jsonrpc.Executor/5::ERROR::2016-07-28 >> 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not >> connect to storageServer >> Traceback (most recent call last): >> File "/usr/share/vdsm/storage/hsm.py", line 2470, in >> connectStorageServer >> conObj.connect() >> File "/usr/share/vdsm/storage/storageServer.py", line 248, in >> connect >> six.reraise(t, v, tb) >> File "/usr/share/vdsm/storage/storageServer.py", line 241, in >> connect >> self.getMountObj().getRecord().fs_file) >> File "/usr/share/vdsm/storage/fileSD.py", line 79, in >> validateDirAccess >> raise se.StorageServerAccessPermissionError(dirPath) >> StorageServerAccessPermissionError: Permission settings on the >> specified path do not allow access to the storage. Verify permission >> settings on the specified storage path.: 'path = >> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} >> jsonrpc.Executor/5::INFO::2016-07-28 >> 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: >> connectStorageServer, Return response: {'statuslist': [{'status': 469, >> 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) >> Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': >> [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} >> jsonrpc.Executor/5::DEBUG::2016-07-28 >> 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) >> Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> >> state finished >> >> I can manually mount the gluster volume on the same server. >> >> >>> >>> >>>> Setup: >>>> engine running on a separate node >>>> 3 x kvm/glusterd nodes >>>> >>>> Status of volume: ovirt >>>> Gluster process TCP Port RDMA Port >>>> Online Pid >>>> >>>> ------------------------------------------------------------------------------ >>>> Brick 172.16.0.11:/data/brick1/brick1 49152 0 >>>> Y 17304 >>>> Brick 172.16.0.12:/data/brick3/brick3 49155 0 >>>> Y 9363 >>>> Brick 172.16.0.13:/data/brick1/brick1 49152 0 >>>> Y 23684 >>>> Brick 172.16.0.11:/data/brick2/brick2 49153 0 >>>> Y 17323 >>>> Brick 172.16.0.12:/data/brick2/brick2 49153 0 >>>> Y 9382 >>>> Brick 172.16.0.13:/data/brick2/brick2 49153 0 >>>> Y 23703 >>>> NFS Server on localhost 2049 0 >>>> Y 30508 >>>> Self-heal Daemon on localhost N/A N/A >>>> Y 30521 >>>> NFS Server on 172.16.0.11 2049 0 >>>> Y 24999 >>>> Self-heal Daemon on 172.16.0.11 N/A N/A >>>> Y 25016 >>>> NFS Server on 172.16.0.13 2049 0 >>>> Y 25379 >>>> Self-heal Daemon on 172.16.0.13 N/A N/A >>>> Y 25509 >>>> >>>> Task Status of Volume ovirt >>>> >>>> ------------------------------------------------------------------------------ >>>> Task : Rebalance >>>> ID : 84d5ab2a-275e-421d-842b-928a9326c19a >>>> Status : completed >>>> >>>> Thanks, >>>> Siavash >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>>

Sahina Bose

4:54 p.m.

Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' I remember another thread about a similar issue - can you check the ACL settings on the storage path? ----- Original Message -----

...

From: "Siavash Safi" <siavash.safi@gmail.com> To: "David Gossage" <dgossage@carouselchecks.com> Cc: "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 7:58:29 PM Subject: Re: [ovirt-users] Cannot find master domain

On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > wrote:

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > wrote:

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt? Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start? As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt ------------------------------------------------------------------------------ Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Siavash Safi

5:05 p.m.

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28 /rhev/data-center/mnt/glusterSD/ [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x And as I mentioned in another message, the directory is empty. On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

...

Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

I remember another thread about a similar issue - can you check the ACL settings on the storage path?

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "David Gossage" <dgossage@carouselchecks.com> Cc: "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 7:58:29 PM Subject: Re: [ovirt-users] Cannot find master domain

On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > wrote:

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > wrote:

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt? Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start? As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28

...
Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) path

...
do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------

...
Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------

...
Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Sahina Bose

5:16 p.m.

----- Original Message -----

...

From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 8:35:18 PM Subject: Re: [ovirt-users] Cannot find master domain

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28 /rhev/data-center/mnt/glusterSD/ [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x

The ACLs look correct to me. Adding Nir/Allon for insights. Can you attach the gluster mount logs from this host?

...

And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

...
Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'

I remember another thread about a similar issue - can you check the ACL settings on the storage path?

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "David Gossage" <dgossage@carouselchecks.com> Cc: "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 7:58:29 PM Subject: Re: [ovirt-users] Cannot find master domain

On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > wrote:

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > wrote:

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt? Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start? As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28

...
Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) path

...
do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

------------------------------------------------------------------------------

...
Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

------------------------------------------------------------------------------

...
Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Siavash Safi

5:34 p.m.

Please check the attachment. On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sabose@redhat.com> wrote:

...

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org> Sent: Thursday, July 28, 2016 8:35:18 PM Subject: Re: [ovirt-users] Cannot find master domain

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28 /rhev/data-center/mnt/glusterSD/ [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x

The ACLs look correct to me. Adding Nir/Allon for insights.

Can you attach the gluster mount logs from this host?

...
And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

...
Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11: _ovirt'

I remember another thread about a similar issue - can you check the ACL settings on the storage path?

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "David Gossage" <dgossage@carouselchecks.com> Cc: "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 7:58:29 PM Subject: Re: [ovirt-users] Cannot find master domain

On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > wrote:

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > wrote:

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12: /data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt? Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start? As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28

...
...
Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) preparing

...
->

...
state finished

I can manually mount the gluster volume on the same server.

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

...
...
...
Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

...
...
...
Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Sahina Bose

6:36 p.m.

----- Original Message -----

...

From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" <users@ovirt.org>, "Nir Soffer" <nsoffer@redhat.com>, "Allon Mureinik" <amureini@redhat.com> Sent: Thursday, July 28, 2016 9:04:32 PM Subject: Re: [ovirt-users] Cannot find master domain

Please check the attachment.

Nothing out of place in the mount logs. Can you ensure the brick dir permissions are vdsm:kvm - even for the brick that was replaced?

...

On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sabose@redhat.com> wrote:

...
----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org> Sent: Thursday, July 28, 2016 8:35:18 PM Subject: Re: [ovirt-users] Cannot find master domain

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28 /rhev/data-center/mnt/glusterSD/ [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x

The ACLs look correct to me. Adding Nir/Allon for insights.

Can you attach the gluster mount logs from this host?

...
And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

...
Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11: _ovirt'

I remember another thread about a similar issue - can you check the ACL settings on the storage path?

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "David Gossage" <dgossage@carouselchecks.com> Cc: "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 7:58:29 PM Subject: Re: [ovirt-users] Cannot find master domain

On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > wrote:

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > wrote:

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12: /data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt? Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start? As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28

...
...
Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option) preparing

...
->

...
state finished

I can manually mount the gluster volume on the same server.

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

...
...
...
Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

...
...
...
Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Siavash Safi

6:44 p.m.

It seems that dir modes are wrong!? [root@node1 ~]# ls -ld /data/brick*/brick* drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 [root@node2 ~]# ls -ld /data/brick*/brick* drwxr-xr-x. 5 vdsm kvm 107 Apr 26 19:33 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick3/brick3 [root@node3 ~]# ls -ld /data/brick*/brick* drw-------. 5 vdsm kvm 107 Jul 28 20:10 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 On Thu, Jul 28, 2016 at 9:06 PM Sahina Bose <sabose@redhat.com> wrote:

...

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org>, "Nir Soffer" <nsoffer@redhat.com>, "Allon Mureinik" <amureini@redhat.com> Sent: Thursday, July 28, 2016 9:04:32 PM Subject: Re: [ovirt-users] Cannot find master domain

Please check the attachment.

Nothing out of place in the mount logs.

Can you ensure the brick dir permissions are vdsm:kvm - even for the brick that was replaced?

...
On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sabose@redhat.com> wrote:

...
----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org> Sent: Thursday, July 28, 2016 8:35:18 PM Subject: Re: [ovirt-users] Cannot find master domain

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28

...
...
...
[root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x

The ACLs look correct to me. Adding Nir/Allon for insights.

Can you attach the gluster mount logs from this host?

...
And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

...
Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11 : _ovirt'

I remember another thread about a similar issue - can you check

...
...
...
...
settings on the storage path?

----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "David Gossage" <dgossage@carouselchecks.com> Cc: "users" <users@ovirt.org> Sent: Thursday, July 28, 2016 7:58:29 PM Subject: Re: [ovirt-users] Cannot find master domain

On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > wrote:

On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > wrote:

Hi,

Issue: Cannot find master domain Changes applied before issue started to happen: replaced 172.16.0.12:/data/brick1/brick1 with 172.16.0.12: /data/brick3/brick3, did minor package upgrades for vdsm and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/

Any errrors in glusters brick or server logs? The client gluster logs from ovirt? Brick errors: [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null) [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid argument] (Both repeated many times)

Server errors: None

Client errors: None

yum log: https://paste.fedoraproject.org/396854/

What version of gluster was running prior to update to 3.7.13? 3.7.11-1 from gluster.org repository(after update ovirt switched to centos repository)

Did it create gluster mounts on server when attempting to start? As I checked the master domain is not mounted on any nodes. Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None jsonrpc.Executor/5::DEBUG::2016-07-28

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)

...
...
...
...
Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11: /ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) jsonrpc.Executor/5::ERROR::2016-07-28 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect six.reraise(t, v, tb) File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect self.getMountObj().getRecord().fs_file) File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess raise se.StorageServerAccessPermissionError(dirPath) StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} jsonrpc.Executor/5::INFO::2016-07-28 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and

/rhev/data-center/mnt/glusterSD/ the ACL protect:

...
...
...
...
...
connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} jsonrpc.Executor/5::DEBUG::2016-07-28 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> state finished

I can manually mount the gluster volume on the same server.

Setup: engine running on a separate node 3 x kvm/glusterd nodes

Status of volume: ovirt Gluster process TCP Port RDMA Port Online Pid

...
...
...
...
...
Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 NFS Server on localhost 2049 0 Y 30508 Self-heal Daemon on localhost N/A N/A Y 30521 NFS Server on 172.16.0.11 2049 0 Y 24999 Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 NFS Server on 172.16.0.13 2049 0 Y 25379 Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509

Task Status of Volume ovirt

...
...
...
...
...
Task : Rebalance ID : 84d5ab2a-275e-421d-842b-928a9326c19a Status : completed

Thanks, Siavash

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

David Gossage

6:59 p.m.

On Thu, Jul 28, 2016 at 11:44 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...

It seems that dir modes are wrong!? [root@node1 ~]# ls -ld /data/brick*/brick* drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 [root@node2 ~]# ls -ld /data/brick*/brick* drwxr-xr-x. 5 vdsm kvm 107 Apr 26 19:33 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick3/brick3 [root@node3 ~]# ls -ld /data/brick*/brick* drw-------. 5 vdsm kvm 107 Jul 28 20:10 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2

That would probably do it. kvm does read access. plus the lack of x on directories isnt great either. I'd think since they are the bricks you could maybe manually chmod them appropriately 755. I "think" gluster is only tracking files under /data/brick1/brick1/* not /data/brick1/brick1 itself. /data/brick3/brick3 is the only "new" one? Wonder if it could be a bug from the brick move in some way. Might be something worth posting to gluster list about.

...

On Thu, Jul 28, 2016 at 9:06 PM Sahina Bose <sabose@redhat.com> wrote:

...
----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org>, "Nir Soffer" <nsoffer@redhat.com>, "Allon Mureinik" <amureini@redhat.com> Sent: Thursday, July 28, 2016 9:04:32 PM Subject: Re: [ovirt-users] Cannot find master domain

Please check the attachment.

Nothing out of place in the mount logs.

Can you ensure the brick dir permissions are vdsm:kvm - even for the brick that was replaced?

...
On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sabose@redhat.com> wrote:

...
----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org> Sent: Thursday, July 28, 2016 8:35:18 PM Subject: Re: [ovirt-users] Cannot find master domain

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28

...
...
...
[root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x

The ACLs look correct to me. Adding Nir/Allon for insights.

Can you attach the gluster mount logs from this host?

...
And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

...
Error from vdsm log: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/glusterSD/ 172.16.0.11: _ovirt'

I remember another thread about a similar issue - can you check

...
...
...
...
settings on the storage path?

----- Original Message ----- > From: "Siavash Safi" <siavash.safi@gmail.com> > To: "David Gossage" <dgossage@carouselchecks.com> > Cc: "users" <users@ovirt.org> > Sent: Thursday, July 28, 2016 7:58:29 PM > Subject: Re: [ovirt-users] Cannot find master domain > > > > On Thu, Jul 28, 2016 at 6:29 PM David Gossage < dgossage@carouselchecks.com > > wrote: > > > > On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > > wrote: > > > > Hi, > > Issue: Cannot find master domain > Changes applied before issue started to happen: replaced > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12: /data/brick3/brick3, did > minor package upgrades for vdsm and glusterfs > > vdsm log: https://paste.fedoraproject.org/396842/ > > > Any errrors in glusters brick or server logs? The client gluster logs from > ovirt? > Brick errors: > [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] > 0-ovirt-posix: null gfid for path (null) > [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] > 0-ovirt-posix: lstat on null failed [Invalid argument] > (Both repeated many times) > > Server errors: > None > > Client errors: > None > > > > > > > > yum log: https://paste.fedoraproject.org/396854/ > > What version of gluster was running prior to update to 3.7.13? > 3.7.11-1 from gluster.org repository(after update ovirt switched to centos > repository) > > > > > Did it create gluster mounts on server when attempting to start? > As I checked the master domain is not mounted on any nodes. > Restarting vdsmd generated following errors: > > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating > directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None > jsonrpc.Executor/5::DEBUG::2016-07-28 >

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)

...
...
...
> Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope > --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o > backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11: /ovirt > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess... > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) > jsonrpc.Executor/5::ERROR::2016-07-28 > 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not > connect to storageServer > Traceback (most recent call last): > File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer > conObj.connect() > File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect > six.reraise(t, v, tb) > File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect > self.getMountObj().getRecord().fs_file) > File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess > raise se.StorageServerAccessPermissionError(dirPath) > StorageServerAccessPermissionError: Permission settings on the specified path > do not allow access to the storage. Verify permission settings on the > specified storage path.: 'path = > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} > jsonrpc.Executor/5::INFO::2016-07-28 > 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and

/rhev/data-center/mnt/glusterSD/ the ACL protect:

...
...
...
...
> connectStorageServer, Return response: {'statuslist': [{'status': 469, 'id': > u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': > [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} > jsonrpc.Executor/5::DEBUG::2016-07-28 > 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing -> > state finished > > I can manually mount the gluster volume on the same server. > > > > > > > > > > > > > > Setup: > engine running on a separate node > 3 x kvm/glusterd nodes > > Status of volume: ovirt > Gluster process TCP Port RDMA Port Online Pid >

...
...
...
...
> Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 > Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 > Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 > Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 > Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 > Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 > NFS Server on localhost 2049 0 Y 30508 > Self-heal Daemon on localhost N/A N/A Y 30521 > NFS Server on 172.16.0.11 2049 0 Y 24999 > Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 > NFS Server on 172.16.0.13 2049 0 Y 25379 > Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509 > > Task Status of Volume ovirt >

...
...
...
...
> Task : Rebalance > ID : 84d5ab2a-275e-421d-842b-928a9326c19a > Status : completed > > Thanks, > Siavash > > > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users >

Siavash Safi

7:08 p.m.

Thanks for the hint guys :) Running the following commands on all nodes resolved the issue: chmod u+rxw /data/brick*/brick* chmod g+rx /data/brick*/brick* I wonder if the upgrade process from gluster.org to centos.org packages caused this issue. I have another oVirt cluster running in a data center and it has similar package versions before this upgrade. I can try to upgrade one of the nodes there next week and see if the directory permission issue happens again. I will notify the gluster list and/or open a bug report for CentOS package maintainers when I finish my test. On Thu, Jul 28, 2016 at 9:29 PM David Gossage <dgossage@carouselchecks.com> wrote:

...

On Thu, Jul 28, 2016 at 11:44 AM, Siavash Safi <siavash.safi@gmail.com> wrote:

...
It seems that dir modes are wrong!? [root@node1 ~]# ls -ld /data/brick*/brick* drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 [root@node2 ~]# ls -ld /data/brick*/brick* drwxr-xr-x. 5 vdsm kvm 107 Apr 26 19:33 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2 drw-------. 5 vdsm kvm 107 Jul 28 20:13 /data/brick3/brick3 [root@node3 ~]# ls -ld /data/brick*/brick* drw-------. 5 vdsm kvm 107 Jul 28 20:10 /data/brick1/brick1 drw-------. 5 vdsm kvm 82 Jul 27 23:08 /data/brick2/brick2

That would probably do it. kvm does read access. plus the lack of x on directories isnt great either. I'd think since they are the bricks you could maybe manually chmod them appropriately 755. I "think" gluster is only tracking files under /data/brick1/brick1/* not /data/brick1/brick1 itself.

/data/brick3/brick3 is the only "new" one? Wonder if it could be a bug from the brick move in some way. Might be something worth posting to gluster list about.

...
On Thu, Jul 28, 2016 at 9:06 PM Sahina Bose <sabose@redhat.com> wrote:

...
----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org>, "Nir Soffer" <nsoffer@redhat.com>, "Allon Mureinik" <amureini@redhat.com> Sent: Thursday, July 28, 2016 9:04:32 PM Subject: Re: [ovirt-users] Cannot find master domain

Please check the attachment.

Nothing out of place in the mount logs.

Can you ensure the brick dir permissions are vdsm:kvm - even for the brick that was replaced?

...
On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sabose@redhat.com> wrote:

...
----- Original Message -----

...
From: "Siavash Safi" <siavash.safi@gmail.com> To: "Sahina Bose" <sabose@redhat.com> Cc: "David Gossage" <dgossage@carouselchecks.com>, "users" < users@ovirt.org> Sent: Thursday, July 28, 2016 8:35:18 PM Subject: Re: [ovirt-users] Cannot find master domain

[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/ drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28

...
...
...
[root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/ getfacl: Removing leading '/' from absolute path names # file: rhev/data-center/mnt/glusterSD/ # owner: vdsm # group: kvm user::rwx group::r-x other::r-x

The ACLs look correct to me. Adding Nir/Allon for insights.

Can you attach the gluster mount logs from this host?

...
And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sabose@redhat.com> wrote:

> Error from vdsm log: Permission settings on the specified path do not > allow access to the storage. Verify permission settings on the specified > storage path.: 'path = /rhev/data-center/mnt/glusterSD/ 172.16.0.11: _ovirt' > > I remember another thread about a similar issue - can you check

...
...
...
> settings on the storage path? > > ----- Original Message ----- > > From: "Siavash Safi" <siavash.safi@gmail.com> > > To: "David Gossage" <dgossage@carouselchecks.com> > > Cc: "users" <users@ovirt.org> > > Sent: Thursday, July 28, 2016 7:58:29 PM > > Subject: Re: [ovirt-users] Cannot find master domain > > > > > > > > On Thu, Jul 28, 2016 at 6:29 PM David Gossage < > dgossage@carouselchecks.com > > > wrote: > > > > > > > > On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.safi@gmail.com > > > wrote: > > > > > > > > Hi, > > > > Issue: Cannot find master domain > > Changes applied before issue started to happen: replaced > > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12: /data/brick3/brick3, > did > > minor package upgrades for vdsm and glusterfs > > > > vdsm log: https://paste.fedoraproject.org/396842/ > > > > > > Any errrors in glusters brick or server logs? The client gluster logs > from > > ovirt? > > Brick errors: > > [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup] > > 0-ovirt-posix: null gfid for path (null) > > [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup] > > 0-ovirt-posix: lstat on null failed [Invalid argument] > > (Both repeated many times) > > > > Server errors: > > None > > > > Client errors: > > None > > > > > > > > > > > > > > > > yum log: https://paste.fedoraproject.org/396854/ > > > > What version of gluster was running prior to update to 3.7.13? > > 3.7.11-1 from gluster.org repository(after update ovirt switched to > centos > > repository) > > > > > > > > > > Did it create gluster mounts on server when attempting to start? > > As I checked the master domain is not mounted on any nodes. > > Restarting vdsmd generated following errors: > > > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating > > directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > >

18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)

...
...
> > Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13'] > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset > > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope > > --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o > > backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11: /ovirt > > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting > IOProcess... > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset > > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l > > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None) > > jsonrpc.Executor/5::ERROR::2016-07-28 > > 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not > > connect to storageServer > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer > > conObj.connect() > > File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect > > six.reraise(t, v, tb) > > File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect > > self.getMountObj().getRecord().fs_file) > > File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess > > raise se.StorageServerAccessPermissionError(dirPath) > > StorageServerAccessPermissionError: Permission settings on the specified > path > > do not allow access to the storage. Verify permission settings on the > > specified storage path.: 'path = > > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt' > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {} > > jsonrpc.Executor/5::INFO::2016-07-28 > > 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and

/rhev/data-center/mnt/glusterSD/ the ACL protect:

...
...
...
> > connectStorageServer, Return response: {'statuslist': [{'status': 469, > 'id': > > u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare) > > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist': > > [{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]} > > jsonrpc.Executor/5::DEBUG::2016-07-28 > > 18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState) > > Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing > -> > > state finished > > > > I can manually mount the gluster volume on the same server. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Setup: > > engine running on a separate node > > 3 x kvm/glusterd nodes > > > > Status of volume: ovirt > > Gluster process TCP Port RDMA Port Online Pid > > >

...
...
...
> > Brick 172.16.0.11:/data/brick1/brick1 49152 0 Y 17304 > > Brick 172.16.0.12:/data/brick3/brick3 49155 0 Y 9363 > > Brick 172.16.0.13:/data/brick1/brick1 49152 0 Y 23684 > > Brick 172.16.0.11:/data/brick2/brick2 49153 0 Y 17323 > > Brick 172.16.0.12:/data/brick2/brick2 49153 0 Y 9382 > > Brick 172.16.0.13:/data/brick2/brick2 49153 0 Y 23703 > > NFS Server on localhost 2049 0 Y 30508 > > Self-heal Daemon on localhost N/A N/A Y 30521 > > NFS Server on 172.16.0.11 2049 0 Y 24999 > > Self-heal Daemon on 172.16.0.11 N/A N/A Y 25016 > > NFS Server on 172.16.0.13 2049 0 Y 25379 > > Self-heal Daemon on 172.16.0.13 N/A N/A Y 25509 > > > > Task Status of Volume ovirt > > >

...
...
...
> > Task : Rebalance > > ID : 84d5ab2a-275e-421d-842b-928a9326c19a > > Status : completed > > > > Thanks, > > Siavash > > > > > > > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > > > > > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > >

3410

Age (days ago)

3410

Last active (days ago)

List overview

Download

18 comments

3 participants

participants (3)

David Gossage
Sahina Bose
Siavash Safi