On Fri, Feb 4, 2022 at 3:18 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Fri, Feb 04, 2022 at 03:09:02PM +0200, Nir Soffer wrote:
> On Fri, Feb 4, 2022 at 11:16 AM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> >
> > On Fri, Feb 04, 2022 at 08:42:08AM +0000, Richard W.M. Jones wrote:
> > > On Thu, Feb 03, 2022 at 06:31:52PM +0200, Nir Soffer wrote:
> > > > This is expected on oVirt, our multipath configuration is
intentionally grabbing
> > > > any device that multipath can work with, even if the device only has
one path.
> > > > The motivation is to be able to configure a system when only one path
is
> > > > available (maybe you have an hba/network/server issue), and once the
other
> > > > paths are available the system will use them transparently.
> > > >
> > > > To avoid this issue with local devices, you need to blacklist the
device.
> > > >
> > > > Add this file:
> > > >
> > > > $ cat /etc/multipath/conf.d/local.conf
> > > > blacklist {
> > > > wwid "QEMU HARDDISK"
> > > > }
> > >
> > > Thanks - for the mailing list record the syntax that worked for me is:
> > >
> > > # cat /etc/multipath/conf.d/local.conf
> > > blacklist {
> > > wwid ".*QEMU_HARDDISK.*"
> > > }
> > >
> > > > Configuring NFS on some other machine is easy.
> > > >
> > > > I'm using another VM for this, so I can easily test negative
flows like stopping
> > > > or restarting the NFS server while it is being used by vms or
storage
> > > > operations.
> > > > I'm using 2G alpine vm for this, it works fine even with 1G
memory.
> > >
> > > I think I can get local storage working now (I had it working before).
> >
> > Well finally it fails with:
> >
> > 2022-02-04 09:14:55,779Z ERROR
[org.ovirt.engine.core.bll.storage.domain.AddPosixFsStorageDomainCommand] (default task-2)
[25a32edf] Command
'org.ovirt.engine.core.bll.storage.domain.AddPosixFsStorageDomainCommand' failed:
EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
VDSGenericException: VDSErrorException: Failed to CreateStorageDomainVDS, error = Could
not initialize cluster lock: (), code = 701 (Failed with error unexpected and code 16)
>
> The error "Could not initialize cluster lock" comes from vdsm. Usually
> engine log is
> not the best way to debug such failures. This is only the starting
> point and you need to
> go to the host and check vdsm and supervdsm logs in /var/log/vdsm/.
I can't really see anything relevant in supervdsm.log, it's all fairly
neutral debug messages.
> Since this error
> comes from sanlock, we also may have useful info in /var/log/sanlock.log.
Interesting:
2022-02-04 13:15:27 16723 [826]: open error -13 EACCES: no permission to open
/rhev/data-center/mnt/_dev_sdb1/13a731d2-e1d2-4998-9b02-ac46899e3159/dom_md/ids
2022-02-04 13:15:27 16723 [826]: check that daemon user sanlock 179 group sanlock 179 has
access to disk or file.
The issue is selinux:
NFS domain:
$ ls -lhZ /rhev/data-center/mnt/alpine\:_01/e9467633-ee31-4e15-b3f8-3812b374c764/dom_md/
total 2.3M
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1.0M Feb 4 15:32 ids
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 16M Jan 20 23:53 inbox
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 2.0M Jan 20 23:54 leases
-rw-r--r--. 1 vdsm kvm system_u:object_r:nfs_t:s0 354 Jan 20 23:54 metadata
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 16M Jan 20 23:53 outbox
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1.3M Jan 20 23:53 xleases
The posix domain mount (mounted manually):
$ ls -lhZ mnt/689c22c4-e264-4873-aa75-1aa4970d4366/dom_md/
total 252K
-rw-rw----. 1 vdsm kvm system_u:object_r:unlabeled_t:s0 0 Feb 4 15:23 ids
-rw-rw----. 1 vdsm kvm system_u:object_r:unlabeled_t:s0 16M Feb 4 15:23 inbox
-rw-rw----. 1 vdsm kvm system_u:object_r:unlabeled_t:s0 0 Feb 4 15:23 leases
-rw-r--r--. 1 vdsm kvm system_u:object_r:unlabeled_t:s0 316 Feb 4
15:23 metadata
-rw-rw----. 1 vdsm kvm system_u:object_r:unlabeled_t:s0 16M Feb 4 15:23 outbox
-rw-rw----. 1 vdsm kvm system_u:object_r:unlabeled_t:s0 1.3M Feb 4
15:23 xleases
Can be fixed with:
$ sudo chcon -R -t nfs_t mnt
$ ls -lhZ mnt/689c22c4-e264-4873-aa75-1aa4970d4366/dom_md/
total 252K
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 0 Feb 4 15:23 ids
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 16M Feb 4 15:23 inbox
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 0 Feb 4 15:23 leases
-rw-r--r--. 1 vdsm kvm system_u:object_r:nfs_t:s0 316 Feb 4 15:23 metadata
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 16M Feb 4 15:23 outbox
-rw-rw----. 1 vdsm kvm system_u:object_r:nfs_t:s0 1.3M Feb 4 15:23 xleases
After this change delete the storage domain directory:
$ rm -rf mnt/689c22c4-e264-4873-aa75-1aa4970d4366
Since vdsm will refuse to create a storage domain in a non-empty mount,
and recreate the storage domain in engine.
Works for me with 4.5:
$ mount | grep /dev/sda1
/dev/sda1 on /rhev/data-center/mnt/_dev_sda1 type xfs
(rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
Nir