On Fri, Feb 4, 2022 at 3:18 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Fri, Feb 04, 2022 at 03:09:02PM +0200, Nir Soffer wrote:
> On Fri, Feb 4, 2022 at 11:16 AM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> >
> > On Fri, Feb 04, 2022 at 08:42:08AM +0000, Richard W.M. Jones wrote:
> > > On Thu, Feb 03, 2022 at 06:31:52PM +0200, Nir Soffer wrote:
> > > > This is expected on oVirt, our multipath configuration is
intentionally grabbing
> > > > any device that multipath can work with, even if the device only has
one path.
> > > > The motivation is to be able to configure a system when only one path
is
> > > > available (maybe you have an hba/network/server issue), and once the
other
> > > > paths are available the system will use them transparently.
> > > >
> > > > To avoid this issue with local devices, you need to blacklist the
device.
> > > >
> > > > Add this file:
> > > >
> > > > $ cat /etc/multipath/conf.d/local.conf
> > > > blacklist {
> > > > wwid "QEMU HARDDISK"
> > > > }
> > >
> > > Thanks - for the mailing list record the syntax that worked for me is:
> > >
> > > # cat /etc/multipath/conf.d/local.conf
> > > blacklist {
> > > wwid ".*QEMU_HARDDISK.*"
> > > }
> > >
> > > > Configuring NFS on some other machine is easy.
> > > >
> > > > I'm using another VM for this, so I can easily test negative
flows like stopping
> > > > or restarting the NFS server while it is being used by vms or
storage
> > > > operations.
> > > > I'm using 2G alpine vm for this, it works fine even with 1G
memory.
> > >
> > > I think I can get local storage working now (I had it working before).
> >
> > Well finally it fails with:
> >
> > 2022-02-04 09:14:55,779Z ERROR
[org.ovirt.engine.core.bll.storage.domain.AddPosixFsStorageDomainCommand] (default task-2)
[25a32edf] Command
'org.ovirt.engine.core.bll.storage.domain.AddPosixFsStorageDomainCommand' failed:
EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
VDSGenericException: VDSErrorException: Failed to CreateStorageDomainVDS, error = Could
not initialize cluster lock: (), code = 701 (Failed with error unexpected and code 16)
>
> The error "Could not initialize cluster lock" comes from vdsm. Usually
> engine log is
> not the best way to debug such failures. This is only the starting
> point and you need to
> go to the host and check vdsm and supervdsm logs in /var/log/vdsm/.
I can't really see anything relevant in supervdsm.log, it's all fairly
neutral debug messages.
> Since this error
> comes from sanlock, we also may have useful info in /var/log/sanlock.log.
Interesting:
2022-02-04 13:15:27 16723 [826]: open error -13 EACCES: no permission to open
/rhev/data-center/mnt/_dev_sdb1/13a731d2-e1d2-4998-9b02-ac46899e3159/dom_md/ids
2022-02-04 13:15:27 16723 [826]: check that daemon user sanlock 179 group sanlock 179 has
access to disk or file.
I think it's quite likely that the sanlock daemon does not have access
here, since (see below) I choown'd the root of the xfs filesystem to
36:36 (otherwise vdsm complains).
> Can you share instructions on how to reproduce this issue?
I have one engine and one node (both happen to be VMs, but I don't
believe that is relevant here). It's running Version 4.4.10.6-1.el8.
I added a second disk to the node, and disabled multipath as
previously discussed. The second disk is /dev/sdb1. I formatted it
as xfs and chowned the root of the filesystem to 36:36.
Looks right, forgetting to change ownership is a common mistake.
In the admin portal, Storage -> Domains -> New domain
Storage type: Posix compliant fs
Name: ovirt-data
Path: /dev/sdb1
VFS type: xfs
Hit OK ->
Error while executing action AddPosixFsStorageDomain: Unexpected exception
I reproduce with vdsm-4.50.0.5-1.el8.x86_64. on RHEL 8.6 nightly.
Can you file a oVirt/vdsm bug for this?