On 2/10/20 11:09 AM, Amit Bawer wrote:
compared it with host having nfs domain working
this

On Mon, Feb 10, 2020 at 11:11 AM Jorick Astrego <jorick@netbulae.eu> wrote:


On 2/9/20 10:27 AM, Amit Bawer wrote:


On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego <jorick@netbulae.eu> wrote:

Hi,

Something weird is going on with our ovirt node 4.3.8 install mounting a nfs share.

We have a NFS domain for a couple of backup disks and we have a couple of 4.2 nodes connected to it.

Now I'm adding a fresh cluster of 4.3.8 nodes and the backupnfs mount doesn't work.

(annoying you cannot copy the text from the events view)

The domain is up and working

ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
Size: 10238 GiB
Available:2491 GiB
Used:7747 GiB
Allocated: 3302 GiB
Over Allocation Ratio:37%
Images:7
Path:*.*.*.*:/data/ovirt
NFS Version: AUTO
Warning Low Space Indicator:10% (1023 GiB)
Critical Space Action Blocker:5 GiB

But somehow the node appears to thin thinks it's an LVM volume? It tries to find the VGs volume group but fails... which is not so strange as it is an NFS volume:
2020-02-05 14:17:54,190+0000 WARN  (monitor/f5d2f7c) [storage.LVM] Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5 out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not found', '  Cannot process volume group f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
2020-02-05 14:17:54,201+0000 ERROR (monitor/f5d2f7c) [storage.Monitor] Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed (monitor:330)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop
    self._setupMonitor()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor
    self._produceDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper
    value = meth(self, *a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain
    self.domain = sdCache.produce(self.sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce
    domain.getRealDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain
    return findMethod(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)

The volume is actually mounted fine on the node:

On NFS server
Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated mount request from *.*.*.*:673 for /data/ovirt (/data/ovirt)

On the host

mount|grep nfs

*.*.*.*:/data/ovirt on /rhev/data-center/mnt/*.*.*.*:_data_ovirt type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)

And I can see the files:
ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
total 4
drwxr-xr-x. 5 vdsm kvm    61 Oct 26  2016 1ed0a635-67ee-4255-aad9-b70822350706

What ls -lart for 1ed0a635-67ee-4255-aad9-b70822350706 is showing?

ls -arlt 1ed0a635-67ee-4255-aad9-b70822350706/
total 4
drwxr-xr-x. 2 vdsm kvm    93 Oct 26  2016 dom_md
drwxr-xr-x. 5 vdsm kvm    61 Oct 26  2016 .
drwxr-xr-x. 4 vdsm kvm    40 Oct 26  2016 master
drwxr-xr-x. 5 vdsm kvm  4096 Oct 26  2016 images
drwxrwxrwx. 3 root root   86 Feb  5 14:37 ..

On a working nfs domain host we have following storage hierarchy, feece142-9e8d-42dc-9873-d154f60d0aac is the nfs domain in my case

/rhev/data-center/
├── edefe626-3ada-11ea-9877-525400b37767
...
│   ├── feece142-9e8d-42dc-9873-d154f60d0aac -> /rhev/data-center/mnt/10.35.18.45:_exports_data/feece142-9e8d-42dc-9873-d154f60d0aac
│   └── mastersd -> /rhev/data-center/mnt/blockSD/a6a14714-6eaa-4054-9503-0ea3fcc38531
└── mnt
    ├── 10.35.18.45:_exports_data
    │   └── feece142-9e8d-42dc-9873-d154f60d0aac
    │       ├── dom_md
    │       │   ├── ids
    │       │   ├── inbox
    │       │   ├── leases
    │       │   ├── metadata
    │       │   ├── outbox
    │       │   └── xleases
    │       └── images
    │           ├── 915e6f45-ea13-428c-aab2-fb27798668e5
    │           │   ├── b83843d7-4c5a-4872-87a4-d0fe27a2c3d2
    │           │   ├── b83843d7-4c5a-4872-87a4-d0fe27a2c3d2.lease
    │           │   └── b83843d7-4c5a-4872-87a4-d0fe27a2c3d2.meta
    │           ├── b3be4748-6e18-43c2-84fb-a2909d8ee2d6
    │           │   ├── ac46e91d-6a50-4893-92c8-2693c192fbc8
    │           │   ├── ac46e91d-6a50-4893-92c8-2693c192fbc8.lease
    │           │   └── ac46e91d-6a50-4893-92c8-2693c192fbc8.meta
    │           ├── b9edd81a-06b0-421c-85a3-f6618c05b25a
    │           │   ├── 9b9e1d3d-fc89-4c08-87b6-557b17a4b5dd
    │           │   ├── 9b9e1d3d-fc89-4c08-87b6-557b17a4b5dd.lease
    │           │   └── 9b9e1d3d-fc89-4c08-87b6-557b17a4b5dd.meta
    │           ├── f88a6f36-fcb2-413c-8fd6-c2b090321542
    │           │   ├── d8f8b2d7-7232-4feb-bce4-dbf0d37dba9b
    │           │   ├── d8f8b2d7-7232-4feb-bce4-dbf0d37dba9b.lease
    │           │   └── d8f8b2d7-7232-4feb-bce4-dbf0d37dba9b.meta
    │           └── fe59753e-f3b5-4840-8d1d-31c49c2448f0
    │               ├── ad0107bc-46d2-4977-b6c3-082adbf3083d
    │               ├── ad0107bc-46d2-4977-b6c3-082adbf3083d.lease
    │               └── ad0107bc-46d2-4977-b6c3-082adbf3083d.meta
 
Maybe I got confused by your ls command output, but I was looking to see how the dir tree for your nfs domain looks like
which should be rooted under /rhev/data-center/mnt/<nfs server>:<exported path>

In your output, only 1ed0a635-67ee-4255-aad9-b70822350706 is there which is not the nfs domain f5d2f7c6-093f-46d6-a844-224d92db5ef9 at question.

So to begin with, there is a need to figure why in your case the f5d2f7c6-093f-46d6-a844-224d92db5ef9 folder is not to be found on the nfs storage mounted on that node,
which should be there as far as I understood since this is the same nfs mount path and server shared between all hosts which are connected to this SD.

Maybe compare the mount options between the working nodes to the non-working node and check the export options on the nfs server itself, maybe it has some specific client ip exports settings?

Hmm, I didn't notice that.

I did a check on the NFS server and I found the "1ed0a635-67ee-4255-aad9-b70822350706" in the exportdom path (/data/exportdom).

This was an old NFS export domain that has been deleted for a while now. I remember finding somewhere an issue with old domains still being active after removal but I cannot find it now.

I unexported the directory on the nfs server and now I have to correct mount and it activates fine.

Thanks!






Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts


Tel: 053 20 30 270 info@netbulae.eu Staalsteden 4-3A KvK 08198180
Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01