On Saturday, February 1, 2020, <asm@pioner.kz> wrote:
Hi! I trying to upgrade my hosts and have problem with it. After uprgading one host i see that this one NonOperational. All was fine with vdsm-4.30.24-1.el7 but after upgrading with new version vdsm-4.30.40-1.el7.x86_64 and some others i have errors.
Firtst of all i see in ovirt Events: Host srv02 cannot access the Storage Domain(s) <UNKNOWN> attached to the Data Center Default. Setting Host state to Non-Operational. My Default storage domain with HE VM data on NFS storage.

In messages log of host: 
srv02 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last):#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/a
gent.py", line 131, in _run_agent#012    return action(he)#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper#012    return he.start_monitoring
()#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 432, in start_monitoring#012    self._initialize_broker()#012  File "/usr/lib/python2.7/site-packages/
ovirt_hosted_engine_ha/agent/hosted_engine.py", line 556, in _initialize_broker#012    m.get('options', {}))#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 8
9, in start_monitor#012    ).format(t=type, o=options, e=e)#012RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'network', options:
{'tcp_t_address': None, 'network_test': None, 'tcp_t_port': None, 'addr': '192.168.2.248'}]
Feb  1 15:41:42 srv02 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent

In broker log:
MainThread::WARNING::2020-02-01 15:43:35,167::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Command StorageDomain.getInfo with ar
gs {'storagedomainID': 'bbdddea7-9cd6-41e7-ace5-fb9a6795caa8'} failed:
(code=350, message=Error in storage domain action: (u'sdUUID=bbdddea7-9cd6-41e7-ace5-fb9a6795caa8',))

In vdsm.lod
2020-02-01 15:44:19,930+0600 INFO  (jsonrpc/0) [vdsm.api] FINISH getStorageDomainInfo error=[Errno 1] Operation not permitted from=::1,57528, task_id=40683f67-d7b0-4105-aab8-6338deb54b00 (api:52)
2020-02-01 15:44:19,930+0600 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='40683f67-d7b0-4105-aab8-6338deb54b00') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in getStorageDomainInfo
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2753, in getStorageDomainInfo
    dom = self.validateSdUUID(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 305, in validateSdUUID
    sdDom = sdCache.produce(sdUUID=sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce
    domain.getRealDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain
    return findMethod(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 145, in findDomain
    return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 378, in __init__
    manifest.sdUUID, manifest.mountpoint)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 853, in _detect_block_size
    block_size = iop.probe_block_size(mountpoint)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/outOfProcess.py", line 384, in probe_block_size
    return self._ioproc.probe_block_size(dir_path)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 602, in probe_block_size
    "probe_block_size", {"dir": dir_path}, self.timeout)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 448, in _sendCommand
    raise OSError(errcode, errstr)
OSError: [Errno 1] Operation not permitted
2020-02-01 15:44:19,930+0600 INFO  (jsonrpc/0) [storage.TaskManager.Task] (Task='40683f67-d7b0-4105-aab8-6338deb54b00') aborting: Task is aborted: u'[Errno 1] Operation not permitted' - code 100 (task:1
181)
2020-02-01 15:44:19,930+0600 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo error=[Errno 1] Operation not permitted (dispatcher:87)

Seems like a reproduction of

 https://bugzilla.redhat.com/show_bug.cgi?id=1777726#c1

Missing file creation/removal permissions on nfs storage.


But i see that this domain is mounted (by mount command):
storage:/volume3/ovirt-hosted on /rhev/data-center/mnt/storage:_volume3_ovirt-hosted type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.2.251,local_lock=none,addr=192.168.2.248)

I didnt see storage directory in /var/run/vdsm? I see many differences with another hosts. Here is listing of var/run/vdsm:
bonding-defaults.json
dhclientmon
nets_restored
payload
svdsm.sock
v2v
vhostuser
bonding-name2numeric.json
mom-vdsm.sock
ovirt-imageio-daemon.sock
supervdsmd.lock
trackedInterfaces
vdsmd.lock
What whe problem? Please help.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/IBUDRUOETQ5WCTZQMGIVBZZZUAITDVHL/