Hi everybody,
we are running an oVirt 4.3.10 production cluster with 9 hosts and 5 datastore domains.
Since yesterday we get error messages
26.02.2021 02:00:48 VDSM command SetVolumeDescriptionVDS failed: Could not acquire
resource. Probably resource factory threw an exception.: ()
26.02.2021 02:00:48 Failed to update OVF disks 5aa438e3-8d22-4b6c-bccf-a843151ca0be, OVF
data isn't updated on those OVF stores (Data Center datacenter01, Storage Domain
vmstore13).
26.02.2021 02:00:48 Failed to update VMs/Templates OVF data for Storage Domain vmstore13
in Data Center datacenter01.
every hour.
Only one domain (vmstore13) is affected, this domain is (like 3 others in the cluster)
based on glusterfs.
Trying to update the OVF's manually from the engine web-gui leads to the same result
followed by (a misleading)
26.02.2021 02:00:49 OVF_STORE for domain vmstore13 was updated by admin@internal-authz.
The vm's with discs on the affected domain are running fine, snapshots are working.
So far I have tried to move the SPM role to another host. This succeeded, but the error
messages persist.
The vdsm log on the SPM host contains something like
2021-02-26 03:00:57,701+0100 INFO (jsonrpc/2) [vdsm.api] START
setVolumeDescription(sdUUID=u'9f731135-f5d9-4609-9e3b-fa9cae75e314',
spUUID=u'33e8dc9e-8bc8-11ea-bd76-00163e741033',
imgUUID=u'5aa438e3-8d22-4b6c-bccf-a843151ca0be',
volUUID=u'0795e58c-4960-413a-a0b4-e8a6d547fda5',
description=u'{"Updated":false,"Last Updated":"Wed Feb 24
17:48:17 CET 2021","Storage
Domains":[{"uuid":"9f731135-f5d9-4609-9e3b-fa9cae75e314"}],"Disk
Description":"OVF_STORE"}', options=None) from=::ffff:10.70.1.1,46968,
flow_id=1f314676, task_id=9101db01-b4f0-447e-a5a9-b6af76278d55 (api:48)
2021-02-26 03:00:57,712+0100 ERROR (jsonrpc/2) [storage.VolumeManifest] [Errno 116] Stale
file handle (fileVolume:155)
for each error, I have attached the relevant part of vdsm.log.
Has anyone experienced this behaviour before and can provide help ?
--
juergen
Attachments:
- vdsm.log
(application/octet-stream — 5.6 KB)