Greetings!
I'm running a 3 node ovirt (3.6) hosted engine
(3.6.5.3-1.el7.centos) cluster with glusterfs (3.7.11) storage. I
keep getting this error for my hosted engine storage:
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed to
stop monitoring domain
(sd_uuid=ff8ce693-5a52-47df-8e06-3443b4dc98a4): Error 900 from
stopMonitoringDomain: Storage domain is member of pool:
'domain=ff8ce693-5a52-47df-8e06-3443b4dc98a4'
ovirt_hosted_engine_ha.lib.image.Image:Teardown images
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Disconnecting
the storage
ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Disconnecting
storage server
seemingly related to these ERROR messages in vdsm.log:
Storage.TaskManager.Task::(_setError) Task=`xyz`::Unexpected error
> Storage domain is member of pool
or
> Domain is either partially accessible or entirely inaccessible
or
Storage.HSM::(disconnectStorageServer) Could not disconnect from
storageServer
and then it updates config, mounts again, rinse, repeat, every
+-minute! (and seems to introduce side effects like engine state
changes, inability to migrate engine VM, hosted engine HA status
changing, etc.)
Extracts from logs attached. The main issues seem to stem from the
ERROR lines 214, 778, 1037, etc. in the vdsm.log...
Everything else seems to be working fine.
Please advise?
Thanks,
Roderick