On 11/24/2016 06:15 PM, Simone Tiraboschi wrote:
On Thu, Nov 24, 2016 at 1:26 PM, knarra <knarra@redhat.com> wrote:
Hi,
I have three nodes with glusterfs as storage domain. For some reason i see that vm.conf from /var/run/ovirt-hosted-engine-ha is missing and due to this on one of my host i see that Hosted Engine HA : Not Active. Once i copy the file from some other node and restart ovirt-ha-broker and ovirt-ha-agent services everything works fine. But then this happens again. Can some please help me identify why this happens. Below is the log i see in ovirt-ha-agent.logs.
https://paste.fedoraproject.org/489120/79990345/
Once the engine correctly imported the hosted-engine storage domain, a couple of OVF_STORE volumes will appear there.
Every modification to the engine VM configuration will be written by the engine into that OVF_STORE, so all the ovirt-ha-agent running on the hosted-engine hosts will be able to re-start the engine VM with a coherent configuration.
Till the engine imports the hosted-engine storage domain, ovirt-ha-agent will fall back to the initial vm.conf.
In you case the OVF_STORE volume is there,
but the agent fails extracting the engine VM configuration:
MainThread::INFO::2016-11-24 17:55:04,914::ovf_store::112::ovirt_hosted_engine_ha.lib. ovf.ovf_store.OVFStore::( getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE
MainThread::INFO::2016-11-24 17:55:04,919::ovf_store::119::ovirt_hosted_engine_ha.lib. ovf.ovf_store.OVFStore::( getEngineVMOVF) OVF_STORE volume path: /rhev/data-center/mnt/ glusterSD/10.70.36.79:_engine/ 27f054c3-c245-4039-b42a- c28b37043016/images/fdf49778- 9a06-49c6-bf7a-a0f12425911c/ 8c954add-6bcf-47f8-ac2e- 4c85fc3f8699
MainThread::ERROR::2016-11-24 17:55:04,928::ovf_store::124::ovirt_hosted_engine_ha.lib. ovf.ovf_store.OVFStore::( getEngineVMOVF) Unable to extract HEVM OVF
So it tries to rollback to the initial vm.conf, but also that one seams to miss some values and so the agent is failing:
MainThread::ERROR::2016-11-24 17:55:04,974::agent::205::ovirt_hosted_engine_ha.agent. agent.Agent::(_run_agent) Error: ''Configuration value not found: file=/var/run/ovirt-hosted- engine-ha/vm.conf, key=memSize'' - trying to restart agent
Both of the issue seams storage related, could yuo please share your gluster logs?
Thanks
kasturi
Hi Simone,
Below [1] is the link for the sosreports on the first two hosts. The third host has some issue. Once it is up will give the sosreport from there as well.