On Thu, Nov 16, 2017 at 7:09 AM, Rudi Ahlers <rudiahlers@gmail.com> wrote:
I forgot to add: 

The file   /var/run/ovirt-hosted-engine-ha/vm.conf doesn't exist:

[root@virt1 ~]# ll /var/run/ovirt-hosted-engine-ha/vm.conf
ls: cannot access /var/run/ovirt-hosted-engine-ha/vm.conf: No such file or directory
[root@virt1 ~]# ll /var/run/ovirt-hosted-engine-ha/
total 8
-rw-r--r--. 1 root root 5 Nov 16 08:05 agent.pid
-rw-r--r--. 1 root root 5 Nov 16 06:45 broker.pid
srwxr-xr-x. 1 vdsm kvm  0 Nov 16 06:45 broker.socket

I am not sure how to get it (back?) or how to generate it?

Hi Rudi,
we have to understand what happened at setup time.
Do you still have hosted-engine-setup logs file under /var/log/ovirt-hosted-engine-setup on your first host?
Could you please share it?
 


On Thu, Nov 16, 2017 at 7:55 AM, Rudi Ahlers <rudiahlers@gmail.com> wrote:
Hi, 

I wonder if someone can help. After a system reboot, the Hosted-Agent isn't running. This is on a fresh installaion CentOS Linux release 7.4.1708 running ovirt-release41-4.1.7-1.el7.centos.noarch. Gluster is setup on 3 nodes, but hosted-engine is only setup on the 1st node for now. 

[root@virt1 ~]# hosted-engine --console
Virtual machine does not exist
The engine VM is not on this host

[root@virt1 ~]# hosted-engine --vm-status
The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable.


[root@virt1 ~]# ps ax | grep ovirt-ha-agent
41309 ?        Rsl    0:14 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
42818 pts/0    S+     0:00 grep --color=auto ovirt-ha-agent


[root@virt1 ~]# mount | grep engine
/dev/mapper/storage-engine on /storage/engine type xfs (rw,relatime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)
virt1:/engine on /mnt/engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
virt1:/engine on /rhev/data-center/mnt/glusterSD/virt1:_engine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)


And then I see this error:

[root@virt1 ~]# systemctl status ovirt-ha-agent -l
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-11-16 07:44:43 SAST; 4min 23s ago
 Main PID: 41309 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
           └─41309 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon

Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored in the HE configuration image
Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored in the HE configuration image
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored in the HE configuration image
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last):
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 191, in _run_agent
                                                              return action(he)
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 64, in action_proper
                                                              return he.start_monitoring()
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 423, in start_monitoring
                                                              for old_state, state, delay in self.fsm:
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 127, in next
                                                              new_data = self.refresh(self._state.data)
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 123, in refresh
                                                              ] = self.hosted_engine.min_memory_threshold
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 183, in min_memory_threshold
                                                              return int(self._config.get(config.VM, config.MEM_SIZE))
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 226, in get
                                                              key
                                                          KeyError: 'Configuration value not found: file=/var/run/ovirt-hosted-engine-ha/vm.conf, key=memSize'
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
Nov 16 07:49:01 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:49:01 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored 



All the nodes resolve fine, and has been added to /etc/hosts:

[root@virt1 ~]# systemctl status ovirt-ha-agent -l
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-11-16 07:44:43 SAST; 4min 23s ago
 Main PID: 41309 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
           └─41309 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon

Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored in the HE configuration image
Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored in the HE configuration image
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored in the HE configuration image
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last):
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 191, in _run_agent
                                                              return action(he)
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 64, in action_proper
                                                              return he.start_monitoring()
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 423, in start_monitoring
                                                              for old_state, state, delay in self.fsm:
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 127, in next
                                                              new_data = self.refresh(self._state.data)
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 123, in refresh
                                                              ] = self.hosted_engine.min_memory_threshold
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 183, in min_memory_threshold
                                                              return int(self._config.get(config.VM, config.MEM_SIZE))
                                                            File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 226, in get
                                                              key
                                                          KeyError: 'Configuration value not found: file=/var/run/ovirt-hosted-engine-ha/vm.conf, key=memSize'
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
Nov 16 07:49:01 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs
Nov 16 07:49:01 virt ovirt-ha-agent[41309]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR 'version' is not stored 



--
Kind Regards
Rudi Ahlers
Website: http://www.rudiahlers.co.za



--
Kind Regards
Rudi Ahlers
Website: http://www.rudiahlers.co.za

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users