On Fri, Nov 10, 2017 at 9:42 AM, Kasturi Narra <knarra@redhat.com> wrote:
Hello Logan,

   One reason the liveliness check fails is host cannot ping your hosted engine vm. you can try connecting to HE vm using remote-viewer vnc://hypervisor-ip:5900 and from the hosted-engine --vm-status output looks like the HE vm is up and running fine.


Hi,
just a small addition:
we can deploy hosted-engine choosing vnc or spice as the graphical console protocol so you have to fix the remote viewer command according to what you are using.
And the tcp post is not always 5900 but it depends on the VMs starting order.

To get the actual VNC port number you could use:
. /etc/ovirt-hosted-engine/hosted-engine.conf
vdsm-client VM getInfo vmID=$vmid | jq -r '.devices[] | select(.device | contains("vnc")).port'

An alternative is to use the serial console with:
hosted-engine --console
 
  • Please check internal dns setting like resolv.conf setting
  • Can not resolve virtual host name or ip address.
Thanks
kasturi


On Fri, Nov 10, 2017 at 12:56 PM, Logan Kuhn <support@jac-properties.com> wrote:
We lost the backend storage that hosts our self hosted engine tonight.  We've recovered it and there was no data corruption on the volume containing the HE disk.  However, when we try to start the HE it doesn't give an error, but it also doesn't start. 

The VM isn't pingable and the liveliness check always fails.

 [root@ovirttest1 ~]# hosted-engine --vm-status | grep -A20 ovirttest1 
Hostname                           : ovirttest1.wolfram.com 
Host ID                            : 1 
Engine status                      : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "up"} 
Score                              : 3400 
stopped                            : False 
Local maintenance                  : False 
crc32                              : 2c2f3ec9 
local_conf_timestamp               : 18980042 
Host timestamp                     : 18980039 
Extra metadata (valid at timestamp): 
       metadata_parse_version=
       metadata_feature_version=1 
       timestamp=18980039 (Fri Nov 10 01:17:59 2017) 
       host-id=1 
       score=3400 
       vm_conf_refresh_time=18980042 (Fri Nov 10 01:18:03 2017) 
       conf_on_shared_storage=True 
       maintenance=False 
       state=GlobalMaintenance 
       stopped=False

The environment is in Global Maintenance so that we can isolate it to starting on a specific host to eliminate as many variables as possible.  I've attached the agent and broker logs

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users