[ovirt-users] hosted-engine VM and services not working

Andrew Dent adent at ctcroydon.com.au
Fri Jun 16 06:11:57 UTC 2017


Hi

Well I've got myself into a fine mess.

host01 was setup with hosted-engine v4.1. This was successful.
Imported 3 VMs from a v3.6 OVirt AIO instance. (This OVirt 3.6 is still 
running with more VMs on it)
Tried to add host02 to the new Ovirt 4.1 setup. This partially succeeded 
but I couldn't add any storage domains to it. Cannot remember why.
In Ovirt engine UI I removed host02.
I reinstalled host02 with Centos7, tried to add it and Ovirt UI told me 
it was already there (but it wasn't listed in the UI).
Renamed the reinstalled host02 to host03, changed the ipaddress, 
reconfig the DNS server and added host03 into the Ovirt Engine UI.
All good, and I was able to import more VMs to it.
I was also able to shutdown a VM on host01 assign it to host03 and start 
the VM. Cool, everything working.
The above was all last couple of weeks.

This week I performed some yum updates on the Engine VM. No reboot.
Today noticed that the Ovirt services in the Engine VM were in a endless 
restart loop. They would be up for a 5 minutes and then die.
Looking into /var/log/ovirt-engine/engine.log and I could only see 
errors relating to host02. Ovirt was trying to find it and failing. Then 
falling over.
I ran "hosted-engine --clean-metadata" thinking it would cleanup and 
remove bad references to hosts, but now realise that was a really bad 
idea as it didn't do what I'd hoped.
At this point the sequence below worked, I could login to Ovirt UI but 
after 5 minutes the services would be off
service ovirt-engine restart
service ovirt-websocket-proxy restart
service httpd restart

I saw some reference to having to remove hosts from the database by hand 
in situations where under the hood of Ovirt a decommission host was 
still listed, but wasn't showing in the GUI.
So I removed reference to host02 (vds_id and host_id) in the following 
tables in this order.
vds_dynamic
vds_statistics
vds_static
host_device

Now when I try to start ovirt-websocket it will not start
service ovirt-websocket start
Redirecting to /bin/systemctl start  ovirt-websocket.service
Failed to start ovirt-websocket.service: Unit not found.

I'm now thinking that I need to do the following in the engine VM
# engine-cleanup # yum remove ovirt-engine # yum install ovirt-engine # 
engine-setup
But to run engine-cleanup I need to put the engine-vm into maintenance 
mode and because of the --clean-metadata that I ran earlier on host01 I 
cannot do that.

What is the best course of action from here?

Cheers



Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170616/290250ed/attachment-0001.html>


More information about the Users mailing list