[ovirt-users] issues deploying hosted-engine on mutiple hosts

Keppler, Thomas (PEBA) thomas.keppler at kit.edu
Tue Sep 9 08:00:23 UTC 2014


Dear oVirt-Team,

we (as in: our company) currently has a *little* problem regarding your hosted-engine solution.
First, I want to tell you the steps we did until the errors occured:

1.) All four hosts have been prepared with CentOS 7, the EPEL repositories and a glusterfs-volume.
2.) The oVIrt 3.5 nightly snapshot was added to each host's yum mirror list, a yum upgrade was performed
3.) Then, we installed the hosted-engine package and we triggered a hosted-engine --deploy. It stopped there, complaining that there were new packages available and we should perform an upgrade first, so we did that. We ran the --deploy process again, resulting in a working engine-vm, but ending up in an error (all log files of all hosts are attached as a tar,gz package to this mail) - We completed those steps on Friday, 5th Sept. (As vmnode1 is dead by now, sadly, we can't provide any logs for this machine without imense tinkering, but could be provided if you really desire so).
4.) On Monday, we noticed that the node (xxx-vmnode1), which had the hosted-engine on it, died due to a hardware failure. Not minding this, we decided to give our gluster-fs the good 'ol rm -rf in order to get rid of the previously created files and we moved on with three nodes from there.
5.) We decided to deploy the engine on xxx-vmnode4 this time, since it seemed to be the most stable of the rack. Immediately, an error occured (stating that /etc/pki/vdsm/certs/cacert.pem couldn't be found) which thanks to sbonazzo's help in the IRC could be worked around by doing a vdsm config --force. Running the deploy process again worked fine, resulted in the same matter as the first try (see 3rd point) BUT bringing up the stated error again.
6.) Now, we tried to add another host (xxx-vmnode3 to our solution in order to make the Engine highly available. Thus, working fine until the point of entering an id for the new node where it complained, that the UUID was already in use and we couldn't add this node to the cluster - which is fairly odd, according to sbonazzo as any machine should have its own, unique UUID.
7.) As this host wouldn't work, we decided giving xxx-vmnode2 a shot and ran the deploy process on there, which resulted in ultimate failure. It didn't even get to the steps regarding the path for the resulting VM.

Because it might help, I probably should give you an overview of our network setup:
It is currently set up, so that we have a company-wide WAN and a rack-wide LAN. The WAN is only there for the VMs to communicate with the outside world, management and calling the engine is done via the LAN, which can be accessed through a VPN connection. Therefore, we bridged the engine's "ovirtmgmt" bridge to the internal LAN connection. Because the FQDN for the Engine isn't callable through the DNS, we hacked it into the hosts file on all nodes prior deploying the hosted-engine package.

This is where we are and where we come from - the oVirt setup worked initially, when the engine was still seperated from the nodes. Our bad luck with hardware didn't really help, too.
I am really looking forward to hearing from you guys because this project would be a nice successor to our current VMWare solution, which is starting to die.

Thank you for any time invested into our problems (and probably solutions) ;)

--
Best regards
Thomas Keppler

PS: I've just heard that the hosted-engine is **NOT** (really) compatible with the hosted-engine. Are there any recommendations on what to do? 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ovirt.tar.gz
Type: application/gzip
Size: 2117068 bytes
Desc: ovirt.tar.gz
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140909/019a5c13/attachment-0001.bin>


More information about the Users mailing list