[ovirt-users] Hosting an oVirt test cluster on single machine running Fedora 25 with nested virtualization enabled.

john vorpal at internode.on.net
Mon Feb 27 08:34:55 UTC 2017


Hi everybody,

I'm having a nightmarish time trying to test oVirt and RHEV on my main
machine which i just upgraded to Fedora 25 (before realising oVirt is
still on F24). Nevertheless, it seems to me I shouldn't have a problem
if i just use F24/RHEV/oVirt nodes running under my F25 machine. 

The F25 physical host is a pretty powerful machine, 8 (HT) cores, SSD
storage, 16 GB ram (bit light on RAM but this should be just sufficient
to run the engine & host, and say one test host)

Main problem is that i just cannot seem to add a new host to this
cluster.

I am using the F25 physical host to export NFS4 storage (on SSD),
for the oVirt cluster.

So currently, I have, all on one physical machine:

1) ovirt-engine-host vm - this is the F24 host which i ran hosted-engine
--deploy in. So, if the physical host is layer 0 (L0), this is L1.

VM has been given 3 CPUs, 6GB of RAM

2) ovirt-engine vm- this is the vm inside ovirt-engine-host, and this vm
is the main ovirt admin engine/appliance. The image for this appliance
was obtained from: ovirt-engine-appliance-4.1-20170201.1.fc24.noarch.rpm
So this vm is at L2.

According to the ovirt Manager web admin page, this is running:
oVirt Engine Version:4.1.0.3-1.el7.centos

When i ran the hosted-engine --deploy, I specified 2 CPUs, 4GB of RAM,
however, when i look at this host in the management webpqage, it
reports 3 CPU, and 6GB of RAM, so the hosted engine deploy script seems
to have ignored my wishes and given this vm the full resources of its
host vm, which is a little annoying but shouldnt be causing a problem at
this stage.

Basically, this seems to be working. I can login to the oVirt Manager
webpage and see my whole datacenter, storage, cluster, etc.

I've been able to add a data domain, and an iso storage domain, and
when i did this, the new data storage domain became the Master, and the
hosted_storage domain (holding my ovirt-engine host) appeared out of
nowhere in the UI. So i now have 3 storage domains - the hosted_engine,
the master, and iso storage.


So anyway, this is where the problems start. No matter what I try, I
can't seem to add a host to this cluster, which seems to be working
fine otherwise.

On all my F24 hosts Ive installed python2-dnf which is required by
oVirt on F24, installed nfs-utils so it can mount nfs shares, enabled
ovirt repository, setup static IP address, tweaked /etc/hosts so all my
hosts can see each other without relying on the physical KVM host's DNS.

I have tried:

1) create a new F24 VM under my physical host, so, at L1. I called this
ovirt-host-01. I have enabled CPU host-passthrough in KVM so this VM
can also run nested VMs, which is of course, the main objective. I've
given this VM 3 CPUs, and 6GB of RAM. Go to Management webpage and try
to add this as a new host. 

It ALMOST succeeds, but at the end it fails to mount my hosted_storage
domain. It correctly mounts the main master data domain, and the iso
domain, but fails to mount the hosted_storage domain. I can mount this
domain manually, and I just can't work out why this fails, it's
incredibly frustrating.

2) create a new host using the latest oVirt node iso:
ovirt-node-ng-installer-ovirt-4.0-2017011712.iso

node has 25 GB disk,3 CPUs, 6GB of RAM. Cockpit admin webpage reports
it is an "oVirt Node 4.0.6.1"

The node seems to be running fine, but i can't seem to add it to my
main cluster. The VDSM service is failing because it's not a member of
any domain. The main "System" tab in CockPit lists all the details,
but for Domain, it says: "Join Domain". Nowhere in this CockPit admin
webpage does there appear to be any way to actually specify the domain
to join.

So, I go to my main engine Admin page, and try to add this node as a
new host. It attempts to do so, but fails, after trying to trying to
install the package collectd, which is in epel. So i enable epel, even
though I'm sure something must be drastically wrong here, because this
oVirt node should have everything it needs already installed on it.
Trying to reinstall again, it fails with the exact same error in the
deployment log, saying collectd cannot be found even though it is
available from epel & epel repo is enabled.


Summary:

This is incredibly frustrating. Why is it so hard to add a host to an
oVirt cluster. This should be child's play. What is going on?

I have seen another email to this list saying you have to run hosted
engine deploy on bare metal (L0), so the engine vm is at L1,  or the HA
features of vdsm are pointless, but I'm not clear why that is, I was
hoping i should be able to get some sort of power management setup
eventually, KVM & virsh should be able to provide full power management
of the hosts even if i had to kludge together a few scripts myself.

Plus that still shouldn't prevent me from setting up this test cluster
surely. I know i haven't configured power management properly on the
hosts i'm adding, but, so what, they should at least work for testing
purposes.

If we set aside the power management issues for now, I can't see why
there should be issues setting up this whole cluster under KVM on a
single physical host. The nested virtualisation is working perfectly.

Apologies for the long email, but i thought it best to get most of the
details out there, so ppl won't have to go back & forth asking questions
just to see what i am trying to do. Any questions re details i've left
out though, pls feel free to ask.

If anyone can offer any halp or advice it'd be greatly appreciated,
ta and cheers!

J














More information about the Users mailing list