On Tue, Sep 15, 2015 at 9:58 PM, Joachim Tingvold <joachim@tingvold.com> wrote:
Hi,

First-time user of oVirt, so bear with me.

Trying to get redundant oVirt + gluster set up. Have four hosts;

  gluster1 (CentOS7)
  gluster2 (CentOS7)
  ovirt1 (CentOS7)
  ovirt2 (CentOS7)

Using replica 3 volume with arbiter node (new in 3.7.0). Got that part up and running (using ovirt1 as the arbiter node), and it works fine.

Initial goal (before reading up on both gluster and oVirt) was to have everything v6-only, but found out quickly enough that we had to scratch that plan for now (I see that there are some activity on both gluster and oVirt on this, which is nice).

Anyways. We wanted to use the "self hosted engine gluster"-feature (which, by the looks of it, is only present in 3.6). We installed 3.6b4 (3.6.0.1-0.1.20150821.gitc8ddcd8.el7.centos).

I already had the network set up (couldn't find any specifics on this in the somewhat lacking oVirt-documentation?), something along these lines;

 * eth0 + eth1, bonded in bond0 (LACP)
 * vlan110 on top of bond0: v6-only for mgmt of host
 * vlan111 on top of bond0: v4 for gluster + ovirt

We then ran the 'hosted-engine --deploy' command, filling out the information as best as we could (some of these options seemed to lack documentation, or at least we had trouble finding it). The end-result was like this[1].

Accepting this, we suddenly found ourselves without connectivity to the host. Logged in via KVM, and this[2] was the last part of the log.

All of the interfaces we had before (bond0, vlan110, vlan111) was "wiped clean" for it's configuration, and VDSM seems to have taken control on that part (however, since the script failed, we seem to have ended up in some kind of "limbo mode"). Rebooting didn't help bring things up again, and we're currently looking into manually configuring things via VDSM.


Hi Joachim,
unfortunately you hit this one:
https://bugzilla.redhat.com/1263311
The latest VDSM build sometimes fails setting up the management network and so the issue you described.
It's not really systematic, we are seeing it on our CI env: some execution correctly work while others, with the same code and the same parameters, fail.
So, till we solve it, just clean up VDSM network conf and retry deploying hosted-engine.
Feel free to add any relevant findings on that bug.


Thought I'd post here meanwhile, seeing if we've missed something obvious, or if oVirt should've handled this any different?

If relevant, the content of the answer-file referenced in [2] can be found here[3].

[1] <http://files.jocke.no/b/dump_2015-09-08_22.14.21.png>
[2] <http://files.jocke.no/b/dump_2015-09-08_22.06.06.png>
[3] <http://files.jocke.no/b/answers-20150908215613.conf>

--
Joachim
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users