On Tue, Apr 30, 2019 at 9:50 AM Yedidyah Bar David <
didi@redhat.com> wrote:
On Tue, Apr 30, 2019 at 5:09 AM Todd Barton
<tcbarton@ipvoicedatasystems.com> wrote:
>
> I've having to rebuild an environment that started back in the early 3.x days. A lot has changed and I'm attempting to use the Ovirt Node based setup to build a new environment, but I can't get through the hosted engine deployment process via the cockpit (I've done command line as well). I've tried static DHCP address and static IPs as well as confirmed I have resolvable host-names. This is a test environment so I can work through any issues in deployment.
>
> When the cockpit is displaying the waiting for host to come up task, the cockpit gets disconnected. It appears to a happen when the bridge network is setup. At that point, the deployment is messed up and I can't return to the cockpit. I've tried this with one or two nic/interfaces and tried every permutation of static and dynamic ip addresses. I've spent a week trying different setups and I've got to be doing something stupid.
>
> Attached is a screen capture of the resulting IP info after my latest try failing. I used two nics, one for the gluster and bridge network and the other for the ovirt cockpit access. I can't access cockpit on either ip address after the failure.
>
> I've attempted this setup as both a single host hyper-converged setup and a three host hyper-converged environment...same issue in both.
>
> Can someone please help me or give me some thoughts on what is wrong?
There are two parts here: 1. Fix it so that you can continue (and so
that if it happens to you on production, you know what to do) 2. Fix
the code so that it does not happen again. They are not necessarily
identical (or even very similar).
At the point in time of taking the screen capture:
1. Did the ovirtmgmt bridge get the IP address of the intended nic? Which one?
2. Did you check routing? Default gateway, or perhaps you had/have
specific other routes?
3. What nics are in the bridge? Can you check/share output of 'brctl show'?
4. Probably not related, just noting: You have there (currently on
eth0 and on ovirtmgmt, perhaps you tried other combinations):
10.1.2.61/16 and 10.1.1.61/16 . It seems like you wanted two different
subnets, but are actually using a single one. Perhaps you intended to
use 10.1.2.61/24 and 10.1.1.61/24.
Good catch: the issue comes exactly form here!
Please see:
https://bugzilla.redhat.com/1694626The issue happens when the user has two interfaces configured on the same IP subnet, the default gateway is configured to be reached from one of the two interfaces and the user chooses to create the management bridge on the other one.
When the engine, adding the host, creates the management bridge it also tries to configure the default gateway on the bridge and for some reason this disrupt the external connectivity on the host and the the user is going to loose it.
If you intend to use one interface for gluster and the other for the management network I'd strongly suggest to use two distinct subnets having the default gateway on the subnet you are going to use for the management network.
If you want to use two interfaces for reliability reasons I'd strongly suggest to create a bond of the two instead.
Please also notice that deploying a three host hyper-converged environment over a single 1 gbps interface will be really penalizing in terms of storage performances.
Each data has to be written on the host itself and on the two remote ones so you are going to have 1000 mbps / 2 (external replicas ) / 8 (bit/bytes) = a max of 62.5 MB/s sustained throughput shared between all the VMs and this ignoring all the overheads.
In practice it will be much less ending in a barely usable environment.