Thanks a bunch for the reply Didi and Simone.  I will admit this last setup was a bit of wild attempt to see if i could get it working somehow so maybe it wasn't the best example to submit...and yeah, should have been /24 subnets.  Initially I tried the single nic setup, but the outcome seemed to be the same scenario.

Honestly I've run through this setup so many times in the last week its all a blur.  I started messing multiple nics in latest attempts to see if this was something specific I should do in a cockpit setup as one of the articles I read suggested multiple interfaces to separate traffic.

My "production" 4.0 environment (currently a failed upgrade with a down host that I can't seem to get back online) is 3 host gluster on 4 bonded 1Gbps links.  With the exception of the upgrade issue/failure, it has been rock-solid with good performance and I've only restarted hosts on upgrades in 4+ years.  There are a few networking changes i would like to make in a rebuild, but I wanted to test various options before implementing.  Getting a single nic environment was the initial goal to get started.

I'm doing this testing in a virtualized setup with pfsense as the firewall/router and I can setup hosts/nics however I want.  I will start over again with more straightforward setup and get more data on failure.  Considering I can setup the environment how i want, what would be your recommended config for a single nic(or single bond) setup using cockpit?  Static IPs with host file resolution, DHCP with mac specific IPs, etc.

Thank you,

Todd Barton




---- On Tue, 30 Apr 2019 05:20:04 -0400 Simone Tiraboschi <stirabos@redhat.com> wrote ----



On Tue, Apr 30, 2019 at 9:50 AM Yedidyah Bar David <didi@redhat.com> wrote:
On Tue, Apr 30, 2019 at 5:09 AM Todd Barton
>
> I've having to rebuild an environment that started back in the early 3.x days.  A lot has changed and I'm attempting to use the Ovirt Node based setup to build a new environment, but I can't get through the hosted engine deployment process via the cockpit (I've done command line as well).  I've tried static DHCP address and static IPs as well as confirmed I have resolvable host-names.  This is a test environment so I can work through any issues in deployment.
>
> When the cockpit is displaying the waiting for host to come up task, the cockpit gets disconnected.  It appears to a happen when the bridge network is setup.  At that point, the deployment is messed up and I can't return to the cockpit.  I've tried this with one or two nic/interfaces and tried every permutation of static and dynamic ip addresses.  I've spent a week trying different setups and I've got to be doing something stupid.
>
> Attached is a screen capture of the resulting IP info after my latest try failing.  I used two nics, one for the gluster and bridge network and the other for the ovirt cockpit access.  I can't access cockpit on either ip address after the failure.
>
> I've attempted this setup as both a single host hyper-converged setup and a three host hyper-converged environment...same issue in both.
>
> Can someone please help me or give me some thoughts on what is wrong?

There are two parts here: 1. Fix it so that you can continue (and so
that if it happens to you on production, you know what to do) 2. Fix
the code so that it does not happen again. They are not necessarily
identical (or even very similar).

At the point in time of taking the screen capture:

1. Did the ovirtmgmt bridge get the IP address of the intended nic? Which one?

2. Did you check routing? Default gateway, or perhaps you had/have
specific other routes?

3. What nics are in the bridge? Can you check/share output of 'brctl show'?

4. Probably not related, just noting: You have there (currently on
eth0 and on ovirtmgmt, perhaps you tried other combinations):
10.1.2.61/16 and 10.1.1.61/16 . It seems like you wanted two different
subnets, but are actually using a single one. Perhaps you intended to
 
Good catch: the issue comes exactly form here!
Please see:

The issue happens when the user has two interfaces configured on the same IP subnet, the default gateway is configured to be reached from one of the two interfaces and the user chooses to create the management bridge on the other one.
When the engine, adding the host, creates the management bridge it also tries to configure the default gateway on the bridge and for some reason this disrupt the external connectivity on the host and the the user is going to loose it.

If you intend to use one interface for gluster and the other for the management network I'd strongly suggest to use two distinct subnets having the default gateway on the subnet you are going to use for the management network.

If you want to use two interfaces for reliability reasons I'd strongly suggest to create a bond of the two instead.

Please also notice that deploying a three host hyper-converged environment over a single 1 gbps interface will be really penalizing in terms of storage performances.
Each data has to be written on the host itself and on the two remote ones so you are going to have 1000 mbps / 2 (external replicas ) / 8 (bit/bytes) = a max of 62.5 MB/s sustained throughput shared between all the VMs and this ignoring all the overheads.
In practice it will be much less ending in a barely usable environment.

I'd strongly suggest to move to a 10 gbps environment if possible, or to bond a few 1 gbps nics for gluster.


5. Can you ping from/to these two addresses from/to some other machine
on the network? Your laptop? The storage?

6. If possible, please check/share relevant logs, including (from the
host) /var/log/vdsm/* and /var/log/ovirt-hosted-engine-setup/*.

Thanks and best regards,
--
Didi
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org