On Thu, May 2, 2019 at 9:57 AM Simone Tiraboschi <stirabos(a)redhat.com>
wrote:
On Thu, May 2, 2019 at 5:22 AM Todd Barton <
tcbarton(a)ipvoicedatasystems.com> wrote:
> Didi,
>
> I was able to carve out some time to attempt the original basic setup
> again this evening. The result was similar to my original post. During HE
> deployment, in the process of waiting for the host to come up (cockpit
> message), the networking is disrupted while building the bridged network
> and the host becomes unreachable.
>
> In this state, I can't ping the host from external machine and the
> ping/nslookup is non-functional from within the host. Nslookup returns
> "connection time out; no servers could be reached". The networking
appears
> to be completely down although various command make it appear operational.
>
> Upon rebooting the Host (the host locked up on reboot attempt and needed
> to be reset), the message appears "libvirt-guests is configured not to
> start any guests on boot". After the reboot, the cockpit becomes
> responsive again and loging-in displays the "This system is already
> registered ovirt-dr-he-standalone.ipvoicedatasystems.lan!" with a
> "Redeploy" button. Looking at the networking setup in cockpit, it appears
> the "ovritmgmt" network is setup, but the hosted engine did not complete
> deployment and startup. The /etc/host file still contains the temporary IP
> address used in deployment and a HostedEngineLocal is listed under virtual
> machines, but it is not running.
>
> Please advise with any help/input on why this is happening. *Your help
> is much appreciated.*
>
>
> Here are the settings and diagnostic info/logs.
>
> This is a single-host hyper-converged setup for lab testing.
>
> - Host behind pfsense firewall with gateway IP address 10.1.1.1/24. The
> Host machine and the machine accessing the cockpit from IP address
> 10.1.1.101 are the only devices on the subnet (other than the router). It
> really can't get any simpler.
>
> - Host setup with single nic eth0
> - Hostname is setup as fully FQDN on Host
> - Static IP setup on Host with gateway and DNS server set to 10.1.1.1
> - FQDNs confirmed resolvable on subnet via dns server at 10.1.1.1 in
> pfsense
> Host = ovirt-dr-standalone.ipvoicedatasystems.lan , IP = 10.1.1.61
> Hosted Engine VM = ovirt-dr-he-standalone.ipvoicedatasystems.lan ,
> IP = 10.1.1.60
>
> - Gluster portion of cockpit setup installed as expected without problems
>
>
Everything defined here looks OK to me.
> - Hosted-Engine cockpit deployment executed with settings in attached
> screen shots.
> - Hosted engine setup and vdsm logs are attached in zip before the reboot.
> - Other network info captured in text files included in zip.
> - Screen shot of post reboot network setup in cockpit.
>
>
According to VDSM logs
setupNetworks got executed here:
2019-05-01 20:22:14,656-0400 INFO (jsonrpc/0) [api.network] START
setupNetworks(networks={u'ovirtmgmt': {u'ipv6autoconf': True,
u'nic':
u'eth0', u'ipaddr': u'10.1.1.61', u'switch':
u'legacy', u'mtu': 1500,
u'netmask': u'255.255.255.0', u'dhcpv6': False, u'STP':
u'no', u'bridged':
u'true', u'gateway': u'10.1.1.1', u'defaultRoute':
True}}, bondings={},
options={u'connectivityCheck': u'true', u'connectivityTimeout':
120,
u'commitOnSuccess': False}) from=::ffff:192.168.122.13,47544,
flow_id=2e7d10f2 (api:48)
and it successfully completed at:
2019-05-01 20:22:22,904-0400 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:22,916-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:22,917-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:23,469-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:23,583-0400 INFO (jsonrpc/0) [api.network] FINISH
setupNetworks return={'status': {'message': 'Done',
'code': 0}}
from=::ffff:192.168.122.13,47544, flow_id=2e7d10f2 (api:54)
2019-05-01 20:22:23,583-0400 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
call Host.setupNetworks succeeded in 8.93 seconds (__init__:312)
2019-05-01 20:22:24,033-0400 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.01 seconds (__init__:312)
Host.confirmConnectivity means that, after setupNetworks, the bootstrap
engine VM was still able to reach the engine as expected.
and indeed after that we see:
2019-05-01 20:22:24,051-0400 INFO (jsonrpc/4) [api.host] START
getCapabilities() from=::ffff:192.168.122.13,47544, flow_id=2e7d10f2
(api:48)
...
2019-05-01 20:22:25,557-0400 INFO (jsonrpc/4) [api.host] FINISH
getCapabilities return={'status': {'message': 'Done',
'code': 0}, 'info':
{u'HBAInventory': {u'iSCSI': [{u'InitiatorName':
u'iqn.1994-05.com.redhat:79982989d81e'}], u'FC': []},
u'packages2':
{u'kernel': {u'release': u'957.10.1.el7.x86_64',
u'version': u'3.10.0'},
u'glusterfs-rdma': {u'release': u'2.el7', u'version':
u'5.3'},
u'glusterfs-fuse': {u'release': u'2.el7', u'version':
u'5.3'},
u'spice-server': {u'release': u'6.el7_6.1', u'version':
u'0.14.0'},
u'librbd1': {u'release': u'4.el7', u'version':
u'10.2.5'}, u'vdsm':
{u'release': u'1.el7', u'version': u'4.30.11'},
u'qemu-kvm': {u'release':
u'18.el7_6.3.1', u'version': u'2.12.0'}, u'openvswitch':
{u'release':
u'3.el7', u'version': u'2.10.1'}, u'libvirt':
{u'release': u'10.el7_6.4',
u'version': u'4.5.0'}, u'ovirt-hosted-engine-ha':
{u'release': u'1.el7',
u'version': u'2.3.1'}, u'qemu-img': {u'release':
u'18.el7_6.3.1',
u'version': u'2.12.0'}, u'mom': {u'release':
u'1.el7.centos', u'version':
u'0.5.12'}, u'glusterfs': {u'release': u'2.el7',
u'version': u'5.3'},
u'glusterfs-cli': {u'release': u'2.el7', u'version':
u'5.3'},
u'glusterfs-server': {u'release': u'2.el7', u'version':
u'5.3'},
u'glusterfs-geo-replication': {u'release': u'2.el7',
u'version': u'5.3'}},
u'numaNodeDistance': {u'0': [10]}, u'cpuModel': u'Intel(R)
Xeon(R) CPU
X5675 @ 3.07GHz', u'nestedVirtualization': False,
u'liveMerge':
u'true', u'hooks': {u'after_vm_start':
{u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet':
{u'md5':
u'ea0a5a715da8c1badbcda28e8b8fa00e'}}, u'after_network_setup':
{u'30_ethtool_options': {u'md5':
u'ce1fbad7aa0389e3b06231219140bf0d'}},
u'after_vm_destroy': {u'delete_vhostuserclient_hook': {u'md5':
u'c2f279cc9483a3f842f6c29df13994c1'}, u'50_vhostmd': {u'md5':
u'bdf4802c0521cf1bae08f2b90a9559cf'}}, u'before_vm_start':
{u'50_hostedengine': {u'md5':
u'95c810cdcfe4195302a59574a5148289'},
u'50_vhostmd': {u'md5': u'9206bc390bcbf208b06a8e899581be2d'}},
u'after_device_migrate_destination': {u'openstacknet_utils.py':
{u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet':
{u'md5':
u'6226fbc4d1602994828a3904fc1b875d'}},
u'before_device_migrate_destination': {u'50_vmfex': {u'md5':
u'49caba1a5faadd8efacef966f79bc30a'}}, u'after_get_caps':
{u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'},
u'50_openstacknet': {u'md5':
u'5c3a9ab6e06e039bdd220c0216e45809'},
u'ovirt_provider_ovn_hook': {u'md5':
u'4c4b1d2d5460e6a65114ae36cb775df6'}},
u'after_nic_hotplug': {u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet':
{u'md5':
u'6226fbc4d1602994828a3904fc1b875d'}}, u'before_vm_migrate_destination':
{u'50_vhostmd': {u'md5': u'9206bc390bcbf208b06a8e899581be2d'}},
u'after_device_create': {u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet':
{u'md5':
u'6226fbc4d1602994828a3904fc1b875d'}}, u'before_vm_dehibernate':
{u'50_vhostmd': {u'md5': u'9206bc390bcbf208b06a8e899581be2d'}},
u'before_nic_hotplug': {u'50_vmfex': {u'md5':
u'49caba1a5faadd8efacef966f79bc30a'}, u'openstacknet_utils.py':
{u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet':
{u'md5':
u'ec8d5d7ba063a109f749cd63c7a2b58d'},
u'20_ovirt_provider_ovn_vhostuser_hook': {u'md5':
u'a8af653b7386c138b2e6e9738bd6b62c'}, u'10_ovirt_provider_ovn_hook':
{u'md5': u'73822988042847bab1ea832a6b9fa837'}},
u'before_network_setup':
{u'50_fcoe': {u'md5': u'28c352339c8beef1e1b05c67d106d062'}},
u'before_device_create': {u'50_vmfex': {u'md5':
u'49caba1a5faadd8efacef966f79bc30a'}, u'openstacknet_utils.py':
{u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet':
{u'md5':
u'ec8d5d7ba063a109f749cd63c7a2b58d'},
u'20_ovirt_provider_ovn_vhostuser_hook': {u'md5':
u'a8af653b7386c138b2e6e9738bd6b62c'}, u'10_ovirt_provider_ovn_hook':
{u'md5': u'73822988042847bab1ea832a6b9fa837'}}}, u'supportsIPv6':
True,
u'realtimeKernel': False, u'vmTypes': [u'kvm'],
u'liveSnapshot': u'true',
u'cpuThreads': u'12', u'kdumpStatus': 0, u'networks':
{u'ovirtmgmt':
{u'iface': u'ovirtmgmt', u'ipv6autoconf': True, u'addr':
u'10.1.1.61',
u'dhcpv6': False, u'ipv6addrs': [], u'switch': u'legacy',
u'bridged': True,
u'southbound': u'eth0', u'dhcpv4': False, u'netmask':
u'255.255.255.0',
u'ipv4defaultroute': True, u'stp': u'off', u'ipv4addrs':
[u'10.1.1.61/24'],
u'mtu': u'1500', u'ipv6gateway': u'::',
u'gateway': u'10.1.1.1', u'ports':
[u'eth0']}}, u'kernelArgs':
u'BOOT_IMAGE=/ovirt-node-ng-4.3.2-0.20190319.0+1/vmlinuz-3.10.0-957.10.1.el7.x86_64
root=/dev/onn/ovirt-node-ng-4.3.2-0.20190319.0+1 ro crashkernel=auto
rd.lvm.lv=onn/ovirt-node-ng-4.3.2-0.20190319.0+1 rd.lvm.lv=onn/swap rhgb
quiet LANG=en_US.UTF-8 img.bootid=ovirt-node-ng-4.3.2-0.20190319.0+1',
u'domain_versions': [0, 2, 3, 4, 5], u'bridges': {u'ovirtmgmt':
{u'ipv6autoconf': True, u'addr': u'10.1.1.61', u'dhcpv6':
False,
u'ipv6addrs': [], u'mtu': u'1500', u'dhcpv4': False,
u'netmask':
u'255.255.255.0', u'ipv4defaultroute': True, u'stp':
u'off', u'ipv4addrs':
[u'10.1.1.61/24'], u'ipv6gateway': u'::', u'gateway':
u'10.1.1.1',
u'opts': {u'multicast_last_member_count': u'2',
u'vlan_protocol':
u'0x8100', u'hash_elasticity': u'4',
u'multicast_query_response_interval':
u'1000', u'group_fwd_mask': u'0x0',
u'multicast_snooping': u'1',
u'multicast_startup_query_interval': u'3125', u'hello_timer':
u'0',
u'multicast_querier_interval': u'25500', u'max_age':
u'2000', u'hash_max':
u'512', u'stp_state': u'0', u'topology_change_detected':
u'0', u'priority':
u'32768', u'multicast_igmp_version': u'2',
u'multicast_membership_interval': u'26000', u'root_path_cost':
u'0',
u'root_port': u'0', u'multicast_stats_enabled': u'0',
u'multicast_startup_query_count': u'2', u'nf_call_iptables':
u'0',
u'vlan_stats_enabled': u'0', u'hello_time': u'200',
u'topology_change':
u'0', u'bridge_id': u'8000.00155d380110',
u'topology_change_timer': u'0',
u'ageing_time': u'30000', u'nf_call_ip6tables': u'0',
u'multicast_mld_version': u'1', u'gc_timer': u'29395',
u'root_id':
u'8000.00155d380110', u'nf_call_arptables': u'0',
u'group_addr':
u'1:80:c2:0:0:0', u'multicast_last_member_interval': u'100',
u'default_pvid': u'1', u'multicast_query_interval':
u'12500',
u'multicast_query_use_ifaddr': u'0', u'tcn_timer': u'0',
u'multicast_router': u'1', u'vlan_filtering': u'0',
u'multicast_querier':
u'0', u'forward_delay': u'0'}, u'ports':
[u'eth0']}, u'virbr0':
{u'ipv6autoconf': False, u'addr': u'192.168.122.1',
u'dhcpv6': False,
u'ipv6addrs': [], u'mtu': u'1500', u'dhcpv4': False,
u'netmask':
u'255.255.255.0', u'ipv4defaultroute': False, u'stp':
u'on', u'ipv4addrs':
[u'192.168.122.1/24'], u'ipv6gateway': u'::', u'gateway':
u'', u'opts':
{u'multicast_last_member_count': u'2', u'vlan_protocol':
u'0x8100',
u'hash_elasticity': u'4', u'multicast_query_response_interval':
u'1000',
u'group_fwd_mask': u'0x0', u'multicast_snooping': u'1',
u'multicast_startup_query_interval': u'3125', u'hello_timer':
u'138',
u'multicast_querier_interval': u'25500', u'max_age':
u'2000', u'hash_max':
u'512', u'stp_state': u'1', u'topology_change_detected':
u'0', u'priority':
u'32768', u'multicast_igmp_version': u'2',
u'multicast_membership_interval': u'26000', u'root_path_cost':
u'0',
u'root_port': u'0', u'multicast_stats_enabled': u'0',
u'multicast_startup_query_count': u'2', u'nf_call_iptables':
u'0',
u'vlan_stats_enabled': u'0', u'hello_time': u'200',
u'topology_change':
u'0', u'bridge_id': u'8000.5254008ac0fb',
u'topology_change_timer': u'0',
u'ageing_time': u'30000', u'nf_call_ip6tables': u'0',
u'multicast_mld_version': u'1', u'gc_timer': u'4000',
u'root_id':
u'8000.5254008ac0fb', u'nf_call_arptables': u'0',
u'group_addr':
u'1:80:c2:0:0:0', u'multicast_last_member_interval': u'100',
u'default_pvid': u'1', u'multicast_query_interval':
u'12500',
u'multicast_query_use_ifaddr': u'0', u'tcn_timer': u'0',
u'multicast_router': u'1', u'vlan_filtering': u'0',
u'multicast_querier':
u'0', u'forward_delay': u'200'}, u'ports':
[u'vnet0', u'virbr0-nic']}},
u'uuid': u'cb4aee34-27aa-064d-aaf1-2c27871125bc', u'onlineCpus':
u'0,1,2,3,4,5,6,7,8,9,10,11', u'nameservers': [u'10.1.1.1'],
u'nics':
{u'eth0': {u'ipv6autoconf': False, u'addr': u'',
u'speed': 10000,
u'dhcpv6': False, u'ipv6addrs': [], u'mtu': u'1500',
u'dhcpv4': False,
u'netmask': u'', u'ipv4defaultroute': False,
u'ipv4addrs': [], u'hwaddr':
u'00:15:5d:38:01:10', u'ipv6gateway': u'::', u'gateway':
u''}},
u'software_revision': u'1', u'hostdevPassthrough':
u'false',
u'clusterLevels': [u'4.1', u'4.2', u'4.3'],
u'cpuFlags':
u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ss,ht,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,rep_good,nopl,xtopology,eagerfpu,pni,pclmulqdq,vmx,ssse3,cx16,sse4_1,sse4_2,popcnt,aes,hypervisor,lahf_lm,ibrs,ibpb,stibp,tpr_shadow,vnmi,ept,vpid,spec_ctrl,intel_stibp,arch_capabilities,model_Opteron_G2,model_kvm32,model_coreduo,model_Conroe,model_Nehalem,model_Westmere-IBRS,model_Opteron_G1,model_core2duo,model_Nehalem-IBRS,model_qemu32,model_Penryn,model_pentium2,model_pentium3,model_qemu64,model_Westmere,model_kvm64,model_pentium,model_486',
u'kernelFeatures': {u'RETP': 1, u'IBRS': 0, u'PTI': 1},
u'ISCSIInitiatorName': u'iqn.1994-05.com.redhat:79982989d81e',
u'netConfigDirty': u'True', u'selinux': {u'mode':
u'1'},
u'autoNumaBalancing': 0, u'reservedMem': u'321',
u'bondings': {},
u'software_version': u'4.30.11', u'supportedENGINEs':
[u'4.1', u'4.2',
u'4.3'], u'vncEncrypted': False, u'backupEnabled': False,
u'cpuSpeed':
u'3063.656', u'numaNodes': {u'0': {u'totalMemory':
u'64248', u'cpus': [0,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]}}, u'cpuSockets': u'1',
u'vlans': {},
u'version_name': u'Snow Man', 'lastClientIface':
'ovirtmgmt', u'cpuCores':
u'6', u'hostedEngineDeployed': False, u'hugepages': [1048576,
2048],
u'guestOverhead': u'65', u'additionalFeatures':
[u'libgfapi_supported',
u'GLUSTER_SNAPSHOT', u'GLUSTER_GEO_REPLICATION',
u'GLUSTER_BRICK_MANAGEMENT'], u'openstack_binding_host_ids':
{u'OPENSTACK_OVN': u'ovirt-dr-standalone.ipvoicedatasystems.lan',
u'OPEN_VSWITCH': u'ovirt-dr-standalone.ipvoicedatasystems.lan',
u'OVIRT_PROVIDER_OVN': u'a86b72aa-c9d2-488d-b04d-1ccf4bb010e7'},
u'kvmEnabled': u'true', u'memSize': u'64248',
u'emulatedMachines':
[u'pc-i440fx-rhel7.1.0', u'pc-q35-rhel7.3.0', u'rhel6.3.0',
u'pc-i440fx-rhel7.5.0', u'pc-i440fx-rhel7.0.0', u'rhel6.1.0',
u'pc-q35-rhel7.6.0', u'pc-i440fx-rhel7.4.0', u'rhel6.6.0',
u'pc-q35-rhel7.5.0', u'rhel6.2.0', u'pc',
u'pc-i440fx-rhel7.3.0', u'q35',
u'pc-i440fx-rhel7.2.0', u'rhel6.4.0', u'pc-q35-rhel7.4.0',
u'pc-i440fx-rhel7.6.0', u'rhel6.0.0', u'rhel6.5.0'],
u'rngSources':
[u'hwrng', u'random'], u'operatingSystem': {u'release':
u'6.1810.2.el7.centos', u'pretty_name': u'oVirt Node 4.3.2',
u'version':
u'7', u'name': u'RHEL'}}} from=::ffff:192.168.122.13,47544,
flow_id=2e7d10f2 (api:54)
2019-05-01 20:22:25,579-0400 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC
call Host.getCapabilities succeeded in 1.53 seconds (__init__:312)
2019-05-01 20:22:25,743-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
call Host.setSafeNetworkConfig succeeded in 0.02 seconds (__init__:312)
where Host.setSafeNetworkConfig means that the engine committed the new
network configuration since everything was fine from its point of view.
And after that we also have:
2019-05-01 20:22:29,748-0400 INFO (jsonrpc/7) [api.host] START
getHardwareInfo() from=::ffff:192.168.122.13,47544 (api:48)
2019-05-01 20:22:29,805-0400 INFO (jsonrpc/7) [api.host] FINISH
getHardwareInfo return={'status': {'message': 'Done',
'code': 0}, 'info':
{'systemProductName': 'Virtual Machine', 'systemUUID':
'cb4aee34-27aa-064d-aaf1-2c27871125bc', 'systemSerialNumber':
'9448-7597-9700-7577-0920-4186-69', 'systemVersion': '7.0',
'systemManufacturer': 'Microsoft Corporation'}}
from=::ffff:192.168.122.13,47544 (api:54)
So I'm pretty sure that the engine VM was correctly able to talk with the
host after the network configuration change so, on my opinion, we should
focus somewhere else.
If I correctly understood you environment you have:
- A pfsense software firewall/router on 10.1.1.1
- Your host on 10.1.1.61
- You are accessing cockpit from a browser running on a machine on
10.1.1.101 on the same subnet
And the issue is that once the engine created the management bridge, your
client machine on 10.1.1.101 wasn't able anymore to reach your host on
10.1.1.61. Am I right?
In this case the default gateway or other routes should't be an issue
since your client is inside the same subnet.
Do you think we are loosing some piece of your network configuration
creating the management bridge such as a custom MTU or a VLAN id or
something like that?
Do you think pfsense can start blocking/dropping the traffic for any
reason?
Dominik, any hint from you?
Todd, another hint: how did you exposed cockpit over firewalld?
Maybe something in firewalld zone configuration?
>
> Regards,
> *Todd Barton*
>
>
> ---- On Wed, 01 May 2019 11:29:39 -0400 *Todd Barton
> <tcbarton(a)ipvoicedatasystems.com <tcbarton(a)ipvoicedatasystems.com>>*
> wrote ----
>
> Thanks again...I've done all the detail work back in the 3.x days and I
> thought (and was hoping) the node/cockpit setup would make this easier to
> get everything lined up for the HE deploy, but it is not working as
> expected. I've followed best practices/recommendations, but realize there
> are not absolute specifics in these recommendations...there are a lot of
> either/or statements...which is why I was asking for recommendations. I've
> reviewed many articles including the "up and running" one like
>
https://ovirt.org/blog/2018/02/up-and-running-with-ovirt-4-2-and-gluster-...,
> In everything I've looked at there isn't anything new or different vs what
> I've already done or attempted.
>
> I was very methodical in my initial attempts as I did in my initial
> install of v3.3 years ago which took many attempts and methodical
> configuration to get it up and setup the way I wanted. What I'm trying to
> understand is why its not coming up in a lab setting with what I would
> consider to be a pretty remedial setup.
>
> I'll get back to a basic setup and run through the process again today or
> tomorrow and post logs of the failure.
>
> Regards,
>
> *Todd Barton*
>
>
> ---- On Wed, 01 May 2019 01:50:49 -0400 *Yedidyah Bar David
> <didi(a)redhat.com <didi(a)redhat.com>>* wrote ----
>
>
>
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
>
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WRVZCJQELCX...
>
> On Tue, Apr 30, 2019 at 4:09 PM Todd Barton
> <tcbarton(a)ipvoicedatasystems.com> wrote:
> >
> > Thanks a bunch for the reply Didi and Simone. I will admit this last
> setup was a bit of wild attempt to see if i could get it working somehow so
> maybe it wasn't the best example to submit...and yeah, should have been /24
> subnets. Initially I tried the single nic setup, but the outcome seemed to
> be the same scenario.
> >
> > Honestly I've run through this setup so many times in the last week its
> all a blur. I started messing multiple nics in latest attempts to see if
> this was something specific I should do in a cockpit setup as one of the
> articles I read suggested multiple interfaces to separate traffic.
> >
> > My "production" 4.0 environment (currently a failed upgrade with a
down
> host that I can't seem to get back online) is 3 host gluster on 4 bonded
> 1Gbps links. With the exception of the upgrade issue/failure, it has been
> rock-solid with good performance and I've only restarted hosts on upgrades
> in 4+ years. There are a few networking changes i would like to make in a
> rebuild, but I wanted to test various options before implementing. Getting
> a single nic environment was the initial goal to get started.
> >
> > I'm doing this testing in a virtualized setup with pfsense as the
> firewall/router and I can setup hosts/nics however I want. I will start
> over again with more straightforward setup and get more data on failure.
> Considering I can setup the environment how i want, what would be your
> recommended config for a single nic(or single bond) setup using cockpit?
> Static IPs with host file resolution, DHCP with mac specific IPs, etc.
>
> Much of such decisions is a matter of personal preferences,
> acquaintance with the relevant technologies and tooling you have
> around them, local needs/policies/mandates, existing infrastructure,
> etc.
>
> If you search the net, e.g. for "ovirt best practices" or "RHV best
> practices", you can find various articles etc. that can provide some
> good guidelines/ideas.
>
> I suggest to read around a bit, then spend some good time on planning,
> then carefully and systematically implement your design, verifying
> each step right after doing it. When you run into problems, tell us
> :-). Ideally, IMO, you should not give up on your design due to such
> problems and try workarounds, inferior (in your eyes) solutions, etc.,
> unless you manage to find existing open bugs that describe your
> problem and you decide you can't want until they are solved. Instead,
> try to fix problems, perhaps with the list members' help.
>
> I realize spending a week on what is in your perception a simple,
> straightforward task, does not leave you in the best mood for such a
> methodical next attempt. Perhaps first take a break and do something
> else :-), then start from a clean and fresh hardware/software
> environment and mind.
>
> Good luck and best regards,
>
> >
> > Thank you,
> >
> > Todd Barton
> >
> >
> >
> >
> > ---- On Tue, 30 Apr 2019 05:20:04 -0400 Simone Tiraboschi <
> stirabos(a)redhat.com> wrote ----
> >
> >
> >
> > On Tue, Apr 30, 2019 at 9:50 AM Yedidyah Bar David <didi(a)redhat.com>
> wrote:
> >
> > On Tue, Apr 30, 2019 at 5:09 AM Todd Barton
> > <tcbarton(a)ipvoicedatasystems.com> wrote:
> > >
> > > I've having to rebuild an environment that started back in the early
> 3.x days. A lot has changed and I'm attempting to use the Ovirt Node based
> setup to build a new environment, but I can't get through the hosted engine
> deployment process via the cockpit (I've done command line as well). I've
> tried static DHCP address and static IPs as well as confirmed I have
> resolvable host-names. This is a test environment so I can work through any
> issues in deployment.
> > >
> > > When the cockpit is displaying the waiting for host to come up task,
> the cockpit gets disconnected. It appears to a happen when the bridge
> network is setup. At that point, the deployment is messed up and I can't
> return to the cockpit. I've tried this with one or two nic/interfaces and
> tried every permutation of static and dynamic ip addresses. I've spent a
> week trying different setups and I've got to be doing something stupid.
> > >
> > > Attached is a screen capture of the resulting IP info after my latest
> try failing. I used two nics, one for the gluster and bridge network and
> the other for the ovirt cockpit access. I can't access cockpit on either ip
> address after the failure.
> > >
> > > I've attempted this setup as both a single host hyper-converged setup
> and a three host hyper-converged environment...same issue in both.
> > >
> > > Can someone please help me or give me some thoughts on what is wrong?
> >
> > There are two parts here: 1. Fix it so that you can continue (and so
> > that if it happens to you on production, you know what to do) 2. Fix
> > the code so that it does not happen again. They are not necessarily
> > identical (or even very similar).
> >
> > At the point in time of taking the screen capture:
> >
> > 1. Did the ovirtmgmt bridge get the IP address of the intended nic?
> Which one?
> >
> > 2. Did you check routing? Default gateway, or perhaps you had/have
> > specific other routes?
> >
> > 3. What nics are in the bridge? Can you check/share output of 'brctl
> show'?
> >
> > 4. Probably not related, just noting: You have there (currently on
> > eth0 and on ovirtmgmt, perhaps you tried other combinations):
> > 10.1.2.61/16 and 10.1.1.61/16 . It seems like you wanted two different
> > subnets, but are actually using a single one. Perhaps you intended to
> > use 10.1.2.61/24 and 10.1.1.61/24.
> >
> >
> > Good catch: the issue comes exactly form here!
> > Please see:
> >
https://bugzilla.redhat.com/1694626
> >
> > The issue happens when the user has two interfaces configured on the
> same IP subnet, the default gateway is configured to be reached from one of
> the two interfaces and the user chooses to create the management bridge on
> the other one.
> > When the engine, adding the host, creates the management bridge it also
> tries to configure the default gateway on the bridge and for some reason
> this disrupt the external connectivity on the host and the the user is
> going to loose it.
> >
> > If you intend to use one interface for gluster and the other for the
> management network I'd strongly suggest to use two distinct subnets having
> the default gateway on the subnet you are going to use for the management
> network.
> >
> > If you want to use two interfaces for reliability reasons I'd strongly
> suggest to create a bond of the two instead.
> >
> > Please also notice that deploying a three host hyper-converged
> environment over a single 1 gbps interface will be really penalizing in
> terms of storage performances.
> > Each data has to be written on the host itself and on the two remote
> ones so you are going to have 1000 mbps / 2 (external replicas ) / 8
> (bit/bytes) = a max of 62.5 MB/s sustained throughput shared between all
> the VMs and this ignoring all the overheads.
> > In practice it will be much less ending in a barely usable environment.
> >
> > I'd strongly suggest to move to a 10 gbps environment if possible, or
> to bond a few 1 gbps nics for gluster.
> >
> >
> > 5. Can you ping from/to these two addresses from/to some other machine
> > on the network? Your laptop? The storage?
> >
> > 6. If possible, please check/share relevant logs, including (from the
> > host) /var/log/vdsm/* and /var/log/ovirt-hosted-engine-setup/*.
> >
> > Thanks and best regards,
> > --
> > Didi
> > _______________________________________________
> > Users mailing list -- users(a)ovirt.org
> > To unsubscribe send an email to users-leave(a)ovirt.org
> > Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
>
https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JIYWEUXPA25...
> >
> >
> >
>
>
> --
> Didi
>
>
>
>