
Simone/Dominik, Double reply below and more info with latest attempt. --------------- Simone...answers to your questions, using CAPS to make my responses easier to see/read. "If I correctly understood you environment you have: - A pfsense software firewall/router on 10.1.1.1 - Your host on 10.1.1.61 - You are accessing cockpit from a browser running on a machine on 10.1.1.101 on the same subnet" YES "And the issue is that once the engine created the management bridge, your client machine on 10.1.1.101 wasn't able anymore to reach your host on 10.1.1.61. Am I right?" YES "In this case the default gateway or other routes should't be an issue since your client is inside the same subnet." CORRECT, IMO "Do you think we are loosing some piece of your network configuration creating the management bridge such as a custom MTU or a VLAN id or something like that?" NO CUSTOM SETUP HERE...RUNNING A PLAIN/BASIC NETWORK "Do you think pfsense can start blocking/dropping the traffic for any reason?" NO, ONLY USING PFSENSE TO PROVIDE DHCP AND DNS IN TEST/LAB ENVIRONMENT. "Todd, another hint: how did you exposed cockpit over firewalld? Maybe something in firewalld zone configuration?" I'M USING THE OVIRT NODE MINIMAL/UTILITY INSTALL AND I DIDN'T DO ANY CUSTOMIZATION OUTSIDE OF THE INSTALL/SETUP. IF ITS FIREWALLD, THEN ITS NOT SOMETHING THE NODE SETUP DID PROPERLY. I CAN SEND CONFIG INFO IF YOU WOULD LIKE SEE IT, BUT THERE ARE MORE DEVELOPMENTS/INFO FURTHER DOWN IN THE EMAIL FROM ADDITIONAL EFFORTS. --------------- Dominik, It is in a virtualized setup as I'm testing install/setup of this version of ovirt/node mainly as a lab/testing setup, but I was hoping to use this environment as a temporary data center to keep critical VMs running if necessary while performing the reinstall/rebuild of my physical system. I'm running this in hyper-v with the virtual switch/network setup as a private network with no restrictions from what I can see. The VMs are setup to allow mac spoofing and the mac addresses are all unique. The virtualization could the be the culprit, but I have no idea how/why it would be causing a problem during this specific point of the install. See new info below and I'm curious about your thoughts on this issue. --------------- **** Additional info from "Redeploy" option **** To see what would happen, I attempted the "Redeploy" option in the cockpit after the reboot described in previous email. Upon Redeploy using the same settings, it made it through the "Prepare VM" stage. I didn't manually perform a hosted-engine cleanup command, but it looked like deploy script cleaned up everything before attempting the redeployment. I've repeated this behavior twice, so for some reason the redeployment works after the 1st failure. Continuing on to specify the "storage" settings to finalize deployment, it failed at "Obtain SSO token using username/password credentials" because it couldn't connect to the engine. The HostedEngineLocal is running on the Host with 192.168.122.13 as the IP (the temp IP). Trying to ping that address from the Host gets a "Destination Host Unreachable" from 192.168.122.1. Logging into the HostedEngineLocal console from the cockpit, I can't ping 192.168.122.13 getting the same unreachable message from within the hosted-engine, but I can ping the Host address 10.1.1.61 from within the hosted-engine. I'm assuming the hosted-engine should still be on this temp/private IP until after completion of the HE deployment. It seems like if I could make 192.168.122.13 routable from the Host, I could get this to work. Any advice on how to fix this?...I don't understand why I can't ping 192.168.122.13 from anywhere. I've attached logs and ip info commands from Host as well as screen shots from cockpit including storage/final deployment error and hosted-engine basic networking info. Thanks, Todd B. ---- On Thu, 02 May 2019 04:51:34 -0400 Dominik Holler <dholler@redhat.com> wrote ---- On Thu, 2 May 2019 09:57:08 +0200 Simone Tiraboschi <mailto:stirabos@redhat.com> wrote:
On Thu, May 2, 2019 at 5:22 AM Todd Barton <mailto:tcbarton@ipvoicedatasystems.com>
wrote:
Didi,
I was able to carve out some time to attempt the original basic setup
again this evening. The result was similar to my original post. During HE
deployment, in the process of waiting for the host to come up (cockpit
message), the networking is disrupted while building the bridged network
and the host becomes unreachable.
In this state, I can't ping the host from external machine and the
ping/nslookup is non-functional from within the host. Nslookup returns
"connection time out; no servers could be reached". The networking appears
to be completely down although various command make it appear operational.
Upon rebooting the Host (the host locked up on reboot attempt and needed
to be reset), the message appears "libvirt-guests is configured not to
start any guests on boot". After the reboot, the cockpit becomes
responsive again and loging-in displays the "This system is already
registered ovirt-dr-he-standalone.ipvoicedatasystems.lan!" with a
"Redeploy" button. Looking at the networking setup in cockpit, it appears
the "ovritmgmt" network is setup, but the hosted engine did not complete
deployment and startup. The /etc/host file still contains the temporary IP
address used in deployment and a HostedEngineLocal is listed under virtual
machines, but it is not running.
Please advise with any help/input on why this is happening. *Your help
is much appreciated.*
Here are the settings and diagnostic info/logs.
This is a single-host hyper-converged setup for lab testing.
- Host behind pfsense firewall with gateway IP address 10.1.1.1/24. The
Host machine and the machine accessing the cockpit from IP address
10.1.1.101 are the only devices on the subnet (other than the router). It
really can't get any simpler.
- Host setup with single nic eth0
- Hostname is setup as fully FQDN on Host
- Static IP setup on Host with gateway and DNS server set to 10.1.1.1
- FQDNs confirmed resolvable on subnet via dns server at 10.1.1.1 in
pfsense
Host = ovirt-dr-standalone.ipvoicedatasystems.lan , IP = 10.1.1.61
Hosted Engine VM = ovirt-dr-he-standalone.ipvoicedatasystems.lan ,
IP = 10.1.1.60
- Gluster portion of cockpit setup installed as expected without problems
Everything defined here looks OK to me.
- Hosted-Engine cockpit deployment executed with settings in attached
screen shots.
- Hosted engine setup and vdsm logs are attached in zip before the reboot.
- Other network info captured in text files included in zip.
- Screen shot of post reboot network setup in cockpit.
According to VDSM logs
setupNetworks got executed here:
2019-05-01 20:22:14,656-0400 INFO (jsonrpc/0) [api.network] START
setupNetworks(networks={u'ovirtmgmt': {u'ipv6autoconf': True, u'nic':
u'eth0', u'ipaddr': u'10.1.1.61', u'switch': u'legacy', u'mtu': 1500,
u'netmask': u'255.255.255.0', u'dhcpv6': False, u'STP': u'no', u'bridged':
u'true', u'gateway': u'10.1.1.1', u'defaultRoute': True}}, bondings={},
options={u'connectivityCheck': u'true', u'connectivityTimeout': 120,
u'commitOnSuccess': False}) from=::ffff:192.168.122.13,47544,
flow_id=2e7d10f2 (api:48)
and it successfully completed at:
2019-05-01 20:22:22,904-0400 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:22,916-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:22,917-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:23,469-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:312)
2019-05-01 20:22:23,583-0400 INFO (jsonrpc/0) [api.network] FINISH
setupNetworks return={'status': {'message': 'Done', 'code': 0}}
from=::ffff:192.168.122.13,47544, flow_id=2e7d10f2 (api:54)
2019-05-01 20:22:23,583-0400 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
call Host.setupNetworks succeeded in 8.93 seconds (__init__:312)
2019-05-01 20:22:24,033-0400 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
call Host.confirmConnectivity succeeded in 0.01 seconds (__init__:312)
Host.confirmConnectivity means that, after setupNetworks, the bootstrap
engine VM was still able to reach the engine as expected.
and indeed after that we see:
2019-05-01 20:22:24,051-0400 INFO (jsonrpc/4) [api.host] START
getCapabilities() from=::ffff:192.168.122.13,47544, flow_id=2e7d10f2
(api:48)
...
2019-05-01 20:22:25,557-0400 INFO (jsonrpc/4) [api.host] FINISH
getCapabilities return={'status': {'message': 'Done', 'code': 0}, 'info':
{u'HBAInventory': {u'iSCSI': [{u'InitiatorName':
u'iqn.1994-05.com.redhat:79982989d81e'}], u'FC': []}, u'packages2':
{u'kernel': {u'release': u'957.10.1.el7.x86_64', u'version': u'3.10.0'},
u'glusterfs-rdma': {u'release': u'2.el7', u'version': u'5.3'},
u'glusterfs-fuse': {u'release': u'2.el7', u'version': u'5.3'},
u'spice-server': {u'release': u'6.el7_6.1', u'version': u'0.14.0'},
u'librbd1': {u'release': u'4.el7', u'version': u'10.2.5'}, u'vdsm':
{u'release': u'1.el7', u'version': u'4.30.11'}, u'qemu-kvm': {u'release':
u'18.el7_6.3.1', u'version': u'2.12.0'}, u'openvswitch': {u'release':
u'3.el7', u'version': u'2.10.1'}, u'libvirt': {u'release': u'10.el7_6.4',
u'version': u'4.5.0'}, u'ovirt-hosted-engine-ha': {u'release': u'1.el7',
u'version': u'2.3.1'}, u'qemu-img': {u'release': u'18.el7_6.3.1',
u'version': u'2.12.0'}, u'mom': {u'release': u'1.el7.centos', u'version':
u'0.5.12'}, u'glusterfs': {u'release': u'2.el7', u'version': u'5.3'},
u'glusterfs-cli': {u'release': u'2.el7', u'version': u'5.3'},
u'glusterfs-server': {u'release': u'2.el7', u'version': u'5.3'},
u'glusterfs-geo-replication': {u'release': u'2.el7', u'version': u'5.3'}},
u'numaNodeDistance': {u'0': [10]}, u'cpuModel': u'Intel(R) Xeon(R) CPU
X5675 @ 3.07GHz', u'nestedVirtualization': False, u'liveMerge':
u'true', u'hooks': {u'after_vm_start': {u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet': {u'md5':
u'ea0a5a715da8c1badbcda28e8b8fa00e'}}, u'after_network_setup':
{u'30_ethtool_options': {u'md5': u'ce1fbad7aa0389e3b06231219140bf0d'}},
u'after_vm_destroy': {u'delete_vhostuserclient_hook': {u'md5':
u'c2f279cc9483a3f842f6c29df13994c1'}, u'50_vhostmd': {u'md5':
u'bdf4802c0521cf1bae08f2b90a9559cf'}}, u'before_vm_start':
{u'50_hostedengine': {u'md5': u'95c810cdcfe4195302a59574a5148289'},
u'50_vhostmd': {u'md5': u'9206bc390bcbf208b06a8e899581be2d'}},
u'after_device_migrate_destination': {u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet': {u'md5':
u'6226fbc4d1602994828a3904fc1b875d'}},
u'before_device_migrate_destination': {u'50_vmfex': {u'md5':
u'49caba1a5faadd8efacef966f79bc30a'}}, u'after_get_caps':
{u'openstacknet_utils.py': {u'md5': u'1ed38ddf30f8a9c7574589e77e2c0b1f'},
u'50_openstacknet': {u'md5': u'5c3a9ab6e06e039bdd220c0216e45809'},
u'ovirt_provider_ovn_hook': {u'md5': u'4c4b1d2d5460e6a65114ae36cb775df6'}},
u'after_nic_hotplug': {u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet': {u'md5':
u'6226fbc4d1602994828a3904fc1b875d'}}, u'before_vm_migrate_destination':
{u'50_vhostmd': {u'md5': u'9206bc390bcbf208b06a8e899581be2d'}},
u'after_device_create': {u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet': {u'md5':
u'6226fbc4d1602994828a3904fc1b875d'}}, u'before_vm_dehibernate':
{u'50_vhostmd': {u'md5': u'9206bc390bcbf208b06a8e899581be2d'}},
u'before_nic_hotplug': {u'50_vmfex': {u'md5':
u'49caba1a5faadd8efacef966f79bc30a'}, u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet': {u'md5':
u'ec8d5d7ba063a109f749cd63c7a2b58d'},
u'20_ovirt_provider_ovn_vhostuser_hook': {u'md5':
u'a8af653b7386c138b2e6e9738bd6b62c'}, u'10_ovirt_provider_ovn_hook':
{u'md5': u'73822988042847bab1ea832a6b9fa837'}}, u'before_network_setup':
{u'50_fcoe': {u'md5': u'28c352339c8beef1e1b05c67d106d062'}},
u'before_device_create': {u'50_vmfex': {u'md5':
u'49caba1a5faadd8efacef966f79bc30a'}, u'openstacknet_utils.py': {u'md5':
u'1ed38ddf30f8a9c7574589e77e2c0b1f'}, u'50_openstacknet': {u'md5':
u'ec8d5d7ba063a109f749cd63c7a2b58d'},
u'20_ovirt_provider_ovn_vhostuser_hook': {u'md5':
u'a8af653b7386c138b2e6e9738bd6b62c'}, u'10_ovirt_provider_ovn_hook':
{u'md5': u'73822988042847bab1ea832a6b9fa837'}}}, u'supportsIPv6': True,
u'realtimeKernel': False, u'vmTypes': [u'kvm'], u'liveSnapshot': u'true',
u'cpuThreads': u'12', u'kdumpStatus': 0, u'networks': {u'ovirtmgmt':
{u'iface': u'ovirtmgmt', u'ipv6autoconf': True, u'addr': u'10.1.1.61',
u'dhcpv6': False, u'ipv6addrs': [], u'switch': u'legacy', u'bridged': True,
u'southbound': u'eth0', u'dhcpv4': False, u'netmask': u'255.255.255.0',
u'ipv4defaultroute': True, u'stp': u'off', u'ipv4addrs': [u'10.1.1.61/24'],
u'mtu': u'1500', u'ipv6gateway': u'::', u'gateway': u'10.1.1.1', u'ports':
[u'eth0']}}, u'kernelArgs':
u'BOOT_IMAGE=/ovirt-node-ng-4.3.2-0.20190319.0+1/vmlinuz-3.10.0-957.10.1.el7.x86_64
root=/dev/onn/ovirt-node-ng-4.3.2-0.20190319.0+1 ro crashkernel=auto
rd.lvm.lv=onn/ovirt-node-ng-4.3.2-0.20190319.0+1 rd.lvm.lv=onn/swap rhgb
quiet LANG=en_US.UTF-8 img.bootid=ovirt-node-ng-4.3.2-0.20190319.0+1',
u'domain_versions': [0, 2, 3, 4, 5], u'bridges': {u'ovirtmgmt':
{u'ipv6autoconf': True, u'addr': u'10.1.1.61', u'dhcpv6': False,
u'ipv6addrs': [], u'mtu': u'1500', u'dhcpv4': False, u'netmask':
u'255.255.255.0', u'ipv4defaultroute': True, u'stp': u'off', u'ipv4addrs':
[u'10.1.1.61/24'], u'ipv6gateway': u'::', u'gateway': u'10.1.1.1', u'opts':
{u'multicast_last_member_count': u'2', u'vlan_protocol': u'0x8100',
u'hash_elasticity': u'4', u'multicast_query_response_interval': u'1000',
u'group_fwd_mask': u'0x0', u'multicast_snooping': u'1',
u'multicast_startup_query_interval': u'3125', u'hello_timer': u'0',
u'multicast_querier_interval': u'25500', u'max_age': u'2000', u'hash_max':
u'512', u'stp_state': u'0', u'topology_change_detected': u'0', u'priority':
u'32768', u'multicast_igmp_version': u'2',
u'multicast_membership_interval': u'26000', u'root_path_cost': u'0',
u'root_port': u'0', u'multicast_stats_enabled': u'0',
u'multicast_startup_query_count': u'2', u'nf_call_iptables': u'0',
u'vlan_stats_enabled': u'0', u'hello_time': u'200', u'topology_change':
u'0', u'bridge_id': u'8000.00155d380110', u'topology_change_timer': u'0',
u'ageing_time': u'30000', u'nf_call_ip6tables': u'0',
u'multicast_mld_version': u'1', u'gc_timer': u'29395', u'root_id':
u'8000.00155d380110', u'nf_call_arptables': u'0', u'group_addr':
u'1:80:c2:0:0:0', u'multicast_last_member_interval': u'100',
u'default_pvid': u'1', u'multicast_query_interval': u'12500',
u'multicast_query_use_ifaddr': u'0', u'tcn_timer': u'0',
u'multicast_router': u'1', u'vlan_filtering': u'0', u'multicast_querier':
u'0', u'forward_delay': u'0'}, u'ports': [u'eth0']}, u'virbr0':
{u'ipv6autoconf': False, u'addr': u'192.168.122.1', u'dhcpv6': False,
u'ipv6addrs': [], u'mtu': u'1500', u'dhcpv4': False, u'netmask':
u'255.255.255.0', u'ipv4defaultroute': False, u'stp': u'on', u'ipv4addrs':
[u'192.168.122.1/24'], u'ipv6gateway': u'::', u'gateway': u'', u'opts':
{u'multicast_last_member_count': u'2', u'vlan_protocol': u'0x8100',
u'hash_elasticity': u'4', u'multicast_query_response_interval': u'1000',
u'group_fwd_mask': u'0x0', u'multicast_snooping': u'1',
u'multicast_startup_query_interval': u'3125', u'hello_timer': u'138',
u'multicast_querier_interval': u'25500', u'max_age': u'2000', u'hash_max':
u'512', u'stp_state': u'1', u'topology_change_detected': u'0', u'priority':
u'32768', u'multicast_igmp_version': u'2',
u'multicast_membership_interval': u'26000', u'root_path_cost': u'0',
u'root_port': u'0', u'multicast_stats_enabled': u'0',
u'multicast_startup_query_count': u'2', u'nf_call_iptables': u'0',
u'vlan_stats_enabled': u'0', u'hello_time': u'200', u'topology_change':
u'0', u'bridge_id': u'8000.5254008ac0fb', u'topology_change_timer': u'0',
u'ageing_time': u'30000', u'nf_call_ip6tables': u'0',
u'multicast_mld_version': u'1', u'gc_timer': u'4000', u'root_id':
u'8000.5254008ac0fb', u'nf_call_arptables': u'0', u'group_addr':
u'1:80:c2:0:0:0', u'multicast_last_member_interval': u'100',
u'default_pvid': u'1', u'multicast_query_interval': u'12500',
u'multicast_query_use_ifaddr': u'0', u'tcn_timer': u'0',
u'multicast_router': u'1', u'vlan_filtering': u'0', u'multicast_querier':
u'0', u'forward_delay': u'200'}, u'ports': [u'vnet0', u'virbr0-nic']}},
u'uuid': u'cb4aee34-27aa-064d-aaf1-2c27871125bc', u'onlineCpus':
u'0,1,2,3,4,5,6,7,8,9,10,11', u'nameservers': [u'10.1.1.1'], u'nics':
{u'eth0': {u'ipv6autoconf': False, u'addr': u'', u'speed': 10000,
u'dhcpv6': False, u'ipv6addrs': [], u'mtu': u'1500', u'dhcpv4': False,
u'netmask': u'', u'ipv4defaultroute': False, u'ipv4addrs': [], u'hwaddr':
u'00:15:5d:38:01:10', u'ipv6gateway': u'::', u'gateway': u''}},
u'software_revision': u'1', u'hostdevPassthrough': u'false',
u'clusterLevels': [u'4.1', u'4.2', u'4.3'], u'cpuFlags':
u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ss,ht,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,rep_good,nopl,xtopology,eagerfpu,pni,pclmulqdq,vmx,ssse3,cx16,sse4_1,sse4_2,popcnt,aes,hypervisor,lahf_lm,ibrs,ibpb,stibp,tpr_shadow,vnmi,ept,vpid,spec_ctrl,intel_stibp,arch_capabilities,model_Opteron_G2,model_kvm32,model_coreduo,model_Conroe,model_Nehalem,model_Westmere-IBRS,model_Opteron_G1,model_core2duo,model_Nehalem-IBRS,model_qemu32,model_Penryn,model_pentium2,model_pentium3,model_qemu64,model_Westmere,model_kvm64,model_pentium,model_486',
u'kernelFeatures': {u'RETP': 1, u'IBRS': 0, u'PTI': 1},
u'ISCSIInitiatorName': u'iqn.1994-05.com.redhat:79982989d81e',
u'netConfigDirty': u'True', u'selinux': {u'mode': u'1'},
u'autoNumaBalancing': 0, u'reservedMem': u'321', u'bondings': {},
u'software_version': u'4.30.11', u'supportedENGINEs': [u'4.1', u'4.2',
u'4.3'], u'vncEncrypted': False, u'backupEnabled': False, u'cpuSpeed':
u'3063.656', u'numaNodes': {u'0': {u'totalMemory': u'64248', u'cpus': [0,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]}}, u'cpuSockets': u'1', u'vlans': {},
u'version_name': u'Snow Man', 'lastClientIface': 'ovirtmgmt', u'cpuCores':
u'6', u'hostedEngineDeployed': False, u'hugepages': [1048576, 2048],
u'guestOverhead': u'65', u'additionalFeatures': [u'libgfapi_supported',
u'GLUSTER_SNAPSHOT', u'GLUSTER_GEO_REPLICATION',
u'GLUSTER_BRICK_MANAGEMENT'], u'openstack_binding_host_ids':
{u'OPENSTACK_OVN': u'ovirt-dr-standalone.ipvoicedatasystems.lan',
u'OPEN_VSWITCH': u'ovirt-dr-standalone.ipvoicedatasystems.lan',
u'OVIRT_PROVIDER_OVN': u'a86b72aa-c9d2-488d-b04d-1ccf4bb010e7'},
u'kvmEnabled': u'true', u'memSize': u'64248', u'emulatedMachines':
[u'pc-i440fx-rhel7.1.0', u'pc-q35-rhel7.3.0', u'rhel6.3.0',
u'pc-i440fx-rhel7.5.0', u'pc-i440fx-rhel7.0.0', u'rhel6.1.0',
u'pc-q35-rhel7.6.0', u'pc-i440fx-rhel7.4.0', u'rhel6.6.0',
u'pc-q35-rhel7.5.0', u'rhel6.2.0', u'pc', u'pc-i440fx-rhel7.3.0', u'q35',
u'pc-i440fx-rhel7.2.0', u'rhel6.4.0', u'pc-q35-rhel7.4.0',
u'pc-i440fx-rhel7.6.0', u'rhel6.0.0', u'rhel6.5.0'], u'rngSources':
[u'hwrng', u'random'], u'operatingSystem': {u'release':
u'6.1810.2.el7.centos', u'pretty_name': u'oVirt Node 4.3.2', u'version':
u'7', u'name': u'RHEL'}}} from=::ffff:192.168.122.13,47544,
flow_id=2e7d10f2 (api:54)
2019-05-01 20:22:25,579-0400 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC
call Host.getCapabilities succeeded in 1.53 seconds (__init__:312)
2019-05-01 20:22:25,743-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
call Host.setSafeNetworkConfig succeeded in 0.02 seconds (__init__:312)
where Host.setSafeNetworkConfig means that the engine committed the new
network configuration since everything was fine from its point of view.
And after that we also have:
2019-05-01 20:22:29,748-0400 INFO (jsonrpc/7) [api.host] START
getHardwareInfo() from=::ffff:192.168.122.13,47544 (api:48)
2019-05-01 20:22:29,805-0400 INFO (jsonrpc/7) [api.host] FINISH
getHardwareInfo return={'status': {'message': 'Done', 'code': 0}, 'info':
{'systemProductName': 'Virtual Machine', 'systemUUID':
'cb4aee34-27aa-064d-aaf1-2c27871125bc', 'systemSerialNumber':
'9448-7597-9700-7577-0920-4186-69', 'systemVersion': '7.0',
'systemManufacturer': 'Microsoft Corporation'}}
from=::ffff:192.168.122.13,47544 (api:54)
So I'm pretty sure that the engine VM was correctly able to talk with the
host after the network configuration change so, on my opinion, we should
focus somewhere else.
If I correctly understood you environment you have:
- A pfsense software firewall/router on 10.1.1.1
- Your host on 10.1.1.61
- You are accessing cockpit from a browser running on a machine on
10.1.1.101 on the same subnet
And the issue is that once the engine created the management bridge, your
client machine on 10.1.1.101 wasn't able anymore to reach your host on
10.1.1.61. Am I right?
In this case the default gateway or other routes should't be an issue since
your client is inside the same subnet.
Do you think we are loosing some piece of your network configuration
creating the management bridge such as a custom MTU or a VLAN id or
something like that?
Do you think pfsense can start blocking/dropping the traffic for any reason?
Dominik, any hint from you?
Layer 3 looks good, so let's check layer 2: I understood that the oVirt host is VM. Has the networking interface of this VM some kind of mac spoofing protection or any other kind of filtering? Are the MAC addresses of all involved interfaces, including the ones from the router, unique?
Regards,
*Todd Barton*
---- On Wed, 01 May 2019 11:29:39 -0400 *Todd Barton
<mailto:tcbarton@ipvoicedatasystems.com <mailto:tcbarton@ipvoicedatasystems.com>>*
wrote ----
Thanks again...I've done all the detail work back in the 3.x days and I
thought (and was hoping) the node/cockpit setup would make this easier to
get everything lined up for the HE deploy, but it is not working as
expected. I've followed best practices/recommendations, but realize there
are not absolute specifics in these recommendations...there are a lot of
either/or statements...which is why I was asking for recommendations. I've
reviewed many articles including the "up and running" one like
https://ovirt.org/blog/2018/02/up-and-running-with-ovirt-4-2-and-gluster-sto...,
In everything I've looked at there isn't anything new or different vs what
I've already done or attempted.
I was very methodical in my initial attempts as I did in my initial
install of v3.3 years ago which took many attempts and methodical
configuration to get it up and setup the way I wanted. What I'm trying to
understand is why its not coming up in a lab setting with what I would
consider to be a pretty remedial setup.
I'll get back to a basic setup and run through the process again today or
tomorrow and post logs of the failure.
Regards,
*Todd Barton*
---- On Wed, 01 May 2019 01:50:49 -0400 *Yedidyah Bar David
<mailto:didi@redhat.com <mailto:didi@redhat.com>>* wrote ----
_______________________________________________
Users mailing list -- mailto:users@ovirt.org
To unsubscribe send an email to mailto:users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WRVZCJQELCX4QS...
On Tue, Apr 30, 2019 at 4:09 PM Todd Barton
<mailto:tcbarton@ipvoicedatasystems.com> wrote:
Thanks a bunch for the reply Didi and Simone. I will admit this last
setup was a bit of wild attempt to see if i could get it working somehow so
maybe it wasn't the best example to submit...and yeah, should have been /24
subnets. Initially I tried the single nic setup, but the outcome seemed to
be the same scenario.
Honestly I've run through this setup so many times in the last week its
all a blur. I started messing multiple nics in latest attempts to see if
this was something specific I should do in a cockpit setup as one of the
articles I read suggested multiple interfaces to separate traffic.
My "production" 4.0 environment (currently a failed upgrade with a down
host that I can't seem to get back online) is 3 host gluster on 4 bonded
1Gbps links. With the exception of the upgrade issue/failure, it has been
rock-solid with good performance and I've only restarted hosts on upgrades
in 4+ years. There are a few networking changes i would like to make in a
rebuild, but I wanted to test various options before implementing. Getting
a single nic environment was the initial goal to get started.
I'm doing this testing in a virtualized setup with pfsense as the
firewall/router and I can setup hosts/nics however I want. I will start
over again with more straightforward setup and get more data on failure.
Considering I can setup the environment how i want, what would be your
recommended config for a single nic(or single bond) setup using cockpit?
Static IPs with host file resolution, DHCP with mac specific IPs, etc.
Much of such decisions is a matter of personal preferences,
acquaintance with the relevant technologies and tooling you have
around them, local needs/policies/mandates, existing infrastructure,
etc.
If you search the net, e.g. for "ovirt best practices" or "RHV best
practices", you can find various articles etc. that can provide some
good guidelines/ideas.
I suggest to read around a bit, then spend some good time on planning,
then carefully and systematically implement your design, verifying
each step right after doing it. When you run into problems, tell us
:-). Ideally, IMO, you should not give up on your design due to such
problems and try workarounds, inferior (in your eyes) solutions, etc.,
unless you manage to find existing open bugs that describe your
problem and you decide you can't want until they are solved. Instead,
try to fix problems, perhaps with the list members' help.
I realize spending a week on what is in your perception a simple,
straightforward task, does not leave you in the best mood for such a
methodical next attempt. Perhaps first take a break and do something
else :-), then start from a clean and fresh hardware/software
environment and mind.
Good luck and best regards,
Thank you,
Todd Barton
---- On Tue, 30 Apr 2019 05:20:04 -0400 Simone Tiraboschi <
mailto:stirabos@redhat.com> wrote ----
On Tue, Apr 30, 2019 at 9:50 AM Yedidyah Bar David <mailto:didi@redhat.com>
wrote:
On Tue, Apr 30, 2019 at 5:09 AM Todd Barton
<mailto:tcbarton@ipvoicedatasystems.com> wrote:
I've having to rebuild an environment that started back in the early
3.x days. A lot has changed and I'm attempting to use the Ovirt Node based
setup to build a new environment, but I can't get through the hosted engine
deployment process via the cockpit (I've done command line as well). I've
tried static DHCP address and static IPs as well as confirmed I have
resolvable host-names. This is a test environment so I can work through any
issues in deployment.
When the cockpit is displaying the waiting for host to come up task,
the cockpit gets disconnected. It appears to a happen when the bridge
network is setup. At that point, the deployment is messed up and I can't
return to the cockpit. I've tried this with one or two nic/interfaces and
tried every permutation of static and dynamic ip addresses. I've spent a
week trying different setups and I've got to be doing something stupid.
Attached is a screen capture of the resulting IP info after my latest
try failing. I used two nics, one for the gluster and bridge network and
the other for the ovirt cockpit access. I can't access cockpit on either ip
address after the failure.
I've attempted this setup as both a single host hyper-converged setup
and a three host hyper-converged environment...same issue in both.
Can someone please help me or give me some thoughts on what is wrong?
There are two parts here: 1. Fix it so that you can continue (and so
that if it happens to you on production, you know what to do) 2. Fix
the code so that it does not happen again. They are not necessarily
identical (or even very similar).
At the point in time of taking the screen capture:
1. Did the ovirtmgmt bridge get the IP address of the intended nic?
Which one?
2. Did you check routing? Default gateway, or perhaps you had/have
specific other routes?
3. What nics are in the bridge? Can you check/share output of 'brctl
show'?
4. Probably not related, just noting: You have there (currently on
eth0 and on ovirtmgmt, perhaps you tried other combinations):
10.1.2.61/16 and 10.1.1.61/16 . It seems like you wanted two different
subnets, but are actually using a single one. Perhaps you intended to
use 10.1.2.61/24 and 10.1.1.61/24.
Good catch: the issue comes exactly form here!
Please see:
The issue happens when the user has two interfaces configured on the
same IP subnet, the default gateway is configured to be reached from one of
the two interfaces and the user chooses to create the management bridge on
the other one.
When the engine, adding the host, creates the management bridge it also
tries to configure the default gateway on the bridge and for some reason
this disrupt the external connectivity on the host and the the user is
going to loose it.
If you intend to use one interface for gluster and the other for the
management network I'd strongly suggest to use two distinct subnets having
the default gateway on the subnet you are going to use for the management
network.
If you want to use two interfaces for reliability reasons I'd strongly
suggest to create a bond of the two instead.
Please also notice that deploying a three host hyper-converged
environment over a single 1 gbps interface will be really penalizing in
terms of storage performances.
Each data has to be written on the host itself and on the two remote
ones so you are going to have 1000 mbps / 2 (external replicas ) / 8
(bit/bytes) = a max of 62.5 MB/s sustained throughput shared between all
the VMs and this ignoring all the overheads.
In practice it will be much less ending in a barely usable environment.
I'd strongly suggest to move to a 10 gbps environment if possible, or to
bond a few 1 gbps nics for gluster.
5. Can you ping from/to these two addresses from/to some other machine
on the network? Your laptop? The storage?
6. If possible, please check/share relevant logs, including (from the
host) /var/log/vdsm/* and /var/log/ovirt-hosted-engine-setup/*.
Thanks and best regards,
--
Didi
_______________________________________________
Users mailing list -- mailto:users@ovirt.org
To unsubscribe send an email to mailto:users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JIYWEUXPA25BK3...
--
Didi