Hi Simone,

Yes, up until that point there were no errors output by the hosted-engine --deploy command.

Ok, many thanks for looking into this for me. Would you mind sending me a link to the ticket you create in bugzilla so I can keep an eye on it?

Thanks again,

Ben

On 05/10/2018 15:43, Simone Tiraboschi wrote:


On Fri, Oct 5, 2018 at 4:35 PM Simone Tiraboschi <stirabos@redhat.com> wrote:


On Fri, Oct 5, 2018 at 4:23 PM Ben Webber <ben.webber@egsgroup.com> wrote:

Hi Simone,

Attached are the logs you requested. Looking in the supervdsm.log, I can see the Unknown nics error that was in the engine log also at the same time:


Up to here it was correct, right?

MainProcess|jsonrpc/0::INFO::2018-10-04 21:51:30,291::netconfpersistence::68::root::(setBonding) Adding bond0({'nics': ['bond1', 'bond2'], 'switch': 'legacy', 'options': 'miimon=100 mode=1'})
MainProcess|jsonrpc/0::INFO::2018-10-04 21:51:30,291::netconfpersistence::68::root::(setBonding) Adding bond1({'nics': ['p2p1', 'p2p2', 'p2p3', 'p2p4'], 'switch': 'legacy', 'options': 'miimon=100 mode=4'})
MainProcess|jsonrpc/0::INFO::2018-10-04 21:51:30,291::netconfpersistence::68::root::(setBonding) Adding bond2({'nics': ['p3p1', 'p3p2', 'p3p3', 'p3p4'], 'switch': 'legacy', 'options': 'miimon=100 mode=4'})

Ok, looking at vdsm code I fear you hit a bug:

I'll file a ticket for ti in bugzilla, thanks for reporting it.
Please note that this piece of code is already different on the master branch:
 
 

MainProcess|jsonrpc/0::ERROR::2018-10-04 21:51:30,428::supervdsm_server::100::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 98, in wrapper
    res = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 210, in setupNetworks
    validator.validate(networks, bondings)
  File "/usr/lib/python2.7/site-packages/vdsm/network/validator.py", line 29, in validate
    netswitch.configurator.validate(networks, bondings)
  File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 127, in validate
    legacy_switch.validate_network_setup(legacy_nets, legacy_bonds)
  File "/usr/lib/python2.7/site-packages/vdsm/network/legacy_switch.py", line 598, in validate_network_setup
    'Unknown nics in: %r' % list(nics))
ConfigNetworkError: (23, "Unknown nics in: ['bond1', 'bond2']")

Thanks,

Ben


On 05/10/2018 14:55, Simone Tiraboschi wrote:


On Fri, Oct 5, 2018 at 3:33 PM Ben Webber <ben.webber@egsgroup.com> wrote:
Hi Miguel,

Thanks for getting back to me so quickly! The pastebin is here:

https://pastebin.com/xNJWiymw

Yes, bond1 and bond2 are 802.3ad bonds and bond0 is an active-backup bond of bond1 and bond2

OK, according to the provided  bond0.101, bond0.201 and bond0.202 were fine.
 

Thanks

Ben

On 05/10/2018 14:09, Miguel Duarte de Mora Barroso wrote:
> On Thu, Oct 4, 2018 at 11:49 PM, Ben Webber <ben.webber@egsgroup.com> wrote:
>> Hi,
>>
>> I'm trying to set up ovirt using the hosted-engine --deploy command on CentOS7, but am encountering an error. I am running a slightly unusual network configuration. I have two fairly basic non stacked gigabit switches with port channels connecting the two switches together. I have a lacp bond from the host consisting of 4 ports to each switch (bond1 and bond2). I have then created an active-backup bond (bond0) using the two lacp bonds as slaves in the hope to create ha at the switch layer using my basic switches. There is then a VLAN (101) on bond0.
>>
>> This network configuration runs fine on the host, however, when run, after a short while, the hosted-engine --deploy command outputs the following error:
>>
>> ...
>>
>> [ INFO  ] TASK [Force host-deploy in offline mode]
>> [ INFO  ] ok: [localhost]
>> [ INFO  ] TASK [Add host]
>> [ INFO  ] changed: [localhost]
>> [ INFO  ] TASK [Wait for the host to be up]
>> [ INFO  ] ok: [localhost]
>> [ INFO  ] TASK [Check host status]
>> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"}
>>
>> ...
>>
>>
>> Looking in /var/log/ovirt-engine/engine.log on the machine created, I can see the following errors logged:
>>
>> ...
>>
>> 2018-10-04 21:51:30,116+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [59fb360a] START, HostSetupNetworksVDSCommand(HostName = ov1.test.local, HostSetupNetworksVdsCommandParameters:{hostId='7440c9b9-e530-4341-a317-d3a9041dc777', vds='Host[ov1.test.local,7440c9b9-e530-4341-a317-d3a9041dc777]', rollbackOnFailure='true', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='true', bonding='true', networkName='ovirtmgmt', vdsmName='ovirtmgmt', nicName='bond0', vlan='101', vmNetwork='true', stp='false', properties='null', ipv4BootProtocol='STATIC_IP', ipv4Address='192.168.1.11', ipv4Netmask='255.255.255.0', ipv4Gateway='192.168.1.1', ipv6BootProtocol='AUTOCONF', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null', nameServers='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'}), log id: 4f0c7eaa
>> 2018-10-04 21:51:30,121+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [59fb360a] FINISH, HostSetupNetworksVDSCommand, log id: 4f0c7eaa
>> 2018-10-04 21:51:30,645+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [59fb360a] Failed in 'HostSetupNetworksVDS' method
>> 2018-10-04 21:51:30,687+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [59fb360a] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ov1.test.local command HostSetupNetworksVDS failed: Unknown nics in: ['bond1', 'bond2']
>> 2018-10-04 21:51:30,688+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [59fb360a] Error: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = Unknown nics in: ['bond1', 'bond2'], code = 23

I fear that the real issue comes from here.
Can you please attach vdsm and supervdsm logs fro the relevant timeframe (both of them in /var/log/vdsm/ on your host)?
 
>>
>> ...
>>
>>
>> It looks like when HostSetupNetworksVDS is run, it is checking that the slave interfaces to the bonds are physical network devices and being as the slaves of bond0 are bond1 and bond2, rather than physical devices, it then throws the error Unknown nics in: ['bond1', 'bond2'].
>>
>> Is there anything I can do or any configuration that I can put anywhere to make it work with this "stacked bond" configuration or does ovirt just not work when bonds are set up like this?
> Forwarding to Simone, who is an ovirt-hosted-engine-setup expert.
>
> Please get us a pastebin with the output of 'ansible-playbook -vvv -i
> localhost, /usr/share/ovirt-hosted-engine-setup/ansible/get_network_interfaces.yml'
> on your engine node.
>
> One thing I want to make sure: your bond1 and bond2 configurations are
>  IEEE 802.3ad bonds, please confirm.
>
>> Thanks in advance,
>>
>> Ben
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-leave@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
>> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XPHQTPUINKZBSZVUDP2G66UPA5OJL3J7/
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-leave@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RG2SGSFO6XILGKPZH4RLGGEK66NDHPWF/