[ovirt-users] bonding problems

Tim Macy macytd at gmail.com
Thu Feb 12 17:07:10 UTC 2015


Jorick,  I experienced similar problems when setting up a few different
test clusters using bonds.  I found that the vdsm is not setting the bonds
to persist.  These are the steps that worked for me to get the host back
online and fix it permanently:
1. Put the host in Maintenance Mode.
2. Manually restore the network configuration back to your original setup
before joining the cluster. Tar up the configuration files in case you need
them again.
3. At this point you can attempt to run re-install on the host from the
engine *sometimes this worked other times it did not and I needed to do the
additional steps below.

If it fails to join:
1. Put the host in Maintenance Mode.
2. Manually restore the network configuration back to your original setup
before joining the cluster.
3. Remove vdsm from the host  yum remove vdsm*
4. Delete the following vdsm configs   rm -Rf /etc/vdsm ; rm -Rf
/var/lib/vdsm
5. Run re-install on the host from the engine

Making sure the network is setup to persist -
* Any networks that will persist will be identified in
/var/lib/vdsm/persistence/netconfs (bonds and nets directories)  If they
are missing do the following
1.  From the engine DC/Cluster/Hosts - select each host - from the menu
below select Network Interfaces - Run Setup Host Networks.
2.  Verify they were added to /var/lib/vdsm/persistence/netconfs/bonds or
nets

Hope this helps.

Tim Macy

On Thu, Feb 12, 2015 at 10:29 AM, Martin Pavlík <mpavlik at redhat.com> wrote:

> Hi Jorick,
>
> if I understand correctly you had ovirtmgmt over bond and now you’ve tried
> to move it to single interface? You mention that everything is gone after
> restarting. Restarting of what? What system is running on your hypervisors?
> Can you provide supervdsm.log from the affected node?
>
> Martin Pavlik
>
> RHEV QE
> > On 10 Feb 2015, at 17:58, Jorick Astrego <j.astrego at netbulae.eu> wrote:
> >
> > After having problems with a bond and the ovirtmgmt interface on 3.5.1,
> I skipped bond0 and just added bond1 for gluster and bond2 for internet.
> >
> > When restarting, I loose all the bonds and I cannot use the "Setup Host
> networks" anymore:
> >
> > <ehehbhcg.png>
> >
> > In the engine log, I see the following:
> >
> >
> > 2015-02-10 17:29:41,799 INFO
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
> Connecting to test4.netbulae.test/xx.xx.xx.xx
> > 2015-02-10 17:29:42,414 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand]
> (ajp--127.0.0.1-8702-4) [4be2b53a] Failed in SetupNetworksVDS method
> > 2015-02-10 17:29:42,414 WARN
> [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] (ResponseWorker)
> Exception thrown during message processing
> > 2015-02-10 17:29:42,414 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand]
> (ajp--127.0.0.1-8702-4) [4be2b53a]
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
> VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error =
> Missing required nics for bonding device., code = 21
> > 2015-02-10 17:29:42,416 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand]
> (ajp--127.0.0.1-8702-4) [4be2b53a] Command SetupNetworksVDSCommand(HostName
> = test4.netbulae.test, HostId = c4e1f226-a243-4a9a-8fa9-fc65f4518114,
> force=false, checkConnectivity=true, conectivityTimeout=120,
> >     networks=[gluster {id=123ff683-fcf8-407b-94ca-f323c1fba326,
> description=null, comment=null, subnet=null, gateway=null, type=null,
> vlanId=null, stp=false, dataCenterId=00000002-0002-0002-0002-00000000031a,
> mtu=9000, vmNetwork=false, cluster=NetworkCluster {id={clusterId=null,
> networkId=null}, status=OPERATIONAL, display=false, required=true,
> migration=true}, providedBy=null, label=null, qosId=null},
> >         uplink {id=51a63339-5570-4b58-b00c-9d54a534ff20,
> description=null, comment=null, subnet=null, gateway=null, type=null,
> vlanId=null, stp=false, dataCenterId=00000002-0002-0002-0002-00000000031a,
> mtu=0, vmNetwork=true, cluster=NetworkCluster {id={clusterId=null,
> networkId=null}, status=OPERATIONAL, display=false, required=true,
> migration=false}, providedBy=null, label=null, qosId=null}],
> >     bonds=[bond1 {id=64135fde-a5de-404c-89a3-2ce13a1a7d84,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=bond1,
> macAddress=**:**:**:**:**:**, networkName=gluster, bondOptions=mode=4
> miimon=100, bootProtocol=STATIC_IP, address=, subnet=, gateway=, mtu=9000,
> bridged=false, type=0, networkImplementationDetails={inSync=true,
> managed=true}},
> >         bond2 {id=f2f26c27-85f3-413a-abb0-79db4b323c0e,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=bond2,
> macAddress=**:**:**:**:**:**, networkName=null, bondOptions=mode=4
> miimon=100, bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500,
> bridged=false, type=0, networkImplementationDetails=null}],
> >     interfaces=[enp2s0 {id=542b77c3-a412-4435-87f1-182663317893,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp2s0,
> macAddress=**:**:**:**:**:**, networkName=null, bondName=null,
> bootProtocol=DHCP, address=, subnet=, gateway=null, mtu=1500,
> bridged=false, speed=0, type=0, networkImplementationDetails=null},
> >         enp9s0f1 {id=f5cc13ec-ac66-446c-b2b4-c6561c10a62b,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp9s0f1,
> macAddress=**:**:**:**:**:**, networkName=null, bondName=bond2,
> bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500,
> bridged=false, speed=0, type=0, networkImplementationDetails=null},
> >         enp9s0f0 {id=6bf7215a-ca7e-4634-8225-5762f5bd3620,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp9s0f0,
> macAddress=**:**:**:**:**:**, networkName=null, bondName=bond2,
> bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500,
> bridged=false, speed=0, type=0, networkImplementationDetails=null},
> >         enp1s0 {id=e5319631-6c7e-4be2-8914-a24df70f26af,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp1s0,
> macAddress=**:**:**:**:**:**, networkName=ovirtmgmt, bondName=null,
> bootProtocol=DHCP, address=xx.xx.xx.xx, subnet=255.255.255.0,
> gateway=*.*.*.*, mtu=1500, bridged=true, speed=1000, type=2,
> networkImplementationDetails={inSync=true, managed=true}},
> >         enp7s0f1 {id=665c7d78-0ce3-46d9-9458-6828eecf23e7,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp7s0f1,
> macAddress=**:**:**:**:**:**, networkName=null, bondName=bond1,
> bootProtocol=NONE, address=, subnet=, gateway=null, mtu=9000,
> bridged=false, speed=0, type=0, networkImplementationDetails=null},
> >         enp7s0f0 {id=3bb50de2-590d-451e-841f-805ebba1d1b6,
> vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp7s0f0,
> macAddress=**:**:**:**:**:**, networkName=null, bondName=bond1,
> bootProtocol=NONE, address=, subnet=, gateway=null, mtu=9000,
> bridged=false, speed=0, type=0, networkImplementationDetails=null},
> >         bond1 {id=null, vdsId=null, name=bond1, macAddress=null,
> networkName=gluster, bondOptions=mode=4 miimon=100, bootProtocol=STATIC_IP,
> address=xx.xx.xx.xx, subnet=255.255.255.0, gateway=null, mtu=9000,
> bridged=false, type=0, networkImplementationDetails=null},
> >         bond2 {id=null, vdsId=null, name=bond2, macAddress=null,
> networkName=uplink, bondOptions=mode=4 miimon=100, bootProtocol=NONE,
> address=null, subnet=null, gateway=null, mtu=0, bridged=true, type=0,
> networkImplementationDetails=null}],
> >     removedNetworks=[],
> >     removedBonds=[]) execution failed. Exception: VDSErrorException:
> VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error =
> Missing required nics for bonding device., code = 21
> > 2015-02-10 17:29:42,425 ERROR
> [org.ovirt.engine.core.bll.network.host.SetupNetworksCommand]
> (ajp--127.0.0.1-8702-4) [4be2b53a] Command
> org.ovirt.engine.core.bll.network.host.SetupNetworksCommand throw Vdc Bll
> exception. With error message VdcBLLException:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
> VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error =
> Missing required nics for bonding device., code = 21 (Failed with error
> ERR_BAD_PARAMS and code 21)
> > 2015-02-10 17:30:00,053 INFO
> [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (DefaultQuartzScheduler_Worker-32) Autorecovering 2 hosts
> > 2015-02-10 17:30:00,053 INFO
> [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (DefaultQuartzScheduler_Worker-32) Autorecovering hosts id:
> c4e1f226-a243-4a9a-8fa9-fc65f4518114, name : test4.netbulae.test
> > 2015-02-10 17:30:00,075 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (DefaultQuartzScheduler_Worker-32) [655e85c2] Lock Acquired to object
> EngineLock [exclusiveLocks= key: c4e1f226-a243-4a9a-8fa9-fc65f4518114
> value: VDS
> > , sharedLocks= ]
> >
> > When I check the host I see somehow bond0 has appeared and all the ip's
> have been lost.  But I never configured bond0!!!
> >
> >
> > What's up with the bonding?
> >
> >
> >
> >
> >
> >
> >
> > Met vriendelijke groet, With kind regards,
> >
> > Jorick Astrego
> >
> > Netbulae Virtualization Experts
> > Tel: 053 20 30 270    info at netbulae.eu        Staalsteden 4-3A
> KvK 08198180
> > Fax: 053 20 30 271    www.netbulae.eu 7547 TA Enschede        BTW
> NL821234584B01
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150212/34054d92/attachment-0001.html>


More information about the Users mailing list