[ovirt-users] bonding problems

Rik Theys Rik.Theys at esat.kuleuven.be
Thu Feb 12 17:29:25 UTC 2015


 

Hi, 

I had a similar problem with an upgrade from 3.4 to 3.5. My workaround
was: 

1. Put the host in maintenance mode 

2. Stop vdsmd and supervdsmd service 

3. Manually restore the ifcfg network config files 

4. Start vdsmd service (which starts supervdsmd), which succeeds once. 

5. Go to the "setup networks" tab for this host in the engine web
interface and click the pencil on each bond and network. Make a change
and change it back. Click to save it on the host. The host will become
unreachable for a few seconds 

6. Afterwards the files should have been generated in
/var/lib/vdsm/persistence/netconf 

It seems that everytime the system boots now, vdsmd erases and recreates
the ifcfg files. This might be the expected behaviour now but I don't
really see the point. On boot the network will now come up and when it
reaches the vdsmd service it will go down and come back up. 

Regards, 

Rik 

On 2015-02-12 18:07, Tim Macy wrote: 

> Jorick, I experienced similar problems when setting up a few different test clusters using bonds. I found that the vdsm is not setting the bonds to persist. These are the steps that worked for me to get the host back online and fix it permanently: 
> 1. Put the host in Maintenance Mode. 
> 2. Manually restore the network configuration back to your original setup before joining the cluster. Tar up the configuration files in case you need them again. 
> 3. At this point you can attempt to run re-install on the host from the engine *sometimes this worked other times it did not and I needed to do the additional steps below. 
> 
> If it fails to join: 
> 
> 1. Put the host in Maintenance Mode. 
> 2. Manually restore the network configuration back to your original setup before joining the cluster. 
> 3. Remove vdsm from the host yum remove vdsm* 
> 4. Delete the following vdsm configs rm -Rf /etc/vdsm ; rm -Rf /var/lib/vdsm 
> 5. Run re-install on the host from the engine 
> 
> Making sure the network is setup to persist - 
> * Any networks that will persist will be identified in /var/lib/vdsm/persistence/netconfs (bonds and nets directories) If they are missing do the following 
> 1. From the engine DC/Cluster/Hosts - select each host - from the menu below select Network Interfaces - Run Setup Host Networks. 
> 2. Verify they were added to /var/lib/vdsm/persistence/netconfs/bonds or nets 
> 
> Hope this helps. 
> 
> Tim Macy 
> 
> On Thu, Feb 12, 2015 at 10:29 AM, Martin Pavlík <mpavlik at redhat.com> wrote:
> 
>> Hi Jorick,
>> 
>> if I understand correctly you had ovirtmgmt over bond and now you've tried to move it to single interface? You mention that everything is gone after restarting. Restarting of what? What system is running on your hypervisors? Can you provide supervdsm.log from the affected node?
>> 
>> Martin Pavlik
>> 
>> RHEV QE
>>> On 10 Feb 2015, at 17:58, Jorick Astrego <j.astrego at netbulae.eu> wrote:
>>> 
>>> After having problems with a bond and the ovirtmgmt interface on 3.5.1, I skipped bond0 and just added bond1 for gluster and bond2 for internet.
>>> 
>>> When restarting, I loose all the bonds and I cannot use the "Setup Host networks" anymore:
>>> 
>>> <ehehbhcg.png>
>> 
>>> 
>>> In the engine log, I see the following:
>>> 
>>> 
>>> 2015-02-10 17:29:41,799 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) Connecting to test4.netbulae.test/xx.xx.xx.xx
>>> 2015-02-10 17:29:42,414 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] (ajp--127.0.0.1-8702-4) [4be2b53a] Failed in SetupNetworksVDS method
>>> 2015-02-10 17:29:42,414 WARN [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] (ResponseWorker) Exception thrown during message processing
>>> 2015-02-10 17:29:42,414 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] (ajp--127.0.0.1-8702-4) [4be2b53a] org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error = Missing required nics for bonding device., code = 21
>>> 2015-02-10 17:29:42,416 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] (ajp--127.0.0.1-8702-4) [4be2b53a] Command SetupNetworksVDSCommand(HostName = test4.netbulae.test, HostId = c4e1f226-a243-4a9a-8fa9-fc65f4518114, force=false, checkConnectivity=true, conectivityTimeout=120,
>>> networks=[gluster {id=123ff683-fcf8-407b-94ca-f323c1fba326, description=null, comment=null, subnet=null, gateway=null, type=null, vlanId=null, stp=false, dataCenterId=00000002-0002-0002-0002-00000000031a, mtu=9000, vmNetwork=false, cluster=NetworkCluster {id={clusterId=null, networkId=null}, status=OPERATIONAL, display=false, required=true, migration=true}, providedBy=null, label=null, qosId=null},
>>> uplink {id=51a63339-5570-4b58-b00c-9d54a534ff20, description=null, comment=null, subnet=null, gateway=null, type=null, vlanId=null, stp=false, dataCenterId=00000002-0002-0002-0002-00000000031a, mtu=0, vmNetwork=true, cluster=NetworkCluster {id={clusterId=null, networkId=null}, status=OPERATIONAL, display=false, required=true, migration=false}, providedBy=null, label=null, qosId=null}],
>>> bonds=[bond1 {id=64135fde-a5de-404c-89a3-2ce13a1a7d84, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=bond1, macAddress=**:**:**:**:**:**, networkName=gluster, bondOptions=mode=4 miimon=100, bootProtocol=STATIC_IP, address=, subnet=, gateway=, mtu=9000, bridged=false, type=0, networkImplementationDetails={inSync=true, managed=true}},
>>> bond2 {id=f2f26c27-85f3-413a-abb0-79db4b323c0e, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=bond2, macAddress=**:**:**:**:**:**, networkName=null, bondOptions=mode=4 miimon=100, bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500, bridged=false, type=0, networkImplementationDetails=null}],
>>> interfaces=[enp2s0 {id=542b77c3-a412-4435-87f1-182663317893, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp2s0, macAddress=**:**:**:**:**:**, networkName=null, bondName=null, bootProtocol=DHCP, address=, subnet=, gateway=null, mtu=1500, bridged=false, speed=0, type=0, networkImplementationDetails=null},
>>> enp9s0f1 {id=f5cc13ec-ac66-446c-b2b4-c6561c10a62b, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp9s0f1, macAddress=**:**:**:**:**:**, networkName=null, bondName=bond2, bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500, bridged=false, speed=0, type=0, networkImplementationDetails=null},
>>> enp9s0f0 {id=6bf7215a-ca7e-4634-8225-5762f5bd3620, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp9s0f0, macAddress=**:**:**:**:**:**, networkName=null, bondName=bond2, bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500, bridged=false, speed=0, type=0, networkImplementationDetails=null},
>>> enp1s0 {id=e5319631-6c7e-4be2-8914-a24df70f26af, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp1s0, macAddress=**:**:**:**:**:**, networkName=ovirtmgmt, bondName=null, bootProtocol=DHCP, address=xx.xx.xx.xx, subnet=255.255.255.0, gateway=*.*.*.*, mtu=1500, bridged=true, speed=1000, type=2, networkImplementationDetails={inSync=true, managed=true}},
>>> enp7s0f1 {id=665c7d78-0ce3-46d9-9458-6828eecf23e7, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp7s0f1, macAddress=**:**:**:**:**:**, networkName=null, bondName=bond1, bootProtocol=NONE, address=, subnet=, gateway=null, mtu=9000, bridged=false, speed=0, type=0, networkImplementationDetails=null},
>>> enp7s0f0 {id=3bb50de2-590d-451e-841f-805ebba1d1b6, vdsId=c4e1f226-a243-4a9a-8fa9-fc65f4518114, name=enp7s0f0, macAddress=**:**:**:**:**:**, networkName=null, bondName=bond1, bootProtocol=NONE, address=, subnet=, gateway=null, mtu=9000, bridged=false, speed=0, type=0, networkImplementationDetails=null},
>>> bond1 {id=null, vdsId=null, name=bond1, macAddress=null, networkName=gluster, bondOptions=mode=4 miimon=100, bootProtocol=STATIC_IP, address=xx.xx.xx.xx, subnet=255.255.255.0, gateway=null, mtu=9000, bridged=false, type=0, networkImplementationDetails=null},
>>> bond2 {id=null, vdsId=null, name=bond2, macAddress=null, networkName=uplink, bondOptions=mode=4 miimon=100, bootProtocol=NONE, address=null, subnet=null, gateway=null, mtu=0, bridged=true, type=0, networkImplementationDetails=null}],
>>> removedNetworks=[],
>>> removedBonds=[]) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error = Missing required nics for bonding device., code = 21
>>> 2015-02-10 17:29:42,425 ERROR [org.ovirt.engine.core.bll.network.host.SetupNetworksCommand] (ajp--127.0.0.1-8702-4) [4be2b53a] Command org.ovirt.engine.core.bll.network.host.SetupNetworksCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error = Missing required nics for bonding device., code = 21 (Failed with error ERR_BAD_PARAMS and code 21)
>>> 2015-02-10 17:30:00,053 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (DefaultQuartzScheduler_Worker-32) Autorecovering 2 hosts
>>> 2015-02-10 17:30:00,053 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (DefaultQuartzScheduler_Worker-32) Autorecovering hosts id: c4e1f226-a243-4a9a-8fa9-fc65f4518114, name : test4.netbulae.test
>>> 2015-02-10 17:30:00,075 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (DefaultQuartzScheduler_Worker-32) [655e85c2] Lock Acquired to object EngineLock [exclusiveLocks= key: c4e1f226-a243-4a9a-8fa9-fc65f4518114 value: VDS
>>> , sharedLocks= ]
>>> 
>>> When I check the host I see somehow bond0 has appeared and all the ip's have been lost. But I never configured bond0!!!
>>> 
>>> 
>>> What's up with the bonding?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Met vriendelijke groet, With kind regards,
>>> 
>>> Jorick Astrego
>>> 
>>> Netbulae Virtualization Experts > Tel: 053 20 30 270 info at netbulae.eu Staalsteden 4-3A KvK 08198180
>>> Fax: 053 20 30 271 www.netbulae.eu [1] 7547 TA Enschede BTW NL821234584B01
>>> 
>>> 
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users [2]
>> 
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users [2]
> 
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users [2]

-- 
Rik Theys
 

Links:
------
[1] http://www.netbulae.eu
[2] http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150212/aa1c9263/attachment-0001.html>


More information about the Users mailing list