-----Messaggio originale-----
Da: Michael S. Tsirkin [mailto:mst@redhat.com]
Inviato: mercoledì 29 luglio 2015 12:03
A: NUNIN Roberto
Cc: Fabian Deutsch; users(a)ovirt.org
Oggetto: Re: R: [ovirt-users] R: R: R: R: R: R: PXE boot of a VM on vdsm don't
read DHCP offer
On Wed, Jul 29, 2015 at 12:00:38PM +0200, NUNIN Roberto wrote:
>
> > -----Messaggio originale-----
> > Da: users-bounces(a)ovirt.org [mailto:users-bounces@ovirt.org] Per conto
di
> > Michael S. Tsirkin
> > Inviato: giovedì 9 luglio 2015 15:15
> > A: Fabian Deutsch
> > Cc: users(a)ovirt.org
> > Oggetto: Re: [ovirt-users] R: R: R: R: R: R: PXE boot of a VM on vdsm
don't
read
> > DHCP offer
> >
> > On Thu, Jul 09, 2015 at 08:57:50AM -0400, Fabian Deutsch wrote:
> > > ----- Original Message -----
> > > > On Wed, Jul 08, 2015 at 09:11:42AM +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 07, 2015 at 05:13:28PM +0100, Dan Kenigsberg wrote:
> > > > > > On Tue, Jul 07, 2015 at 10:14:54AM +0200, NUNIN Roberto
wrote:
> > > > > > > >
> > > > > > > > On Mon, Jul 06, 2015 at 10:33:59AM +0200, NUNIN
Roberto
wrote:
> > > > > > > > > Hi Dan
> > > > > > > > >
> > > > > > > > > Sorry for question: what do you mean for
interface vnetxxxx ?
> > > > > > > > > Currently our path is :
> > > > > > > > > eno1 - eno2 ---- bond0 ----- bond.3500
(VLAN) ------ bridge -----
> > > > > > > > > vm.
> > > > > > > > >
> > > > > > > > > Which one of these ?
> > > > > > > > > Moreover, reading Fabian statements about
bonding limits,
> > today I
> > > > > > > > > can try
> > > > > > > > to switch to a config without bonding.
> > > > > > > >
> > > > > > > > "vm" is a complicated term.
> > > > > > > >
> > > > > > > > `brctl show` would not show you a "vm"
connected to a bridge.
> > When
> > > > > > > > you
> > > > > > > > WOULD see is a vnet888 tap device. The
"other side" of this
device
> > is
> > > > > > > > held by qemu, which implement the VM.
> > > > > > >
> > > > > > > Ok, understood and found it, vnet2
> > > > > > >
> > > > > > > >
> > > > > > > > I'm asking if the dhcp offer has reached that
tap device.
> > > > > > >
> > > > > > > No, the DHCP offer packet do not reach the vnet2
interface, I can
see
> > > > > > > only DHCP DISCOVER.
> > > > > >
> > > > > > Ok, so it seems that we have a problem in the host
bridging.
> > > > > >
> > > > > > Is it the latest kernel-3.10.0-229.7.2.el7.x86_64 ?
> > > > > >
> > > > > > Michael, a DHCP DISCOVER is sent out of a just-booted
guest, and
> > OFFER
> > > > > > returns to the bridge, but is not propagated to the tap
device.
> > > > > > Can you suggest how to debug this further?
> > > > >
> > > > > Dump packets including the ethernet headers.
> > > > > Likely something interfered with them so the eth address is
wrong.
> > > > >
> > > > > Since bonding does this sometimes, this is the most likely
culprit.
> > > >
> > > > We've ruled this out already - Roberto reproduces the issue
without a
> > > > bond.
> > >
> > > To me this looks like either a regression in the host side bridging. But
otoh it
> > doesn't look
> > > like it's happening always, because otherwise I'd expect more
noise
around
> > this issue.
> > >
> > > - fabian
> >
> > Hard to say. E.g. forwarding delay would do this for a while.
> > If eth address of the packets is okay, poke at the fbd, maybe there's
> > something wrong there. Maybe stp is detecting a loop - try checking that.
>
> Someone is checking this ?
> In tested config SPT was off.
Then maybe you have a loop :)
No, it is not a loop.
I've done further tests today and finally I've defined the following conditions.
Erratic behavior is detected only within a cluster where nodes are HP Proliant BL660cGen8,
connected to Cisco Nexus 7K thru HP FEX B22 blade interconnects and Cisco Nexus 5596
switches. All nic cards are 10Gbit.
It doesn't happen with two HP Proliant DL380G5 with 10Gbit nics, connected directly to
Cisco Nexus 5548UP switches and not happen with two HP Proliant ML350eGen8 nic 1Gbit
connected to Cisco 4948 and next the same Nexus 5548UP.
All nodes are running Centos 7.1 with latest updates and all networks are configured in
the same mode, with bonding over two nic, then vlan interfaces and bridge towards VMs.
Bonding is 4 for all and works correctly with DL380 and ML350 clusters.
Well, I've tried to change the bonding mode on the BL660 cluster to mode 1 and the
issue disappear.
In all other bonding modes, it doesn't work; bridge interfaces receive DHCP offers and
do NOT reject packets, but tap interfaces aren't receiving the offer. It works only
with mode 1.
How I can investigate further ? Desiderata is to have mode 4, to aggregate available
bandwidth.
RN
> RN
> >
> > --
> > MST
> > _______________________________________________
> > Users mailing list
> > Users(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/users
>
> Questo messaggio e' indirizzato esclusivamente al destinatario indicato e
potrebbe contenere informazioni confidenziali, riservate o proprietarie.
Qualora la presente venisse ricevuta per errore, si prega di segnalarlo
immediatamente al mittente, cancellando l'originale e ogni sua copia e
distruggendo eventuali copie cartacee. Ogni altro uso e' strettamente proibito
e potrebbe essere fonte di violazione di legge.
>
> This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information. If you have received
it in error, please notify the sender immediately, deleting the original and all
copies and destroying any hard copies. Any other use is strictly prohibited
and may be unlawful.
Questo messaggio e' indirizzato esclusivamente al destinatario indicato e potrebbe
contenere informazioni confidenziali, riservate o proprietarie. Qualora la presente
venisse ricevuta per errore, si prega di segnalarlo immediatamente al mittente,
cancellando l'originale e ogni sua copia e distruggendo eventuali copie cartacee. Ogni
altro uso e' strettamente proibito e potrebbe essere fonte di violazione di legge.
This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise private information. If you have received it in error, please notify the
sender immediately, deleting the original and all copies and destroying any hard copies.
Any other use is strictly prohibited and may be unlawful.