Sure, so, unfortunately, if you need a tcpdump from the VM, we'll have
to wait until Monday. They have things set up here so that I can't get
to a console on the VM from the VDI desktop they provide as a
replacement for a real VPN.
What I have, though, is two tcpdumps. I used this command-line:
"tcpdump -i p1p1 port not 22". I did not use any other ports, just in
case.
The "from" file is where the VM was migrating from. The host's name is
swm-01.
The "to" file is where the VM was migrating to. The host's name is swm-02.
When I migrated from swm-01 to swm-02, it seems that the hosts stayed
up. The VM is very small, so I figured it just got by without
triggering whatever is closing off these ports. When I migrated it
back, though, it did trigger the event and swm-01 went "NonResponsive"
Thank you so much!
cecjr
On Sun, Aug 25, 2019 at 1:56 AM Strahil <hunter86_bg(a)yahoo.com> wrote:
It seems that according to
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/...
the ports of interest are:
16514/TCP
49152 - 49216/TCP
Best Regards,
Strahil NikolovOn Aug 25, 2019 08:48, Strahil <hunter86_bg(a)yahoo.com> wrote:
>
> Curtis,
>
> Do you have enough space to run tcpdump (port not 22) on both hosts and on the small
VM you have done previously - and then start the migration?
>
> Best Regards,
> Strahil NikolovOn Aug 24, 2019 22:15, "Curtis E. Combs Jr."
<ej.albany(a)gmail.com> wrote:
> >
> > I applied a 90Mbs QOS Rate Limit with 10 set for the shares to both
> > interfaces of 2 of the hosts. My hosts names are swm-01 and swm-02.
> >
> > Creating a small VM from a Cinder template and running it gave me a test VM.
> >
> > When I migrated from swm-01 to swm-02, swm-01 immediately became
> > unresponsive to pings, SSH'es, and to the ovirt interface which marked
> > it as "NonResponsive" soon after the VM finished. The VM did finish
> > migrating, however I'm unsure if that's a good migration or not.
> >
> > Thank you, Strahil.
> >
> > On Sat, Aug 24, 2019 at 12:39 PM Strahil <hunter86_bg(a)yahoo.com> wrote:
> > >
> > > What is your bandwidth threshold for the network used for VM migration ?
> > > Can you set a 90 mbit/s threshold (yes, less than 100mbit/s) and try to
migrate a small (1 GB RAM) VM ?
> > >
> > > Do you see disconnects ?
> > >
> > > If no, try a little bit up (the threshold) and check again.
> > >
> > > Best Regards,
> > > Strahil NikolovOn Aug 23, 2019 23:19, "Curtis E. Combs Jr."
<ej.albany(a)gmail.com> wrote:
> > > >
> > > > It took a while for my servers to come back on the network this
time.
> > > > I think it's due to ovirt continuing to try to migrate the VMs
around
> > > > like I requested. The 3 servers' names are "swm-01, swm-02
and
> > > > swm-03". Eventually (about 2-3 minutes ago) they all came back
online.
> > > >
> > > > So I disabled and stopped the lldpad service.
> > > >
> > > > Nope. Started some more migrations and swm-02 and swm-03 disappeared
> > > > again. No ping, SSH hung, same as before - almost as soon as the
> > > > migration started.
> > > >
> > > > If you wall have any ideas what switch-level setting might be
enabled,
> > > > let me know, cause I'm stumped. I can add it to the ticket
that's
> > > > requesting the port configurations. I've already added the port
> > > > numbers and switch name that I got from CDP.
> > > >
> > > > Thanks again, I really appreciate the help!
> > > > cecjr
> > > >
> > > >
> > > >
> > > > On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler
<dholler(a)redhat.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler
<dholler(a)redhat.com> wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr.
<ej.albany(a)gmail.com> wrote:
> > > > >>>
> > > > >>> This little cluster isn't in production or anything
like that yet.
> > > > >>>
> > > > >>> So, I went ahead and used your ethtool commands to
disable pause
> > > > >>> frames on both interfaces of each server. I then, chose
a few VMs to
> > > > >>> migrate around at random.
> > > > >>>
> > > > >>> swm-02 and swm-03 both went out again. Unreachable.
Can't ping, can't
> > > > >>> ssh, and the SSH session that I had open was
unresponsive.
> > > > >>>
> > > > >>> Any other ideas?
> > > > >>>
> > > > >>
> > > > >> Sorry, no. Looks like two different NICs with different
drivers and frimware goes down together.
> > > > >> This is a strong indication that the root cause is related
to the switch.
> > > > >> Maybe you can get some information about the switch config
by
> > > > >> 'lldptool get-tlv -n -i em1'
> > > > >>
> > > > >
> > > > > Another guess:
> > > > > After the optional 'lldptool get-tlv -n -i em1'
> > > > > 'systemctl stop lldpad'
> > > > > another try to migrate.
> > > > >
> > > > >
> > > > >>
> > > > >>
> > > > >>>
> > > > >>> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler
<dholler(a)redhat.com> wrote:
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr.
<ej.albany(a)gmail.com> wrote:
> > > > >>> >>
> > > > >>> >> Unfortunately, I can't check on the switch.
Trust me, I've tried.
> > > > >>> >> These servers are in a Co-Lo and I've put 5
tickets in asking about
> > > > >>> >> the port configuration. They just get ignored -
but that's par for the
> > > > >>> >> coarse for IT here. Only about 2 out of 10 of
our tickets get any
> > > > >>> >> response and usually the response doesn't
help. Then