Re: Need to enable STP on ovirt bridges

It seems that according to https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/htm... the ports of interest are: 16514/TCP 49152 - 49216/TCP Best Regards, Strahil NikolovOn Aug 25, 2019 08:48, Strahil <hunter86_bg@yahoo.com> wrote:
Curtis,
Do you have enough space to run tcpdump (port not 22) on both hosts and on the small VM you have done previously - and then start the migration?
Best Regards, Strahil NikolovOn Aug 24, 2019 22:15, "Curtis E. Combs Jr." <ej.albany@gmail.com> wrote:
I applied a 90Mbs QOS Rate Limit with 10 set for the shares to both interfaces of 2 of the hosts. My hosts names are swm-01 and swm-02.
Creating a small VM from a Cinder template and running it gave me a test VM.
When I migrated from swm-01 to swm-02, swm-01 immediately became unresponsive to pings, SSH'es, and to the ovirt interface which marked it as "NonResponsive" soon after the VM finished. The VM did finish migrating, however I'm unsure if that's a good migration or not.
Thank you, Strahil.
On Sat, Aug 24, 2019 at 12:39 PM Strahil <hunter86_bg@yahoo.com> wrote:
What is your bandwidth threshold for the network used for VM migration ? Can you set a 90 mbit/s threshold (yes, less than 100mbit/s) and try to migrate a small (1 GB RAM) VM ?
Do you see disconnects ?
If no, try a little bit up (the threshold) and check again.
Best Regards, Strahil NikolovOn Aug 23, 2019 23:19, "Curtis E. Combs Jr." <ej.albany@gmail.com> wrote:
It took a while for my servers to come back on the network this time. I think it's due to ovirt continuing to try to migrate the VMs around like I requested. The 3 servers' names are "swm-01, swm-02 and swm-03". Eventually (about 2-3 minutes ago) they all came back online.
So I disabled and stopped the lldpad service.
Nope. Started some more migrations and swm-02 and swm-03 disappeared again. No ping, SSH hung, same as before - almost as soon as the migration started.
If you wall have any ideas what switch-level setting might be enabled, let me know, cause I'm stumped. I can add it to the ticket that's requesting the port configurations. I've already added the port numbers and switch name that I got from CDP.
Thanks again, I really appreciate the help! cecjr
On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > This little cluster isn't in production or anything like that yet. > > So, I went ahead and used your ethtool commands to disable pause > frames on both interfaces of each server. I then, chose a few VMs to > migrate around at random. > > swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't > ssh, and the SSH session that I had open was unresponsive. > > Any other ideas? >
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
> > On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote: > > > > > > > > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >> > >> Unfortunately, I can't check on the switch. Trust me, I've tried. > >> These servers are in a Co-Lo and I've put 5 tickets in asking about > >> the port configuration. They just get ignored - but that's par for the > >> coarse for IT here. Only about 2 out of 10 of our tickets get any > >> response and usually the response doesn't help. Then

Sure, so, unfortunately, if you need a tcpdump from the VM, we'll have to wait until Monday. They have things set up here so that I can't get to a console on the VM from the VDI desktop they provide as a replacement for a real VPN. What I have, though, is two tcpdumps. I used this command-line: "tcpdump -i p1p1 port not 22". I did not use any other ports, just in case. The "from" file is where the VM was migrating from. The host's name is swm-01. The "to" file is where the VM was migrating to. The host's name is swm-02. When I migrated from swm-01 to swm-02, it seems that the hosts stayed up. The VM is very small, so I figured it just got by without triggering whatever is closing off these ports. When I migrated it back, though, it did trigger the event and swm-01 went "NonResponsive" Thank you so much! cecjr On Sun, Aug 25, 2019 at 1:56 AM Strahil <hunter86_bg@yahoo.com> wrote:
It seems that according to https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/htm... the ports of interest are: 16514/TCP 49152 - 49216/TCP
Best Regards, Strahil NikolovOn Aug 25, 2019 08:48, Strahil <hunter86_bg@yahoo.com> wrote:
Curtis,
Do you have enough space to run tcpdump (port not 22) on both hosts and on the small VM you have done previously - and then start the migration?
Best Regards, Strahil NikolovOn Aug 24, 2019 22:15, "Curtis E. Combs Jr." <ej.albany@gmail.com> wrote:
I applied a 90Mbs QOS Rate Limit with 10 set for the shares to both interfaces of 2 of the hosts. My hosts names are swm-01 and swm-02.
Creating a small VM from a Cinder template and running it gave me a test VM.
When I migrated from swm-01 to swm-02, swm-01 immediately became unresponsive to pings, SSH'es, and to the ovirt interface which marked it as "NonResponsive" soon after the VM finished. The VM did finish migrating, however I'm unsure if that's a good migration or not.
Thank you, Strahil.
On Sat, Aug 24, 2019 at 12:39 PM Strahil <hunter86_bg@yahoo.com> wrote:
What is your bandwidth threshold for the network used for VM migration ? Can you set a 90 mbit/s threshold (yes, less than 100mbit/s) and try to migrate a small (1 GB RAM) VM ?
Do you see disconnects ?
If no, try a little bit up (the threshold) and check again.
Best Regards, Strahil NikolovOn Aug 23, 2019 23:19, "Curtis E. Combs Jr." <ej.albany@gmail.com> wrote:
It took a while for my servers to come back on the network this time. I think it's due to ovirt continuing to try to migrate the VMs around like I requested. The 3 servers' names are "swm-01, swm-02 and swm-03". Eventually (about 2-3 minutes ago) they all came back online.
So I disabled and stopped the lldpad service.
Nope. Started some more migrations and swm-02 and swm-03 disappeared again. No ping, SSH hung, same as before - almost as soon as the migration started.
If you wall have any ideas what switch-level setting might be enabled, let me know, cause I'm stumped. I can add it to the ticket that's requesting the port configurations. I've already added the port numbers and switch name that I got from CDP.
Thanks again, I really appreciate the help! cecjr
On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com> wrote: > > > > On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: >> >> This little cluster isn't in production or anything like that yet. >> >> So, I went ahead and used your ethtool commands to disable pause >> frames on both interfaces of each server. I then, chose a few VMs to >> migrate around at random. >> >> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't >> ssh, and the SSH session that I had open was unresponsive. >> >> Any other ideas? >> > > Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. > This is a strong indication that the root cause is related to the switch. > Maybe you can get some information about the switch config by > 'lldptool get-tlv -n -i em1' >
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
> > >> >> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote: >> > >> > >> > >> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: >> >> >> >> Unfortunately, I can't check on the switch. Trust me, I've tried. >> >> These servers are in a Co-Lo and I've put 5 tickets in asking about >> >> the port configuration. They just get ignored - but that's par for the >> >> coarse for IT here. Only about 2 out of 10 of our tickets get any >> >> response and usually the response doesn't help. Then
participants (2)
-
Curtis E. Combs Jr.
-
Strahil