
It seems that according to https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/htm... the ports of interest are: 16514/TCP 49152 - 49216/TCP Best Regards, Strahil NikolovOn Aug 25, 2019 08:48, Strahil <hunter86_bg@yahoo.com> wrote:
Curtis,
Do you have enough space to run tcpdump (port not 22) on both hosts and on the small VM you have done previously - and then start the migration?
Best Regards, Strahil NikolovOn Aug 24, 2019 22:15, "Curtis E. Combs Jr." <ej.albany@gmail.com> wrote:
I applied a 90Mbs QOS Rate Limit with 10 set for the shares to both interfaces of 2 of the hosts. My hosts names are swm-01 and swm-02.
Creating a small VM from a Cinder template and running it gave me a test VM.
When I migrated from swm-01 to swm-02, swm-01 immediately became unresponsive to pings, SSH'es, and to the ovirt interface which marked it as "NonResponsive" soon after the VM finished. The VM did finish migrating, however I'm unsure if that's a good migration or not.
Thank you, Strahil.
On Sat, Aug 24, 2019 at 12:39 PM Strahil <hunter86_bg@yahoo.com> wrote:
What is your bandwidth threshold for the network used for VM migration ? Can you set a 90 mbit/s threshold (yes, less than 100mbit/s) and try to migrate a small (1 GB RAM) VM ?
Do you see disconnects ?
If no, try a little bit up (the threshold) and check again.
Best Regards, Strahil NikolovOn Aug 23, 2019 23:19, "Curtis E. Combs Jr." <ej.albany@gmail.com> wrote:
It took a while for my servers to come back on the network this time. I think it's due to ovirt continuing to try to migrate the VMs around like I requested. The 3 servers' names are "swm-01, swm-02 and swm-03". Eventually (about 2-3 minutes ago) they all came back online.
So I disabled and stopped the lldpad service.
Nope. Started some more migrations and swm-02 and swm-03 disappeared again. No ping, SSH hung, same as before - almost as soon as the migration started.
If you wall have any ideas what switch-level setting might be enabled, let me know, cause I'm stumped. I can add it to the ticket that's requesting the port configurations. I've already added the port numbers and switch name that I got from CDP.
Thanks again, I really appreciate the help! cecjr
On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > This little cluster isn't in production or anything like that yet. > > So, I went ahead and used your ethtool commands to disable pause > frames on both interfaces of each server. I then, chose a few VMs to > migrate around at random. > > swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't > ssh, and the SSH session that I had open was unresponsive. > > Any other ideas? >
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
> > On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote: > > > > > > > > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >> > >> Unfortunately, I can't check on the switch. Trust me, I've tried. > >> These servers are in a Co-Lo and I've put 5 tickets in asking about > >> the port configuration. They just get ignored - but that's par for the > >> coarse for IT here. Only about 2 out of 10 of our tickets get any > >> response and usually the response doesn't help. Then