
The log from the start of the migration: 2021-01-20 20:27:11,027-0500 INFO (jsonrpc/5) [api.virt] START migrate(params={u'incomingLimit': 2, u'src': u'vhost2.my.domain.name', u'dstqemu': u'10.0.20.100', u'autoConverge': u'true', u'tunneled': u'false', u'enableGuestEvents': True, u'dst': u'vhost1.my.domain.name:54321', u'convergenceSchedule': {u'init': [{u'params': [u'100'], u'name': u'setDowntime'}], u'stalling': [{u'action': {u'params': [u'150'], u'name': u'setDowntime'}, u'limit': 1}, {u'action': {u'params': [u'200'], u'name': u'setDowntime'}, u'limit': 2}, {u'action': {u'params': [u'300'], u'name': u'setDowntime'}, u'limit': 3}, {u'action': {u'params': [u'400'], u'name': u'setDowntime'}, u'limit': 4}, {u'action': {u'params': [u'500'], u'name': u'setDowntime'}, u'limit': 6}, {u'action': {u'params': [], u'name': u'abort'}, u'limit': -1}]}, u'vmId': u'a24fd7e3-161c-451e-8880-b3e7e1f7d86f', u'abortOnError': u'true', u'outgoingLimit': 2, u'compressed': u'false', u'maxBandwidth': 125, u'method': u'online'}) from=::ffff:10.0.0.20,40308, flow_id=fc4e0792-a3a0-425f-b0b6-bcaf5e0f4775, vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:48) 2021-01-20 20:27:11,027-0500 INFO (jsonrpc/5) [api.virt] START migrate(params={u'incomingLimit': 2, u'src': u'vhost2.my.domain.name', u'dstqemu': u'10.0.20.100', u'autoConverge': u'true', u'tunneled': u'false', u'enableGuestEvents': True, u'dst': u'vhost1.my.domain.name:54321', u'convergenceSchedule': {u'init': [{u'params': [u'100'], u'name': u'setDowntime'}], u'stalling': [{u'action': {u'params': [u'150'], u'name': u'setDowntime'}, u'limit': 1}, {u'action': {u'params': [u'200'], u'name': u'setDowntime'}, u'limit': 2}, {u'action': {u'params': [u'300'], u'name': u'setDowntime'}, u'limit': 3}, {u'action': {u'params': [u'400'], u'name': u'setDowntime'}, u'limit': 4}, {u'action': {u'params': [u'500'], u'name': u'setDowntime'}, u'limit': 6}, {u'action': {u'params': [], u'name': u'abort'}, u'limit': -1}]}, u'vmId': u'a24fd7e3-161c-451e-8880-b3e7e1f7d86f', u'abortOnError': u'true', u'outgoingLimit': 2, u'compressed': u'false', u'maxBandwidth': 125, u'method': u'online'}) from=::ffff:10.0.0.20,40308, flow_id=fc4e0792-a3a0-425f-b0b6-bcaf5e0f4775, vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:48) 2020-01-20 20:27:13,367-0500 INFO (migsrc/a24fd7e3) [virt.vm] (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Creation of destination VM took: 2 seconds (migration:469) 2020-01-20 20:27:13,367-0500 INFO (migsrc/a24fd7e3) [virt.vm] (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') starting migration to qemu+tls://vhost1.my.domain.name/system with miguri tcp://10.0.20.100 (migration:498) That appears to all be in order, as 10.0.20.100 is the correct IP address of the migration interface on the destination host. The netcat also looks good: [root@vhost2 ~]# nc -vz 10.0.20.100 54321 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to 10.0.20.100:54321. Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds. However, the final test is very telling: [root@vhost2 ~]# ping -M do -s $((9000 - 28)) 10.0.20.100 PING 10.0.20.100 (10.0.20.100) 8972(9000) bytes of data. ^C --- 10.0.20.100 ping statistics --- 14 packets transmitted, 0 received, 100% packet loss, time 12999ms I don't think my switch is handling the MTU setting, even though it is configured to do so. I will have to investigate further. -Ben On Mon, Jan 20, 2020 at 3:54 PM Dominik Holler <dholler@redhat.com> wrote:
On Mon, Jan 20, 2020 at 2:34 PM Milan Zamazal <mzamazal@redhat.com> wrote:
Ben <gravyfish@gmail.com> writes:
Hi Milan,
Thanks for your reply. I checked the firewall, and saw that both the bond0 interface and the VLAN interface bond0.20 had been added to the default zone, which I believe should provide the necessary firewall access (output below)
I double-checked the destination host's VDSM logs and wasn't able to find any warning or error-level logs during the migration timeframe.
I checked the migration_port_* and *_port settings in qemu.conf and libvirtd.conf and all lines are commented. I have not modified either file.
The commented out settings define the default port used for migrations, so they are valid even when commented out. I can see you have libvirt-tls open below, not sure about the QEMU ports. If migration works when not using a separate migration network then it should work with the same rules for the migration network, so I think your settings are OK.
The fact that you don't get any better explanation than "unexpectedly failed" and that it fails before transferring any data indicates a possible networking error, but I can't help with that, someone with networking knowledge should.
Can you please share the relevant lines which logs the start of the migration on the source host from vdsm.log? This line should contain the IP address on migration network of the destination host. Please note that there are two network connections: the libvirt's control data is transmitted on the management network encrypted, while the qemu's data is transmitted on the migration network. On source host, can you please ping -M do -s $((9000 - 28)) dest_ip_address_on_migration_network_from_vdsm_log nc -vz dest_ip_address_on_migration_network_from_vdsm_log dest_port_on_migration_network_from_vdsm_log
You can also try to enable libvirt debugging on both the sides in /etc/libvirt/libvirtd.conf and restart libvirt (beware, those logs are huge). libvirt logs should report some error.
[root@vhost2 vdsm]# firewall-cmd --list-all public (active) target: default icmp-block-inversion: no interfaces: bond0 bond0.20 em1 em2 migration ovirtmgmt p1p1 sources: services: cockpit dhcpv6-client libvirt-tls ovirt-imageio ovirt-vmconsole snmp ssh vdsm ports: 1311/tcp 22/tcp 6081/udp 5666/tcp protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules:
On Mon, Jan 20, 2020 at 6:29 AM Milan Zamazal <mzamazal@redhat.com> wrote:
Ben <gravyfish@gmail.com> writes:
Hi, I'm pretty stuck at the moment so I hope someone can help me.
I have an oVirt 4.3 data center with two hosts. Recently, I attempted to segregate migration traffic from the the standard ovirtmgmt network, where the VM traffic and all other traffic resides.
I set up the VLAN on my router and switch, and created LACP bonds on both hosts, tagging them with the VLAN ID. I confirmed the routes work fine, and traffic speeds are as expected. MTU is set to 9000.
After configuring the migration network in the cluster and dragging and dropping it onto the bonds on each host, VMs fail to migrate.
oVirt is not reporting any issues with the network interfaces or sync with the hosts. However, when I attempt to live-migrate a VM, progress gets to 1% and stalls. The transfer rate is 0Mbps, and the operation eventually fails.
I have not been able to identify anything useful in the VDSM logs on the source or destination hosts, or in the engine logs. It repeats the below WARNING and INFO logs for the duration of the process, then logs the last entries when it fails. I can provide more logs if it would help. I'm not even sure where to start -- since I am a novice at networking, at best, my suspicion the entire time was that something is misconfigured in my network. However, the routes are good, speed tests are fine, and I can't find anything else wrong with the connections. It's not impacting any other traffic over the bond interfaces.
Are there other requirements that must be met for VMs to migrate over a separate interface/network?
Hi, did you check your firewall settings? Are the required ports open? See migration_port_* options in /etc/libvirt/qemu.conf and *_port options in /etc/libvirt/libvirtd.conf.
Is there any error reported in the destination vdsm.log?
Regards, Milan
2020-01-12 03:18:28,245-0500 WARN (migmon/a24fd7e3) [virt.vm] (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Migration stalling: remaining (4191MiB) > lowmark (4191MiB). (migration:854) 2020-01-12 03:18:28,245-0500 INFO (migmon/a24fd7e3) [virt.vm] (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Migration Progress: 930.341 seconds elapsed, 1% of data processed, total data: 4192MB, processed data: 0MB, remaining data: 4191MB, transfer speed 0MBps, zero pages: 149MB, compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:881) 2020-01-12 03:18:31,386-0500 ERROR (migsrc/a24fd7e3) [virt.vm] (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') operation failed: migration out job: unexpectedly failed (migration:282) 2020-01-12 03:18:32,695-0500 ERROR (migsrc/a24fd7e3) [virt.vm] (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Failed to migrate (migration:450) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 431, in _regular_run time.time(), migrationParams, machineParams File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 505, in _startUnderlyingMigration File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 591, in _perform_with_conv_schedule self._perform_migration(duri, muri) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 525, in _perform_migration self._migration_flags) libvirtError: operation failed: migration out job: unexpectedly failed 2020-01-12 03:18:40,880-0500 INFO (jsonrpc/6) [api.virt] FINISH getMigrationStatus return={'status': {'message': 'Done', 'code': 0}, 'migrationStats': {'status': {'message': 'Fatal error during migration', 'code': 12}, 'progress': 1L}} from=::ffff:10.0.0.20,41462, vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:54) _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PB3TQTFXWKAMNQ...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WQ64NLLPDC7E4Q...