The problem appears to be MTU related, I may have a network configuration problem. Setting back to 1500 mtu seems to have solved it for now

On Thu, May 27, 2021 at 2:26 PM Jayme <jaymef@gmail.com> wrote:
I've gotten a bit further. I have a separate 10Gbe network for GlusterFS traffic which was also set as the migration network. I disabled migration on GlusterFS network and enabled on default management network and now migration seems to be working. I'm not sure why at this point, it used to work fine on GlusterFS migration network in the past.

On Thu, May 27, 2021 at 2:11 PM Jayme <jaymef@gmail.com> wrote:
I have a three node oVirt 4.4.5 cluster running oVirt node hosts. Storage is mix of GlusterFS and NFS. Everything has been running smoothly, but the other day I noticed many VMs had invalid snapshots. I run a script to export OVA for VMs for backup purposes, exports seemed to have been fine but snapshots failed to delete at the end. I was able to manually delete the snapshots through oVirt admin GUI without any errors/warnings and the VMs have been running fine and can restart them without problems.

I thought this problem may be due to snapshot bug which is supposedly fixed in oVirt 4.4.6. I decided to start upgrading cluster to 4.4.6 and am now having a problem with VMs not being able to migrate.

When I migrate any VM (doesn't seem to matter which host to and from) the process starts but stops at 0-1%. Eventually after 15-30 minutes or more the tasks are all completed by the VM is not migrated. 

I am unable to migrate any VMs and as such I cannot place any host in maintenance mode.

I've attaching some VDSM logs from source and destination hosts, these were after initiating a migration of a single VM

I'm seeing some errors in the logs regarding the migration stalling, but not able to determine why its stalling.

2021-05-27 17:10:22,167+0000 INFO  (jsonrpc/4) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'code': 0, 'message': 'Done'}, 'io_tune_policies_dict': {'f8f4e4a1-b565-4663-8962-c8804dbb86fb': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme1n1/bce04425-1d25-4489-bdab-2834a1a57db8/images/38b27cce-c744-4a12-85a3-3af07d386da2/93c1e793-f8cb-42c9-86a6-0e9ce4a6023a', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '2b87204f-f695-474a-9f08-47b85fcac366': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/f2e0c9f3-ab0d-441a-85a6-07a42e78b5a8/848f353e-6787-4e20-ab7b-0541ebd852c6', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '26332421-54a3-4afc-90e7-551a7e314c80': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/b7a785f9-307b-42af-9bbe-23cac884fe97/ed1d027e-a36a-4e6b-9207-119915044e06', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '60edbd80-dad7-4bf8-8fd1-e138413cf9f6': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/535fcb2e-ece9-4d50-86fe-bf6264d11ae1/6c01a036-8a14-46ba-a4b4-fe4f66a586a3', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/1f467fb5-5ea7-42ba-bace-f175c86791b2/cbe8327f-9b7f-442f-a650-6888bb11a674', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdd', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/c93956d5-c88d-41f9-8c38-9f5f62cc90dd/3920b46c-5fab-4b63-b47f-2fa5c6714c36', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, 'beeefe06-78a0-4e14-a932-cc8d734d542d': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/glusterSD/gluster0.grove.silverorange.com:_data__sdb/30fd0a2f-ab42-4a8a-8f0b-67242dc2d15d/images/310d8b3e-d578-418d-9802-dc0ebcea06d6/aa758c51-8478-4273-aeef-d4b374b8d6b4', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/glusterSD/gluster0.grove.silverorange.com:_data__sdb/30fd0a2f-ab42-4a8a-8f0b-67242dc2d15d/images/4072fda1-ec82-45c9-b353-91fceb13bf08/891f5982-dead-48b4-8907-caa1e309fa82', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '7e5156de-649d-4904-9092-21a699242a37': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/ca0c1208-a7aa-4ef6-a450-4a40bd4455f3/a2335199-ddd4-429b-b55d-f4d527081fd3', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}}} from=::1,35012 (api:54)
2021-05-27 17:10:31,118+0000 WARN  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration stalling: remaining (32863MiB) > lowmark (32863MiB). (migration:801)
2021-05-27 17:10:31,118+0000 INFO  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration Progress: 190.035 seconds elapsed, 1% of data processed, total data: 32864MB, processed data: 0MB, remaining data: 32863MB, transfer speed 0Mbps, zero pages: 160MB, compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:814)
2021-05-27 17:10:33,827+0000 INFO  (jsonrpc/5) [throttled] Current getAllVmStats: {'f8f4e4a1-b565-4663-8962-c8804dbb86fb': 'Up', '2b87204f-f695-474a-9f08-47b85fcac366': 'Up', '26332421-54a3-4afc-90e7-551a7e314c80': 'Up', '60edbd80-dad7-4bf8-8fd1-e138413cf9f6': 'Up', 'beeefe06-78a0-4e14-a932-cc8d734d542d': 'Up', '7e5156de-649d-4904-9092-21a699242a37': 'Migration Source'} (throttledlog:104)
2021-05-27 17:10:37,186+0000 INFO  (jsonrpc/5) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'code': 0, 'message': 'Done'}, 'io_tune_policies_dict': {'f8f4e4a1-b565-4663-8962-c8804dbb86fb': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme1n1/bce04425-1d25-4489-bdab-2834a1a57db8/images/38b27cce-c744-4a12-85a3-3af07d386da2/93c1e793-f8cb-42c9-86a6-0e9ce4a6023a', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '2b87204f-f695-474a-9f08-47b85fcac366': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/f2e0c9f3-ab0d-441a-85a6-07a42e78b5a8/848f353e-6787-4e20-ab7b-0541ebd852c6', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '26332421-54a3-4afc-90e7-551a7e314c80': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/b7a785f9-307b-42af-9bbe-23cac884fe97/ed1d027e-a36a-4e6b-9207-119915044e06', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '60edbd80-dad7-4bf8-8fd1-e138413cf9f6': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/535fcb2e-ece9-4d50-86fe-bf6264d11ae1/6c01a036-8a14-46ba-a4b4-fe4f66a586a3', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/1f467fb5-5ea7-42ba-bace-f175c86791b2/cbe8327f-9b7f-442f-a650-6888bb11a674', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdd', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/c93956d5-c88d-41f9-8c38-9f5f62cc90dd/3920b46c-5fab-4b63-b47f-2fa5c6714c36', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, 'beeefe06-78a0-4e14-a932-cc8d734d542d': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/glusterSD/gluster0.grove.silverorange.com:_data__sdb/30fd0a2f-ab42-4a8a-8f0b-67242dc2d15d/images/310d8b3e-d578-418d-9802-dc0ebcea06d6/aa758c51-8478-4273-aeef-d4b374b8d6b4', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/glusterSD/gluster0.grove.silverorange.com:_data__sdb/30fd0a2f-ab42-4a8a-8f0b-67242dc2d15d/images/4072fda1-ec82-45c9-b353-91fceb13bf08/891f5982-dead-48b4-8907-caa1e309fa82', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '7e5156de-649d-4904-9092-21a699242a37': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/ca0c1208-a7aa-4ef6-a450-4a40bd4455f3/a2335199-ddd4-429b-b55d-f4d527081fd3', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}}} from=::1,35012 (api:54)
2021-05-27 17:10:41,120+0000 WARN  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration stalling: remaining (32863MiB) > lowmark (32863MiB). (migration:801)
2021-05-27 17:10:41,120+0000 INFO  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration Progress: 200.037 seconds elapsed, 1% of data processed, total data: 32864MB, processed data: 0MB, remaining data: 32863MB, transfer speed 0Mbps, zero pages: 160MB, compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:814)
2021-05-27 17:10:51,121+0000 WARN  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration stalling: remaining (32863MiB) > lowmark (32863MiB). (migration:801)
2021-05-27 17:10:51,121+0000 INFO  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration Progress: 210.039 seconds elapsed, 1% of data processed, total data: 32864MB, processed data: 0MB, remaining data: 32863MB, transfer speed 0Mbps, zero pages: 160MB, compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:814)
2021-05-27 17:10:52,211+0000 INFO  (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'code': 0, 'message': 'Done'}, 'io_tune_policies_dict': {'f8f4e4a1-b565-4663-8962-c8804dbb86fb': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme1n1/bce04425-1d25-4489-bdab-2834a1a57db8/images/38b27cce-c744-4a12-85a3-3af07d386da2/93c1e793-f8cb-42c9-86a6-0e9ce4a6023a', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '2b87204f-f695-474a-9f08-47b85fcac366': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/f2e0c9f3-ab0d-441a-85a6-07a42e78b5a8/848f353e-6787-4e20-ab7b-0541ebd852c6', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '26332421-54a3-4afc-90e7-551a7e314c80': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/b7a785f9-307b-42af-9bbe-23cac884fe97/ed1d027e-a36a-4e6b-9207-119915044e06', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '60edbd80-dad7-4bf8-8fd1-e138413cf9f6': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/535fcb2e-ece9-4d50-86fe-bf6264d11ae1/6c01a036-8a14-46ba-a4b4-fe4f66a586a3', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/1f467fb5-5ea7-42ba-bace-f175c86791b2/cbe8327f-9b7f-442f-a650-6888bb11a674', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdd', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme2n1/a7efa448-201b-4453-9bc9-900559b891ca/images/c93956d5-c88d-41f9-8c38-9f5f62cc90dd/3920b46c-5fab-4b63-b47f-2fa5c6714c36', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, 'beeefe06-78a0-4e14-a932-cc8d734d542d': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/glusterSD/gluster0.grove.silverorange.com:_data__sdb/30fd0a2f-ab42-4a8a-8f0b-67242dc2d15d/images/310d8b3e-d578-418d-9802-dc0ebcea06d6/aa758c51-8478-4273-aeef-d4b374b8d6b4', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/glusterSD/gluster0.grove.silverorange.com:_data__sdb/30fd0a2f-ab42-4a8a-8f0b-67242dc2d15d/images/4072fda1-ec82-45c9-b353-91fceb13bf08/891f5982-dead-48b4-8907-caa1e309fa82', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}, '7e5156de-649d-4904-9092-21a699242a37': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/10.11.0.9:_vmstorage_nvme0n1/a99cd663-f6d5-42d8-bd7a-ee0b5d068608/images/ca0c1208-a7aa-4ef6-a450-4a40bd4455f3/a2335199-ddd4-429b-b55d-f4d527081fd3', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}}} from=::1,35012 (api:54)
2021-05-27 17:11:01,123+0000 WARN  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration stalling: remaining (32863MiB) > lowmark (32863MiB). (migration:801)
2021-05-27 17:11:01,123+0000 INFO  (migmon/7e5156de) [virt.vm] (vmId='7e5156de-649d-4904-9092-21a699242a37') Migration Progress: 220.041 seconds elapsed, 1% of data processed, total data: 32864MB, processed data: 0MB, remaining data: 32863MB, transfer speed 0Mbps, zero pages: 160MB, compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:814)ats return={'86245648-abd8-46e3-9c10-432e8788a074': {'code': 0, 'lastCheck': '1.6', 'delay': '0.00353497', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::1,35010, task_id=c4e65f55-1367-41d3-9bf6-f357a382df4a (api:54)
2021-05-27 17:09:33,156+0000 INFO  (jsonrpc/2) [api.host] START getStats() from=::ffff:10.11.0.219,54952 (api:48)