VM live migration fails after update to qemu-kvm-9.0.0-4.el9.x86_64

Hello After updating a hypervisor with qemu-kvm-9.0.0-4.el9.x86_64 live migration stopped working with errors related to the cpu topology. When I downgraded to qemu-kvm-9.0.0-3.el9.x86_64 migration worked again. Is this a known bug? Has anyone else encountered this?

On 2024-07-10 11:27, g.vasilopoulos@uoc.gr wrote:
After updating a hypervisor with qemu-kvm-9.0.0-4.el9.x86_64 live migration stopped working with errors related to the cpu topology. When I downgraded to qemu-kvm-9.0.0-3.el9.x86_64 migration worked again. Is this a known bug? Has anyone else encountered this?
Unfortunately I've encountered the same issue. I've decided to stay on qemu-kvm-9.0.0-4.el9 and restarted all VMs on hosts with the new version. However, the update from 9.0.0-4 to 9.0.0-6 was flawless and did not break VM live migration. Ciao - Frank

That is a qemu bug. Can you send the exact error and eventually the domain xml/args? Needs to be fixed in the qemu-kvm packages. Jean-Louis On 10/07/2024 13:17, Frank Wall wrote:
On 2024-07-10 11:27, g.vasilopoulos@uoc.gr wrote:
After updating a hypervisor with qemu-kvm-9.0.0-4.el9.x86_64 live migration stopped working with errors related to the cpu topology. When I downgraded to qemu-kvm-9.0.0-3.el9.x86_64 migration worked again. Is this a known bug? Has anyone else encountered this?
Unfortunately I've encountered the same issue. I've decided to stay on qemu-kvm-9.0.0-4.el9 and restarted all VMs on hosts with the new version.
However, the update from 9.0.0-4 to 9.0.0-6 was flawless and did not break VM live migration.
Ciao - Frank _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XA5KZCUSMZJLAU...

Hello libvirt logs From the source host: 2024-07-09 14:00:26.242+0000: Domain id=3 is tainted: deprecated-config (machine type 'pc-q35-rhel8.6.0') 2024-07-09T14:00:26.339600Z qemu-kvm: warning: Deprecated CPU topology (considered invalid): Unsupported clusters parameter mustn't be specified as 1 2024-07-09T14:00:26.353423Z qemu-kvm: -accel kvm: warning: Number of hotpluggable cpus requested (64) exceeds the recommended cpus supported by KVM (20) 2024-07-09T14:00:26.354825Z qemu-kvm: warning: Machine type 'pc-q35-rhel8.6.0' is deprecated: machine types for previous major releases are deprecated 2024-07-09 14:00:50.900+0000: Domain id=3 is tainted: custom-ga-command 2024-07-10 14:15:31.812+0000: initiating migration 2024-07-10 14:15:39.245+0000: shutting down, reason=migrated 2024-07-10T14:15:39.533067Z qemu-kvm: terminating on signal 15 from pid 1897 (<unknown process>) 2024-07-11 07:00:32.691+0000: starting up libvirt version: 10.3.0, package: 1.el9 (builder@centos.org, 2024-05-09-10:33:35, ), qemu version: 9.0.0qemu-kvm-9.0.0-3.el9, kernel: 5.14.0-460.el9.x86_64, hostname: o14-car0109.vmmgmt-int.uoc.gr 2024-07-11 07:00:32.691+0000: Domain id=4 is tainted: hook-script 2024-07-11 07:00:32.692+0000: Domain id=4 is tainted: deprecated-config (machine type 'pc-q35-rhel8.6.0') 2024-07-11T07:00:32.791666Z qemu-kvm: warning: Deprecated CPU topology (considered invalid): Unsupported clusters parameter mustn't be specified as 1 2024-07-11T07:00:32.797464Z qemu-kvm: -accel kvm: warning: Number of hotpluggable cpus requested (64) exceeds the recommended cpus supported by KVM (20) 2024-07-11T07:00:32.799091Z qemu-kvm: warning: Machine type 'pc-q35-rhel8.6.0' is deprecated: machine types for previous major releases are deprecated warning: old compression is deprecated; use multifd compression methods instead warning: old compression is deprecated; use multifd compression methods instead warning: old compression is deprecated; use multifd compression methods instead warning: block migration is deprecated; use blockdev-mirror with NBD instead 2024-07-11 07:00:43.330+0000: Domain id=4 is tainted: custom-ga-command 2024-07-11 07:01:25.444+0000: initiating migration 2024-07-11T07:01:27.203660Z qemu-kvm: failed to save SaveStateEntry with id(name): 2(ram): -5 2024-07-11T07:01:27.228804Z qemu-kvm: Unable to shutdown socket: Transport endpoint is not connected 2024-07-11T07:01:27.228861Z qemu-kvm: Unable to read from socket: Connection reset by peer warning: old compression is deprecated; use multifd compression methods instead warning: old compression is deprecated; use multifd compression methods instead warning: old compression is deprecated; use multifd compression methods instead warning: block migration is deprecated; use blockdev-mirror with NBD instead Destination host: 2024-07-11 07:01:18.096+0000: Domain id=1 is tainted: deprecated-config (machine type 'pc-q35-rhel8.6.0') 2024-07-11T07:01:18.189344Z qemu-kvm: warning: Deprecated CPU topology (considered invalid): Unsupported clusters parameter mustn't be specified as 1 2024-07-11T07:01:18.201070Z qemu-kvm: -accel kvm: warning: Number of hotpluggable cpus requested (64) exceeds the recommended cpus supported by KVM (40) 2024-07-11T07:01:18.202475Z qemu-kvm: warning: Machine type 'pc-q35-rhel8.6.0' is deprecated: machine types for previous major releases are deprecated 2024-07-11T07:01:18.277421Z qemu-kvm: -device {"driver":"virtio-blk-pci","iothread":"iothread1","bus":"pci.6","addr":"0x0","drive":"libvirt-1-storage","id":"ua-f589b059-4a80-416b-a954-5442729d9d6c","bootindex":1,"write-cache":"on","serial":"f589b059-4a80-416b-a954-5442729d9d6c","werror":"stop","rerror":"stop"}: Failed to get "write" lock Is another process using the image [/run/vdsm/storage/07a500b2-dd25-484a-b4e7-290caf22f03a/f589b059-4a80-416b-a954-5442729d9d6c/9ca107b7-7813-4b5e-a5ea-edb94732a2ee]? 2024-07-11 07:01:18.279+0000: shutting down, reason=failed =============================================================== source vdsm 2024-07-11 10:01:21,897+0300 INFO (jsonrpc/4) [api.virt] START migrate(params={'abortOnError': 'true', 'autoConverge': 'true', 'dst': 'destination.migration.host:54321', 'method': 'online', 'vmId': '45036639-b358-4d21-b048-5d150071a276', 'src': 'source.migration.host', 'dstqemu': 'XXX.XXX.164.144', 'convergenceSchedule': {'init': [{'name': 'setDowntime', 'params': ['100']}], 'stalling': [{'limit': 1, 'action': {'name': 'setDowntime', 'params': ['150']}}, {'limit': 2, 'action': {'name': 'setDowntime', 'params': ['200']}}, {'limit': 3, 'action': {'name': 'setDowntime', 'params': ['300']}}, {'limit': 4, 'action': {'name': 'setDowntime', 'params': ['400']}}, {'limit': 6, 'action': {'name': 'setDowntime', 'params': ['500']}}, {'limit': -1, 'action': {'name': 'abort', 'params': []}}]}, 'outgoingLimit': 2, 'enableGuestEvents': True, 'maxBandwidth': 1250, 'tunneled': 'false', 'encrypted': False, 'compressed': 'false', 'incomingLimit': 2}) from=::ffff:XXX.XXX.162.210,49024, flow_id=b0078327- 5961-4201-a6e8-3e6b823d40d9, vmId=45036639-b358-4d21-b048-5d150071a276 (api:31) 2024-07-11 10:01:21,899+0300 INFO (jsonrpc/4) [api.virt] FINISH migrate return={'status': {'code': 0, 'message': 'Migration in progress'}, 'progress': 0} from=::ffff:XXX.XXX.162.210,49024, flow_id=b0078327-5961-4201-a6e8-3e6b823d40d9, vmId=45036639-b358-4d21-b048-5d150071a276 (api:37) 2024-07-11 10:01:21,923+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Migration semaphore: acquiring (migration:463) 2024-07-11 10:01:21,923+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Migration semaphore: acquired (migration:465) 2024-07-11 10:01:21,925+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Switching from State.INITIALIZED to State.PREPARED (migration:229) 2024-07-11 10:01:21,960+0300 INFO (jsonrpc/0) [vdsm.api] START repoStats(domains=['07a500b2-dd25-484a-b4e7-290caf22f03a']) from=::1,39466, task_id=bc129db0-b985-4641-866f-4e5c5144099f (api:31) 2024-07-11 10:01:21,960+0300 INFO (jsonrpc/0) [vdsm.api] FINISH repoStats return={'07a500b2-dd25-484a-b4e7-290caf22f03a': {'code': 0, 'lastCheck': '2.6', 'delay': '0.00144745', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::1,39466, task_id=bc129db0-b985-4641-866f-4e5c5144099f (api:37) 2024-07-11 10:01:22,023+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Creation of destination VM took: 0 seconds (migration:527) 2024-07-11 10:01:22,023+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Switching from State.PREPARED to State.STARTED (migration:229) 2024-07-11 10:01:22,023+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') starting migration to qemu+tls://destination.migration.host/system with miguri tcp://XXX.XXX.164.144 (migration:576) 2024-07-11 10:01:22,030+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Migrating to qemu+tls://destination.migration.host/system with params {'bandwidth': 1250, 'migrate_uri': 'tcp://XXX.XXX.164.144', 'destination_xml': '...'} and flags 12299 (migration:597) 2024-07-11 10:01:25,197+0300 INFO (jsonrpc/6) [api.virt] START getStats() from=::1,39466, vmId=45036639-b358-4d21-b048-5d150071a276 (api:31) 2024-07-11 10:01:25,199+0300 INFO (jsonrpc/6) [api.virt] FINISH getStats return={'status': {'code': 0, 'message': 'Done'}, 'statsList': [{'statusTime': '2296497004', 'statu s': 'Migration Source', 'vmId': '45036639-b358-4d21-b048-5d150071a276', 'vmName': 'HostedEngine', 'vmType': 'kvm', 'kvmEnable': 'true', 'acpiEnable': 'true', 'elapsedTime': '147659', 'monitorResponse': '0', 'clientIp': '', 'timeOffset': '0', 'migrationProgress': 0, 'cpuUser': '0.00', 'cpuSys': '9.04', 'cpuUsage': '10240709000', 'cpuActual': True, 'network': {'vnet3': {'macAddr': '00:16:3e:4c:7b:b8', 'name': 'vnet3', 'speed': '1000', 'state': 'unknown', 'rxErrors': '0', 'rxDropped': '0', 'txErrors': '0', 'txDropped': '0', 'rx': '167113', 'tx': '63277', 'sampleTime': 149013.357784503}}, 'disks': {'sdc': {'truesize': '0', 'apparentsize': '0', 'readLatency': '0', 'writeLatency': '0', 'flushLatency': '0', 'writtenBytes': '0', 'writeOps': '0', 'readOps': '0', 'readBytes': '0', 'readRate': '0.0', 'writeRate': '0.0'}, 'vda': {'truesize': '6111555584', 'apparentsize': '128849018880', 'readLatency': '0', 'writeLatency': '13216404.046511628', 'flushLatency': '11532105.147058824', 'writtenBytes': '1246208', 'writeOps': '118', 'readOps': '0', 'readBytes': '0', 'readRate': '0.0', 'writeRate': '27477.939441741062', 'imageID': 'f589b059-4a80-416b-a954-5442729d9d6c'}}, 'balloonInfo': {'balloon_max': '16777216', 'balloon_min': '16777216', 'balloon_cur': '16777216', 'balloon_target': '16777216', 'ballooning_enabled': True}, 'vcpuCount': '4', 'memoryStats': {'mem_total': '16097916', 'mem_unused': '9031864', 'mem_free': '10009084', 'swap_in': 0, 'swap_out': 0, 'majflt': 0, 'minflt': 438, 'pageflt': 438}, 'displayInfo': [{'type': 'vnc', 'port': '5900', 'tlsPort': '', 'ipAddress': 'XXX.XXX.162.157'}], 'hash': '5471222823762019987', 'vmJobs': {}, 'vcpuQuota': '17592186044415', 'vcpuPeriod': 100000, 'username': '', 'session': 'Unknown', 'memUsage': '40', 'guestCPUCount': 4, 'appsList': ('kernel-5.14.0-474.el9.x86_64', 'qemu-guest-agent-9.0.0'), 'guestIPs': '', 'guestFQDN': 'kmanage.vmmgmt-int.uoc.gr', 'netIfaces': [{'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6': ['::1'], 'name': 'lo'}, {'hw': '00:16:3e:4c:7b:b8', 'inet': ['XXX.XXX.162.210'], 'inet6': ['fe80::216:3eff:fe4c:7bb8'], 'name': 'eth0'}], 'disksUsage': [{'path': '/', 'total': '7449083 904', 'used': '4110884864', 'fs': 'xfs'}, {'path': '/boot', 'total': '1006632960', 'used': '427859968', 'fs': 'xfs'}, {'path': '/var', 'total': '19260243968', 'used': '1031921664', 'fs': 'xfs'}, {'path': '/home', 'total': '1006632960', 'used': '40878080', 'fs': 'xfs'}, {'path': '/var/tmp', 'total': '2080374784', 'used': '48451584', 'fs': 'xfs'}, {'path': '/tmp', 'total': '2080374784', 'used': '48537600', 'fs': 'xfs'}, {'path': '/var/log', 'total': '10670309376', 'used': '159739904', 'fs': 'xfs'}, {'path': '/var/log/audit', 'total': '1006632960', 'used': '41656320', 'fs': 'xfs'}], 'guestName': 'kmanage.vmmgmt-int.uoc.gr', 'guestOs': '5.14.0-474.el9.x86_64', 'guestOsInfo': {'type': 'linux', 'arch': 'x86_64', 'kernel': '5.14.0-474.el9.x86_64', 'distribution': 'CentOS Stream', 'version': '9', 'codename': ''}, 'guestTimezone': {'offset': 180, 'zone': 'EEST'}}]} from=::1,39466, vmId=45036639-b358-4d21-b048-5d150071a276 (api:37) 2024-07-11 10:01:26,288+0300 INFO (jsonrpc/2) [api.host] START getStats() from=::ffff:XXX.XXX.162.210,49024 (api:31) 2024-07-11 10:01:26,309+0300 INFO (jsonrpc/2) [vdsm.api] START repoStats(domains=()) from=::ffff:XXX.XXX.162.210,49024, task_id=baf1b9aa-0dd6-43d7-85b9-7f1be658fa68 (api:31) 2024-07-11 10:01:26,310+0300 INFO (jsonrpc/2) [vdsm.api] FINISH repoStats return={'07a500b2-dd25-484a-b4e7-290caf22f03a': {'code': 0, 'lastCheck': '1.1', 'delay': '0.00144745', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::ffff:XXX.XXX.162.210,49024, task_id=baf1b9aa-0dd6-43d7-85b9-7f1be658fa68 (api:37) 2024-07-11 10:01:26,310+0300 INFO (jsonrpc/2) [vdsm.api] START multipath_health() from=::ffff:XXX.XXX.162.210,49024, task_id=5ae042a8-13c6-4255-82e6-16fa502a4667 (api:31) 2024-07-11 10:01:26,310+0300 INFO (jsonrpc/2) [vdsm.api] FINISH multipath_health return={} from=::ffff:XXX.XXX.162.210,49024, task_id=5ae042a8-13c6-4255-82e6-16fa502a4667 (api:37) 2024-07-11 10:01:26,320+0300 INFO (jsonrpc/2) [api.host] FINISH getStats return={'status': {'code': 0, 'message': 'Done'}, 'info': (suppressed)} from=::ffff:XXX.XXX.162.210,49024 (api:37) 2024-07-11 10:01:26,357+0300 INFO (jsonrpc/4) [vdsm.api] START repoStats(domains=['07a500b2-dd25-484a-b4e7-290caf22f03a']) from=::1,39466, task_id=fefede66-f7de-4da8-ac00-aff32ec27365 (api:31) 2024-07-11 10:01:26,357+0300 INFO (jsonrpc/4) [vdsm.api] FINISH repoStats return={'07a500b2-dd25-484a-b4e7-290caf22f03a': {'code': 0, 'lastCheck': '1.2', 'delay': '0.00144745', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::1,39466, task_id=fefede66-f7de-4da8-ac00-aff32ec27365 (api:37) 2024-07-11 10:01:26,468+0300 INFO (jsonrpc/7) [api.host] START getStats() from=::1,39466 (api:31) 2024-07-11 10:01:26,490+0300 INFO (jsonrpc/7) [vdsm.api] START repoStats(domains=()) from=::1,39466, task_id=773d02a5-140b-49a3-ad4a-ba76e512a866 (api:31) 2024-07-11 10:01:26,491+0300 INFO (jsonrpc/7) [vdsm.api] FINISH repoStats return={'07a500b2-dd25-484a-b4e7-290caf22f03a': {'code': 0, 'lastCheck': '1.3', 'delay': '0.00144745', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::1,39466, task_id=773d02a5-140b-49a3-ad4a-ba76e512a866 (api:37) 2024-07-11 10:01:26,491+0300 INFO (jsonrpc/7) [vdsm.api] START multipath_health() from=::1,39466, task_id=dd0ca999-d6ce-4da9-b377-f43b1455f4fd (api:31) 2024-07-11 10:01:26,491+0300 INFO (jsonrpc/7) [vdsm.api] FINISH multipath_health return={} from=::1,39466, task_id=dd0ca999-d6ce-4da9-b377-f43b1455f4fd (api:37) 2024-07-11 10:01:26,493+0300 INFO (periodic/2) [vdsm.api] START getVolumeSize(sdUUID='07a500b2-dd25-484a-b4e7-290caf22f03a', spUUID='00000000-0000-0000-0000-000000000000', imgUUID='f589b059-4a80-416b-a954-5442729d9d6c', volUUID='9ca107b7-7813-4b5e-a5ea-edb94732a2ee') from=internal, task_id=b8e5fe57-aae1-4bb4-a0a5-4a9af4b21aab (api:31) 2024-07-11 10:01:26,496+0300 INFO (periodic/2) [vdsm.api] FINISH getVolumeSize return={'apparentsize': '128849018880', 'truesize': '6111625216'} from=internal, task_id=b8e5fe57-aae1-4bb4-a0a5-4a9af4b21aab (api:37) 2024-07-11 10:01:26,500+0300 INFO (jsonrpc/7) [api.host] FINISH getStats return={'status': {'code': 0, 'message': 'Done'}, 'info': (suppressed)} from=::1,39466 (api:37) 2024-07-11 10:01:28,013+0300 ERROR (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') operation failed: job 'migration out' failed: Unable to read from socket: Connection reset by peer (migration:331) 2024-07-11 10:01:28,018+0300 INFO (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Switching from State.STARTED to State.FAILED (migration:229) 2024-07-11 10:01:28,019+0300 ERROR (migsrc/45036639) [virt.vm] (vmId='45036639-b358-4d21-b048-5d150071a276') Failed to migrate (migration:506) Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/vdsm/virt/migration.py", line 479, in _regular_run self._startUnderlyingMigration( File "/usr/lib/python3.9/site-packages/vdsm/virt/migration.py", line 580, in _startUnderlyingMigration self._perform_with_conv_schedule(duri, muri) File "/usr/lib/python3.9/site-packages/vdsm/virt/migration.py", line 700, in _perform_with_conv_schedule self._perform_migration(duri, muri) File "/usr/lib/python3.9/site-packages/vdsm/virt/migration.py", line 602, in _perform_migration self._dom.migrateToURI3(duri, params, flags) File "/usr/lib/python3.9/site-packages/vdsm/virt/virdomain.py", line 162, in call return getattr(self._vm._dom, name)(*a, **kw) File "/usr/lib/python3.9/site-packages/vdsm/virt/virdomain.py", line 104, in f ret = attr(*args, **kwargs) File "/usr/lib/python3.9/site-packages/vdsm/common/libvirtconnection.py", line 114, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.9/site-packages/vdsm/common/function.py", line 78, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.9/site-packages/libvirt.py", line 2174, in migrateToURI3 raise libvirtError('virDomainMigrateToURI3() failed') libvirt.libvirtError: operation failed: job 'migration out' failed: Unable to read from socket: Connection reset by peer
participants (3)
-
Frank Wall
-
g.vasilopoulos@uoc.gr
-
Jean-Louis Dupond