The log from the start of the migration:
2021-01-20 20:27:11,027-0500 INFO (jsonrpc/5) [api.virt] START
migrate(params={u'incomingLimit': 2, u'src':
u'vhost2.my.domain.name',
u'dstqemu': u'10.0.20.100', u'autoConverge': u'true',
u'tunneled':
u'false', u'enableGuestEvents': True, u'dst':
u'vhost1.my.domain.name:54321',
u'convergenceSchedule': {u'init': [{u'params': [u'100'],
u'name':
u'setDowntime'}], u'stalling': [{u'action': {u'params':
[u'150'], u'name':
u'setDowntime'}, u'limit': 1}, {u'action': {u'params':
[u'200'], u'name':
u'setDowntime'}, u'limit': 2}, {u'action': {u'params':
[u'300'], u'name':
u'setDowntime'}, u'limit': 3}, {u'action': {u'params':
[u'400'], u'name':
u'setDowntime'}, u'limit': 4}, {u'action': {u'params':
[u'500'], u'name':
u'setDowntime'}, u'limit': 6}, {u'action': {u'params': [],
u'name':
u'abort'}, u'limit': -1}]}, u'vmId':
u'a24fd7e3-161c-451e-8880-b3e7e1f7d86f', u'abortOnError':
u'true',
u'outgoingLimit': 2, u'compressed': u'false',
u'maxBandwidth': 125,
u'method': u'online'}) from=::ffff:10.0.0.20,40308,
flow_id=fc4e0792-a3a0-425f-b0b6-bcaf5e0f4775,
vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:48)
2021-01-20 20:27:11,027-0500 INFO (jsonrpc/5) [api.virt] START
migrate(params={u'incomingLimit': 2, u'src':
u'vhost2.my.domain.name',
u'dstqemu': u'10.0.20.100', u'autoConverge': u'true',
u'tunneled':
u'false', u'enableGuestEvents': True, u'dst':
u'vhost1.my.domain.name:54321',
u'convergenceSchedule': {u'init': [{u'params': [u'100'],
u'name':
u'setDowntime'}], u'stalling': [{u'action': {u'params':
[u'150'], u'name':
u'setDowntime'}, u'limit': 1}, {u'action': {u'params':
[u'200'], u'name':
u'setDowntime'}, u'limit': 2}, {u'action': {u'params':
[u'300'], u'name':
u'setDowntime'}, u'limit': 3}, {u'action': {u'params':
[u'400'], u'name':
u'setDowntime'}, u'limit': 4}, {u'action': {u'params':
[u'500'], u'name':
u'setDowntime'}, u'limit': 6}, {u'action': {u'params': [],
u'name':
u'abort'}, u'limit': -1}]}, u'vmId':
u'a24fd7e3-161c-451e-8880-b3e7e1f7d86f', u'abortOnError':
u'true',
u'outgoingLimit': 2, u'compressed': u'false',
u'maxBandwidth': 125,
u'method': u'online'}) from=::ffff:10.0.0.20,40308,
flow_id=fc4e0792-a3a0-425f-b0b6-bcaf5e0f4775,
vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:48)
2020-01-20 20:27:13,367-0500 INFO (migsrc/a24fd7e3) [virt.vm]
(vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Creation of destination VM
took: 2 seconds (migration:469)
2020-01-20 20:27:13,367-0500 INFO (migsrc/a24fd7e3) [virt.vm]
(vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') starting migration to
qemu+tls://vhost1.my.domain.name/system with miguri tcp://10.0.20.100
(migration:498)
That appears to all be in order, as 10.0.20.100 is the correct IP address
of the migration interface on the destination host.
The netcat also looks good:
[root@vhost2 ~]# nc -vz 10.0.20.100 54321
Ncat: Version 7.50 (
https://nmap.org/ncat )
Ncat: Connected to 10.0.20.100:54321.
Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
However, the final test is very telling:
[root@vhost2 ~]# ping -M do -s $((9000 - 28)) 10.0.20.100
PING 10.0.20.100 (10.0.20.100) 8972(9000) bytes of data.
^C
--- 10.0.20.100 ping statistics ---
14 packets transmitted, 0 received, 100% packet loss, time 12999ms
I don't think my switch is handling the MTU setting, even though it is
configured to do so. I will have to investigate further.
-Ben
On Mon, Jan 20, 2020 at 3:54 PM Dominik Holler <dholler(a)redhat.com> wrote:
On Mon, Jan 20, 2020 at 2:34 PM Milan Zamazal <mzamazal(a)redhat.com> wrote:
> Ben <gravyfish(a)gmail.com> writes:
>
> > Hi Milan,
> >
> > Thanks for your reply. I checked the firewall, and saw that both the
> bond0
> > interface and the VLAN interface bond0.20 had been added to the default
> > zone, which I believe should provide the necessary firewall access
> (output
> > below)
> >
> > I double-checked the destination host's VDSM logs and wasn't able to
> find
> > any warning or error-level logs during the migration timeframe.
> >
> > I checked the migration_port_* and *_port settings in qemu.conf and
> > libvirtd.conf and all lines are commented. I have not modified either
> file.
>
> The commented out settings define the default port used for migrations,
> so they are valid even when commented out. I can see you have
> libvirt-tls open below, not sure about the QEMU ports. If migration
> works when not using a separate migration network then it should work
> with the same rules for the migration network, so I think your settings
> are OK.
>
> The fact that you don't get any better explanation than "unexpectedly
> failed" and that it fails before transferring any data indicates a
> possible networking error, but I can't help with that, someone with
> networking knowledge should.
>
>
Can you please share the relevant lines which logs the start of the
migration on the source host from vdsm.log?
This line should contain the IP address on migration network of the
destination host.
Please note that there are two network connections: the libvirt's control
data is transmitted on the management
network encrypted, while the qemu's data is transmitted on the migration
network.
On source host, can you please
ping -M do -s $((9000 - 28))
dest_ip_address_on_migration_network_from_vdsm_log
nc -vz dest_ip_address_on_migration_network_from_vdsm_log
dest_port_on_migration_network_from_vdsm_log
> You can also try to enable libvirt debugging on both the sides in
> /etc/libvirt/libvirtd.conf and restart libvirt (beware, those logs are
> huge). libvirt logs should report some error.
>
> > [root@vhost2 vdsm]# firewall-cmd --list-all
> > public (active)
> > target: default
> > icmp-block-inversion: no
> > interfaces: bond0 bond0.20 em1 em2 migration ovirtmgmt p1p1
> > sources:
> > services: cockpit dhcpv6-client libvirt-tls ovirt-imageio
> ovirt-vmconsole
> > snmp ssh vdsm
> > ports: 1311/tcp 22/tcp 6081/udp 5666/tcp
> > protocols:
> > masquerade: no
> > forward-ports:
> > source-ports:
> > icmp-blocks:
> > rich rules:
> >
> > On Mon, Jan 20, 2020 at 6:29 AM Milan Zamazal <mzamazal(a)redhat.com>
> wrote:
> >
> >> Ben <gravyfish(a)gmail.com> writes:
> >>
> >> > Hi, I'm pretty stuck at the moment so I hope someone can help me.
> >> >
> >> > I have an oVirt 4.3 data center with two hosts. Recently, I
> attempted to
> >> > segregate migration traffic from the the standard ovirtmgmt network,
> >> where
> >> > the VM traffic and all other traffic resides.
> >> >
> >> > I set up the VLAN on my router and switch, and created LACP bonds on
> both
> >> > hosts, tagging them with the VLAN ID. I confirmed the routes work
> fine,
> >> and
> >> > traffic speeds are as expected. MTU is set to 9000.
> >> >
> >> > After configuring the migration network in the cluster and dragging
> and
> >> > dropping it onto the bonds on each host, VMs fail to migrate.
> >> >
> >> > oVirt is not reporting any issues with the network interfaces or sync
> >> with
> >> > the hosts. However, when I attempt to live-migrate a VM, progress
> gets to
> >> > 1% and stalls. The transfer rate is 0Mbps, and the operation
> eventually
> >> > fails.
> >> >
> >> > I have not been able to identify anything useful in the VDSM logs on
> the
> >> > source or destination hosts, or in the engine logs. It repeats the
> below
> >> > WARNING and INFO logs for the duration of the process, then logs the
> last
> >> > entries when it fails. I can provide more logs if it would help.
I'm
> not
> >> > even sure where to start -- since I am a novice at networking, at
> best,
> >> my
> >> > suspicion the entire time was that something is misconfigured in my
> >> > network. However, the routes are good, speed tests are fine, and I
> can't
> >> > find anything else wrong with the connections. It's not impacting
any
> >> other
> >> > traffic over the bond interfaces.
> >> >
> >> > Are there other requirements that must be met for VMs to migrate
> over a
> >> > separate interface/network?
> >>
> >> Hi, did you check your firewall settings? Are the required ports open?
> >> See migration_port_* options in /etc/libvirt/qemu.conf and *_port
> >> options in /etc/libvirt/libvirtd.conf.
> >>
> >> Is there any error reported in the destination vdsm.log?
> >>
> >> Regards,
> >> Milan
> >>
> >> > 2020-01-12 03:18:28,245-0500 WARN (migmon/a24fd7e3) [virt.vm]
> >> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Migration
stalling:
> >> remaining
> >> > (4191MiB) > lowmark (4191MiB). (migration:854)
> >> > 2020-01-12 03:18:28,245-0500 INFO (migmon/a24fd7e3) [virt.vm]
> >> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Migration
Progress:
> 930.341
> >> > seconds elapsed, 1% of data processed, total data: 4192MB, processed
> >> data:
> >> > 0MB, remaining data: 4191MB, transfer speed 0MBps, zero pages: 149MB,
> >> > compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:881)
> >> > 2020-01-12 03:18:31,386-0500 ERROR (migsrc/a24fd7e3) [virt.vm]
> >> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') operation
failed:
> migration
> >> > out job: unexpectedly failed (migration:282)
> >> > 2020-01-12 03:18:32,695-0500 ERROR (migsrc/a24fd7e3) [virt.vm]
> >> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Failed to
migrate
> >> > (migration:450)
> >> > File
"/usr/lib/python2.7/site-packages/vdsm/virt/migration.py",
> line
> >> 431,
> >> > in _regular_run
> >> > time.time(), migrationParams, machineParams
> >> > File
"/usr/lib/python2.7/site-packages/vdsm/virt/migration.py",
> line
> >> 505,
> >> > in _startUnderlyingMigration
> >> > File
"/usr/lib/python2.7/site-packages/vdsm/virt/migration.py",
> line
> >> 591,
> >> > in _perform_with_conv_schedule
> >> > self._perform_migration(duri, muri)
> >> > File
"/usr/lib/python2.7/site-packages/vdsm/virt/migration.py",
> line
> >> 525,
> >> > in _perform_migration
> >> > self._migration_flags)
> >> > libvirtError: operation failed: migration out job: unexpectedly
> failed
> >> > 2020-01-12 03:18:40,880-0500 INFO (jsonrpc/6) [api.virt] FINISH
> >> > getMigrationStatus return={'status': {'message':
'Done', 'code': 0},
> >> > 'migrationStats': {'status': {'message':
'Fatal error during
> migration',
> >> > 'code': 12}, 'progress': 1L}}
from=::ffff:10.0.0.20,41462,
> >> > vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:54)
> >> > _______________________________________________
> >> > Users mailing list -- users(a)ovirt.org
> >> > To unsubscribe send an email to users-leave(a)ovirt.org
> >> > Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
> >> > oVirt Code of Conduct:
> >>
https://www.ovirt.org/community/about/community-guidelines/
> >> > List Archives:
> >> >
> >>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PB3TQTFXWKA...
> >>
> >>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
>
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WQ64NLLPDC7...
>