On Mon, Jan 20, 2020 at 2:34 PM Milan Zamazal <mzamazal@redhat.com> wrote:
Ben <gravyfish@gmail.com> writes:

> Hi Milan,
>
> Thanks for your reply. I checked the firewall, and saw that both the bond0
> interface and the VLAN interface bond0.20 had been added to the default
> zone, which I believe should provide the necessary firewall access (output
> below)
>
> I double-checked the destination host's VDSM logs and wasn't able to find
> any warning or error-level logs during the migration timeframe.
>
> I checked the migration_port_* and *_port settings in qemu.conf and
> libvirtd.conf and all lines are commented. I have not modified either file.

The commented out settings define the default port used for migrations,
so they are valid even when commented out.  I can see you have
libvirt-tls open below, not sure about the QEMU ports.  If migration
works when not using a separate migration network then it should work
with the same rules for the migration network, so I think your settings
are OK.

The fact that you don't get any better explanation than "unexpectedly
failed" and that it fails before transferring any data indicates a
possible networking error, but I can't help with that, someone with
networking knowledge should.


Can you please share the relevant lines which logs the start of the migration on the source host from vdsm.log?
This line should contain the IP address on migration network of the destination host.
Please note that there are two network connections: the libvirt's control data is transmitted on the management
network encrypted, while the qemu's data is transmitted on the migration network.
On source host, can you please
ping -M do -s $((9000 - 28)) dest_ip_address_on_migration_network_from_vdsm_log
nc -vz dest_ip_address_on_migration_network_from_vdsm_log dest_port_on_migration_network_from_vdsm_log


 
You can also try to enable libvirt debugging on both the sides in
/etc/libvirt/libvirtd.conf and restart libvirt (beware, those logs are
huge).  libvirt logs should report some error.

> [root@vhost2 vdsm]# firewall-cmd --list-all
> public (active)
>   target: default
>   icmp-block-inversion: no
>   interfaces: bond0 bond0.20 em1 em2 migration ovirtmgmt p1p1
>   sources:
>   services: cockpit dhcpv6-client libvirt-tls ovirt-imageio ovirt-vmconsole
> snmp ssh vdsm
>   ports: 1311/tcp 22/tcp 6081/udp 5666/tcp
>   protocols:
>   masquerade: no
>   forward-ports:
>   source-ports:
>   icmp-blocks:
>   rich rules:
>
> On Mon, Jan 20, 2020 at 6:29 AM Milan Zamazal <mzamazal@redhat.com> wrote:
>
>> Ben <gravyfish@gmail.com> writes:
>>
>> > Hi, I'm pretty stuck at the moment so I hope someone can help me.
>> >
>> > I have an oVirt 4.3 data center with two hosts. Recently, I attempted to
>> > segregate migration traffic from the the standard ovirtmgmt network,
>> where
>> > the VM traffic and all other traffic resides.
>> >
>> > I set up the VLAN on my router and switch, and created LACP bonds on both
>> > hosts, tagging them with the VLAN ID. I confirmed the routes work fine,
>> and
>> > traffic speeds are as expected. MTU is set to 9000.
>> >
>> > After configuring the migration network in the cluster and dragging and
>> > dropping it onto the bonds on each host, VMs fail to migrate.
>> >
>> > oVirt is not reporting any issues with the network interfaces or sync
>> with
>> > the hosts. However, when I attempt to live-migrate a VM, progress gets to
>> > 1% and stalls. The transfer rate is 0Mbps, and the operation eventually
>> > fails.
>> >
>> > I have not been able to identify anything useful in the VDSM logs on the
>> > source or destination hosts, or in the engine logs. It repeats the below
>> > WARNING and INFO logs for the duration of the process, then logs the last
>> > entries when it fails. I can provide more logs if it would help. I'm not
>> > even sure where to start -- since I am a novice at networking, at best,
>> my
>> > suspicion the entire time was that something is misconfigured in my
>> > network. However, the routes are good, speed tests are fine, and I can't
>> > find anything else wrong with the connections. It's not impacting any
>> other
>> > traffic over the bond interfaces.
>> >
>> > Are there other requirements that must be met for VMs to migrate over a
>> > separate interface/network?
>>
>> Hi, did you check your firewall settings?  Are the required ports open?
>> See migration_port_* options in /etc/libvirt/qemu.conf and *_port
>> options in /etc/libvirt/libvirtd.conf.
>>
>> Is there any error reported in the destination vdsm.log?
>>
>> Regards,
>> Milan
>>
>> > 2020-01-12 03:18:28,245-0500 WARN  (migmon/a24fd7e3) [virt.vm]
>> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Migration stalling:
>> remaining
>> > (4191MiB) > lowmark (4191MiB). (migration:854)
>> > 2020-01-12 03:18:28,245-0500 INFO  (migmon/a24fd7e3) [virt.vm]
>> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Migration Progress: 930.341
>> > seconds elapsed, 1% of data processed, total data: 4192MB, processed
>> data:
>> > 0MB, remaining data: 4191MB, transfer speed 0MBps, zero pages: 149MB,
>> > compressed: 0MB, dirty rate: 0, memory iteration: 1 (migration:881)
>> > 2020-01-12 03:18:31,386-0500 ERROR (migsrc/a24fd7e3) [virt.vm]
>> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') operation failed: migration
>> > out job: unexpectedly failed (migration:282)
>> > 2020-01-12 03:18:32,695-0500 ERROR (migsrc/a24fd7e3) [virt.vm]
>> > (vmId='a24fd7e3-161c-451e-8880-b3e7e1f7d86f') Failed to migrate
>> > (migration:450)
>> >   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
>> 431,
>> > in _regular_run
>> >     time.time(), migrationParams, machineParams
>> >   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
>> 505,
>> > in _startUnderlyingMigration
>> >   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
>> 591,
>> > in _perform_with_conv_schedule
>> >     self._perform_migration(duri, muri)
>> >   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
>> 525,
>> > in _perform_migration
>> >     self._migration_flags)
>> > libvirtError: operation failed: migration out job: unexpectedly failed
>> > 2020-01-12 03:18:40,880-0500 INFO  (jsonrpc/6) [api.virt] FINISH
>> > getMigrationStatus return={'status': {'message': 'Done', 'code': 0},
>> > 'migrationStats': {'status': {'message': 'Fatal error during migration',
>> > 'code': 12}, 'progress': 1L}} from=::ffff:10.0.0.20,41462,
>> > vmId=a24fd7e3-161c-451e-8880-b3e7e1f7d86f (api:54)
>> > _______________________________________________
>> > Users mailing list -- users@ovirt.org
>> > To unsubscribe send an email to users-leave@ovirt.org
>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> > List Archives:
>> >
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/PB3TQTFXWKAMNQBNH2OMH5J7R44TMZQF/
>>
>>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WQ64NLLPDC7E4QGUT6AUTZ6JMZUHRIST/