On Thu, Feb 14, 2019 at 02:05:00AM +0200, Nir Soffer wrote:
On Thu, Feb 14, 2019 at 1:28 AM Hetz Ben Hamo <hetz(a)hetz.biz>
wrote:
> Hi,
>
> After digging around and finding a bit of info about viodiskcache - I
> understand that if the user enable it - then the VM cannot be live migrated.
>
> Umm, unless the op decides to do a live migration including changing
> storage - I don't understand why the live migration is disabled. If the VM
> will only be live migrated between nodes, then the storage is the same,
> nothing is saved locally on the node's hard disk, so what is the reason to
> disable live migration?
>
I think the issue is synchronizing the host buffer cache on different
hosts. Each kernel thinks it
control storage, while storage is accessed by two hosts at the same time.
Yes, the problem is that the host page cache on the destiation may
contain stale data. This is because the destination QEMU reads the
shared disk before migration handover.
QEMU 3.0.0 introduced support for live migration even with the host page
cache. This happened in commit dd577a26ff03b6829721b1ffbbf9e7c411b72378
("block/file-posix: implement bdrv_co_invalidate_cache() on Linux").
Libvirt still considers such configurations unsafe for live migration
but this can be overridden with the "virsh migrate --unsafe" option.
Work is required so that libvirt can detect QEMU binaries that support
live migration when the host page cache is in use.
(cache=writeback can produce misleading performance results, so think
carefully if you're using it because it appears faster. The results may
look good during benchmarking but change depending on host load/memory
pressure. In production there are probably other VMs on the same host
so using cache=none leads to more consistent results and the host page
cache isn't a great help if the host is close to capacity anyway.)
Stefan