
Hi, After digging around and finding a bit of info about viodiskcache - I understand that if the user enable it - then the VM cannot be live migrated. Umm, unless the op decides to do a live migration including changing storage - I don't understand why the live migration is disabled. If the VM will only be live migrated between nodes, then the storage is the same, nothing is saved locally on the node's hard disk, so what is the reason to disable live migration? Thanks, Hetz

On Thu, Feb 14, 2019 at 1:28 AM Hetz Ben Hamo <hetz@hetz.biz> wrote:
Hi,
After digging around and finding a bit of info about viodiskcache - I understand that if the user enable it - then the VM cannot be live migrated.
Umm, unless the op decides to do a live migration including changing storage - I don't understand why the live migration is disabled. If the VM will only be live migrated between nodes, then the storage is the same, nothing is saved locally on the node's hard disk, so what is the reason to disable live migration?
I think the issue is synchronizing the host buffer cache on different hosts. Each kernel thinks it control storage, while storage is accessed by two hosts at the same time. I'm sure Kevin or Eric can provide a better answer. Nir

On Thu, Feb 14, 2019 at 02:05:00AM +0200, Nir Soffer wrote:
On Thu, Feb 14, 2019 at 1:28 AM Hetz Ben Hamo <hetz@hetz.biz> wrote:
Hi,
After digging around and finding a bit of info about viodiskcache - I understand that if the user enable it - then the VM cannot be live migrated.
Umm, unless the op decides to do a live migration including changing storage - I don't understand why the live migration is disabled. If the VM will only be live migrated between nodes, then the storage is the same, nothing is saved locally on the node's hard disk, so what is the reason to disable live migration?
I think the issue is synchronizing the host buffer cache on different hosts. Each kernel thinks it control storage, while storage is accessed by two hosts at the same time.
Yes, the problem is that the host page cache on the destiation may contain stale data. This is because the destination QEMU reads the shared disk before migration handover. QEMU 3.0.0 introduced support for live migration even with the host page cache. This happened in commit dd577a26ff03b6829721b1ffbbf9e7c411b72378 ("block/file-posix: implement bdrv_co_invalidate_cache() on Linux"). Libvirt still considers such configurations unsafe for live migration but this can be overridden with the "virsh migrate --unsafe" option. Work is required so that libvirt can detect QEMU binaries that support live migration when the host page cache is in use. (cache=writeback can produce misleading performance results, so think carefully if you're using it because it appears faster. The results may look good during benchmarking but change depending on host load/memory pressure. In production there are probably other VMs on the same host so using cache=none leads to more consistent results and the host page cache isn't a great help if the host is close to capacity anyway.) Stefan
participants (3)
-
Hetz Ben Hamo
-
Nir Soffer
-
Stefan Hajnoczi