Nir,
On 21/01/16 23:55, "Nir Soffer" <nsoffer(a)redhat.com> wrote:
live migration starts by creating a snapshot, then copying the disks
to the new
storage, and then mirroring the active layer so both the old and the
new disks are
the same. Finally we switch to the new disk, and delete the old disk.
So probably the issue is in the mirroring step. This is most likely a
qemu issue.
Thank you for clarification. This brought me an idea to check consistency of the old
disk.
I performed the following testing:
1. Create a VM on MS NFS
2. Initiate live disk migration to another storage
3. Catch the source files before oVirt has removed them by creating hard links to another
directory
4. Shutdown VM
5. Create another VM and move the catched files to the place where new disk files is
located
6. Check consistency of filesystem in both VMs
The source disk is consistent. The destination disk is corrupted.
I'll try to get instructions for this from libvirt developers. If this
happen with
libvirt alone, this is a libvirt or qemu bug, and there is little we (ovirt) can
do about it.
I've tried to reproduce the mirroring of active layer:
1. Create two thin template provisioned VMs from the same template on different
storages.
2. Start VM1
3. virsh blockcopy VM1 vda /rhev/data-center/...path.to.disk.of.VM2.. --wait --verbose
--reuse-external --shallow
4. virsh blockjob VM1 vda --abort --pivot
5. Shutdown VM1
6. Start VM2. Boot in recovery mode and check filesystem.
I did try this a dozen times. Everything works fine. No data corruption.
Ideas?