We are trying a Fedora 24, because the qemu-img from git require a
differnt version of glibc 2.12 and gthread 2.0 from that running on the
hosts.
In the while we fixed the problem in another way:
1) we dumped (with dd) all the snapshots and the base on files
2) then checked where the backing file becomes too long (in our case in
the 2nd snapshot) with
qemu-img info --backing chain <dumped snapshot filename>
3) we fixed the backing file with hexedit on that file only (be sure to
correct too the lenght of the backing file read qcow2 format)
4) we checked others snapshots starting from 3rd and all were OK
5) we converted all in a qcow2 and in a raw image with
qemu-img convert <latest snapshot filename> -O qcow2 <output filename in
qcow2>
qemu-img convert <latest snapshot filename> -O raw <output filename in raw>
6) we mounted with guestfish the qcow2 image and checked all was correct
and it was
7) we did the same steps (from 1 to 6) for all the disks (base +
snapshots) in the VM
8) we create a new VM in ovirt with the same structure of the old one
(RAM, disks, disks size, network)
9) we started the new VM with a livecd and then dumped back (using the
raw files) all the disks fixed using dd with netcat
10) we stopped and started again the new VM and all was working like we
left the last working time
I know our solution was longer and harder, but in this way if something
was wrong we had destroyed a copy of the original datas and we had no
success to compile the qemu-img from git on our hosts 4.0.4 for glibc
and gthread version.
Anyway we will have to check all the VMs (about 200) for snapshots and
backing file too long but then we would like to fix the problem directly
on the LVs snapshot (our method would require i suppose 2-3 months of
work), so we need a working qemu-img patched.
I will let you know if fedora 24 still give problems about the compiler
and gthread version.
I opened another case in the mailing list, because from when we upgraded
to 4.0.4 each VM that we shutdown, never starts again with the following
error
"Unable to get size for domain ..."
Do you think it can be related at the backing file problem ?
For now thanks to all
Claudio
On 19/10/16 16:55, Adam Litke wrote:
On 19/10/16 14:43 +0200, Claudio Soprano wrote:
> Hi Adam, we tried your solution.
>
> This is our situation with the current VM that has 2 disks
>
> base -> snap1 -> snap2 -> snap3 -> snap4 -> snap5 -> .. ->
snap15 for
> each disk
>
> We tried to do
>
> qemu-img rebase -u -b base snap1
>
> results OK
>
> qemu-img rebase -u -b snap1 snap2
>
> results:
>
> qemu-img: Could not open 'snap2': Backing file name too long
>
> our qemu version is
>
> qemu-img version 2.3.0 (qemu-kvm-ev-2.3.0-31.el7.16.1), Copyright (c)
> 2004-2008 Fabrice Bellard
>
> How do you think can we resolve ?
I talked with the qemu developers about this issue and the best way to
fix this is by using a patched version of qemu-img that ignores
invalid backing_file values when doing an unsafe rebase. Here is what
you will need to do to fix your images.
1. Save the attached patch
2. Grab a copy of the latest qemu.git
3. Apply the patch to the source
4. Install qemu build dependencies
5. Build qemu
6. Run the built version of qemu-img when fixing your chain as I
suggested above:
./qemu-img rebase -u -b snap1 snap2
The patch disables other qemu-img functionality since you should not
be using this for anything but the rebase part. After the rebase you
can use the system qemu-img binary to check the image. Please try
this on one VM disk and make sure everything is okay.
--
/ | / _____/ / | / _____/ | /
/ / | / / / / | / / / | /
/ / | / ___/ _____/ / / | / ___/ / | /
/ / | / / / / | / / / | /
______/ _/ __/ _/ _/ _/ __/ _/ _/ __/
Claudio Soprano phone: (+39)-06-9403.2349/2355
Computing Service fax: (+39)-06-9403.2649
LNF-INFN e-mail: Claudio.Soprano(a)lnf.infn.it
Via Enrico Fermi, 40 www:
http://www.lnf.infn.it/
I-00044 Frascati, Italy