qcow2 images corruption

newer
Unable to connect to the graphic...

Nicolas Ecarnot

7 Feb 2018 7 Feb '18

5:06 p.m.

Hello, TL; DR : qcow2 images keep getting corrupted. Any workaround? Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations : - We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1 - Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated network - Depends on weeks, but all in all, there are around 32 hosts, 8 storage domains and for various reasons, very few VMs (less than 200). - One peculiar point is that most of our VMs are provided an additional dedicated network interface that is iSCSI-connected to some volumes of our SAN - these volumes not being part of the oVirt setup. That could lead to a lot of additional iSCSI traffic. From times to times, a random VM appears paused by oVirt. Digging into the oVirt engine logs, then into the host vdsm logs, it appears that the host considers the qcow2 image as corrupted. Along what I consider as a conservative behavior, vdsm stops any interaction with this image and marks it as paused. Any try to unpause it leads to the same conservative pause. After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all. Months ago, I already sent a similar message but the error message was about No space left on device (https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html). This time, I don't have this message about space, but only corruption. I kept reading and found a similar discussion in the Proxmox group : https://lists.ovirt.org/pipermail/users/2018-February/086750.html https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-d... What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame. I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now. As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug) Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images. Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when. Regards, -- Nicolas ECARNOT

Show replies by date

Yaniv Kaul

8 Feb 8 Feb

12:59 p.m.

On Feb 7, 2018 7:08 PM, "Nicolas Ecarnot" <nicolas@ecarnot.net> wrote: Hello, TL; DR : qcow2 images keep getting corrupted. Any workaround? Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations : - We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1 All are somewhat old releases. I suggest upgrading to the latest RHEL and qemu-kvm bits. Later on, upgrade oVirt. Y. - Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated network - Depends on weeks, but all in all, there are around 32 hosts, 8 storage domains and for various reasons, very few VMs (less than 200). - One peculiar point is that most of our VMs are provided an additional dedicated network interface that is iSCSI-connected to some volumes of our SAN - these volumes not being part of the oVirt setup. That could lead to a lot of additional iSCSI traffic.

...

From times to times, a random VM appears paused by oVirt. Digging into the oVirt engine logs, then into the host vdsm logs, it appears that the host considers the qcow2 image as corrupted. Along what I consider as a conservative behavior, vdsm stops any interaction with this image and marks it as paused. Any try to unpause it leads to the same conservative pause.

After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all. Months ago, I already sent a similar message but the error message was about No space left on device (https://www.mail-archive.com/ qemu-block@gnu.org/msg00110.html). This time, I don't have this message about space, but only corruption. I kept reading and found a similar discussion in the Proxmox group : https://lists.ovirt.org/pipermail/users/2018-February/086750.html https://forum.proxmox.com/threads/qcow2-corruption-after- snapshot-or-heavy-disk-i-o.32865/page-2 What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts ( https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame. I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now. As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug) Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images. Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when. Regards, -- Nicolas ECARNOT _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Nicolas Ecarnot

1:38 p.m.

This is a multi-part message in MIME format. --------------1BC2811328971F9BAD7C7EDB Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Le 08/02/2018 à 13:59, Yaniv Kaul a écrit :

...

On Feb 7, 2018 7:08 PM, "Nicolas Ecarnot" <nicolas@ecarnot.net <mailto:nicolas@ecarnot.net>> wrote:

Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations :

- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1

All are somewhat old releases. I suggest upgrading to the latest RHEL and qemu-kvm bits.

Later on, upgrade oVirt. Y.

Hello Yaniv, We could discuss for hours about the fact that CentOS 7.3 was released in January 2017, thus not that old. And also discuss for hours explaining the gap between developers' will to push their freshest releases and the curb we - industry users - put on adopting such new versions. In my case, the virtualization infrastructure is just one of the +30 domains I have to master everyday, and the more stable the better. In the setup described previously, the qemu qcow2 images were correct, then not. We did not change anything. We have to find a workaround and we need your expertise. Not understanding the cause of the corruption threatens us to the same situation in oVirt 4.2. -- Nicolas Ecarnot --------------1BC2811328971F9BAD7C7EDB Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">Le 08/02/2018 à 13:59, Yaniv Kaul a écrit :<br> </div> <blockquote type="cite" cite="mid:CAJgorsb_we+ZEu=m1aeh=e_jaiXqf4P09SyEHVdC8V-S+ZoTFg@mail.gmail.com"> <div dir="auto"> <div><br> <div class="gmail_extra"><br> <div class="gmail_quote">On Feb 7, 2018 7:08 PM, "Nicolas Ecarnot" <<a href="mailto:nicolas@ecarnot.net" moz-do-not-send="true">nicolas@ecarnot.net</a>> wrote:<br type="attribution"> <blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br> <br> TL; DR : qcow2 images keep getting corrupted. Any workaround?<br> <br> Long version:<br> This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations :<br> <br> - We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts<br> - Hosts :<br> - CentOS 7.2 1511 :<br> - Kernel = 3.10.0 327<br> - KVM : 2.3.0-31<br> - libvirt : 1.2.17<br> - vdsm : 4.17.32-1<br> - CentOS 7.3 1611 :<br> - Kernel 3.10.0 514<br> - KVM : 2.3.0-31<br> - libvirt 2.0.0-10<br> - vdsm : 4.17.32-1<br> </blockquote> </div> </div> </div> <div dir="auto"><br> </div> <div dir="auto">All are somewhat old releases. I suggest upgrading to the latest RHEL and qemu-kvm bits. </div> <div dir="auto"><br> </div> <div dir="auto">Later on, upgrade oVirt. </div> <div dir="auto">Y. </div> </div> </blockquote> Hello Yaniv,<br> <br> We could discuss for hours about the fact that CentOS 7.3 was released in January 2017, thus not that old.<br> And also discuss for hours explaining the gap between developers' will to push their freshest releases and the curb we - industry users - put on adopting such new versions. In my case, the virtualization infrastructure is just one of the +30 domains I have to master everyday, and the more stable the better.<br> In the setup described previously, the qemu qcow2 images were correct, then not. We did not change anything. We have to find a workaround and we need your expertise.<br> <br> Not understanding the cause of the corruption threatens us to the same situation in oVirt 4.2.<br> <br> -- <br> Nicolas Ecarnot<br> </body> </html> --------------1BC2811328971F9BAD7C7EDB--

Kevin Wolf

13 Feb 13 Feb

9:41 a.m.

New subject: [Qemu-block] qcow2 images corruption

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

...

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Not without knowing the cause. The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM. The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports. This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages.

...

After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all.

It would be good if you could make the 'qemu-img check' output available somewhere. It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there.

...

What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame.

This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data. If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around.

...

I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now.

To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output and ideally the image to do anything, but I can't promise that anything would come out of this. Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code.

...

As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images.

To my knowledge, oVirt only uses external snapshots and creates them with QMP. This should be perfectly safe because from the perspective of the qcow2 image being snapshotted, it just means that it gets no new write requests. Migration is something more involved, and if you could relate the problem to migration, that would certainly be something to look into. In that case, it would be important to know more about the setup, e.g. is it migration with shared or non-shared storage?

...

Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when.

That would be helpful, too. Nothing is more frustrating that debugging a bug in an old version only to find that it's already fixed in the current version (well, except maybe debugging and finding nothing). Kevin

Nicolas Ecarnot

3:26 p.m.

New subject: [Qemu-block] qcow2 images corruption

Hello Kevin, Le 13/02/2018 à 10:41, Kevin Wolf a écrit :

...

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

...
TL; DR : qcow2 images keep getting corrupted. Any workaround?

Not without knowing the cause.

Actually, my main concern is mostly about finding the cause rather than correcting my corrupted VMs. Another way to say it : I prefer to help oVirt than help myself.

...

The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM.

Indeed, I read some BZ about this issue : they were raised by a user who ran some qemu-img commands on a "mounted" image, thus leading to some corruption. In my case, I'm not playing with this, and the corrupted VMs were only touched by classical oVirt actions.

...

The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages.

May I ask after what QEMU version this kind of locking has been added. As I wrote, our oVirt setup is 3.6 so not recent.

...

...
After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all.

It would be good if you could make the 'qemu-img check' output available somewhere.

See attachment.

...

It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there.

I just exported it like this : qemu-img convert /dev/the_correct_path /home/blablah.qcow2.img The resulting file is 32G and I need an idea to transfer this img to you.

...

...
What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame.

This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data.

Are you saying: - the corruption is in the metadata and in the guest data OR - the corruption is only in the metadata ?

...

If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around.

I get that.

...

...
I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now.

To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output

Done.

...

and ideally the image to do anything,

I remember some Redhat people once gave me a temporary access to put heavy file on some dedicated server. Is it still possible?

...

but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code.

Sure. Alas I find no obvious pattern leading to corruption : From the guest side, it appeared with windows 2003, 2008, 2012, linux centOS 6 and 7. It appeared with virtio-blk; and I changed some VMs to used virtio-scsi but it's too soon to see appearance of corruption in that case. As I said, I'm using snapshots VERY rarely, and our versions are too old so we do them the cold way only (VM shutdown). So very safely. The "weirdest" thing we do is to migrate VMs : you see how conservative we are!

...

...
As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images.

To my knowledge, oVirt only uses external snapshots and creates them with QMP. This should be perfectly safe because from the perspective of the qcow2 image being snapshotted, it just means that it gets no new write requests.

Migration is something more involved, and if you could relate the problem to migration, that would certainly be something to look into. In that case, it would be important to know more about the setup, e.g. is it migration with shared or non-shared storage?

I'm 99% sure the corrupted VMs have never see a snapshot, and 99% sure they have been migrated at most once. For me *this* is the track to follow. We have 2 main 3.6 oVirt DCs each having 4 dedicated LUNs, connected via iSCSI. Two SANs are serving those volumes. These are Equallogic and the setup of each volume contains a check saying : Access type : "Shared" http://psonlinehelp.equallogic.com/V5.0/Content/V5TOC/Allowing_or_disallowin... (shared access to the iSCSI target from multiple initiators) To be honest, I've never been comfortable with this point: - In a complete different context, I'm using it to allow two files servers to publish an OCFS2 volume embedded in a clustered-LVM. It is absolutely reliable as *c*LVM and OCFS2 are explicitly written to manage concurrent access. - In the case of oVirt, we are here allowing tens of hosts to connect to the same LUN. This LUN is then managed by a classical LVM setup, but I see here no notion of concurrent access management. To date, I still haven't understood how was managed these concurrent access to the same LUN with no crash. I hope I won't find no skeletons in the closet.

...

...
Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when.

That would be helpful, too. Nothing is more frustrating that debugging a bug in an old version only to find that it's already fixed in the current version (well, except maybe debugging and finding nothing).

Kevin

Exact, but as I wrote to Yaniv, it would be sad to setup a brand new 4.2 DC and to face the bad old issues. For the record, I just finished to setup another 4.2 DC, but it'll be long before I could apply to it a similar workload as the 3.6 production site. -- Nicolas ECARNOT

Nicolas Ecarnot

4:35 p.m.

New subject: [Qemu-block] qcow2 images corruption

Le 13/02/2018 à 16:26, Nicolas Ecarnot a écrit :

...

...
It would be good if you could make the 'qemu-img check' output available somewhere.

I found this : https://github.com/ShijunDeng/qcow2-dump and the transcript (beautiful colors when viewed with "more") is attached : -- Nicolas ECARNOT

John Snow

11:01 p.m.

New subject: [Qemu-block] qcow2 images corruption

On 02/13/2018 04:41 AM, Kevin Wolf wrote:

...

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

...
TL; DR : qcow2 images keep getting corrupted. Any workaround?

Not without knowing the cause.

The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM. The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages.

...
After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all.

It would be good if you could make the 'qemu-img check' output available somewhere.

It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there.

Hi! I did write a pretty simplistic tool for trying to tell the shape of a corruption at a glance. It seems to work pretty similarly to the other tool you already found, but it won't hurt anything to run it: https://github.com/jnsnow/qcheck (Actually, that other tool looks like it has an awful lot of options. I'll have to check it out.) It can print a really upsetting amount of data (especially for very corrupt images), but in the default case, the simple setting should do the trick just fine. You could always put the output from this tool in a pastebin too; it might help me visualize the problem a bit more -- I find seeing the exact offsets and locations of where all the various tables and things to be pretty helpful. You can also always use the "deluge" option and compress it if you want, just don't let it print to your terminal: jsnow@probe (dev) ~/s/qcheck> ./qcheck -xd /home/bos/jsnow/src/qemu/bin/git/install_test_f26.qcow2 > deluge.log; and ls -sh deluge.log 4.3M deluge.log but it compresses down very well: jsnow@probe (dev) ~/s/qcheck> 7z a -t7z -m0=ppmd deluge.ppmd.7z deluge.log jsnow@probe (dev) ~/s/qcheck> ls -s deluge.ppmd.7z 316 deluge.ppmd.7z So I suppose if you want to send along: (1) The basic output without any flags, in a pastebin (2) The zipped deluge output, just in case and I will try my hand at guessing what went wrong. (Also, maybe my tool will totally choke for your image, who knows. It hasn't received an overwhelming amount of testing apart from when I go to use it personally and inevitably wind up displeased with how it handles certain situations, so ...)

...

...
What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame.

This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data. If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around.

...
I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now.

To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output and ideally the image to do anything, but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code.

...
As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images.

To my knowledge, oVirt only uses external snapshots and creates them with QMP. This should be perfectly safe because from the perspective of the qcow2 image being snapshotted, it just means that it gets no new write requests.

Migration is something more involved, and if you could relate the problem to migration, that would certainly be something to look into. In that case, it would be important to know more about the setup, e.g. is it migration with shared or non-shared storage?

...
Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when.

That would be helpful, too. Nothing is more frustrating that debugging a bug in an old version only to find that it's already fixed in the current version (well, except maybe debugging and finding nothing).

Kevin

And, looking at your other email: "- In the case of oVirt, we are here allowing tens of hosts to connect to the same LUN. This LUN is then managed by a classical LVM setup, but I see here no notion of concurrent access management. To date, I still haven't understood how was managed these concurrent access to the same LUN with no crash." I'm hoping someone else on list can chime in with if this safe or not -- I'm not really familiar with how oVirt does things, but as long as the rest of the stack is sound and nothing else is touching the qcow2 data area, we should be OK, I'd hope. (Though the last big qcow2 corruption I had to debug wound up being in the storage stack and not in QEMU, so I have some prejudices here) anyway, I'll try to help as best as I'm able, but no promises. --js

Nicolas Ecarnot

14 Feb 14 Feb

2:51 p.m.

New subject: [Qemu-block] qcow2 images corruption

https://framadrop.org/r/Lvvr392QZo#/wOeYUUlHQAtkUw1E+x2YdqTqq21Pbic6OPBIH0Tj... Le 14/02/2018 à 00:01, John Snow a écrit :

...

On 02/13/2018 04:41 AM, Kevin Wolf wrote:

...
Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

...
TL; DR : qcow2 images keep getting corrupted. Any workaround?

Not without knowing the cause.

The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM. The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages.

...
After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all.

It would be good if you could make the 'qemu-img check' output available somewhere.

It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there.

Hi! I did write a pretty simplistic tool for trying to tell the shape of a corruption at a glance. It seems to work pretty similarly to the other tool you already found, but it won't hurt anything to run it:

https://github.com/jnsnow/qcheck

(Actually, that other tool looks like it has an awful lot of options. I'll have to check it out.)

It can print a really upsetting amount of data (especially for very corrupt images), but in the default case, the simple setting should do the trick just fine.

You could always put the output from this tool in a pastebin too; it might help me visualize the problem a bit more -- I find seeing the exact offsets and locations of where all the various tables and things to be pretty helpful.

You can also always use the "deluge" option and compress it if you want, just don't let it print to your terminal:

jsnow@probe (dev) ~/s/qcheck> ./qcheck -xd /home/bos/jsnow/src/qemu/bin/git/install_test_f26.qcow2 > deluge.log; and ls -sh deluge.log 4.3M deluge.log

but it compresses down very well:

jsnow@probe (dev) ~/s/qcheck> 7z a -t7z -m0=ppmd deluge.ppmd.7z deluge.log jsnow@probe (dev) ~/s/qcheck> ls -s deluge.ppmd.7z 316 deluge.ppmd.7z

So I suppose if you want to send along: (1) The basic output without any flags, in a pastebin (2) The zipped deluge output, just in case

and I will try my hand at guessing what went wrong.

(Also, maybe my tool will totally choke for your image, who knows. It hasn't received an overwhelming amount of testing apart from when I go to use it personally and inevitably wind up displeased with how it handles certain situations, so ...)

...
...
What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame.

This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data. If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around.

...
I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now.

To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output and ideally the image to do anything, but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code.

...
As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images.

To my knowledge, oVirt only uses external snapshots and creates them with QMP. This should be perfectly safe because from the perspective of the qcow2 image being snapshotted, it just means that it gets no new write requests.

Migration is something more involved, and if you could relate the problem to migration, that would certainly be something to look into. In that case, it would be important to know more about the setup, e.g. is it migration with shared or non-shared storage?

...
Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when.

That would be helpful, too. Nothing is more frustrating that debugging a bug in an old version only to find that it's already fixed in the current version (well, except maybe debugging and finding nothing).

Kevin

And, looking at your other email:

"- In the case of oVirt, we are here allowing tens of hosts to connect to the same LUN. This LUN is then managed by a classical LVM setup, but I see here no notion of concurrent access management. To date, I still haven't understood how was managed these concurrent access to the same LUN with no crash."

I'm hoping someone else on list can chime in with if this safe or not -- I'm not really familiar with how oVirt does things, but as long as the rest of the stack is sound and nothing else is touching the qcow2 data area, we should be OK, I'd hope.

(Though the last big qcow2 corruption I had to debug wound up being in the storage stack and not in QEMU, so I have some prejudices here)

anyway, I'll try to help as best as I'm able, but no promises.

--js

-- Nicolas ECARNOT

Nir Soffer

18 Feb 18 Feb

7:07 p.m.

On Wed, Feb 7, 2018 at 7:09 PM Nicolas Ecarnot <nicolas@ecarnot.net> wrote:

...

Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations :

- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1 - Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated network

In 3.6 and iSCSI storage you have the issue of lvmetad service, activating oVirt volumes by default, and also activating guest lvs inside oVirt raw volumes. This can lead to data corruption if an lv was activated before it was extended on another host, and the lv size on the host does not reflect the actual lv size. We had many bugs related to this, check this for related bugs: https://bugzilla.redhat.com/1374545 To avoid this issue, you need to 1. edit /etc/lvm/lvm.conf global/use_lvmetad to: use_lvmetad = 0 2. disable and mask these services: - lvm2-lvmetad.socket - lvm2-lvmetad.service Note that this will may cause warnings from systemd during boot, the warnings are harmless: https://bugzilla.redhat.com/1462792 For extra safety and better performance, you should also setup lvm filter on all hosts. Check this for example how it is done in 4.x: https://www.ovirt.org/blog/2017/12/lvm-configuration-the-easy-way/ Since you run 3.6 you will have to setup the filter manually in the same way. Nir

...

- Depends on weeks, but all in all, there are around 32 hosts, 8 storage domains and for various reasons, very few VMs (less than 200). - One peculiar point is that most of our VMs are provided an additional dedicated network interface that is iSCSI-connected to some volumes of our SAN - these volumes not being part of the oVirt setup. That could lead to a lot of additional iSCSI traffic.

From times to times, a random VM appears paused by oVirt. Digging into the oVirt engine logs, then into the host vdsm logs, it appears that the host considers the qcow2 image as corrupted. Along what I consider as a conservative behavior, vdsm stops any interaction with this image and marks it as paused. Any try to unpause it leads to the same conservative pause.

After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all.

Months ago, I already sent a similar message but the error message was about No space left on device (https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html).

This time, I don't have this message about space, but only corruption.

I kept reading and found a similar discussion in the Proxmox group : https://lists.ovirt.org/pipermail/users/2018-February/086750.html

https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-d...

What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame.

I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now.

As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images.

Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when.

Regards,

-- Nicolas ECARNOT _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

2865

Age (days ago)

2876

Last active (days ago)

List overview

Download

8 comments

5 participants

participants (5)

John Snow
Kevin Wolf
Nicolas Ecarnot
Nir Soffer
Yaniv Kaul

qcow2 images corruption

tags

participants (5)