<div dir="auto"><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Feb 6, 2018 11:09 AM, "Nicolas Ecarnot" <<a href="mailto:nicolas@ecarnot.net">nicolas@ecarnot.net</a>> wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
On our two 3.6 DCs, we're still facing qcow2 corruptions, even on freshly installed VMs (CentOS7, win2012, win2008...).<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">Please provide complete information on the issue. When, how often, which storage, etc. </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
(We are still hoping to find some time to migrate all this to 4.2, but it's a big work and our one-person team - me - is overwhelmed.)<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">Understood. Note that we have some scripts that can assist somewhat. </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
My workaround is described in my previous thread below, but it's just a workaround.<br>
<br>
Reading further, I found that :<br>
<br>
<a href="https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2" rel="noreferrer" target="_blank">https://forum.proxmox.com/thre<wbr>ads/qcow2-corruption-after-<wbr>snapshot-or-heavy-disk-i-o.<wbr>32865/page-2</a><br>
<br>
There are many things I don't know or understand, and I'd like your opinion :<br>
<br>
- Is "virtio" is synonym of "virtio-blk"?<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">Yes. </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- Is it true that the development of virtio-scsi is active and the one of virtio is stopped?<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">No. </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- People in the proxmox forum seem to say that no qcow2 corruption occurs when using IDE (not an option for me) neither virtio-scsi. </blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">Anecdotal evidence or properly reproduced? </div><div dir="auto">Have they filed an issue? </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Does any Redhat people ever heard of this?<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">I'm not aware of an existing corruption issue. <br></div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- Is converting all my VMs to use virtio-scsi a guarantee against further corruptions?<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">No. </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- What is the non-official but nonetheless recommended driver oVirt devs recommend in the sense of future, development and stability?<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">Depends. I like virtio-scsi for its features (DISCARD mainly), but in some workloads virtio-blk may be somewhat faster (supposedly lower overhead). </div><div dir="auto">Both interfaces are stable. </div><div dir="auto"><br></div><div dir="auto">We should focus on properly reporting the issue so the qemu folks can look at this. </div><div dir="auto">Y. </div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Regards,<font color="#888888"><br>
<br>
-- <br>
Nicolas ECARNOT</font><div class="elided-text"><br>
<br>
Le 15/09/2017 à 14:06, Nicolas Ecarnot a écrit :<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
TL;DR:<br>
How to avoid images corruption?<br>
<br>
<br>
Hello,<br>
<br>
On two of our old 3.6 DC, a recent series of VM migrations lead to some issues :<br>
- I'm putting a host into maintenance mode<br>
- most of the VM are migrating nicely<br>
- one remaining VM never migrates, and the logs are showing :<br>
<br>
* engine.log : "...VM has been paused due to I/O error..."<br>
* vdsm.log : "...Improbable extension request for volume..."<br>
<br>
After digging amongst the RH BZ tickets, I saved the day by :<br>
- stopping the VM<br>
- lvchange -ay the adequate /dev/...<br>
- qemu-img check [-r all] /rhev/blahblah<br>
- lvchange -an...<br>
- boot the VM<br>
- enjoy!<br>
<br>
Yesterday this worked for a VM where only one error occurred on the qemu image, and the repair was easily done by qemu-img.<br>
<br>
Today, facing the same issue on another VM, it failed because the errors were very numerous, and also because of this message :<br>
<br>
[...]<br>
Rebuilding refcount structure<br>
ERROR writing refblock: No space left on device<br>
qemu-img: Check failed: No space left on device<br>
[...]<br>
<br>
The PV/VG/LV are far from being full, so I guess I don't where to look at.<br>
I tried many ways to solve it but I'm not comfortable at all with qemu images, corruption and solving, so I ended up exporting this VM (to an NFS export domain), importing it into another DC : this had the side effect to use qemu-img convert from qcow2 to qcow2, and (maybe?????) to solve some errors???<br>
I also copied it into another qcow2 file with the same qemu-img convert way, but it is leading to another clean qcow2 image without errors.<br>
<br>
I saw that on 4.x some bugs are fixed about VM migrations, but this is not the point here.<br>
I checked my SANs, my network layers, my blades, the OS (CentOS 7.2) of my hosts, but I see nothing special.<br>
<br>
The real reason behind my message is not to know how to repair anything, rather than to understand what could have lead to this situation?<br>
Where to keep a keen eye?<br>
<br>
</blockquote>
<br>
<br>
-- <br>
Nicolas ECARNOT<br>
______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
</div></blockquote></div><br></div></div></div>