[Users] Horrid performance during disk I/O

This probably more appropriate for the qemu users mailing list, but that list doesn’t get much traffic and most posts go unanswered… As I’ve mentioned in the past, I’m migrating my environment from ESXi to oVirt AIO. Under ESXi I was pretty happy with the disk performance, and noticed very little difference from bare metal to HV. Under oVirt/QEMU/KVM, not so much…. Running hdparm on the disk from the HV and from the guest yields the same number, about 180MB/sec (SATA III disks, 7200RPM). The problem is, during disk activity, and it doesn’t matter if it’s Windows 7 guests or Fedora 20 (both using virtio-scsi) the qemu-system-x86 process starts consuming 100% of the hypervisor CPU. Hypervisor is a Core i7 950 with 24GB of RAM. There’s 2 Fedora 20 guests and 2 Windows 7 guests. Each configured with 4 GB of guaranteed RAM. Load averages can go up over 40 during sustained disk IO. Performance obviously suffers greatly. I have tried all combinations of having the guests on EXT 4, BTRFS and using EXT 4 and BTRFS inside the guests, as well as direct LUN. Doesn’t make any difference. Disk IO sends qemu-system-x86 to high CPU percentages. This can’t be normal, so I’m wondering what I’ve done wrong. Is there some magic setting I’m missing?

Under ESXi I was pretty happy with the disk performance, and noticed very=
This is a multi-part message in MIME format. ------=_NextPartTM-000-e613ea3e-a425-489d-90cd-abff7c9ff233 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable little difference from bare metal to HV.=0A=
Under oVirt/QEMU/KVM, not so much=85.=0A= =0A= Still unanswered ... have a look here ...=0A= =0A= http://lists.ovirt.org/pipermail/users/2014-January/019429.html=0A= =0A= Markus=0A= ------=_NextPartTM-000-e613ea3e-a425-489d-90cd-abff7c9ff233 Content-Type: text/plain; name="InterScan_Disclaimer.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="InterScan_Disclaimer.txt"
**************************************************************************** Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. Über das Internet versandte E-Mails können unter fremden Namen erstellt oder manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine rechtsverbindliche Willenserklärung. Collogia Unternehmensberatung AG Ubierring 11 D-50678 Köln Vorstand: Kadir Akin Dr. Michael Höhnerbach Vorsitzender des Aufsichtsrates: Hans Kristian Langva Registergericht: Amtsgericht Köln Registernummer: HRB 52 497 This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. e-mails sent over the internet may have been written under a wrong name or been manipulated. That is why this message sent as an e-mail is not a legally binding declaration of intention. Collogia Unternehmensberatung AG Ubierring 11 D-50678 Köln executive board: Kadir Akin Dr. Michael Höhnerbach President of the supervisory board: Hans Kristian Langva Registry office: district court Cologne Register number: HRB 52 497 **************************************************************************** ------=_NextPartTM-000-e613ea3e-a425-489d-90cd-abff7c9ff233--

Sorry I missed your original post. But at least I know now I’m not the only one suffering this problem. I would be interested in knowing how wide spread this is, and why there aren’t people screaming up and down on the qemu mailing list about this issue? My system board is a Sabertooth X58 using the ICH10r chipset. I wish I hadn’t of destroyed my AMD oVirt system yesterday, I didn’t notice any issues on there, but I didn’t push the IO very hard on it. Have you opened a BZ on this issue? I wonder if it’s possible to down rev qemu to see if this is a recent issue, or will oVirt have an issue? On Jan 13, 2014, at 11:32 AM, Markus Stockhausen <stockhausen@collogia.de> wrote:
Under ESXi I was pretty happy with the disk performance, and noticed very little difference from bare metal to HV. Under oVirt/QEMU/KVM, not so much….
Still unanswered ... have a look here ...
http://lists.ovirt.org/pipermail/users/2014-January/019429.html
Markus <InterScan_Disclaimer.txt>

Hi, I suspect most users didn't complain because they do not have high IO workloads (yet). I think most people run test setups (as do I). However, in my own (artificial) IO workload test on engine 3.2.3. there was no extreme cpu penalty for higher IO, the read/write(cache/no cache) was an expected linear transformation, with no high cpu at all. I didn't test it with 3.3. though, and as I said it was an artificial workload, no real applications. Am 13.01.2014 21:20, schrieb Blaster:
I didn’t push the IO very hard on it.
-- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH & Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen

----- Original Message -----
From: "Blaster" <blaster@556nato.com> To: users@ovirt.org Sent: Monday, January 13, 2014 12:22:37 PM Subject: [Users] Horrid performance during disk I/O
This probably more appropriate for the qemu users mailing list, but that list doesn’t get much traffic and most posts go unanswered…
As I’ve mentioned in the past, I’m migrating my environment from ESXi to oVirt AIO.
Under ESXi I was pretty happy with the disk performance, and noticed very little difference from bare metal to HV.
Under oVirt/QEMU/KVM, not so much….
Running hdparm on the disk from the HV and from the guest yields the same number, about 180MB/sec (SATA III disks, 7200RPM). The problem is, during disk activity, and it doesn’t matter if it’s Windows 7 guests or Fedora 20 (both using virtio-scsi) the qemu-system-x86 process starts consuming 100% of the hypervisor CPU. Hypervisor is a Core i7 950 with 24GB of RAM. There’s 2 Fedora 20 guests and 2 Windows 7 guests. Each configured with 4 GB of guaranteed RAM.
Did you compare virtio-block to virto-scsi, the former will likely outperform the latter.
Load averages can go up over 40 during sustained disk IO. Performance obviously suffers greatly.
I have tried all combinations of having the guests on EXT 4, BTRFS and using EXT 4 and BTRFS inside the guests, as well as direct LUN. Doesn’t make any difference. Disk IO sends qemu-system-x86 to high CPU percentages.
This can’t be normal, so I’m wondering what I’ve done wrong. Is there some magic setting I’m missing?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 1/14/2014 11:04 AM, Andrew Cathrow wrote:
Did you compare virtio-block to virto-scsi, the former will likely outperform the latter.
No, but I have been meaning to, out of curiosity. But why do you say virto-blk will be faster than virtio-scsi? The virtio-scsi wiki claims equal performance. I've been trying to get some real numbers of the performance differences using iozone, but the numbers are all over the place, both on the HV and the guests, so not very meaningful. Not an iozone expert, so still trying to figure out what I'm doing wrong there as well.

On Tue, Jan 14, 2014 at 10:38 PM, Blaster <Blaster@556nato.com> wrote:
On 1/14/2014 11:04 AM, Andrew Cathrow wrote:
Did you compare virtio-block to virto-scsi, the former will likely outperform the latter.
No, but I have been meaning to, out of curiosity.
But why do you say virto-blk will be faster than virtio-scsi? The virtio-scsi wiki claims equal performance. That's also what I read but ... this presentation: http://www.linux-kvm.org/wiki/images/f/f9/2012-forum-virtio-blk-performance-... claims claims: "virtio-blk is about ~3 times faster than virtio-scsi in my setup" o_O So testing is definitely a good idea.
I've been trying to get some real numbers of the performance differences using iozone, but the numbers are all over the place, both on the HV and the guests, so not very meaningful. Not an iozone expert, so still trying to figure out what I'm doing wrong there as well.
I've also done some (relatively simple) testing with fio.

On Tue, Jan 14, 2014 at 10:38 PM, Blaster <Blaster@556nato.com> wrote:
On 1/14/2014 11:04 AM, Andrew Cathrow wrote:
=20 Did you compare virtio-block to virto-scsi, the former will likely outperform the latter. =20 =20 No, but I have been meaning to, out of curiosity. =20 But why do you say virto-blk will be faster than virtio-scsi? The virtio-scsi wiki claims equal performance. That's also what I read but ... this presentation: = http://www.linux-kvm.org/wiki/images/f/f9/2012-forum-virtio-blk-performanc= e-improvement.pdf claims claims: "virtio-blk is about ~3 times faster than virtio-scsi in my = setup" o_O So testing is definitely a good idea. =20 I've been trying to get some real numbers of the performance = differences using iozone, but the numbers are all over the place, both on the HV = and the guests, so not very meaningful. Not an iozone expert, so still =
--Apple-Mail=_9F9CD89E-026E-4211-8B99-308F3B2B5026 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 I finally found time to switch from virtio-scsi to virtio. Seems to = have make a big difference. The windows VMs boot faster, web pages load faster, same with the Fedora = 20 VMs.=20 Everything just feels smoother. I found this interesting presentation: = http://events.linuxfoundation.org/sites/events/files/slides/CloudOpen2013_= Khoa_Huynh_v3.pdf Claims virtio blk and scsi are the same speed. Although, it doesn=92t = mention if this was with or without data plane enabled. When will data plane be the default? On Jan 15, 2014, at 3:11 AM, Sander Grendelman <sander@grendelman.com> = wrote: trying to
figure out what I'm doing wrong there as well. =20 I've also done some (relatively simple) testing with fio.
--Apple-Mail=_9F9CD89E-026E-4211-8B99-308F3B2B5026 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dwindows-1252"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: = after-white-space;"><div><br></div><div>I finally found time to switch = from virtio-scsi to virtio. Seems to have make a big = difference.</div><div><br></div><div>The windows VMs boot faster, web = pages load faster, same with the Fedora 20 = VMs. </div><div>Everything just feels = smoother.</div><div><br></div><div>I found this interesting = presentation:</div><div><a = href=3D"http://events.linuxfoundation.org/sites/events/files/slides/CloudO= pen2013_Khoa_Huynh_v3.pdf">http://events.linuxfoundation.org/sites/events/= files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf</a></div><div><br></div><div>= Claims virtio blk and scsi are the same speed. Although, it = doesn=92t mention if this was with or without data plane = enabled.</div><div><br></div><div>When will data plane be the = default?</div><div><br></div><br><div><div>On Jan 15, 2014, at 3:11 AM, = Sander Grendelman <<a = href=3D"mailto:sander@grendelman.com">sander@grendelman.com</a>> = wrote:</div><br class=3D"Apple-interchange-newline"><blockquote = type=3D"cite">On Tue, Jan 14, 2014 at 10:38 PM, Blaster <<a = href=3D"mailto:Blaster@556nato.com">Blaster@556nato.com</a>> = wrote:<br><blockquote type=3D"cite">On 1/14/2014 11:04 AM, Andrew = Cathrow wrote:<br><blockquote type=3D"cite"><br>Did you compare = virtio-block to virto-scsi, the former will likely<br>outperform the = latter.<br></blockquote><br><br>No, but I have been meaning to, out of = curiosity.<br><br>But why do you say virto-blk will be faster than = virtio-scsi? The<br>virtio-scsi wiki claims equal = performance.<br></blockquote>That's also what I read but ... this = presentation:<br><a = href=3D"http://www.linux-kvm.org/wiki/images/f/f9/2012-forum-virtio-blk-pe= rformance-improvement.pdf">http://www.linux-kvm.org/wiki/images/f/f9/2012-= forum-virtio-blk-performance-improvement.pdf</a><br>claims<br>claims: = "virtio-blk is about ~3 times faster than virtio-scsi in my setup" = o_O<br>So testing is definitely a good idea.<br><blockquote = type=3D"cite"><br>I've been trying to get some real numbers of the = performance differences<br>using iozone, but the numbers are all over = the place, both on the HV and the<br>guests, so not very meaningful. = Not an iozone expert, so still trying to<br>figure out what = I'm doing wrong there as well.<br></blockquote><br>I've also done some = (relatively simple) testing with = fio.<br></blockquote></div><br></body></html>= --Apple-Mail=_9F9CD89E-026E-4211-8B99-308F3B2B5026--
participants (6)
-
Andrew Cathrow
-
Blaster
-
Blaster
-
Markus Stockhausen
-
Sander Grendelman
-
Sven Kieske