Re: Hosted Engine I/O scheduler

Hi Darrel, Still, based on my experience we shouldn't queue our I/O in the VM, just to do the same in the Host. I'm still considering if I should keep deadline in my hosts or to switch to 'cfq'. After all, I'm using Hyper-converged oVirt and this needs testing. What I/O scheduler are you using on the host? Best Regards, Strahil NikolovOn Mar 18, 2019 19:15, Darrell Budic <budic@onholyground.com> wrote:
Checked this on mine, see the same thing. Switching the engine to noop definitely feels more responsive.
I checked on some VMs as well, it looks like virtio drives (vda, vdb….) get mq-deadline by default, but virtscsi gets noop. I used to think the tuned profile for virtual-guest would set noop, but apparently not…
-Darrell
On Mar 18, 2019, at 1:58 AM, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Hi All,
I have changed my I/O scheduler to none and here are the results so far:
Before (mq-deadline): Adding a disk to VM (initial creation) START: 2019-03-17 16:34:46.709 Adding a disk to VM (initial creation) COMPLETED: 2019-03-17 16:45:17.996
After (none): Adding a disk to VM (initial creation) START: 2019-03-18 08:52:02.xxx Adding a disk to VM (initial creation) COMPLETED: 2019-03-18 08:52:20.xxx
Of course the results are inconclusive, as I have tested only once - but I feel the engine more responsive.
Best Regards, Strahil Nikolov
В неделя, 17 март 2019 г., 18:30:23 ч. Гринуич+2, Strahil <hunter86_bg@yahoo.com> написа:
Dear All,
I have just noticed that my Hosted Engine has a strange I/O scheduler:
Last login: Sun Mar 17 18:14:26 2019 from 192.168.1.43 [root@engine ~]# cat /sys/block/vda/queue/scheduler [mq-deadline] kyber none [root@engine ~]#
Based on my experience anything than noop/none is useless and performance degrading for a VM.
Is there any reason that we have this scheduler ? It is quite pointless to process (and delay) the I/O in the VM and then process (and again delay) on Host Level .
If there is no reason to keep the deadline, I will open a bug about it.
Best Regards, Strahil Nikolov
Dear All,
I have just noticed that my Hosted Engine has a strange I/O scheduler:
Last login: Sun Mar 17 18:14:26 2019 from 192.168.1.43 [root@engine ~]# cat /sys/block/vda/queue/scheduler [mq-deadline] kyber none [root@engine ~]#
Based on my experience anything than noop/none is useless and performance degrading for a VM.

I agree, been checking some of my more disk intensive VMs this morning, switching them to noop definitely improved responsiveness. All the virtio ones I’ve found were using deadline (with RHEL/Centos guests), but some of the virt-scsi were using deadline and some were noop, so I’m not sure of a definitive answer on that level yet. For the hosts, it depends on what your backend is running. With a separate storage server on my main cluster, it doesn’t matter what the hosts set for me. You mentioned you run hyper converged, so I’d say it depends on what your disks are. If you’re using SSDs, go none/noop as they don’t benefit from the queuing. If they are HDDs, I’d test cfq or deadline and see which gave better latency and throughput to your vms. I’d guess you’ll find deadline to offer better performance, but cfq to share better amongst multiple VMs. Unless you use ZFS underneath, then go noop and let ZFS take care of it.
On Mar 18, 2019, at 2:05 PM, Strahil <hunter86_bg@yahoo.com> wrote:
Hi Darrel,
Still, based on my experience we shouldn't queue our I/O in the VM, just to do the same in the Host.
I'm still considering if I should keep deadline in my hosts or to switch to 'cfq'. After all, I'm using Hyper-converged oVirt and this needs testing. What I/O scheduler are you using on the host?
Best Regards, Strahil Nikolov
On Mar 18, 2019 19:15, Darrell Budic <budic@onholyground.com> wrote: Checked this on mine, see the same thing. Switching the engine to noop definitely feels more responsive.
I checked on some VMs as well, it looks like virtio drives (vda, vdb….) get mq-deadline by default, but virtscsi gets noop. I used to think the tuned profile for virtual-guest would set noop, but apparently not…
-Darrell
On Mar 18, 2019, at 1:58 AM, Strahil Nikolov <hunter86_bg@yahoo.com <mailto:hunter86_bg@yahoo.com>> wrote:
Hi All,
I have changed my I/O scheduler to none and here are the results so far:
Before (mq-deadline): Adding a disk to VM (initial creation) START: 2019-03-17 16:34:46.709 Adding a disk to VM (initial creation) COMPLETED: 2019-03-17 16:45:17.996
After (none): Adding a disk to VM (initial creation) START: 2019-03-18 08:52:02.xxx Adding a disk to VM (initial creation) COMPLETED: 2019-03-18 08:52:20.xxx
Of course the results are inconclusive, as I have tested only once - but I feel the engine more responsive.
Best Regards, Strahil Nikolov
В неделя, 17 март 2019 г., 18:30:23 ч. Гринуич+2, Strahil <hunter86_bg@yahoo.com <mailto:hunter86_bg@yahoo.com>> написа:
Dear All,
I have just noticed that my Hosted Engine has a strange I/O scheduler:
Last login: Sun Mar 17 18:14:26 2019 from 192.168.1.43 <http://192.168.1.43/> [root@engine ~]# cat /sys/block/vda/queue/scheduler [mq-deadline] kyber none [root@engine ~]#
Based on my experience anything than noop/none is useless and performance degrading for a VM.
Is there any reason that we have this scheduler ? It is quite pointless to process (and delay) the I/O in the VM and then process (and again delay) on Host Level .
If there is no reason to keep the deadline, I will open a bug about it.
Best Regards, Strahil Nikolov
Dear All,
I have just noticed that my Hosted Engine has a strange I/O scheduler:
Last login: Sun Mar 17 18:14:26 2019 from 192.168.1.43 <http://192.168.1.43/> [root@engine <mailto:root@engine> ~]# cat /sys/block/vda/queue/scheduler [mq-deadline] kyber none [root@engine <mailto:root@engine> ~]#
Based on my experience anything than noop/none is useless and performance degrading for a VM.

On Mon, 18 Mar 2019 at 22:14, Darrell Budic <budic@onholyground.com> wrote:
I agree, been checking some of my more disk intensive VMs this morning, switching them to noop definitely improved responsiveness. All the virtio ones I’ve found were using deadline (with RHEL/Centos guests), but some of the virt-scsi were using deadline and some were noop, so I’m not sure of a definitive answer on that level yet.
For the hosts, it depends on what your backend is running. With a separate storage server on my main cluster, it doesn’t matter what the hosts set for me. You mentioned you run hyper converged, so I’d say it depends on what your disks are. If you’re using SSDs, go none/noop as they don’t benefit from the queuing. If they are HDDs, I’d test cfq or deadline and see which gave better latency and throughput to your vms. I’d guess you’ll find deadline to offer better performance, but cfq to share better amongst multiple VMs. Unless you use ZFS underneath, then go noop and let ZFS take care of it.
On Mar 18, 2019, at 2:05 PM, Strahil <hunter86_bg@yahoo.com> wrote:
Hi Darrel,
Still, based on my experience we shouldn't queue our I/O in the VM, just to do the same in the Host.
I'm still considering if I should keep deadline in my hosts or to switch to 'cfq'. After all, I'm using Hyper-converged oVirt and this needs testing. What I/O scheduler are you using on the host?
Best Regards, Strahil Nikolov On Mar 18, 2019 19:15, Darrell Budic <budic@onholyground.com> wrote:
Checked this on mine, see the same thing. Switching the engine to noop definitely feels more responsive.
I checked on some VMs as well, it looks like virtio drives (vda, vdb….) get mq-deadline by default, but virtscsi gets noop. I used to think the tuned profile for virtual-guest would set noop, but apparently not…
-Darrell
Our internal scale team is testing now 'throughput-performance' tuned profile and it gives promising results, I suggest you try it as well. We will go over the results of a comparison against the virtual-guest profile , if there will be evidence for improvements we will set it as the default (if it won't degrade small,medium scale envs). On Mar 18, 2019, at 1:58 AM, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Hi All,
I have changed my I/O scheduler to none and here are the results so far:
Before (mq-deadline): Adding a disk to VM (initial creation) START: 2019-03-17 16:34:46.709 Adding a disk to VM (initial creation) COMPLETED: 2019-03-17 16:45:17.996
After (none): Adding a disk to VM (initial creation) START: 2019-03-18 08:52:02.xxx Adding a disk to VM (initial creation) COMPLETED: 2019-03-18 08:52:20.xxx
Of course the results are inconclusive, as I have tested only once - but I feel the engine more responsive.
Best Regards, Strahil Nikolov
В неделя, 17 март 2019 г., 18:30:23 ч. Гринуич+2, Strahil < hunter86_bg@yahoo.com> написа:
Dear All,
I have just noticed that my Hosted Engine has a strange I/O scheduler:
Last login: Sun Mar 17 18:14:26 2019 from 192.168.1.43 [root@engine ~]# cat /sys/block/vda/queue/scheduler [mq-deadline] kyber none [root@engine ~]#
Based on my experience anything than noop/none is useless and performance degrading for a VM.
Is there any reason that we have this scheduler ? It is quite pointless to process (and delay) the I/O in the VM and then process (and again delay) on Host Level .
If there is no reason to keep the deadline, I will open a bug about it.
Best Regards, Strahil Nikolov Dear All,
I have just noticed that my Hosted Engine has a strange I/O scheduler:
Last login: Sun Mar 17 18:14:26 2019 from 192.168.1.43 [root@engine ~]# cat /sys/block/vda/queue/scheduler [mq-deadline] kyber none [root@engine ~]#
Based on my experience anything than noop/none is useless and performance degrading for a VM.
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MMTH6225GKYQEZ...

Inline:
On Mar 20, 2019, at 4:25 AM, Roy Golan <rgolan@redhat.com> wrote:
On Mon, 18 Mar 2019 at 22:14, Darrell Budic <budic@onholyground.com <mailto:budic@onholyground.com>> wrote: I agree, been checking some of my more disk intensive VMs this morning, switching them to noop definitely improved responsiveness. All the virtio ones I’ve found were using deadline (with RHEL/Centos guests), but some of the virt-scsi were using deadline and some were noop, so I’m not sure of a definitive answer on that level yet.
For the hosts, it depends on what your backend is running. With a separate storage server on my main cluster, it doesn’t matter what the hosts set for me. You mentioned you run hyper converged, so I’d say it depends on what your disks are. If you’re using SSDs, go none/noop as they don’t benefit from the queuing. If they are HDDs, I’d test cfq or deadline and see which gave better latency and throughput to your vms. I’d guess you’ll find deadline to offer better performance, but cfq to share better amongst multiple VMs. Unless you use ZFS underneath, then go noop and let ZFS take care of it.
On Mar 18, 2019, at 2:05 PM, Strahil <hunter86_bg@yahoo.com <mailto:hunter86_bg@yahoo.com>> wrote:
Hi Darrel,
Still, based on my experience we shouldn't queue our I/O in the VM, just to do the same in the Host.
I'm still considering if I should keep deadline in my hosts or to switch to 'cfq'. After all, I'm using Hyper-converged oVirt and this needs testing. What I/O scheduler are you using on the host?
Our internal scale team is testing now 'throughput-performance' tuned profile and it gives promising results, I suggest you try it as well. We will go over the results of a comparison against the virtual-guest profile , if there will be evidence for improvements we will set it as the default (if it won't degrade small,medium scale envs).
I don’t think that will make a difference in this case. Both virtual-host and virtual-guest include the throughput-performance profile, just with “better” virtual memory tunings for guest and hosts. None of those 3 modify the disk queue schedulers, by default, at least not on my Centos 7.6 systems. Re my testing, I have virtual-host on my hosts and virtual-guest on my guests already.

On Wed, Mar 20, 2019, 1:16 PM Darrell Budic <budic@onholyground.com> wrote:
Inline:
On Mar 20, 2019, at 4:25 AM, Roy Golan <rgolan@redhat.com> wrote:
On Mon, 18 Mar 2019 at 22:14, Darrell Budic <budic@onholyground.com> wrote:
I agree, been checking some of my more disk intensive VMs this morning, switching them to noop definitely improved responsiveness. All the virtio ones I’ve found were using deadline (with RHEL/Centos guests), but some of the virt-scsi were using deadline and some were noop, so I’m not sure of a definitive answer on that level yet.
For the hosts, it depends on what your backend is running. With a separate storage server on my main cluster, it doesn’t matter what the hosts set for me. You mentioned you run hyper converged, so I’d say it depends on what your disks are. If you’re using SSDs, go none/noop as they don’t benefit from the queuing. If they are HDDs, I’d test cfq or deadline and see which gave better latency and throughput to your vms. I’d guess you’ll find deadline to offer better performance, but cfq to share better amongst multiple VMs. Unless you use ZFS underneath, then go noop and let ZFS take care of it.
On Mar 18, 2019, at 2:05 PM, Strahil <hunter86_bg@yahoo.com> wrote:
Hi Darrel,
Still, based on my experience we shouldn't queue our I/O in the VM, just to do the same in the Host.
I'm still considering if I should keep deadline in my hosts or to switch to 'cfq'. After all, I'm using Hyper-converged oVirt and this needs testing. What I/O scheduler are you using on the host?
Our internal scale team is testing now 'throughput-performance' tuned profile and it gives promising results, I suggest you try it as well. We will go over the results of a comparison against the virtual-guest profile , if there will be evidence for improvements we will set it as the default (if it won't degrade small,medium scale envs).
I don’t think that will make a difference in this case. Both virtual-host and virtual-guest include the throughput-performance profile, just with “better” virtual memory tunings for guest and hosts. None of those 3 modify the disk queue schedulers, by default, at least not on my Centos 7.6 systems.
Re my testing, I have virtual-host on my hosts and virtual-guest on my guests already.
Unfortunately, the ideal scheduler really depends on storage configuration. Gluster, ZFS, iSCSI, FC, and NFS don't align on a single "best" configuration (to say nothing of direct LUNs on guests), then there's workload considerations. The scale team is aiming for a balanced "default" policy rather than one which is best for a specific environment. That said, I'm optimistic that the results will let us give better recommendations if your workload/storage benefits from a different scheduler
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FH5LLYXSEJKXTV...

On Mar 20, 2019, at 12:42 PM, Ryan Barry <rbarry@redhat.com> wrote:
On Wed, Mar 20, 2019, 1:16 PM Darrell Budic <budic@onholyground.com <mailto:budic@onholyground.com>> wrote: Inline:
On Mar 20, 2019, at 4:25 AM, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:
On Mon, 18 Mar 2019 at 22:14, Darrell Budic <budic@onholyground.com <mailto:budic@onholyground.com>> wrote: I agree, been checking some of my more disk intensive VMs this morning, switching them to noop definitely improved responsiveness. All the virtio ones I’ve found were using deadline (with RHEL/Centos guests), but some of the virt-scsi were using deadline and some were noop, so I’m not sure of a definitive answer on that level yet.
For the hosts, it depends on what your backend is running. With a separate storage server on my main cluster, it doesn’t matter what the hosts set for me. You mentioned you run hyper converged, so I’d say it depends on what your disks are. If you’re using SSDs, go none/noop as they don’t benefit from the queuing. If they are HDDs, I’d test cfq or deadline and see which gave better latency and throughput to your vms. I’d guess you’ll find deadline to offer better performance, but cfq to share better amongst multiple VMs. Unless you use ZFS underneath, then go noop and let ZFS take care of it.
On Mar 18, 2019, at 2:05 PM, Strahil <hunter86_bg@yahoo.com <mailto:hunter86_bg@yahoo.com>> wrote:
Hi Darrel,
Still, based on my experience we shouldn't queue our I/O in the VM, just to do the same in the Host.
I'm still considering if I should keep deadline in my hosts or to switch to 'cfq'. After all, I'm using Hyper-converged oVirt and this needs testing. What I/O scheduler are you using on the host?
Our internal scale team is testing now 'throughput-performance' tuned profile and it gives promising results, I suggest you try it as well. We will go over the results of a comparison against the virtual-guest profile , if there will be evidence for improvements we will set it as the default (if it won't degrade small,medium scale envs).
I don’t think that will make a difference in this case. Both virtual-host and virtual-guest include the throughput-performance profile, just with “better” virtual memory tunings for guest and hosts. None of those 3 modify the disk queue schedulers, by default, at least not on my Centos 7.6 systems.
Re my testing, I have virtual-host on my hosts and virtual-guest on my guests already.
Unfortunately, the ideal scheduler really depends on storage configuration. Gluster, ZFS, iSCSI, FC, and NFS don't align on a single "best" configuration (to say nothing of direct LUNs on guests), then there's workload considerations.
The scale team is aiming for a balanced "default" policy rather than one which is best for a specific environment.
That said, I'm optimistic that the results will let us give better recommendations if your workload/storage benefits from a different scheduler
Agreed, but that wasn’t my point, I was commenting that those tuned profiles do not set schedulers, so that won’t make a difference, disk scheduler wise. Or are they testing changes to the default policy config? Good point on direct LUNs too. And a question, why not virtual-guest if you’re talking about in guest/engine defaults? Or are they testing host profiles, in which case the question becomes why not virtual-host? Or am I missing where they are testing the scheduler? I’m already using virtual-host on my hosts, which appears to have been set by the ovirt node setup process, and virtual-guest in my RHEL based guests, which I’ve been setting with puppet for a long time now.
participants (4)
-
Darrell Budic
-
Roy Golan
-
Ryan Barry
-
Strahil