Max amount of datacenters per ovirt engine

Hi all, i have a somewhat awkward requirement to deploy datacenters to around 500 satelite locations. Each datacenter will have 1 or 2 hypervisors and if 2 a cluster with glusterfs for shared storage will be deployed. I am expecting to run between 4 and 10 vms per data center. During operations it is expected 20% to 40% of the satelite locations will be down or have a very bad connection to the engine. I will still like to manage these datacenters from ovirt engine. Does anyone have any figures on how many datacenters i can add given what requirements for a single engine and perhaps some best practices for dealing with 'bad' connections between the hypervisors and the engine. Thanks a lot!

On Tuesday, April 26, 2016 12:13:11 PM Joost@familiealbers.nl wrote:
Hi all, i have a somewhat awkward requirement to deploy datacenters to around 500 satelite locations. Each datacenter will have 1 or 2 hypervisors and if 2 a cluster with glusterfs for shared storage will be deployed. I am expecting to run between 4 and 10 vms per data center. During operations it is expected 20% to 40% of the satelite locations will be down or have a very bad connection to the engine. I will still like to manage these datacenters from ovirt engine. Does anyone have any figures on how many datacenters i can add given what requirements for a single engine and perhaps some best practices for dealing with 'bad' connections between the hypervisors and the engine. Thanks a lot!
Sounds like a good use case for manage IQ. If possible I would do hosted engines in each satellite location, and then use manage IQ to manage the engines. Alexander ps. Vrolijk Koning dag morgen
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Thanks for the response. If possible i would like to reduce the amount of tooling. We already have quite a lot. Do you think a single ovirt engine could cope? Verstuurd vanaf mijn iPhone
Op 26 apr. 2016 om 14:19 heeft Alexander Wels <awels@redhat.com> het volgende geschreven:
On Tuesday, April 26, 2016 12:13:11 PM Joost@familiealbers.nl wrote: Hi all, i have a somewhat awkward requirement to deploy datacenters to around 500 satelite locations. Each datacenter will have 1 or 2 hypervisors and if 2 a cluster with glusterfs for shared storage will be deployed. I am expecting to run between 4 and 10 vms per data center. During operations it is expected 20% to 40% of the satelite locations will be down or have a very bad connection to the engine. I will still like to manage these datacenters from ovirt engine. Does anyone have any figures on how many datacenters i can add given what requirements for a single engine and perhaps some best practices for dealing with 'bad' connections between the hypervisors and the engine. Thanks a lot!
Sounds like a good use case for manage IQ. If possible I would do hosted engines in each satellite location, and then use manage IQ to manage the engines.
Alexander
ps. Vrolijk Koning dag morgen
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi, I think that 1000 hosts per engine is a bit over what we recommend (and support). The fact that all of them are going to be remote might not be ideal either. The engine assumes the network connection to all hosts is almost flawless and the necessary routing and distance to your hosts might not play nice with (for example) the fencing logic. I would too recommend you to split the deployment into multiple engines, especially if you do not plan migrating VMs between satellite locations. Martin Sivak SLA / oVirt On Tue, Apr 26, 2016 at 2:29 PM, Joost@familiealbers.nl <Joost@familiealbers.nl> wrote:
Thanks for the response. If possible i would like to reduce the amount of tooling. We already have quite a lot. Do you think a single ovirt engine could cope?
Verstuurd vanaf mijn iPhone
Op 26 apr. 2016 om 14:19 heeft Alexander Wels <awels@redhat.com> het volgende geschreven:
On Tuesday, April 26, 2016 12:13:11 PM Joost@familiealbers.nl wrote: Hi all, i have a somewhat awkward requirement to deploy datacenters to around 500 satelite locations. Each datacenter will have 1 or 2 hypervisors and if 2 a cluster with glusterfs for shared storage will be deployed. I am expecting to run between 4 and 10 vms per data center. During operations it is expected 20% to 40% of the satelite locations will be down or have a very bad connection to the engine. I will still like to manage these datacenters from ovirt engine. Does anyone have any figures on how many datacenters i can add given what requirements for a single engine and perhaps some best practices for dealing with 'bad' connections between the hypervisors and the engine. Thanks a lot!
Sounds like a good use case for manage IQ. If possible I would do hosted engines in each satellite location, and then use manage IQ to manage the engines.
Alexander
ps. Vrolijk Koning dag morgen
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --1MadXg0mqwdPfQo9SDIJ0fCrrAojABM77 Content-Type: multipart/mixed; boundary="Wahn8BaEiLvgPeM35wXfBUq8Qdo6tqbSm" From: Sven Kieske <svenkieske@gmail.com> To: users@ovirt.org Message-ID: <571F7533.4070600@gmail.com> Subject: Re: [ovirt-users] Max amount of datacenters per ovirt engine References: <8DDA0C1A-7331-4880-B913-9D7769E07E30@familiealbers.nl> <2126836.ESJW3ukn0O@awels> <9913BF2A-E662-4CB6-94BC-D176912688D6@familiealbers.nl> <CAF0zDV4FzJAG458fQ+jz=bYWrOEYya82pOAm5QDdfwDHzteDeQ@mail.gmail.com> In-Reply-To: <CAF0zDV4FzJAG458fQ+jz=bYWrOEYya82pOAm5QDdfwDHzteDeQ@mail.gmail.com> --Wahn8BaEiLvgPeM35wXfBUq8Qdo6tqbSm Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 26.04.2016 14:46, Martin Sivak wrote:
I think that 1000 hosts per engine is a bit over what we recommend (and support). The fact that all of them are going to be remote might not be ideal either. The engine assumes the network connection to all hosts is almost flawless and the necessary routing and distance to your hosts might not play nice with (for example) the fencing logic.
Hi, this seems a little surprising. At least RHEV states in the documentation you support up to 200 hosts per cluster alone. There are no documented maxima for clusters or datacenters though. @awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations? But I agree, ovirt does not handle unstable or remote connections that well, so you might be better of with hundredths of remote engines, but it seems to be a nightmare to manage, even if you automate everything. My personal experience is, that ovirt does scale at least until about 30-50 DCs managed by a single engine, but that setup was also on a LAN (but I would say it could scale well beyond these numbers, at least on a LAN). HTH Sven --Wahn8BaEiLvgPeM35wXfBUq8Qdo6tqbSm-- --1MadXg0mqwdPfQo9SDIJ0fCrrAojABM77 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQGcBAEBCAAGBQJXH3VKAAoJEAq0kGAWDrqlrU4MALclfaT9+WJgapJdMglTHE6d G8DX2Ka0fAs3rphQDYBpZW73pvgfak0cZmD+MRyHKhn3leCxi0BAwnjlFJOXyt6c XbP/2FiCbq59+sLHT4JoYvgiZLbKGGuCmXNxAmrGHNmReWyXQutoviyK8wxKzg3Q YRsGoi8tHw6aLtAznjbqZ+3WZlACSMnF0gurQrdVfWhQYSHMLkxKMCyNWQECDUc2 yoW52/JS+ghwdTvnhq8PDLZsgUAhRruCZA77vCOa62T5OsIKKISsqOSq0FoBvAi2 WRofgaDNuFgqpXlHBtOswMCWQ2rxhpYzsJqKTsMUwxql2/N/bLe7+RLnMZPqEVt0 fIEllQIUAQLxMJZ1Yi0VDNrUwL/OoAWUS3yrreigERHg/DPY9aohVvZt5+tYYXlK yWowkqSBDQr7xoOpYnekeM5F+S1ivXBlLFpi5EwJDlvArip400r70Gq3Vouifx7P 0QPjnoORNfcjUHZQpeZgJZ6qx9O2GUDCTeCJlN98FA== =zIZZ -----END PGP SIGNATURE----- --1MadXg0mqwdPfQo9SDIJ0fCrrAojABM77--

@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
I would not go that far. Creating zones per continent (for example) might be enough.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
The default configuration seems to only allow 250 hosts per datacenter. # engine-config -g MaxNumberOfHostsInStoragePool MaxNumberOfHostsInStoragePool: 250 version: general -- Martin Sivak SLA / oVirt On Tue, Apr 26, 2016 at 4:03 PM, Sven Kieske <svenkieske@gmail.com> wrote:
On 26.04.2016 14:46, Martin Sivak wrote:
I think that 1000 hosts per engine is a bit over what we recommend (and support). The fact that all of them are going to be remote might not be ideal either. The engine assumes the network connection to all hosts is almost flawless and the necessary routing and distance to your hosts might not play nice with (for example) the fencing logic.
Hi,
this seems a little surprising.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
There are no documented maxima for clusters or datacenters though.
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
But I agree, ovirt does not handle unstable or remote connections that well, so you might be better of with hundredths of remote engines, but it seems to be a nightmare to manage, even if you automate everything.
My personal experience is, that ovirt does scale at least until about 30-50 DCs managed by a single engine, but that setup was also on a LAN (but I would say it could scale well beyond these numbers, at least on a LAN).
HTH
Sven
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 26 Apr 2016, at 16:48, Martin Sivak <msivak@redhat.com> wrote:
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
I would not go that far. Creating zones per continent (for example) might be enough.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
The default configuration seems to only allow 250 hosts per datacenter.
# engine-config -g MaxNumberOfHostsInStoragePool MaxNumberOfHostsInStoragePool: 250 version: general
yep, but that liit is there because within a DC there is a lot of assumption for flawless fast enough communication, the most problematic is that all hosts need to access the same storage and the monitoring gets expensive then. This is a different situation with separate DCs, there’s no cross-DC communication. I would guess many DCs work great actually. Too many hosts and VMs in total might be an issue, but since the last official updates there were a lot of changes. E.g. in stable state due to VM status events introduced in 3.6 the traffic required between each host and engine is much lower. I would not be so afraid of thousands anymore, but of course YMMV
-- Martin Sivak SLA / oVirt
On Tue, Apr 26, 2016 at 4:03 PM, Sven Kieske <svenkieske@gmail.com> wrote:
On 26.04.2016 14:46, Martin Sivak wrote:
I think that 1000 hosts per engine is a bit over what we recommend (and support). The fact that all of them are going to be remote might not be ideal either. The engine assumes the network connection to all hosts is almost flawless and the necessary routing and distance to your hosts might not play nice with (for example) the fencing logic.
Hi,
this seems a little surprising.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
There are no documented maxima for clusters or datacenters though.
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
yeah. currently the added layer of manageiq with HEs everywhere is not that helpful for this particular case. Still, a per-continent split or per-low-latency-area might not be a bad idea. I can imagine with a bit more tolerant timeouts and refreshes it might work well, with incidents/disconnects being isolated within a DC
But I agree, ovirt does not handle unstable or remote connections that
right. but most of that is again per-DC. You can’t do much cross-DC though (e.g. sharing a template is a pain) Thanks michal
well, so you might be better of with hundredths of remote engines, but it seems to be a nightmare to manage, even if you automate everything.
My personal experience is, that ovirt does scale at least until about 30-50 DCs managed by a single engine, but that setup was also on a LAN (but I would say it could scale well beyond these numbers, at least on a LAN).
HTH
Sven
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Ok that sounds promising. Does it require 3.6 at minimum? We already handle starting vms using vdsm in case the network from hypervisors to engine is down. As well as disabling fencingmto avoid a sloppy network to try and make changes. I would like to reduce the comms between hosts and engine. Are there any ideas you would have about that. Verstuurd vanaf mijn iPhone
Op 28 apr. 2016 om 13:55 heeft Michal Skrivanek <michal.skrivanek@redhat.com> het volgende geschreven:
On 26 Apr 2016, at 16:48, Martin Sivak <msivak@redhat.com> wrote:
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
I would not go that far. Creating zones per continent (for example) might be enough.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
The default configuration seems to only allow 250 hosts per datacenter.
# engine-config -g MaxNumberOfHostsInStoragePool MaxNumberOfHostsInStoragePool: 250 version: general
yep, but that liit is there because within a DC there is a lot of assumption for flawless fast enough communication, the most problematic is that all hosts need to access the same storage and the monitoring gets expensive then. This is a different situation with separate DCs, there’s no cross-DC communication. I would guess many DCs work great actually.
Too many hosts and VMs in total might be an issue, but since the last official updates there were a lot of changes. E.g. in stable state due to VM status events introduced in 3.6 the traffic required between each host and engine is much lower. I would not be so afraid of thousands anymore, but of course YMMV
-- Martin Sivak SLA / oVirt
On Tue, Apr 26, 2016 at 4:03 PM, Sven Kieske <svenkieske@gmail.com> wrote:
On 26.04.2016 14:46, Martin Sivak wrote: I think that 1000 hosts per engine is a bit over what we recommend (and support). The fact that all of them are going to be remote might not be ideal either. The engine assumes the network connection to all hosts is almost flawless and the necessary routing and distance to your hosts might not play nice with (for example) the fencing logic.
Hi,
this seems a little surprising.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
There are no documented maxima for clusters or datacenters though.
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
yeah. currently the added layer of manageiq with HEs everywhere is not that helpful for this particular case. Still, a per-continent split or per-low-latency-area might not be a bad idea. I can imagine with a bit more tolerant timeouts and refreshes it might work well, with incidents/disconnects being isolated within a DC
But I agree, ovirt does not handle unstable or remote connections that
right. but most of that is again per-DC. You can’t do much cross-DC though (e.g. sharing a template is a pain)
Thanks michal
well, so you might be better of with hundredths of remote engines, but it seems to be a nightmare to manage, even if you automate everything.
My personal experience is, that ovirt does scale at least until about 30-50 DCs managed by a single engine, but that setup was also on a LAN (but I would say it could scale well beyond these numbers, at least on a LAN).
HTH
Sven
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 28 Apr 2016, at 14:39, Joost@familiealbers.nl wrote:
Ok that sounds promising. Does it require 3.6 at minimum?
yes, the 3.6 requires far less network bandwidth during stable conditions (nothing’s going on with VM status)
We already handle starting vms using vdsm in case the network from hypervisors to engine is down.
that’s…well, a bit tricky to do. Can you share how exactly? What compromises did you do?
As well as disabling fencingmto avoid a sloppy network to try and make changes. I would like to reduce the comms between hosts and engine. Are there any ideas you would have about that.
fencing sometimes does crazy things indeed. There were also quite a few enhancements in 3.5/3.6, but iirc they are not default so you’d need to enable them (e.g. skipping fencing when storage lease is active) there really shouldn’t be much else some parameters might help in unstable conditions, I guess increasing vdsHeartbeatInSeconds would make a big difference in general at expense of slower detection of network outages(mostly relevant for fencing which you don’t use so it shouldn’t matter to you)
Verstuurd vanaf mijn iPhone
Op 28 apr. 2016 om 13:55 heeft Michal Skrivanek <michal.skrivanek@redhat.com> het volgende geschreven:
On 26 Apr 2016, at 16:48, Martin Sivak <msivak@redhat.com> wrote:
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
I would not go that far. Creating zones per continent (for example) might be enough.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
The default configuration seems to only allow 250 hosts per datacenter.
# engine-config -g MaxNumberOfHostsInStoragePool MaxNumberOfHostsInStoragePool: 250 version: general
yep, but that liit is there because within a DC there is a lot of assumption for flawless fast enough communication, the most problematic is that all hosts need to access the same storage and the monitoring gets expensive then. This is a different situation with separate DCs, there’s no cross-DC communication. I would guess many DCs work great actually.
Too many hosts and VMs in total might be an issue, but since the last official updates there were a lot of changes. E.g. in stable state due to VM status events introduced in 3.6 the traffic required between each host and engine is much lower. I would not be so afraid of thousands anymore, but of course YMMV
-- Martin Sivak SLA / oVirt
On Tue, Apr 26, 2016 at 4:03 PM, Sven Kieske <svenkieske@gmail.com> wrote:
On 26.04.2016 14:46, Martin Sivak wrote: I think that 1000 hosts per engine is a bit over what we recommend (and support). The fact that all of them are going to be remote might not be ideal either. The engine assumes the network connection to all hosts is almost flawless and the necessary routing and distance to your hosts might not play nice with (for example) the fencing logic.
Hi,
this seems a little surprising.
At least RHEV states in the documentation you support up to 200 hosts per cluster alone.
There are no documented maxima for clusters or datacenters though.
@awels: to add another layer of indirection via a dedicated hosted-engine per outlet seems a little much. we are talking about 500 * 4GB RAM at least in this example, so 2 TB RAM just for management purposes, if you follow engine hardware recommendations?
yeah. currently the added layer of manageiq with HEs everywhere is not that helpful for this particular case. Still, a per-continent split or per-low-latency-area might not be a bad idea. I can imagine with a bit more tolerant timeouts and refreshes it might work well, with incidents/disconnects being isolated within a DC
But I agree, ovirt does not handle unstable or remote connections that
right. but most of that is again per-DC. You can’t do much cross-DC though (e.g. sharing a template is a pain)
Thanks michal
well, so you might be better of with hundredths of remote engines, but it seems to be a nightmare to manage, even if you automate everything.
My personal experience is, that ovirt does scale at least until about 30-50 DCs managed by a single engine, but that setup was also on a LAN (but I would say it could scale well beyond these numbers, at least on a LAN).
HTH
Sven
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Tuesday, April 26, 2016 02:29:22 PM Joost@familiealbers.nl wrote:
Thanks for the response. If possible i would like to reduce the amount of tooling. We already have quite a lot. Do you think a single ovirt engine could cope?
The reason I mentioned manage IQ is that it would give you one tool to manage everything. You would not have to do much (perhaps the initial setup) in the oVirt manager. After that everything goes through manage IQ. Since it is designed to manage the managers it is much more fault tolerant which appears to be one of your requirements. As far as max number of data centers goes. I am not entirely sure to be honest. But I believe your proposed setup will push the engine beyond it limits. I would definitely go with multiple engine instances, maybe in zones where the connection to the hosts is better, and then something to control the engines from a central location (like manage IQ).
Verstuurd vanaf mijn iPhone
Op 26 apr. 2016 om 14:19 heeft Alexander Wels <awels@redhat.com> het volgende geschreven:
On Tuesday, April 26, 2016 12:13:11 PM Joost@familiealbers.nl wrote: Hi all, i have a somewhat awkward requirement to deploy datacenters to around 500 satelite locations. Each datacenter will have 1 or 2 hypervisors and if 2 a cluster with glusterfs for shared storage will be deployed. I am expecting to run between 4 and 10 vms per data center. During operations it is expected 20% to 40% of the satelite locations will be down or have a very bad connection to the engine. I will still like to manage these datacenters from ovirt engine. Does anyone have any figures on how many datacenters i can add given what requirements for a single engine and perhaps some best practices for dealing with 'bad' connections between the hypervisors and the engine. Thanks a lot!
Sounds like a good use case for manage IQ. If possible I would do hosted engines in each satellite location, and then use manage IQ to manage the engines.
Alexander
ps. Vrolijk Koning dag morgen
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Thanks a lot for feedback! Verstuurd vanaf mijn iPhone
Op 26 apr. 2016 om 14:56 heeft Alexander Wels <awels@redhat.com> het volgende geschreven:
On Tuesday, April 26, 2016 02:29:22 PM Joost@familiealbers.nl wrote: Thanks for the response. If possible i would like to reduce the amount of tooling. We already have quite a lot. Do you think a single ovirt engine could cope?
The reason I mentioned manage IQ is that it would give you one tool to manage everything. You would not have to do much (perhaps the initial setup) in the oVirt manager. After that everything goes through manage IQ. Since it is designed to manage the managers it is much more fault tolerant which appears to be one of your requirements.
As far as max number of data centers goes. I am not entirely sure to be honest. But I believe your proposed setup will push the engine beyond it limits. I would definitely go with multiple engine instances, maybe in zones where the connection to the hosts is better, and then something to control the engines from a central location (like manage IQ).
Verstuurd vanaf mijn iPhone
Op 26 apr. 2016 om 14:19 heeft Alexander Wels <awels@redhat.com> het volgende geschreven:
On Tuesday, April 26, 2016 12:13:11 PM Joost@familiealbers.nl wrote: Hi all, i have a somewhat awkward requirement to deploy datacenters to around 500 satelite locations. Each datacenter will have 1 or 2 hypervisors and if 2 a cluster with glusterfs for shared storage will be deployed. I am expecting to run between 4 and 10 vms per data center. During operations it is expected 20% to 40% of the satelite locations will be down or have a very bad connection to the engine. I will still like to manage these datacenters from ovirt engine. Does anyone have any figures on how many datacenters i can add given what requirements for a single engine and perhaps some best practices for dealing with 'bad' connections between the hypervisors and the engine. Thanks a lot!
Sounds like a good use case for manage IQ. If possible I would do hosted engines in each satellite location, and then use manage IQ to manage the engines.
Alexander
ps. Vrolijk Koning dag morgen
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (5)
-
Alexander Wels
-
Joost@familiealbers.nl
-
Martin Sivak
-
Michal Skrivanek
-
Sven Kieske