upgrade of CL and DC vs running VMs

Hi all, I believe that introduction of bug 1413150 (Add warning to change CL to the match the installed engine version) may have an unfortunate consequence of people actually moving forward with the CL and DC without realizing the constraints on running existing VMs. The periodic nagging is likely going to make people run into the following issue even more frequently We have a cluster level override per VM which takes care of compatibility on CL update by setting the VM’s override to the original CL - that is visible in VM properties, but that’s pretty much it, it’s not very prominent at the moment and it can’t be searched on (bug 1454389). When the update cluster change is made there is a dialog informing you, and there’s also the pending config change for those running VMs…until you shut the VM down, from that time on it only has the CL override set. But the real problem is with DC which AFAIK does not have an override capability, and currently does not have any checks for running VMs. With the above mechanism you can easily get a VM with CL override (say. 3.6) and mindlessly updated DC to 4.1…and once you stop such VM you won’t be able to start it anymore as there is a proper check for unsupported 3.6 CL VM in a newer DC (as implemented by bug 1436577 - Solve DC/Cluster upgrade of VMs with now-unsupported custom compatibility level) We either need to warn/block on DC upgrade, or implement some kind of a DC override (I guess this is a storage question?) Thoughts/ideas? Thanks, michal

On Thu, May 25, 2017 at 12:16 PM, Michal Skrivanek <mskrivan@redhat.com> wrote:
Hi all, I believe that introduction of bug 1413150 (Add warning to change CL to the match the installed engine version) may have an unfortunate consequence of people actually moving forward with the CL and DC without realizing the constraints on running existing VMs. The periodic nagging is likely going to make people run into the following issue even more frequently
Shall we note on cluster upgrade operation that the user should be aware of the implications? Do we know in advance those constraints and whether that are relevant in the environment, and if it is then not issue the warning?
We have a cluster level override per VM which takes care of compatibility on CL update by setting the VM’s override to the original CL - that is visible in VM properties, but that’s pretty much it, it’s not very prominent at the moment and it can’t be searched on (bug 1454389). When the update cluster change is made there is a dialog informing you, and there’s also the pending config change for those running VMs…until you shut the VM down, from that time on it only has the CL override set.
But the real problem is with DC which AFAIK does not have an override capability, and currently does not have any checks for running VMs. With the above mechanism you can easily get a VM with CL override (say. 3.6) and mindlessly updated DC to 4.1…and once you stop such VM you won’t be able to start it anymore as there is a proper check for unsupported 3.6 CL VM in a newer DC (as implemented by bug 1436577 - Solve DC/Cluster upgrade of VMs with now-unsupported custom compatibility level)
I don't recall. Do we have a warning on data center level as well? Or only cluster level?
We either need to warn/block on DC upgrade, or implement some kind of a DC override (I guess this is a storage question?)
(Similar to my question above), do we have a way to identify those constraints and whether they are relevant in the environment? And if so, block upgrading of the DC level?
Thoughts/ideas?
Thanks, michal _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Thu, May 25, 2017 at 2:26 PM, Oved Ourfali <oourfali@redhat.com> wrote:
On Thu, May 25, 2017 at 12:16 PM, Michal Skrivanek <mskrivan@redhat.com> wrote:
Hi all, I believe that introduction of bug 1413150 (Add warning to change CL to the match the installed engine version) may have an unfortunate consequence of people actually moving forward with the CL and DC without realizing the constraints on running existing VMs. The periodic nagging is likely going to make people run into the following issue even more frequently
Shall we note on cluster upgrade operation that the user should be aware of the implications? Do we know in advance those constraints and whether that are relevant in the environment, and if it is then not issue the warning?
We have a cluster level override per VM which takes care of compatibility on CL update by setting the VM’s override to the original CL - that is visible in VM properties, but that’s pretty much it, it’s not very prominent at the moment and it can’t be searched on (bug 1454389). When the update cluster change is made there is a dialog informing you, and there’s also the pending config change for those running VMs…until you shut the VM down, from that time on it only has the CL override set.
But the real problem is with DC which AFAIK does not have an override capability, and currently does not have any checks for running VMs. With the above mechanism you can easily get a VM with CL override (say. 3.6) and mindlessly updated DC to 4.1…and once you stop such VM you won’t be able to start it anymore as there is a proper check for unsupported 3.6 CL VM in a newer DC (as implemented by bug 1436577 - Solve DC/Cluster upgrade of VMs with now-unsupported custom compatibility level)
I don't recall. Do we have a warning on data center level as well? Or only cluster level?
Yes, yes we have an weekly alert for both data center and cluster which are not upgraded to latest version (level).
We either need to warn/block on DC upgrade, or implement some kind of a DC override (I guess this is a storage question?)
(Similar to my question above), do we have a way to identify those constraints and whether they are relevant in the environment? And if so, block upgrading of the DC level?
We can add similar weekly alert for all VMs which cluster_level_override does not match data center version, but that's only alert. We could also prevent data center upgrade if any running VM contains lower cluster_level_override than cluster level which they belong to. But there is a question: we know that VMs should be restarted to be properly upgraded to the cluster level, but do they also need to be restarted after data center level is upgraded? Dan/Allon could you confirm/refute from the point of networking/storage features bound to data center level?
Thoughts/ideas?
Thanks, michal _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

--Apple-Mail=_65258968-3D20-4A8B-BE77-499C7034444A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8
On 25 May 2017, at 14:26, Oved Ourfali <oourfali@redhat.com> wrote: =20 =20 =20 On Thu, May 25, 2017 at 12:16 PM, Michal Skrivanek = <mskrivan@redhat.com <mailto:mskrivan@redhat.com>> wrote: Hi all, I believe that introduction of bug 1413150 (Add warning to change CL = to the match the installed engine version) may have an unfortunate = consequence of people actually moving forward with the CL and DC without = realizing the constraints on running existing VMs. The periodic nagging = is likely going to make people run into the following issue even more = frequently =20 =20 Shall we note on cluster upgrade operation that the user should be = aware of the implications?
Cluster upgrade itself is not the problem (anymore), but we don=E2=80=99t = have these checks and warning on DC upgrade.
Do we know in advance those constraints and whether that are relevant = in the environment, and if it is then not issue the warning?
=20 We have a cluster level override per VM which takes care of = compatibility on CL update by setting the VM=E2=80=99s override to the = original CL - that is visible in VM properties, but that=E2=80=99s =
=20 But the real problem is with DC which AFAIK does not have an override = capability, and currently does not have any checks for running VMs. With =
Well, yeah, but when we were trying various way to =E2=80=9Cmotivate=E2=80= =9D people to do it properly it didn't effectively work very well. It=E2=80= =99 easy to miss/dismiss. Blocking worked=E2=80=A6.but was followed by a = backlash and reverted pretty much it, it=E2=80=99s not very prominent at the moment and it = can=E2=80=99t be searched on (bug 1454389). When the update cluster = change is made there is a dialog informing you, and there=E2=80=99s also = the pending config change for those running VMs=E2=80=A6until you shut = the VM down, from that time on it only has the CL override set. the above mechanism you can easily get a VM with CL override (say. 3.6) = and mindlessly updated DC to 4.1=E2=80=A6and once you stop such VM you = won=E2=80=99t be able to start it anymore as there is a proper check for = unsupported 3.6 CL VM in a newer DC (as implemented by bug 1436577 - = Solve DC/Cluster upgrade of VMs with now-unsupported custom = compatibility level)
=20 I don't recall. Do we have a warning on data center level as well? Or = only cluster level? =20 =20 We either need to warn/block on DC upgrade, or implement some kind of = a DC override (I guess this is a storage question?) =20 (Similar to my question above), do we have a way to identify those = constraints and whether they are relevant in the environment? And if so, = block upgrading of the DC level?
Blocking is a possibility (e.g. when VMs are running with older CL = override), but I do not know what are the dependent features, I suppose = it=E2=80=99s mostly storage and network team=E2=80=99s features. Thanks, michal
=20 =20 Thoughts/ideas? =20 Thanks, michal _______________________________________________ Devel mailing list Devel@ovirt.org <mailto:Devel@ovirt.org> http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel>
--Apple-Mail=_65258968-3D20-4A8B-BE77-499C7034444A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D""><br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On 25 May 2017, at 14:26, Oved Ourfali <<a = href=3D"mailto:oourfali@redhat.com" class=3D"">oourfali@redhat.com</a>>= wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><div = dir=3D"ltr" class=3D""><br class=3D""><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">On Thu, May 25, 2017 at 12:16 PM, = Michal Skrivanek <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:mskrivan@redhat.com" target=3D"_blank" = class=3D"">mskrivan@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 = .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br class=3D""> I believe that introduction of bug 1413150 (Add warning to change CL to = the match the installed engine version) may have an unfortunate = consequence of people actually moving forward with the CL and DC without = realizing the constraints on running existing VMs. The periodic nagging = is likely going to make people run into the following issue even more = frequently<br class=3D""> <br class=3D""></blockquote><div class=3D""><br class=3D""></div><div = class=3D"">Shall we note on cluster upgrade operation that the user = should be aware of the = implications?</div></div></div></div></div></blockquote><div><br = class=3D""></div>Cluster upgrade itself is not the problem (anymore), = but we don=E2=80=99t have these checks and warning on DC = upgrade.</div><div><br class=3D""><blockquote type=3D"cite" = class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><div class=3D"">Do we = know in advance those constraints and whether that are relevant in the = environment, and if it is then not issue the = warning?</div></div></div></div></div></blockquote><div><br = class=3D""></div>Well, yeah, but when we were trying various way to = =E2=80=9Cmotivate=E2=80=9D people to do it properly it didn't = effectively work very well. It=E2=80=99 easy to miss/dismiss. Blocking = worked=E2=80=A6.but was followed by a backlash and = reverted</div><div><br class=3D""><blockquote type=3D"cite" = class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><div = class=3D""> </div><blockquote class=3D"gmail_quote" style=3D"margin:0= 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> We have a cluster level override per VM which takes care of = compatibility on CL update by setting the VM=E2=80=99s override to the = original CL - that is visible in VM properties, but that=E2=80=99s = pretty much it, it=E2=80=99s not very prominent at the moment and it = can=E2=80=99t be searched on (bug 1454389). When the update cluster = change is made there is a dialog informing you, and there=E2=80=99s also = the pending config change for those running VMs=E2=80=A6until you shut = the VM down, from that time on it only has the CL override set.<br = class=3D""> <br class=3D""> But the real problem is with DC which AFAIK does not have an override = capability, and currently does not have any checks for running VMs. With = the above mechanism you can easily get a VM with CL override (say. 3.6) = and mindlessly updated DC to 4.1=E2=80=A6and once you stop such VM you = won=E2=80=99t be able to start it anymore as there is a proper check for = unsupported 3.6 CL VM in a newer DC (as implemented by bug 1436577 - = Solve DC/Cluster upgrade of VMs with now-unsupported custom = compatibility level)<br class=3D""></blockquote><div class=3D""><br = class=3D""></div><div class=3D"">I don't recall. Do we have a warning on = data center level as well? Or only cluster level?</div><div = class=3D""> </div><blockquote class=3D"gmail_quote" style=3D"margin:0= 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br class=3D""> We either need to warn/block on DC upgrade, or implement some kind of a = DC override (I guess this is a storage question?)<br = class=3D""></blockquote><div class=3D""><br class=3D""></div><div = class=3D"">(Similar to my question above), do we have a way to identify = those constraints and whether they are relevant in the environment? And = if so, block upgrading of the DC = level?</div></div></div></div></div></blockquote><div><br = class=3D""></div>Blocking is a possibility (e.g. when VMs are running = with older CL override), but I do not know what are the dependent = features, I suppose it=E2=80=99s mostly storage and network team=E2=80=99s= features.</div><div><br class=3D""></div><div>Thanks,</div><div>michal<br= class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D"gmail_extra"><div = class=3D"gmail_quote"><div class=3D""> </div><blockquote = class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc = solid;padding-left:1ex"> <br class=3D""> Thoughts/ideas?<br class=3D""> <br class=3D""> Thanks,<br class=3D""> michal<br class=3D""> ______________________________<wbr class=3D"">_________________<br = class=3D""> Devel mailing list<br class=3D""> <a href=3D"mailto:Devel@ovirt.org" class=3D"">Devel@ovirt.org</a><br = class=3D""> <a href=3D"http://lists.ovirt.org/mailman/listinfo/devel" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">http://lists.ovirt.org/<wbr = class=3D"">mailman/listinfo/devel</a></blockquote></div><br = class=3D""></div></div> </div></blockquote></div><br class=3D""></body></html>= --Apple-Mail=_65258968-3D20-4A8B-BE77-499C7034444A--
participants (3)
-
Martin Perina
-
Michal Skrivanek
-
Oved Ourfali