[Call for feedback] share also your successful upgrades experience

newer
4.2: failed reconstructing master...

Sandro Bonazzola

21 Dec 2017 21 Dec '17

2:35 p.m.

Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports! Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel. We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well. Thanks, -- SANDRO BONAZZOLA ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

Attachments:

attachment.html (text/html — 2.4 KB)

Show replies by date

Misak Khachatryan

21 Dec 21 Dec

2:55 p.m.

New subject: [ovirt-announce] [Call for feedback] share also your successful upgrades experience

Did upgrade to 4.2 yesterday. Everything wen smoothly except few glitches. I have 4 host install - 3 gluster and one node with local storage. One of the gluster servers failed to start it's brick with Peer reject status, solved very fast by googling. On the node i hit old bug, can't upgrade it since 4.1.5 version, finally it seems i just need to reinstall it from scratch. Best regards, Misak Khachatryan On Thu, Dec 21, 2017 at 5:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...

Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

_______________________________________________ Announce mailing list Announce@ovirt.org http://lists.ovirt.org/mailman/listinfo/announce

FERNANDO FREDIANI

3:16 p.m.

New subject: [ovirt-announce] [Call for feedback] share also your successful upgrades experience

This is a multi-part message in MIME format. --------------ADF13D12EED45C331EFF369B Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Seems an upgrade guide is more than needed for a 4.1 to 4.2 upgrade. Perhaps with all the feedback comming that can be done. On 21/12/2017 11:55, Misak Khachatryan wrote:

...

Did upgrade to 4.2 yesterday.

Everything wen smoothly except few glitches.

I have 4 host install - 3 gluster and one node with local storage. One of the gluster servers failed to start it's brick with Peer reject status, solved very fast by googling. On the node i hit old bug, can't upgrade it since 4.1.5 version, finally it seems i just need to reinstall it from scratch.

Best regards, Misak Khachatryan

On Thu, Dec 21, 2017 at 5:35 PM, Sandro Bonazzola <sbonazzo@redhat.com <mailto:sbonazzo@redhat.com>> wrote:

Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org <mailto:users@ovirt.org> mailing list (preferred) or on IRC using irc.oftc.net <http://irc.oftc.net> server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com <http://bugzilla.redhat.com> for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/>

<https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

_______________________________________________ Announce mailing list Announce@ovirt.org <mailto:Announce@ovirt.org> http://lists.ovirt.org/mailman/listinfo/announce <http://lists.ovirt.org/mailman/listinfo/announce>

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

--------------ADF13D12EED45C331EFF369B Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> <p>Seems an upgrade guide is more than needed for a 4.1 to 4.2 upgrade. Perhaps with all the feedback comming that can be done.<br> </p> <br> <div class="moz-cite-prefix">On 21/12/2017 11:55, Misak Khachatryan wrote:<br> </div> <blockquote type="cite" cite="mid:CABfKv0mxgx9yyYesSyuVvHZmyed5-6P3XmQDted7OE4YjvDKUg@mail.gmail.com"> <div dir="ltr">Did upgrade to 4.2 yesterday. <div><br> </div> <div>Everything wen smoothly except few glitches.</div> <div><br> </div> <div>I have 4 host install - 3 gluster and one node with local storage.</div> <div>One of the gluster servers failed to start it's brick with Peer reject status, solved very fast by googling.</div> <div>On the node i hit old bug, can't upgrade it since 4.1.5 version, finally it seems i just need to reinstall it from scratch.</div> <div><br> </div> </div> <div class="gmail_extra"><br clear="all"> <div> <div class="gmail_signature" data-smartmail="gmail_signature"><br> Best regards,<br> Misak Khachatryan</div> </div> <br> <div class="gmail_quote">On Thu, Dec 21, 2017 at 5:35 PM, Sandro Bonazzola <span dir="ltr"><<a href="mailto:sbonazzo@redhat.com" target="_blank" moz-do-not-send="true">sbonazzo@redhat.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div dir="ltr">Hi, <div>now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments.</div> <div>We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release.</div> <div>Looking forward to your positive reports!<br clear="all"> <div><br> </div> <div>Not having positive feedback? Let us know too!</div> <div>We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this <a href="mailto:users@ovirt.org" target="_blank" moz-do-not-send="true">users@ovirt.org</a> mailing list (preferred) or on IRC using <a href="http://irc.oftc.net" target="_blank" moz-do-not-send="true">irc.oftc.net</a> server and #ovirt channel.</div> <div><br> </div> <div>We are also closely monitoring <a href="http://bugzilla.redhat.com" target="_blank" moz-do-not-send="true">bugzilla.redhat.com</a> for new bugs on oVirt project, so you can report issues there as well.</div> <div><br> </div> <div>Thanks,</div> <span class="HOEnZb"><font color="#888888">-- <br> <div class="m_2701235852926144080gmail_signature"> <div dir="ltr"> <div> <div dir="ltr"> <div> <div dir="ltr"> <div> <div dir="ltr"> <div> <div dir="ltr"> <div dir="ltr"> <div dir="ltr"> <div dir="ltr"> <p style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-weight:bold;margin:0px;padding:0px;font-size:14px;text-transform:uppercase"><span>SANDRO</span> <span>BONAZZOLA</span></p> <p style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase"><span>ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D</span></p> <p style="font-family:overpass,sans-serif;margin:0px;font-size:10px;color:rgb(153,153,153)"><a href="https://www.redhat.com/" style="color:rgb(0,136,206);margin:0px" target="_blank" moz-do-not-send="true">Red Hat <span>EMEA</span></a></p> <table style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:medium" border="0"> <tbody> <tr> <td width="100px"><a href="https://red.ht/sig" target="_blank" moz-do-not-send="true"><img src="https://www.redhat.com/profiles/rh/themes/redhatdotcom/img/logo-red-hat-blac..." moz-do-not-send="true" height="auto" width="90"></a></td> <td style="font-size:10px"> <div><a href="https://redhat.com/trusted" style="color:rgb(204,0,0);font-weight:bold" target="_blank" moz-do-not-send="true">TRIED. TESTED. TRUSTED.</a></div> </td> </tr> </tbody> </table> <br> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </font></span></div> </div> <br> ______________________________<wbr>_________________<br> Announce mailing list<br> <a href="mailto:Announce@ovirt.org" moz-do-not-send="true">Announce@ovirt.org</a><br> <a href="http://lists.ovirt.org/mailman/listinfo/announce" rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.ovirt.org/<wbr>mailman/listinfo/announce</a><br> <br> </blockquote> </div> <br> </div> <br> <fieldset class="mimeAttachmentHeader"></fieldset> <br> <pre wrap="">_______________________________________________ Users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> </pre> </blockquote> <br> </body> </html> --------------ADF13D12EED45C331EFF369B--

Sandro Bonazzola

3:33 p.m.

New subject: [ovirt-announce] [Call for feedback] share also your successful upgrades experience

2017-12-21 15:16 GMT+01:00 FERNANDO FREDIANI <fernando.frediani@upx.com>:

...

Seems an upgrade guide is more than needed for a 4.1 to 4.2 upgrade. Perhaps with all the feedback comming that can be done.

...

On 21/12/2017 11:55, Misak Khachatryan wrote:

Did upgrade to 4.2 yesterday.

Everything wen smoothly except few glitches.

I have 4 host install - 3 gluster and one node with local storage. One of the gluster servers failed to start it's brick with Peer reject status, solved very fast by googling. On the node i hit old bug, can't upgrade it since 4.1.5 version, finally it seems i just need to reinstall it from scratch.

Best regards, Misak Khachatryan

On Thu, Dec 21, 2017 at 5:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...
Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

_______________________________________________ Announce mailing list Announce@ovirt.org http://lists.ovirt.org/mailman/listinfo/announce

_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

-- SANDRO BONAZZOLA ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

Sandro Bonazzola

3:35 p.m.

New subject: [ovirt-announce] [Call for feedback] share also your successful upgrades experience

2017-12-21 15:33 GMT+01:00 Sandro Bonazzola <sbonazzo@redhat.com>:

...

2017-12-21 15:16 GMT+01:00 FERNANDO FREDIANI <fernando.frediani@upx.com>:

...
Seems an upgrade guide is more than needed for a 4.1 to 4.2 upgrade. Perhaps with all the feedback comming that can be done.

Sorry, previous mail started by mistake. We have an upgrade guide which indeed needs some love: https://ovirt.org/documentation/upgrade-guide/upgrade-guide/ But upgrade from 4.0 to 4.1 and 4.1 to 4.2 is not different, just different ovirt-release rpm to be used.

...

...
On 21/12/2017 11:55, Misak Khachatryan wrote:

Did upgrade to 4.2 yesterday.

Everything wen smoothly except few glitches.

I have 4 host install - 3 gluster and one node with local storage. One of the gluster servers failed to start it's brick with Peer reject status, solved very fast by googling. On the node i hit old bug, can't upgrade it since 4.1.5 version, finally it seems i just need to reinstall it from scratch.

Best regards, Misak Khachatryan

On Thu, Dec 21, 2017 at 5:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...
Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

_______________________________________________ Announce mailing list Announce@ovirt.org http://lists.ovirt.org/mailman/listinfo/announce

_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

--

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

Gianluca Cecchi

22 Dec 22 Dec

11:44 a.m.

On Thu, Dec 21, 2017 at 2:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...

Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

Hi Sandro, nice to see final 4.2! I successfully update a test/lab nested HCI cluster from oVirt 4.1.7 + Gluster 3.10 to oVirt 4.2 + Gluster 3.12 (automatically picked by the upgrade) 3 hosts with CentOS 7.4 Basically following here: https://ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine steps 5,6 substituted by reboot of the upgraded host. My running C6 VM had no downtime during upgrade of the three hosts Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/d/1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=s... ) 2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]' 2017-12-22 00:44:28,722+01 INFO [org.wildfly.extension.undertow] (ServerService Thread Pool -- 65) WFLYUT0022: Unregistered web context: '/ovirt-engine/apidoc' from server 'default-server' Then I restarted the hosted engine vm on the same first upgraded host and it was able now to correctly start and web admin portal ok. The corresponding lines in server.log had become: 2017-12-22 00:48:17,536+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:48:17,545+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "ovirt-web-ui.war" (runtime-name : "ovirt-web-ui.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "restapi.war" (runtime-name : "restapi.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "engine.ear" (runtime-name : "engine.ear") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "apidoc.war" (runtime-name : "apidoc.war") 2017-12-22 00:48:24,175+01 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server 2017-12-22 00:48:24,219+01 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:8706/management I also updated cluster and DC level to 4.2. Gianluca

Sandro Bonazzola

11:59 a.m.

2017-12-22 11:44 GMT+01:00 Gianluca Cecchi <gianluca.cecchi@gmail.com>:

...

On Thu, Dec 21, 2017 at 2:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...
Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

Hi Sandro, nice to see final 4.2!

I successfully update a test/lab nested HCI cluster from oVirt 4.1.7 + Gluster 3.10 to oVirt 4.2 + Gluster 3.12 (automatically picked by the upgrade) 3 hosts with CentOS 7.4

Thanks for the report Gianluca!

...

Basically following here: https://ovirt.org/documentation/how-to/hosted- engine/#upgrade-hosted-engine

steps 5,6 substituted by reboot of the upgraded host.

My running C6 VM had no downtime during upgrade of the three hosts

Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/d/ 1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=sharing )

2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]'

Adding Vaclav, maybe something in Wildfly? Martin, any hint on engine side?

...

2017-12-22 00:44:28,722+01 INFO [org.wildfly.extension.undertow] (ServerService Thread Pool -- 65) WFLYUT0022: Unregistered web context: '/ovirt-engine/apidoc' from server 'default-server'

Then I restarted the hosted engine vm on the same first upgraded host and it was able now to correctly start and web admin portal ok. The corresponding lines in server.log had become:

2017-12-22 00:48:17,536+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:48:17,545+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "ovirt-web-ui.war" (runtime-name : "ovirt-web-ui.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "restapi.war" (runtime-name : "restapi.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "engine.ear" (runtime-name : "engine.ear") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "apidoc.war" (runtime-name : "apidoc.war") 2017-12-22 00:48:24,175+01 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server 2017-12-22 00:48:24,219+01 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:8706/ management

I also updated cluster and DC level to 4.2.

Gianluca

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Martin Perina

12:06 p.m.

On Fri, Dec 22, 2017 at 11:59 AM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...

2017-12-22 11:44 GMT+01:00 Gianluca Cecchi <gianluca.cecchi@gmail.com>:

...
On Thu, Dec 21, 2017 at 2:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...
Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

Hi Sandro, nice to see final 4.2!

I successfully update a test/lab nested HCI cluster from oVirt 4.1.7 + Gluster 3.10 to oVirt 4.2 + Gluster 3.12 (automatically picked by the upgrade) 3 hosts with CentOS 7.4

Thanks for the report Gianluca!

...
Basically following here: https://ovirt.org/documentation/how-to/hosted-engine/# upgrade-hosted-engine

steps 5,6 substituted by reboot of the upgraded host.

My running C6 VM had no downtime during upgrade of the three hosts

Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/ d/1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=sharing )

2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]'

Adding Vaclav, maybe something in Wildfly? Martin, any hint on engine side?

Yeah, I've already seen such error a few times, it usually happens when access to storage is really slow or the host itself is overloaded and WildFly is not able to startup properly until default 300 seconds interval is over. If this is going to happen often, we will have to raise that timeout for all installations.

...

...
2017-12-22 00:44:28,722+01 INFO [org.wildfly.extension.undertow] (ServerService Thread Pool -- 65) WFLYUT0022: Unregistered web context: '/ovirt-engine/apidoc' from server 'default-server'

Then I restarted the hosted engine vm on the same first upgraded host and it was able now to correctly start and web admin portal ok. The corresponding lines in server.log had become:

2017-12-22 00:48:17,536+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:48:17,545+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "ovirt-web-ui.war" (runtime-name : "ovirt-web-ui.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "restapi.war" (runtime-name : "restapi.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "engine.ear" (runtime-name : "engine.ear") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "apidoc.war" (runtime-name : "apidoc.war") 2017-12-22 00:48:24,175+01 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server 2017-12-22 00:48:24,219+01 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:8706/management

I also updated cluster and DC level to 4.2.

Gianluca

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

--

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

-- Martin Perina Associate Manager, Software Engineering Red Hat Czech s.r.o.

Simone Tiraboschi

12:14 p.m.

On Fri, Dec 22, 2017 at 12:06 PM, Martin Perina <mperina@redhat.com> wrote:

...

On Fri, Dec 22, 2017 at 11:59 AM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...
2017-12-22 11:44 GMT+01:00 Gianluca Cecchi <gianluca.cecchi@gmail.com>:

...
On Thu, Dec 21, 2017 at 2:35 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...
Hi, now that oVirt 4.2.0 has been released, we're starting to see some reports about issues that for now are related to not so common deployments. We'd also like to get some feedback from those who upgraded to this amazing release without any issue and add these positive feedback under our developers (digital) Christmas tree as a gift for the effort put in this release. Looking forward to your positive reports!

Not having positive feedback? Let us know too! We are putting an effort in the next weeks to promptly assist whoever hit troubles during or after the upgrade. Let us know in this users@ovirt.org mailing list (preferred) or on IRC using irc.oftc.net server and #ovirt channel.

We are also closely monitoring bugzilla.redhat.com for new bugs on oVirt project, so you can report issues there as well.

Thanks, --

SANDRO BONAZZOLA

Hi Sandro, nice to see final 4.2!

I successfully update a test/lab nested HCI cluster from oVirt 4.1.7 + Gluster 3.10 to oVirt 4.2 + Gluster 3.12 (automatically picked by the upgrade) 3 hosts with CentOS 7.4

Thanks for the report Gianluca!

...
Basically following here: https://ovirt.org/documentation/how-to/hosted-engine/#upgrad e-hosted-engine

steps 5,6 substituted by reboot of the upgraded host.

My running C6 VM had no downtime during upgrade of the three hosts

Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/ d/1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=sharing )

2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]'

Adding Vaclav, maybe something in Wildfly? Martin, any hint on engine side?

Yeah, I've already seen such error a few times, it usually happens when access to storage is really slow or the host itself is overloaded and WildFly is not able to startup properly until default 300 seconds interval is over.

If this is going to happen often, we will have to raise that timeout for all installations.

This is under investigation here: https://bugzilla.redhat.com/show_bug.cgi?id=1528292 We had more than one evidence in fresh installs with the new ansible flow but, up to now, no evidence on upgrades.

...

...
...
2017-12-22 00:44:28,722+01 INFO [org.wildfly.extension.undertow] (ServerService Thread Pool -- 65) WFLYUT0022: Unregistered web context: '/ovirt-engine/apidoc' from server 'default-server'

Then I restarted the hosted engine vm on the same first upgraded host and it was able now to correctly start and web admin portal ok. The corresponding lines in server.log had become:

2017-12-22 00:48:17,536+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:48:17,545+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "ovirt-web-ui.war" (runtime-name : "ovirt-web-ui.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "restapi.war" (runtime-name : "restapi.war") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "engine.ear" (runtime-name : "engine.ear") 2017-12-22 00:48:23,248+01 INFO [org.jboss.as.server] (ServerService Thread Pool -- 25) WFLYSRV0010: Deployed "apidoc.war" (runtime-name : "apidoc.war") 2017-12-22 00:48:24,175+01 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server 2017-12-22 00:48:24,219+01 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:8706/management

I also updated cluster and DC level to 4.2.

Gianluca

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

--

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

-- Martin Perina Associate Manager, Software Engineering Red Hat Czech s.r.o.

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Gianluca Cecchi

12:19 p.m.

On Fri, Dec 22, 2017 at 12:06 PM, Martin Perina <mperina@redhat.com> wrote:

...

...
...
Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/ d/1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=sharing )

2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]'

Adding Vaclav, maybe something in Wildfly? Martin, any hint on engine side?

Yeah, I've already seen such error a few times, it usually happens when access to storage is really slow or the host itself is overloaded and WildFly is not able to startup properly until default 300 seconds interval is over.

If this is going to happen often, we will have to raise that timeout for all installations.

Ok, thanks for the confirmation of what I suspected. Actually this particular environment is based on a single NUC where I have ESXi 6.0U2 and the 3 hosts are actually VMs of this vSphere environment, so that the hosted engine is an L2 VM And the replica 3 (with arbiter) of the hosts insists at the end on a single physical SSD disk below... All in all it is already a great success that this kind of infrastructure has been able to update from 4.1 to 4.2... I use it basically as a functional testing and btw on vSphere there is also another CentOS 7 VM running ;-) Thanks, Gianluca

Gabriel Stein

23 Dec 23 Dec

8:11 p.m.

Well, I finally upgraded all to 4.2. Unfortunately I broke a Server in the upgrade process and I needed much more time than expected(the Host was with Hosted Engine and Gluster). I will not go further on that problems because I interrupted the upgrade process and try to fix it and I ended with a kernel panic. I recommend to use tmux or screen for the upgrade. My experience: I used this tutorial for the upgrade process: https://ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine You can use it for the Hosted Engine VM and the Hosts. Be patient, don't interrupt the yum update process, and follow the instructions. If you have a locale different than US-UTF8, please change it for US-UTF8 before the upgrade process. Where? /etc/locale.conf on CentOS. If you are using e.g. puppet, please deactivate it on upgrade process, to avoid the locale change(if you have something on puppet that changes it). Problems: I still having problems with glusterfs(Peer Rejected) and unstable, but it probably happens because I copied the UUID from the broke server from glusterfs and added the new installed server with same ip back to gluster. IMHO: Please check that you have the engine backup done. Save it somewhere, NFS, rsync-it to another server.... When running engine-setup after the yum update on ovirt-engine VM: don't to the in place upgrade from postgresql. It's really nice to have it, but you you can avoid risks, why do so? Keep the Backup from Postgresql. Previous Version: 4.1.x Updated to: 4.2 Setup: CentOS 7.4.108 4 Servers, 3 with Gluster for engine. If you have questions.... Best Regards, Gabriel Gabriel Stein ------------------------------ Gabriel Ferraz Stein Tel.: +49 (0) 170 2881531 2017-12-22 12:19 GMT+01:00 Gianluca Cecchi <gianluca.cecchi@gmail.com>:

...

On Fri, Dec 22, 2017 at 12:06 PM, Martin Perina <mperina@redhat.com> wrote:

...
...
...
Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/ d/1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=sharing )

2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]'

Adding Vaclav, maybe something in Wildfly? Martin, any hint on engine side?

Yeah, I've already seen such error a few times, it usually happens when access to storage is really slow or the host itself is overloaded and WildFly is not able to startup properly until default 300 seconds interval is over.

If this is going to happen often, we will have to raise that timeout for all installations.

Ok, thanks for the confirmation of what I suspected. Actually this particular environment is based on a single NUC where I have ESXi 6.0U2 and the 3 hosts are actually VMs of this vSphere environment, so that the hosted engine is an L2 VM And the replica 3 (with arbiter) of the hosts insists at the end on a single physical SSD disk below... All in all it is already a great success that this kind of infrastructure has been able to update from 4.1 to 4.2... I use it basically as a functional testing and btw on vSphere there is also another CentOS 7 VM running ;-)

Thanks, Gianluca

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Sahina Bose

28 Dec 28 Dec

6:45 a.m.

On Sun, Dec 24, 2017 at 12:41 AM, Gabriel Stein <gabrielstein@gmail.com> wrote:

...

Well, I finally upgraded all to 4.2. Unfortunately I broke a Server in the upgrade process and I needed much more time than expected(the Host was with Hosted Engine and Gluster). I will not go further on that problems because I interrupted the upgrade process and try to fix it and I ended with a kernel panic.

I recommend to use tmux or screen for the upgrade.

My experience:

I used this tutorial for the upgrade process: https://ovirt.org/ documentation/how-to/hosted-engine/#upgrade-hosted-engine

You can use it for the Hosted Engine VM and the Hosts. Be patient, don't interrupt the yum update process, and follow the instructions. If you have a locale different than US-UTF8, please change it for US-UTF8 before the upgrade process. Where? /etc/locale.conf on CentOS. If you are using e.g. puppet, please deactivate it on upgrade process, to avoid the locale change(if you have something on puppet that changes it).

Problems:

I still having problems with glusterfs(Peer Rejected) and unstable, but it probably happens because I copied the UUID from the broke server from glusterfs and added the new installed server with same ip back to gluster.

Are your problems with GlusterFS resolved. If not please reach out to us and gluster-users@gluster.org for help

...

IMHO: Please check that you have the engine backup done. Save it somewhere, NFS, rsync-it to another server.... When running engine-setup after the yum update on ovirt-engine VM: don't to the in place upgrade from postgresql. It's really nice to have it, but you you can avoid risks, why do so? Keep the Backup from Postgresql.

Previous Version: 4.1.x Updated to: 4.2

Setup:

CentOS 7.4.108 4 Servers, 3 with Gluster for engine.

If you have questions....

Best Regards,

Gabriel

Gabriel Stein ------------------------------ Gabriel Ferraz Stein Tel.: +49 (0) 170 2881531

2017-12-22 12:19 GMT+01:00 Gianluca Cecchi <gianluca.cecchi@gmail.com>:

...
On Fri, Dec 22, 2017 at 12:06 PM, Martin Perina <mperina@redhat.com> wrote:

...
...
...
Only problem I registered was the first start of engine on the first upgraded host (that should be step 7 in link above), where I got error on engine (not able to access web admin portal); I see this in server.log (see full attach here https://drive.google.com/file/ d/1UQAllZfjueVGkXDsBs09S7THGDFn9YPa/view?usp=sharing )

2017-12-22 00:40:17,674+01 INFO [org.quartz.core.QuartzScheduler] (ServerService Thread Pool -- 63) Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2017-12-22 00:40:17,682+01 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started timeout-base cache from ovirt-engine container 2017-12-22 00:44:28,611+01 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[ ("core-service" => "management"), ("management-interface" => "native-interface") ]'

Adding Vaclav, maybe something in Wildfly? Martin, any hint on engine side?

Yeah, I've already seen such error a few times, it usually happens when access to storage is really slow or the host itself is overloaded and WildFly is not able to startup properly until default 300 seconds interval is over.

If this is going to happen often, we will have to raise that timeout for all installations.

Ok, thanks for the confirmation of what I suspected. Actually this particular environment is based on a single NUC where I have ESXi 6.0U2 and the 3 hosts are actually VMs of this vSphere environment, so that the hosted engine is an L2 VM And the replica 3 (with arbiter) of the hosts insists at the end on a single physical SSD disk below... All in all it is already a great success that this kind of infrastructure has been able to update from 4.1 to 4.2... I use it basically as a functional testing and btw on vSphere there is also another CentOS 7 VM running ;-)

Thanks, Gianluca

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

2874

Age (days ago)

2881

Last active (days ago)

List overview

Download

11 comments

8 participants

participants (8)

FERNANDO FREDIANI
Gabriel Stein
Gianluca Cecchi
Martin Perina
Misak Khachatryan
Sahina Bose
Sandro Bonazzola
Simone Tiraboschi