Problems upgrading Ovirt 3.5.4 to 3.6 RC

--------------070306020106050808000908 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit I followed the instructions here <http://www.ovirt.org/OVirt_3.6_Release_Notes> on upgrading to Ovirt 3.6RC from 3.5.4 and have encountered a few problems. My Ovirt 3.5.4 test environment consisted of: 1 Centos 7.1 host running hosted engine stored on a separate NFS server 1 Centos 7.1 ovirt engine vm With some research I was able to solve two of the three issues I've experienced. I'll list them here for acadamia - and perhaps they point to a misstep on my behalf that is causing the third. 1. Upon a "successful" upgrade, the admin@local account was expired. The problem is documented here <https://bugzilla.redhat.com/show_bug.cgi?id=1261382> and currently caused by following the upgrade instructions as seen here <http://www.ovirt.org/OVirt_3.6_Release_Notes#Install_.2F_Upgrade_from_previous_versions>. Solution was to do the following from the ovirt-engine vm (they may not have all been necessary, it was late!): a. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --force b. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --password-valid-to="2019-01-01 12:00:00Z" c. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user edit admin --account-valid-from="2014-01-01 12:00:00Z" --account-valid-to="2019-01-01 12:00:00Z" d. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user unlock admin 2. Rebooting the Centos 7.1 host caused a loss of default gateway. The engine does not allow you to modify the host because it is in use and modifying /etc/sysconfig/network-scripts is undone by VDSM upon the next reboot. I assume in the past this worked okay because I had a GATEWAY=xxxx in /etc/sysconfig/network as a pre-Ovirt relic. Solution here was to add gateway and defaultRoute fields using the vdsClient command line utility: a. vdsClient -s 0 setupNetworks networks='{ovirtmgmt:{ipaddr:10.1.0.21,netmask:255.255.254.0,bonding:bond0,bridged:true,gateway:10.1.1.254,defaultRoute:True}}' b. vdsClient -s 0 setSafeNetworkConfig Now for the issue I can't solve. When I reboot the Centos 7.1 host I get the following: [root@ovirt-one /]# hosted-engine --vm-status You must run deploy first I then notice that the NFS share to the hosted engine is not mounted and the ovirt-ha-agent.service has failed to start itself at boot. [root@ovirt-one /]# systemctl status ovirt-ha-agent.service ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled) Active: failed (Result: exit-code) since Tue 2015-09-29 12:17:55 CDT; 9min ago Process: 1424 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop (code=exited, status=0/SUCCESS) Process: 1210 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start (code=exited, status=0/SUCCESS) Main PID: 1377 (code=exited, status=254) CGroup: /system.slice/ovirt-ha-agent.service Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd-ovirt-ha-agent[1210]: Starting ovirt-ha-agent: [ OK ] Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Sep 29 12:17:55 ovirt-one.thaultanklines.com ovirt-ha-agent[1377]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Service vdsmd is not running and the admin is responsible for starting it. Shutting down. Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: ovirt-ha-agent.service: main process exited, code=exited, status=254/n/a Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Unit ovirt-ha-agent.service entered failed state. Manually starting ovirt-ha-agent.service works and it then correctly mounts the hosted engine NFS share and all works and I can eventually start the hosted engine. Why would the ovirt-ha-agent.service attempt to start before VDSM was ready? Snippet from /usr/lib/systemd/system/ovirt-ha-agent.service [Unit] Description=oVirt Hosted Engine High Availability Monitoring Agent Wants=ovirt-ha-broker.service Wants=vdsmd.service Wants=sanlock.service After=ovirt-ha-broker.service Any help would be appreciated! --------------070306020106050808000908 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> I followed the instructions <a href="http://www.ovirt.org/OVirt_3.6_Release_Notes">here</a> on upgrading to Ovirt 3.6RC from 3.5.4 and have encountered a few problems.<br> <br> My Ovirt 3.5.4 test environment consisted of:<br> <br> 1 Centos 7.1 host running hosted engine stored on a separate NFS server<br> 1 Centos 7.1 ovirt engine vm<br> <br> With some research I was able to solve two of the three issues I've experienced. I'll list them here for acadamia - and perhaps they point to a misstep on my behalf that is causing the third.<br> <br> 1. Upon a "successful" upgrade, the admin@local account was expired. The problem is <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1261382">documented here</a> and currently caused by following the upgrade instructions <a href="http://www.ovirt.org/OVirt_3.6_Release_Notes#Install_.2F_Upgrade_from_previous_versions">as seen here</a>. Solution was to do the following from the ovirt-engine vm (they may not have all been necessary, it was late!):<br> a. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --force<br> b. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --password-valid-to="2019-01-01 12:00:00Z"<br> c. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user edit admin --account-valid-from="2014-01-01 12:00:00Z" --account-valid-to="2019-01-01 12:00:00Z"<br> d. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user unlock admin<br> <br> 2. Rebooting the Centos 7.1 host caused a loss of default gateway. The engine does not allow you to modify the host because it is in use and modifying /etc/sysconfig/network-scripts is undone by VDSM upon the next reboot. I assume in the past this worked okay because I had a GATEWAY=xxxx in /etc/sysconfig/network as a pre-Ovirt relic. Solution here was to add gateway and defaultRoute fields using the vdsClient command line utility:<br> a. vdsClient -s 0 setupNetworks networks='{ovirtmgmt:{ipaddr:10.1.0.21,netmask:255.255.254.0,bonding:bond0,bridged:true,gateway:10.1.1.254,defaultRoute:True}}'<br> b. vdsClient -s 0 setSafeNetworkConfig<br> <br> Now for the issue I can't solve. When I reboot the Centos 7.1 host I get the following:<br> <br> [root@ovirt-one /]# hosted-engine --vm-status<br> You must run deploy first<br> <br> I then notice that the NFS share to the hosted engine is not mounted and the ovirt-ha-agent.service has failed to start itself at boot.<br> <br> [root@ovirt-one /]# systemctl status ovirt-ha-agent.service<br> ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent<br> Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled)<br> Active: failed (Result: exit-code) since Tue 2015-09-29 12:17:55 CDT; 9min ago<br> Process: 1424 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop (code=exited, status=0/SUCCESS)<br> Process: 1210 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start (code=exited, status=0/SUCCESS)<br> Main PID: 1377 (code=exited, status=254)<br> CGroup: /system.slice/ovirt-ha-agent.service<br> <br> Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd-ovirt-ha-agent[1210]: Starting ovirt-ha-agent: [ OK ]<br> Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent.<br> Sep 29 12:17:55 ovirt-one.thaultanklines.com ovirt-ha-agent[1377]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Service vdsmd is not running and the admin is responsible for starting it. Shutting down.<br> Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: ovirt-ha-agent.service: main process exited, code=exited, status=254/n/a<br> Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Unit ovirt-ha-agent.service entered failed state.<br> <br> Manually starting ovirt-ha-agent.service works and it then correctly mounts the hosted engine NFS share and all works and I can eventually start the hosted engine. Why would the ovirt-ha-agent.service attempt to start before VDSM was ready?<br> <br> Snippet from /usr/lib/systemd/system/ovirt-ha-agent.service<br> [Unit]<br> Description=oVirt Hosted Engine High Availability Monitoring Agent<br> Wants=ovirt-ha-broker.service<br> Wants=vdsmd.service<br> Wants=sanlock.service<br> After=ovirt-ha-broker.service<br> <br> Any help would be appreciated!<br> <br> </body> </html> --------------070306020106050808000908--

On Tue, Sep 29, 2015 at 8:28 PM, Adrian Garay < adrian.garay@thaultanklines.com> wrote:
I followed the instructions here <http://www.ovirt.org/OVirt_3.6_Release_Notes> on upgrading to Ovirt 3.6RC from 3.5.4 and have encountered a few problems.
My Ovirt 3.5.4 test environment consisted of:
1 Centos 7.1 host running hosted engine stored on a separate NFS server 1 Centos 7.1 ovirt engine vm
With some research I was able to solve two of the three issues I've experienced. I'll list them here for acadamia - and perhaps they point to a misstep on my behalf that is causing the third.
1. Upon a "successful" upgrade, the admin@local account was expired. The problem is documented here <https://bugzilla.redhat.com/show_bug.cgi?id=1261382> and currently caused by following the upgrade instructions as seen here <http://www.ovirt.org/OVirt_3.6_Release_Notes#Install_.2F_Upgrade_from_previous_versions>. Solution was to do the following from the ovirt-engine vm (they may not have all been necessary, it was late!): a. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --force b. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --password-valid-to="2019-01-01 12:00:00Z" c. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user edit admin --account-valid-from="2014-01-01 12:00:00Z" --account-valid-to="2019-01-01 12:00:00Z" d. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user unlock admin
2. Rebooting the Centos 7.1 host caused a loss of default gateway. The engine does not allow you to modify the host because it is in use and modifying /etc/sysconfig/network-scripts is undone by VDSM upon the next reboot. I assume in the past this worked okay because I had a GATEWAY=xxxx in /etc/sysconfig/network as a pre-Ovirt relic. Solution here was to add gateway and defaultRoute fields using the vdsClient command line utility: a. vdsClient -s 0 setupNetworks networks='{ovirtmgmt:{ipaddr:10.1.0.21,netmask:255.255.254.0,bonding:bond0,bridged:true,gateway:10.1.1.254,defaultRoute:True}}' b. vdsClient -s 0 setSafeNetworkConfig
Thanks for reporting it. We already have an open bug on that https://bugzilla.redhat.com/1262431 What you did to manually fix it is correct, we'll try to find a solution to properly fix it without user actions. Now for the issue I can't solve. When I reboot the Centos 7.1 host I get
the following:
[root@ovirt-one /]# hosted-engine --vm-status You must run deploy first
This message is not coherent. Can you please report the rpm version you are using? this one should be already fixed.
I then notice that the NFS share to the hosted engine is not mounted and the ovirt-ha-agent.service has failed to start itself at boot.
[root@ovirt-one /]# systemctl status ovirt-ha-agent.service ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled) Active: failed (Result: exit-code) since Tue 2015-09-29 12:17:55 CDT; 9min ago Process: 1424 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop (code=exited, status=0/SUCCESS) Process: 1210 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start (code=exited, status=0/SUCCESS) Main PID: 1377 (code=exited, status=254) CGroup: /system.slice/ovirt-ha-agent.service
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd-ovirt-ha-agent[1210]: Starting ovirt-ha-agent: [ OK ] Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Sep 29 12:17:55 ovirt-one.thaultanklines.com ovirt-ha-agent[1377]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Service vdsmd is not running and the admin is responsible for starting it. Shutting down. Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: ovirt-ha-agent.service: main process exited, code=exited, status=254/n/a Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Unit ovirt-ha-agent.service entered failed state.
Manually starting ovirt-ha-agent.service works and it then correctly mounts the hosted engine NFS share and all works and I can eventually start the hosted engine. Why would the ovirt-ha-agent.service attempt to start before VDSM was ready?
Snippet from /usr/lib/systemd/system/ovirt-ha-agent.service [Unit] Description=oVirt Hosted Engine High Availability Monitoring Agent Wants=ovirt-ha-broker.service Wants=vdsmd.service
In the past ovirt-ha-agent was directly starting VDSM service cause we were supporting also el6, now we rely just on systemd for that but probably something should fixed on that side. Thanks for reporting.
Wants=sanlock.service After=ovirt-ha-broker.service
Any help would be appreciated!
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

--------------070601080303000008010604 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit These are the packages I have installed: ovirt-release35-005-1.noarch ovirt-release36-001-0.5.beta.noarch ovirt-vmconsole-1.0.0-0.0.master.20150821105434.gite14b2f0.el7.noarch ovirt-vmconsole-host-1.0.0-0.0.master.20150821105434.gite14b2f0.el7.noarch ovirt-setup-lib-1.0.0-0.0.master.20150812132738.git6a54bc0.el7.centos.noarch ovirt-hosted-engine-ha-1.3.0-0.0.master.20150909150556.20150909150548.git9a2bd43.el7.noarch ovirt-hosted-engine-setup-1.3.0-0.0.master.20150909090214.git794400d.el7.centos.noarch ovirt-host-deploy-1.4.0-0.0.master.20150806005708.git670e9c8.el7.noarch ovirt-engine-sdk-python-3.6.0.1-1.20150909.gitbf05a3a.el7.centos.noarch vdsm-python-4.17.6-0.el7.centos.noarch vdsm-yajsonrpc-4.17.6-0.el7.centos.noarch vdsm-4.17.6-0.el7.centos.noarch vdsm-hook-smbios-4.17.6-0.el7.centos.noarch vdsm-python-zombiereaper-4.16.26-0.el7.centos.noarch vdsm-infra-4.17.6-0.el7.centos.noarch vdsm-xmlrpc-4.17.6-0.el7.centos.noarch vdsm-jsonrpc-4.17.6-0.el7.centos.noarch vdsm-cli-4.17.6-0.el7.centos.noarch Thanks! On 9/30/2015 3:09 AM, Simone Tiraboschi wrote:
Now for the issue I can't solve. When I reboot the Centos 7.1 host I get the following:
[root@ovirt-one /]# hosted-engine --vm-status You must run deploy first
This message is not coherent. Can you please report the rpm version you are using? this one should be already fixed.
--------------070601080303000008010604 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> These are the packages I have installed:<br> <br> ovirt-release35-005-1.noarch<br> ovirt-release36-001-0.5.beta.noarch<br> ovirt-vmconsole-1.0.0-0.0.master.20150821105434.gite14b2f0.el7.noarch<br> ovirt-vmconsole-host-1.0.0-0.0.master.20150821105434.gite14b2f0.el7.noarch<br> ovirt-setup-lib-1.0.0-0.0.master.20150812132738.git6a54bc0.el7.centos.noarch<br> ovirt-hosted-engine-ha-1.3.0-0.0.master.20150909150556.20150909150548.git9a2bd43.el7.noarch<br> ovirt-hosted-engine-setup-1.3.0-0.0.master.20150909090214.git794400d.el7.centos.noarch<br> ovirt-host-deploy-1.4.0-0.0.master.20150806005708.git670e9c8.el7.noarch<br> ovirt-engine-sdk-python-3.6.0.1-1.20150909.gitbf05a3a.el7.centos.noarch<br> <br> vdsm-python-4.17.6-0.el7.centos.noarch<br> vdsm-yajsonrpc-4.17.6-0.el7.centos.noarch<br> vdsm-4.17.6-0.el7.centos.noarch<br> vdsm-hook-smbios-4.17.6-0.el7.centos.noarch<br> vdsm-python-zombiereaper-4.16.26-0.el7.centos.noarch<br> vdsm-infra-4.17.6-0.el7.centos.noarch<br> vdsm-xmlrpc-4.17.6-0.el7.centos.noarch<br> vdsm-jsonrpc-4.17.6-0.el7.centos.noarch<br> vdsm-cli-4.17.6-0.el7.centos.noarch<br> <br> Thanks!<br> <br> <div class="moz-cite-prefix">On 9/30/2015 3:09 AM, Simone Tiraboschi wrote:<br> </div> <blockquote cite="mid:CAN8-ONohb4xuBLMRqCEEfqPy4grg70K7rfAFQfQPgnD1DHNp7Q@mail.gmail.com" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div dir="ltr"><br> <div class="gmail_extra"><br> <div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"> Now for the issue I can't solve. When I reboot the Centos 7.1 host I get the following:<br> <br> [root@ovirt-one /]# hosted-engine --vm-status<br> You must run deploy first<br> </div> </blockquote> <div><br> </div> <div>This message is not coherent.</div> <div>Can you please report the rpm version you are using?</div> <div>this one should be already fixed.</div> <div> </div> <br> </div> <br> </div> </div> </blockquote> <br> </body> </html> --------------070601080303000008010604--

On Thu, Oct 1, 2015 at 1:54 AM, Adrian Garay < adrian.garay@thaultanklines.com> wrote:
These are the packages I have installed:
ovirt-release35-005-1.noarch ovirt-release36-001-0.5.beta.noarch ovirt-vmconsole-1.0.0-0.0.master.20150821105434.gite14b2f0.el7.noarch ovirt-vmconsole-host-1.0.0-0.0.master.20150821105434.gite14b2f0.el7.noarch
ovirt-setup-lib-1.0.0-0.0.master.20150812132738.git6a54bc0.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.0-0.0.master.20150909150556.20150909150548.git9a2bd43.el7.noarch
This one, as the others, is pretty old: the latest one in the main repo is: http://resources.ovirt.org/pub/ovirt-3.6-pre/rpm/el7/noarch/ovirt-hosted-eng... So maybe the yum mirror you downloaded from was out-of-sync. Could you please try to run yum update? If it doesn't find a new set of rpms please edit /etc/yum.repos.d/ovirt-3.6.repo commenting the mirrorlist line and decommenting the baseurl one then clean the yum metadata and try again.
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150909090214.git794400d.el7.centos.noarch ovirt-host-deploy-1.4.0-0.0.master.20150806005708.git670e9c8.el7.noarch ovirt-engine-sdk-python-3.6.0.1-1.20150909.gitbf05a3a.el7.centos.noarch
vdsm-python-4.17.6-0.el7.centos.noarch vdsm-yajsonrpc-4.17.6-0.el7.centos.noarch vdsm-4.17.6-0.el7.centos.noarch vdsm-hook-smbios-4.17.6-0.el7.centos.noarch vdsm-python-zombiereaper-4.16.26-0.el7.centos.noarch vdsm-infra-4.17.6-0.el7.centos.noarch vdsm-xmlrpc-4.17.6-0.el7.centos.noarch vdsm-jsonrpc-4.17.6-0.el7.centos.noarch vdsm-cli-4.17.6-0.el7.centos.noarch
Thanks!
On 9/30/2015 3:09 AM, Simone Tiraboschi wrote:
Now for the issue I can't solve. When I reboot the Centos 7.1 host I get
the following:
[root@ovirt-one /]# hosted-engine --vm-status You must run deploy first
This message is not coherent. Can you please report the rpm version you are using? this one should be already fixed.
participants (2)
-
Adrian Garay
-
Simone Tiraboschi