Problems upgrading Ovirt 3.5.4 to 3.6 RC

29 Sep 2015

      --------------070306020106050808000908
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit

I followed the instructions here 
<http://www.ovirt.org/OVirt_3.6_Release_Notes> on upgrading to Ovirt 
3.6RC from 3.5.4 and have encountered a few problems.

My Ovirt 3.5.4 test environment consisted of:

1 Centos 7.1 host running hosted engine stored on a separate NFS server
1 Centos 7.1 ovirt engine vm

With some research I was able to solve two of the three issues I've 
experienced.  I'll list them here for acadamia - and perhaps they point 
to a misstep on my behalf that is causing the third.

1. Upon a "successful" upgrade, the admin@local account was expired.  
The problem is documented here 
<https://bugzilla.redhat.com/show_bug.cgi?id=1261382> and currently 
caused by following the upgrade instructions as seen here 
<http://www.ovirt.org/OVirt_3.6_Release_Notes#Install_.2F_Upgrade_from_previous_versions>. 
Solution was to do the following from the ovirt-engine vm (they may not 
have all been necessary, it was late!):
     a. ovirt-aaa-jdbc-tool 
--db-config=/etc/ovirt-engine/aaa/internal.properties user 
password-reset admin --force
     b. ovirt-aaa-jdbc-tool 
--db-config=/etc/ovirt-engine/aaa/internal.properties user 
password-reset admin --password-valid-to="2019-01-01 12:00:00Z"
     c. ovirt-aaa-jdbc-tool 
--db-config=/etc/ovirt-engine/aaa/internal.properties user edit admin 
--account-valid-from="2014-01-01 12:00:00Z" 
--account-valid-to="2019-01-01 12:00:00Z"
     d. ovirt-aaa-jdbc-tool 
--db-config=/etc/ovirt-engine/aaa/internal.properties user unlock admin

2. Rebooting the Centos 7.1 host caused a loss of default gateway. The 
engine does not allow you to modify the host because it is in use and 
modifying /etc/sysconfig/network-scripts is undone by VDSM upon the next 
reboot.  I assume in the past this worked okay because I had a 
GATEWAY=xxxx in /etc/sysconfig/network as a pre-Ovirt relic.  Solution 
here was to add gateway and defaultRoute fields using the vdsClient 
command line utility:
     a. vdsClient -s 0 setupNetworks 
networks='{ovirtmgmt:{ipaddr:10.1.0.21,netmask:255.255.254.0,bonding:bond0,bridged:true,gateway:10.1.1.254,defaultRoute:True}}'
     b. vdsClient -s 0 setSafeNetworkConfig

Now for the issue I can't solve.  When I reboot the Centos 7.1 host I 
get the following:

[root@ovirt-one /]# hosted-engine --vm-status
You must run deploy first

I then notice that the NFS share to the hosted engine is not mounted and 
the ovirt-ha-agent.service has failed to start itself at boot.

[root@ovirt-one /]# systemctl status ovirt-ha-agent.service
ovirt-ha-agent.service - oVirt Hosted Engine High Availability 
Monitoring Agent
    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled)
    Active: failed (Result: exit-code) since Tue 2015-09-29 12:17:55 
CDT; 9min ago
   Process: 1424 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop 
(code=exited, status=0/SUCCESS)
   Process: 1210 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start 
(code=exited, status=0/SUCCESS)
  Main PID: 1377 (code=exited, status=254)
    CGroup: /system.slice/ovirt-ha-agent.service

Sep 29 12:17:55 ovirt-one.thaultanklines.com 
systemd-ovirt-ha-agent[1210]: Starting ovirt-ha-agent: [  OK  ]
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Started oVirt 
Hosted Engine High Availability Monitoring Agent.
Sep 29 12:17:55 ovirt-one.thaultanklines.com ovirt-ha-agent[1377]: 
ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine 
ERROR Service vdsmd is not running and the admin is responsible for 
starting it. Shutting down.
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: 
ovirt-ha-agent.service: main process exited, code=exited, status=254/n/a
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Unit 
ovirt-ha-agent.service entered failed state.

Manually starting ovirt-ha-agent.service works and it then correctly 
mounts the hosted engine NFS share and all works and I can eventually 
start the hosted engine.  Why would the ovirt-ha-agent.service attempt 
to start before VDSM was ready?

Snippet from /usr/lib/systemd/system/ovirt-ha-agent.service
[Unit]
Description=oVirt Hosted Engine High Availability Monitoring Agent
Wants=ovirt-ha-broker.service
Wants=vdsmd.service
Wants=sanlock.service
After=ovirt-ha-broker.service

Any help would be appreciated!

--------------070306020106050808000908
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: 8bit

<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    I followed the instructions <a
      href="http://www.ovirt.org/OVirt_3.6_Release_Notes">here</a> on
    upgrading to Ovirt 3.6RC from 3.5.4 and have encountered a few
    problems.<br>
    <br>
    My Ovirt 3.5.4 test environment consisted of:<br>
    <br>
    1 Centos 7.1 host running hosted engine stored on a separate NFS
    server<br>
    1 Centos 7.1 ovirt engine vm<br>
    <br>
    With some research I was able to solve two of the three issues I've
    experienced.  I'll list them here for acadamia - and perhaps they
    point to a misstep on my behalf that is causing the third.<br>
    <br>
    1. Upon a "successful" upgrade, the admin@local account was
    expired.  The problem is <a
      href="https://bugzilla.redhat.com/show_bug.cgi?id=1261382">documented
      here</a> and currently caused by following the upgrade
    instructions <a
href="http://www.ovirt.org/OVirt_3.6_Release_Notes#Install_.2F_Upgrade_from_previous_versions">as
      seen here</a>. Solution was to do the following from the
    ovirt-engine vm (they may not have all been necessary, it was
    late!):<br>
        a. ovirt-aaa-jdbc-tool
    --db-config=/etc/ovirt-engine/aaa/internal.properties user
    password-reset admin --force<br>
        b. ovirt-aaa-jdbc-tool
    --db-config=/etc/ovirt-engine/aaa/internal.properties user
    password-reset admin --password-valid-to="2019-01-01 12:00:00Z"<br>
        c. ovirt-aaa-jdbc-tool
    --db-config=/etc/ovirt-engine/aaa/internal.properties user edit
    admin --account-valid-from="2014-01-01 12:00:00Z"
    --account-valid-to="2019-01-01 12:00:00Z"<br>
        d. ovirt-aaa-jdbc-tool
    --db-config=/etc/ovirt-engine/aaa/internal.properties user unlock
    admin<br>
    <br>
    2. Rebooting the Centos 7.1 host caused a loss of default gateway. 
    The engine does not allow you to modify the host because it is in
    use and modifying /etc/sysconfig/network-scripts is undone by VDSM
    upon the next reboot.  I assume in the past this worked okay because
    I had a GATEWAY=xxxx in /etc/sysconfig/network as a pre-Ovirt
    relic.  Solution here was to add gateway and defaultRoute fields
    using the vdsClient command line utility:<br>
        a. vdsClient -s 0 setupNetworks
networks='{ovirtmgmt:{ipaddr:10.1.0.21,netmask:255.255.254.0,bonding:bond0,bridged:true,gateway:10.1.1.254,defaultRoute:True}}'<br>
        b. vdsClient -s 0 setSafeNetworkConfig<br>
    <br>
    Now for the issue I can't solve.  When I reboot the Centos 7.1 host
    I get the following:<br>
    <br>
    [root@ovirt-one /]# hosted-engine --vm-status<br>
    You must run deploy first<br>
    <br>
    I then notice that the NFS share to the hosted engine is not mounted
    and the ovirt-ha-agent.service has failed to start itself at boot.<br>
    <br>
    [root@ovirt-one /]# systemctl status ovirt-ha-agent.service<br>
    ovirt-ha-agent.service - oVirt Hosted Engine High Availability
    Monitoring Agent<br>
       Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service;
    enabled)<br>
       Active: failed (Result: exit-code) since Tue 2015-09-29 12:17:55
    CDT; 9min ago<br>
      Process: 1424 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent
    stop (code=exited, status=0/SUCCESS)<br>
      Process: 1210 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent
    start (code=exited, status=0/SUCCESS)<br>
     Main PID: 1377 (code=exited, status=254)<br>
       CGroup: /system.slice/ovirt-ha-agent.service<br>
    <br>
    Sep 29 12:17:55 ovirt-one.thaultanklines.com
    systemd-ovirt-ha-agent[1210]: Starting ovirt-ha-agent: [  OK  ]<br>
    Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Started
    oVirt Hosted Engine High Availability Monitoring Agent.<br>
    Sep 29 12:17:55 ovirt-one.thaultanklines.com ovirt-ha-agent[1377]:
    ovirt-ha-agent
    ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
    Service vdsmd is not running and the admin is responsible for
    starting it. Shutting down.<br>
    Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]:
    ovirt-ha-agent.service: main process exited, code=exited,
    status=254/n/a<br>
    Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Unit
    ovirt-ha-agent.service entered failed state.<br>
    <br>
    Manually starting ovirt-ha-agent.service works and it then correctly
    mounts the hosted engine NFS share and all works and I can
    eventually start the hosted engine.  Why would the
    ovirt-ha-agent.service attempt to start before VDSM was ready?<br>
    <br>
    Snippet from /usr/lib/systemd/system/ovirt-ha-agent.service<br>
    [Unit]<br>
    Description=oVirt Hosted Engine High Availability Monitoring Agent<br>
    Wants=ovirt-ha-broker.service<br>
    Wants=vdsmd.service<br>
    Wants=sanlock.service<br>
    After=ovirt-ha-broker.service<br>
    <br>
    Any help would be appreciated!<br>
    <br>
  </body>
</html>

--------------070306020106050808000908--

Adrian Garay

Simone Tiraboschi

Adrian Garay

Simone Tiraboschi

tags

participants (2)