On Tue, Sep 29, 2015 at 8:28 PM, Adrian Garay <adrian.garay@thaultanklines.com> wrote:
I followed the instructions here on upgrading to Ovirt 3.6RC from 3.5.4 and have encountered a few problems.

My Ovirt 3.5.4 test environment consisted of:

1 Centos 7.1 host running hosted engine stored on a separate NFS server
1 Centos 7.1 ovirt engine vm

With some research I was able to solve two of the three issues I've experienced.  I'll list them here for acadamia - and perhaps they point to a misstep on my behalf that is causing the third.

1. Upon a "successful" upgrade, the admin@local account was expired.  The problem is documented here and currently caused by following the upgrade instructions as seen here. Solution was to do the following from the ovirt-engine vm (they may not have all been necessary, it was late!):
    a. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --force
    b. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user password-reset admin --password-valid-to="2019-01-01 12:00:00Z"
    c. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user edit admin --account-valid-from="2014-01-01 12:00:00Z" --account-valid-to="2019-01-01 12:00:00Z"
    d. ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties user unlock admin

2. Rebooting the Centos 7.1 host caused a loss of default gateway.  The engine does not allow you to modify the host because it is in use and modifying /etc/sysconfig/network-scripts is undone by VDSM upon the next reboot.  I assume in the past this worked okay because I had a GATEWAY=xxxx in /etc/sysconfig/network as a pre-Ovirt relic.  Solution here was to add gateway and defaultRoute fields using the vdsClient command line utility:
    a. vdsClient -s 0 setupNetworks networks='{ovirtmgmt:{ipaddr:10.1.0.21,netmask:255.255.254.0,bonding:bond0,bridged:true,gateway:10.1.1.254,defaultRoute:True}}'
    b. vdsClient -s 0 setSafeNetworkConfig

Thanks for reporting it.
We already have an open bug on that 
https://bugzilla.redhat.com/1262431
What you did to manually fix it is correct, we'll try to find a solution to properly fix it without user actions.

Now for the issue I can't solve.  When I reboot the Centos 7.1 host I get the following:

[root@ovirt-one /]# hosted-engine --vm-status
You must run deploy first

This message is not coherent.
Can you please report the rpm version you are using?
this one should be already fixed.
 
I then notice that the NFS share to the hosted engine is not mounted and the ovirt-ha-agent.service has failed to start itself at boot.

[root@ovirt-one /]# systemctl status ovirt-ha-agent.service
ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled)
   Active: failed (Result: exit-code) since Tue 2015-09-29 12:17:55 CDT; 9min ago
  Process: 1424 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop (code=exited, status=0/SUCCESS)
  Process: 1210 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start (code=exited, status=0/SUCCESS)
 Main PID: 1377 (code=exited, status=254)
   CGroup: /system.slice/ovirt-ha-agent.service

Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd-ovirt-ha-agent[1210]: Starting ovirt-ha-agent: [  OK  ]
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent.
Sep 29 12:17:55 ovirt-one.thaultanklines.com ovirt-ha-agent[1377]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Service vdsmd is not running and the admin is responsible for starting it. Shutting down.
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: ovirt-ha-agent.service: main process exited, code=exited, status=254/n/a
Sep 29 12:17:55 ovirt-one.thaultanklines.com systemd[1]: Unit ovirt-ha-agent.service entered failed state.

Manually starting ovirt-ha-agent.service works and it then correctly mounts the hosted engine NFS share and all works and I can eventually start the hosted engine.  Why would the ovirt-ha-agent.service attempt to start before VDSM was ready?

Snippet from /usr/lib/systemd/system/ovirt-ha-agent.service
[Unit]
Description=oVirt Hosted Engine High Availability Monitoring Agent
Wants=ovirt-ha-broker.service
Wants=vdsmd.service

In the past ovirt-ha-agent was directly starting VDSM service cause we were supporting also el6, now we rely just on systemd for that but probably something should fixed on that side.
Thanks for reporting.
 
Wants=sanlock.service
After=ovirt-ha-broker.service

Any help would be appreciated!


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users