[ovirt-users] hosted engine setup failed for 10 minutes delay.. engine seems alive

Simone Tiraboschi stirabos at redhat.com
Mon May 2 07:58:50 UTC 2016


On Sat, Apr 30, 2016 at 10:59 PM, Gianluca Cecchi
<gianluca.cecchi at gmail.com> wrote:
> Hello,
> trying to deploy a self hosted engine on an Intel NUC6i5SYB with CentOS 7.2
> using oVirt 3.6.5 and appliance (picked up rpm is
> ovirt-engine-appliance-3.6-20160420.1.el7.centos.noarch)
>
> Near the end of the command
> hosted-engine --deploy
>
> I get
> ...
>           |- [ INFO  ] Initializing PostgreSQL
>           |- [ INFO  ] Creating PostgreSQL 'engine' database
>           |- [ INFO  ] Configuring PostgreSQL
>           |- [ INFO  ] Creating/refreshing Engine database schema
>           |- [ INFO  ] Creating/refreshing Engine 'internal' domain database
> schema
> [ ERROR ] Engine setup got stuck on the appliance
> [ ERROR ] Failed to execute stage 'Closing up': Engine setup is stalled on
> the appliance since 600 seconds ago. Please check its log on the appliance.
> [ INFO  ] Stage: Clean up
> [ INFO  ] Generating answer file
> '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160430200654.conf'
> [ INFO  ] Stage: Pre-termination
> [ INFO  ] Stage: Termination
> [ ERROR ] Hosted Engine deployment failed: this system is not reliable,
> please check the issue, fix and redeploy
>
> On host log I indeed see the 10 minutes timeout:
>
> 2016-04-30 19:56:52 DEBUG otopi.plugins.otopi.dialog.human
> dialog.__logString:219 DIALOG:SEND                 |- [ INFO  ]
> Creating/refreshing Engine 'internal' domain database schema
> 2016-04-30 20:06:53 ERROR
> otopi.plugins.ovirt_hosted_engine_setup.engine.health health._closeup:140
> Engine setup got stuck on the appliance
>
> On engine I don't see any particular problem but a ten minutes delay in its
> log:
>
> 2016-04-30 17:56:57 DEBUG otopi.context context.dumpEnvironment:514
> ENVIRONMENT DUMP - END
> 2016-04-30 17:56:57 DEBUG otopi.context context._executeMethod:142 Stage
> misc METHOD
> otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc.Plugin._setupAdminPassword
> 2016-04-30 17:56:57 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc
> plugin.executeRaw:828 execute: ('/usr/bin/ovirt-aaa-jdbc-tool',
> '--db-config=/etc/ovirt-engine/aaa/internal.properties', 'user',
> 'password-reset', 'admin', '--password=env:pass', '--force',
> '--password-valid-to=2216-03-13 17:56:57Z'), executable='None', cwd='None',
> env={'LANG': 'en_US.UTF-8', 'SHLVL': '1', 'PYTHONPATH':
> '/usr/share/ovirt-engine/setup/bin/..::', 'pass': '**FILTERED**',
> 'OVIRT_ENGINE_JAVA_HOME_FORCE': '1', 'PWD': '/', 'OVIRT_ENGINE_JAVA_HOME':
> u'/usr/lib/jvm/jre', 'PATH':
> '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin', 'OTOPI_LOGFILE':
> '/var/log/ovirt-engine/setup/ovirt-engine-setup-20160430175551-dttt2p.log',
> 'OVIRT_JBOSS_HOME': '/usr/share/ovirt-engine-wildfly', 'OTOPI_EXECDIR': '/'}
> 2016-04-30 18:07:06 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc
> plugin.executeRaw:878 execute-result: ('/usr/bin/ovirt-aaa-jdbc-tool',
> '--db-config=/etc/ovirt-engine/aaa/internal.properties', 'user',
> 'password-reset', 'admin', '--password=env:pass', '--force',
> '--password-valid-to=2216-03-13 17:56:57Z'), rc=0
>
> and its last lines are:
>
> 2016-04-30 18:07:06 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc
> plugin.execute:936 execute-output: ('/usr/bin/ovirt-aaa-jdbc-tool',
> '--db-config=/etc/ovirt-engine/aaa/internal.properties', 'user',
> 'password-reset', 'admin', '--password=env:pass', '--force',
> '--password-valid-to=2216-03-13 17:56:57Z') stdout:
> updating user admin...
> user updated successfully

hosted-engine-setup creates a fresh VM and inject a cloud-init script
to configure it and execute there engine-setup to configure the engine
as needed.
Since engine-setup is running on the engine VM triggered by
cloud-init, hosted-engine-setup has no way to really control its
process status so we simply gather its output with a timeout of 10
minutes between each single output line.
In nothing happens within 10 minutes (the value is easily
customizable), hosted-engine-setup thinks that engine-setup is stuck.

So the issue we have to understood is why this simple command took
more than 10 minutes in your env:
2016-04-30 17:56:57 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc
plugin.executeRaw:828 execute: ('/usr/bin/ovirt-aaa-jdbc-tool',
'--db-config=/etc/ovirt-engine/aaa/internal.properties', 'user',
'password-reset', 'admin', '--password=env:pass', '--force',
'--password-valid-to=2216-03-13 17:56:57Z'), executable='None',
cwd='None', env={'LANG': 'en_US.UTF-8', 'SHLVL': '1', 'PYTHONPATH':
'/usr/share/ovirt-engine/setup/bin/..::', 'pass': '**FILTERED**',
'OVIRT_ENGINE_JAVA_HOME_FORCE': '1', 'PWD': '/',
'OVIRT_ENGINE_JAVA_HOME': u'/usr/lib/jvm/jre', 'PATH':
'/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin', 'OTOPI_LOGFILE':
'/var/log/ovirt-engine/setup/ovirt-engine-setup-20160430175551-dttt2p.log',
'OVIRT_JBOSS_HOME': '/usr/share/ovirt-engine-wildfly',
'OTOPI_EXECDIR': '/'}

Can you please check the entropy value on your host?
 cat /proc/sys/kernel/random/entropy_avail


> 2016-04-30 18:07:06 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc
> plugin.execute:941 execute-output: ('/usr/bin/ovirt-aaa-jdbc-tool',
> '--db-config=/etc/ovirt-engine/aaa/internal.properties', 'user',
> 'password-reset', 'admin', '--password=env:pass', '--force',
> '--password-valid-to=2216-03-13 17:56:57Z') stderr:
>
>
> 2016-04-30 18:07:06 DEBUG otopi.context context._executeMethod:142 Stage
> misc METHOD
> otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca.Plugin._miscUpgrade
> 2016-04-30 18:07:06 INFO
> otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca ca._miscUpgrade:510
> Upgrading CA
>
> Full logs of host and engine here:
> https://drive.google.com/file/d/0BwoPbcrMv8mvQm9jeDhpZEdRUjg/view?usp=sharing
>
> I can connect via vnc to the engine and see 277 tables in engine database
> (277 rows in output of "\d" command)
>
> Can anyone tell me if I can follow up without starting from scratch and how
> in case?
> Also understand the reason of this delay, as the NUC is a physical host with
> 32Gb of ram and SSD disks and should be quite fast... faster than a VM non
> my laptop where I had no problems in similar setup...
>
> As a last question how to clean up things in case I have to start from
> scratch.

I'd recommend to redeploy from scratch instead of trying fixing it
but, before that, we need to understand the root issue.

> I can leave the situation as it is in the moment, so I can work on the live
> environment before power off....
>
> Thanks in advance,
> Gianluca
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



More information about the Users mailing list