[ovirt-users] WebAdmin is down, HostedEngine is up

Simone Tiraboschi stirabos at redhat.com
Mon Mar 20 15:34:49 UTC 2017


On Mon, Mar 20, 2017 at 4:28 PM, Logan Kuhn <support at jac-properties.com>
wrote:

> Yup, ovirttest1 ran out of disk space on Friday, we recovered it and
> everything seemed completely normal.
>
> the postgres service is down on the HEVM, but that is because it's on our
> postgresql cluster, has been for weeks.  I can connect to it's database
> from within the HEVM using the credentials stored at /etc/ovirt-engine/
> engine.conf.d/10-setup-database.conf  I can tail the logs on the postgres
> master and ovirt can and does connect to it.
>
> However, trying from ovirttest1 I cannot connect to the engine database
> using those same credentails, should I be able to?  It'd make sense to be
> able to connect to it....
>

It could depend on how you configured your pg_hba.conf for your DBMS
instance.
Only the engine and dwh have to connect to the engine VM, a direct DB
connection from the hosts is not required.


>
> Logan
>
> On Mon, Mar 20, 2017 at 10:14 AM, Alexander Wels <awels at redhat.com> wrote:
>
>> On Monday, March 20, 2017 9:14:51 AM EDT Logan Kuhn wrote:
>> > Starting at 1:09am on Saturday the Hosted Engine has been rebooting
>> because
>> > it failed it's liveliness check.  This is due to the webadmin not
>> loading.
>> > Nothing changed as far as I can tell on the engine since it's last
>> > successful reboot on Friday afternoon.
>> >
>> > The engine, dwhd and httpd are all up and do not seem to be reporting
>> > anything unusual in their respective logs.  The engine can talk to the
>> > database as I can login using the credentials in /etc/ovirt-engine/
>> engine.co
>> > nf.d/10-setup-database.conf and the logs on the postgres server are
>> showing
>> > activity.
>> >
>> > I tried to run engine-setup but it says it's not in global maintenance
>> even
>> > though the hosted engine hosts agree that it is.  We are on version
>> 4.0.6.3
>> >
>> > Server, engine and agent logs are attached
>> >
>> > Regards,
>> > Logan
>>
>> Looking at our logs, it appears that on Friday one of your hosts ran out
>> of
>> disk space in its logs or temp directory. At which point connectivity
>> started
>> to be spotty. I see a bunch of attempts to migrate VMs away from that host
>> (ovirttest1). All of them fail. That repeats a ton of times, I forwarded
>> to
>> Saturday where it appears you had a bunch of stale locks which also
>> repeates a
>> bunch of time until the engine VM gets restarted.
>>
>> Then I see nothing but restarts of the engine and no apparent errors in
>> the
>> engine log.
>>
>> The server log does however reveal this:
>> 2017-03-20 07:04:27,282 ERROR [org.quartz.core.ErrorLogger]
>> (QuartzOvirtDBScheduler_QuartzSchedulerThread) An error occurred while
>> scanning for the next triggers to fire.: org.quartz.JobPersistenceExcep
>> tion:
>> Failed to obtain DB connection from data source 'NMEngineDS':
>> java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/
>> ENGINEDataSourceNoJTA' java.sql.SQLException:
>> javax.resource.ResourceException: IJ000470: You are trying to use a
>> connection
>> factory that has been shut down: java:/ENGINEDataSourceNoJTA [See nested
>> exception: java.sql.SQLException: Could not retrieve datasource via JNDI
>> url
>> 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException:
>> javax.resource.ResourceException: IJ000470: You are trying to use a
>> connection
>> factory that has been shut down: java:/ENGINEDataSourceNoJTA]
>>         at
>> org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConn
>> ection(JobStoreCMT.java:
>> 168) [quartz.jar:]
>>         at
>> org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonMan
>> agedTXLock(JobStoreSupport.java:
>> 3807) [quartz.jar:]
>>         at
>> org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrig
>> gers(JobStoreSupport.java:
>> 2751) [quartz.jar:]
>>         at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThr
>> ead.java:
>> 264) [quartz.jar:]
>> Caused by: java.sql.SQLException: Could not retrieve datasource via JNDI
>> url
>> 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException:
>> javax.resource.ResourceException: IJ000470: You are trying to use a
>> connection
>> factory that has been shut down: java:/ENGINEDataSourceNoJTA
>>         at
>> org.quartz.utils.JNDIConnectionProvider.getConnection(JNDICo
>> nnectionProvider.java:
>> 163) [quartz.jar:]
>>         at
>> org.quartz.utils.DBConnectionManager.getConnection(DBConnect
>> ionManager.java:
>> 108) [quartz.jar:]
>>         at
>> org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConn
>> ection(JobStoreCMT.java:
>> 165) [quartz.jar:]
>>         ... 3 more
>>
>> Is your postgresql service running? That is the most likely source of the
>> engine not coming up.
>>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170320/2b0ce595/attachment.html>


More information about the Users mailing list