On Monday, March 20, 2017 11:34:49 AM EDT Simone Tiraboschi wrote:
On Mon, Mar 20, 2017 at 4:28 PM, Logan Kuhn
<support(a)jac-properties.com>
wrote:
> Yup, ovirttest1 ran out of disk space on Friday, we recovered it and
> everything seemed completely normal.
>
> the postgres service is down on the HEVM, but that is because it's on our
> postgresql cluster, has been for weeks. I can connect to it's database
> from within the HEVM using the credentials stored at /etc/ovirt-engine/
> engine.conf.d/10-setup-database.conf I can tail the logs on the postgres
> master and ovirt can and does connect to it.
>
> However, trying from ovirttest1 I cannot connect to the engine database
> using those same credentails, should I be able to? It'd make sense to be
> able to connect to it....
It could depend on how you configured your pg_hba.conf for your DBMS
instance.
Only the engine and dwh have to connect to the engine VM, a direct DB
connection from the hosts is not required.
I looked a little closer in the server.log. This in particular stood out:
2017-03-20 07:04:27,282 ERROR [org.quartz.core.ErrorLogger]
(QuartzOvirtDBScheduler_QuartzSchedulerThread) An error occurred while
scanning for the next triggers to fire.: org.quartz.JobPersistenceException:
Failed to obtain DB connection from data source 'NMEngineDS':
java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/
ENGINEDataSourceNoJTA' java.sql.SQLException:
javax.resource.ResourceException: IJ000470: You are trying to use a connection
factory that has been shut down: java:/ENGINEDataSourceNoJTA [See nested
exception: java.sql.SQLException: Could not retrieve datasource via JNDI url
'java:/ENGINEDataSourceNoJTA' java.sql.SQLException:
javax.resource.ResourceException: IJ000470: You are trying to use a connection
factory that has been shut down: java:/ENGINEDataSourceNoJTA]
Grepping the code looks like NMEngineDS has something to do with the
scheduler. Which is completely out of my realm of knowledge, so I can't help
there.
> Logan
>
> On Mon, Mar 20, 2017 at 10:14 AM, Alexander Wels <awels(a)redhat.com> wrote:
>> On Monday, March 20, 2017 9:14:51 AM EDT Logan Kuhn wrote:
>> > Starting at 1:09am on Saturday the Hosted Engine has been rebooting
>>
>> because
>>
>> > it failed it's liveliness check. This is due to the webadmin not
>>
>> loading.
>>
>> > Nothing changed as far as I can tell on the engine since it's last
>> > successful reboot on Friday afternoon.
>> >
>> > The engine, dwhd and httpd are all up and do not seem to be reporting
>> > anything unusual in their respective logs. The engine can talk to the
>> > database as I can login using the credentials in /etc/ovirt-engine/
>>
>> engine.co
>>
>> > nf.d/10-setup-database.conf and the logs on the postgres server are
>>
>> showing
>>
>> > activity.
>> >
>> > I tried to run engine-setup but it says it's not in global maintenance
>>
>> even
>>
>> > though the hosted engine hosts agree that it is. We are on version
>>
>> 4.0.6.3
>>
>> > Server, engine and agent logs are attached
>> >
>> > Regards,
>> > Logan
>>
>> Looking at our logs, it appears that on Friday one of your hosts ran out
>> of
>> disk space in its logs or temp directory. At which point connectivity
>> started
>> to be spotty. I see a bunch of attempts to migrate VMs away from that
>> host
>> (ovirttest1). All of them fail. That repeats a ton of times, I forwarded
>> to
>> Saturday where it appears you had a bunch of stale locks which also
>> repeates a
>> bunch of time until the engine VM gets restarted.
>>
>> Then I see nothing but restarts of the engine and no apparent errors in
>> the
>> engine log.
>>
>> The server log does however reveal this:
>> 2017-03-20 07:04:27,282 ERROR [org.quartz.core.ErrorLogger]
>> (QuartzOvirtDBScheduler_QuartzSchedulerThread) An error occurred while
>> scanning for the next triggers to fire.: org.quartz.JobPersistenceExcep
>> tion:
>> Failed to obtain DB connection from data source 'NMEngineDS':
>> java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/
>> ENGINEDataSourceNoJTA' java.sql.SQLException:
>> javax.resource.ResourceException: IJ000470: You are trying to use a
>> connection
>> factory that has been shut down: java:/ENGINEDataSourceNoJTA [See nested
>> exception: java.sql.SQLException: Could not retrieve datasource via JNDI
>> url
>> 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException:
>> javax.resource.ResourceException: IJ000470: You are trying to use a
>> connection
>> factory that has been shut down: java:/ENGINEDataSourceNoJTA]
>>
>> at
>>
>> org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConn
>> ection(JobStoreCMT.java:
>> 168) [quartz.jar:]
>>
>> at
>>
>> org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonMan
>> agedTXLock(JobStoreSupport.java:
>> 3807) [quartz.jar:]
>>
>> at
>>
>> org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrig
>> gers(JobStoreSupport.java:
>> 2751) [quartz.jar:]
>>
>> at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThr
>>
>> ead.java:
>> 264) [quartz.jar:]
>> Caused by: java.sql.SQLException: Could not retrieve datasource via JNDI
>> url
>> 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException:
>> javax.resource.ResourceException: IJ000470: You are trying to use a
>> connection
>> factory that has been shut down: java:/ENGINEDataSourceNoJTA
>>
>> at
>>
>> org.quartz.utils.JNDIConnectionProvider.getConnection(JNDICo
>> nnectionProvider.java:
>> 163) [quartz.jar:]
>>
>> at
>>
>> org.quartz.utils.DBConnectionManager.getConnection(DBConnect
>> ionManager.java:
>> 108) [quartz.jar:]
>>
>> at
>>
>> org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConn
>> ection(JobStoreCMT.java:
>> 165) [quartz.jar:]
>>
>> ... 3 more
>>
>> Is your postgresql service running? That is the most likely source of the
>> engine not coming up.
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users