Ovirt Manager - FATAL: the database system is in recovery mode

PACKAGE_NAME="ovirt-engine" PACKAGE_VERSION="3.6.2.6" PACKAGE_DISPLAY_VERSION="3.6.2.6-1.el6" OPERATING SYSTEM="Centos 6.7" Its been running error-free for over 3 years now. Over the past few weeks, The Ovirt manager application has been frequently losing connection with its own database residing on the same server. It is throwing up errors saying: “Caused by org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode”. The only means we have found to recover from this is to hard reboot the server, after which it works fine for a couple of days before throwing up these errors again. Here are some logs : Caused by: org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:293) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:32) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) at org.postgresql.Driver.makeConnection(Driver.java:393) at org.postgresql.Driver.connect(Driver.java:267) at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:322)

On Thu, Aug 8, 2019 at 12:54 PM Sameer Sardar <sameersardar2410@gmail.com> wrote:
PACKAGE_NAME="ovirt-engine" PACKAGE_VERSION="3.6.2.6" PACKAGE_DISPLAY_VERSION="3.6.2.6-1.el6" OPERATING SYSTEM="Centos 6.7"
Its been running error-free for over 3 years now. Over the past few weeks, The Ovirt manager application has been frequently losing connection with its own database residing on the same server. It is throwing up errors saying: “Caused by org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode”. The only means we have found to recover from this is to hard reboot the server, after which it works fine for a couple of days before throwing up these errors again.
Here are some logs :
Caused by: org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode
This error means that engine cannot connect to PostgreSQL database and for all connections database returns "the database system is in recovery mode" error message. So please take a look at /var/lib/pgsql, there should be pg_log directory with more detailed logs ... at
org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:293) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:32) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) at org.postgresql.Driver.makeConnection(Driver.java:393) at org.postgresql.Driver.connect(Driver.java:267) at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:322) _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CZZGTO25LKDYXV...
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.

Hi Martin, Thanks for your reply. I'm aware that there is some connectivity issue between PostgreSQL and engine. I've checked pg_logs but couldn't find the root cause of the problem. Please look into it, log file attached and suggest what to do. Thanks & Regards, Sameer Sardar On Fri, Aug 9, 2019 at 3:21 PM Martin Perina <mperina@redhat.com> wrote:
On Thu, Aug 8, 2019 at 12:54 PM Sameer Sardar <sameersardar2410@gmail.com> wrote:
PACKAGE_NAME="ovirt-engine" PACKAGE_VERSION="3.6.2.6" PACKAGE_DISPLAY_VERSION="3.6.2.6-1.el6" OPERATING SYSTEM="Centos 6.7"
Its been running error-free for over 3 years now. Over the past few weeks, The Ovirt manager application has been frequently losing connection with its own database residing on the same server. It is throwing up errors saying: “Caused by org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode”. The only means we have found to recover from this is to hard reboot the server, after which it works fine for a couple of days before throwing up these errors again.
Here are some logs :
Caused by: org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode
This error means that engine cannot connect to PostgreSQL database and for all connections database returns "the database system is in recovery mode" error message. So please take a look at /var/lib/pgsql, there should be pg_log directory with more detailed logs ...
at
org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:293) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:32) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) at org.postgresql.Driver.makeConnection(Driver.java:393) at org.postgresql.Driver.connect(Driver.java:267) at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:322) _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CZZGTO25LKDYXV...
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.

Hi Martin, Thanks for your reply. I'm aware that there is some connectivity issue between PostgreSQL and engine. I've checked pg_logs but couldn't find the root cause of the problem. Please look into it and suggest what to do. [root@manager pg_log]# tail -100 postgresql-Sun.log FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode LOG: database system is ready to accept connections LOG: autovacuum launcher started PANIC: corrupted item pointer: offset = 1664, size = 16 LOG: server process (PID 19115) was terminated by signal 6: Aborted LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing LOG: database system was interrupted; last known up at 2019-08-11 09:35:45 UTC LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 247/ED612898 LOG: record with zero length at 247/EDF74C18 LOG: redo done at 247/EDF74BE0 LOG: last completed transaction was at log time 2019-08-11 09:37:51.409331+00 FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode LOG: database system is ready to accept connections LOG: autovacuum launcher started Thanks & Regards, Sameer Sardar

Hi, I somehow missed your email, sorry :-( On Sun, Aug 11, 2019 at 2:45 PM Sameer Sardar <sameersardar2410@gmail.com> wrote:
Hi Martin, Thanks for your reply. I'm aware that there is some connectivity issue between PostgreSQL and engine.
I've checked pg_logs but couldn't find the root cause of the problem. Please look into it and suggest what to do.
[root@manager pg_log]# tail -100 postgresql-Sun.log FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode LOG: database system is ready to accept connections LOG: autovacuum launcher started PANIC: corrupted item pointer: offset = 1664, size = 16
Looks like you have corrupted PostgreSQL data, please check if you disk is not failing. Regarding the recovery, I'm not PostgreSQL expert, but you will most probably need to restore your database from backup. But please check on PostgreSQL mailing list, they might be able to help you LOG: server process (PID 19115) was terminated by signal 6: Aborted
LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing LOG: database system was interrupted; last known up at 2019-08-11 09:35:45 UTC LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 247/ED612898 LOG: record with zero length at 247/EDF74C18 LOG: redo done at 247/EDF74BE0 LOG: last completed transaction was at log time 2019-08-11 09:37:51.409331+00 FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode LOG: database system is ready to accept connections LOG: autovacuum launcher started
Thanks & Regards, Sameer Sardar _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/VHBLTRREK2EKY2...
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
participants (3)
-
Martin Perina
-
Sameer Sardar
-
sameer sardar