
Hi Artur Please find attached, also let me know if I need to rerun. They 5 min apart [root@engine-aa-1-01 ovirt-engine]# ps -ef | grep jboss | grep -v grep | awk '{ print $2 }' 27390 [root@engine-aa-1-01 ovirt-engine]# jstack -F 27390 > your_engine_thread_dump_1.txt [root@engine-aa-1-01 ovirt-engine]# jstack -F 27390 > your_engine_thread_dump_2.txt [root@engine-aa-1-01 ovirt-engine]# jstack -F 27390 > your_engine_thread_dump_3.txt Regards Nar On Thu, 6 Aug 2020 at 15:55, Artur Socha <asocha@redhat.com> wrote:
Sure thing. On engine host please find jboss pid. You can use this command:
ps -ef | grep jboss | grep -v grep | awk '{ print $2 }'
or jps tool from jdk. Sample output on my dev environment is:
± % jps !2860 64853 jboss-modules.jar 196217 Jps
Then use jstack from jdk: jstack <pid> > your_engine_thread_dump.txt 2 or 3 dumps taken in approximately 5 minutes intervals would be even more useful.
Here you can find even more options https://www.baeldung.com/java-thread-dump
Artur
On Thu, Aug 6, 2020 at 3:15 PM Nardus Geldenhuys <nardusg@gmail.com> wrote:
Hi
Can create thread dump, please send details on howto.
Regards
Nardus
On Thu, 6 Aug 2020 at 14:17, Artur Socha <asocha@redhat.com> wrote:
Hi Nardus, You might have hit an issue I have been hunting for some time ( [1] and [2] ). [1] could not be properly resolved because at a time was not able to recreate an issue on dev setup. I suspect [2] is related.
Would you be able to prepare a thread dump from your engine instance? Additionally, please check for potential libvirt errors/warnings. Can you also paste the output of: sudo yum list installed | grep vdsm sudo yum list installed | grep ovirt-engine sudo yum list installed | grep libvirt
Usually, according to previous reports, restarting the engine helps to restore connectivity with hosts ... at least for some time.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1845152 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1846338
regards, Artur
On Thu, Aug 6, 2020 at 8:01 AM Nardus Geldenhuys <nardusg@gmail.com> wrote:
Also see this in engine:
Aug 6, 2020, 7:37:17 AM VDSM someserver command Get Host Capabilities failed: Message timeout which can be caused by communication issues
On Thu, 6 Aug 2020 at 07:09, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Can you fheck for errors on the affected host. Most probably you need the vdsm logs.
Best Regards, Strahil Nikolov
На 6 август 2020 г. 7:40:23 GMT+03:00, Nardus Geldenhuys < nardusg@gmail.com> написа:
Hi Strahil
Hope you are well. I get the following error when I tried to confirm reboot:
Error while executing action: Cannot confirm 'Host has been rebooted' Host. Valid Host statuses are "Non operational", "Maintenance" or "Connecting".
And I can't put it in maintenance, only option is "restart" or "stop".
Regards
Nar
On Thu, 6 Aug 2020 at 06:16, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
> After rebooting the node, have you "marked" it that it was rebooted ? > > Best Regards, > Strahil Nikolov > > На 5 август 2020 г. 21:29:04 GMT+03:00, Nardus Geldenhuys < > nardusg@gmail.com> написа: > >Hi oVirt land > > > >Hope you are well. Got a bit of an issue, actually a big issue. We had > >some > >sort of dip of some sort. All the VM's is still running, but some of > >the > >hosts is show "Unassigned" or "NonResponsive". So all the hosts was > >showing > >UP and was fine before our dip. So I did increase vdsHeartbeatInSecond > >to > >240, no luck. > > > >I still get a timeout on the engine lock even thou I can connect to > >that > >host from the engine using nc to test to port 54321. I also did restart > >vdsmd and also rebooted the host with no luck. > > > > nc -v someserver 54321 > >Ncat: Version 7.50 ( https://nmap.org/ncat ) > >Ncat: Connected to 172.40.2.172:54321. > > > >2020-08-05 20:20:34,256+02 ERROR >
>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >(EE-ManagedThreadFactory-engineScheduled-Thread-70) [] EVENT_ID: > >VDS_BROKER_COMMAND_FAILURE(10,802), VDSM someserver command Get Host > >Capabilities failed: Message timeout which can be caused by > >communication > >issues > > > >Any troubleshoot ideas will be gladly appreciated. > > > >Regards > > > >Nar >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/C4HB2J3MH76FI2...
-- Artur Socha Senior Software Engineer, RHV Red Hat
-- Artur Socha Senior Software Engineer, RHV Red Hat