Sure thing.
On engine host please find jboss pid. You can use this command:
ps -ef | grep jboss | grep -v grep | awk '{ print $2 }'
or jps tool from jdk. Sample output on my dev environment is:
± % jps
!2860
64853 jboss-modules.jar
196217 Jps
Then use jstack from jdk:
jstack <pid> > your_engine_thread_dump.txt
2 or 3 dumps taken in approximately 5 minutes intervals would be even more
useful.
Here you can find even more options
Artur
On Thu, Aug 6, 2020 at 3:15 PM Nardus Geldenhuys <nardusg(a)gmail.com> wrote:
Hi
Can create thread dump, please send details on howto.
Regards
Nardus
On Thu, 6 Aug 2020 at 14:17, Artur Socha <asocha(a)redhat.com> wrote:
> Hi Nardus,
> You might have hit an issue I have been hunting for some time ( [1] and
> [2] ).
> [1] could not be properly resolved because at a time was not able to
> recreate an issue on dev setup.
> I suspect [2] is related.
>
> Would you be able to prepare a thread dump from your engine instance?
> Additionally, please check for potential libvirt errors/warnings.
> Can you also paste the output of:
> sudo yum list installed | grep vdsm
> sudo yum list installed | grep ovirt-engine
> sudo yum list installed | grep libvirt
>
> Usually, according to previous reports, restarting the engine helps to
> restore connectivity with hosts ... at least for some time.
>
> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1845152
> [2]
https://bugzilla.redhat.com/show_bug.cgi?id=1846338
>
> regards,
> Artur
>
>
>
> On Thu, Aug 6, 2020 at 8:01 AM Nardus Geldenhuys <nardusg(a)gmail.com>
> wrote:
>
>> Also see this in engine:
>>
>> Aug 6, 2020, 7:37:17 AM
>> VDSM someserver command Get Host Capabilities failed: Message timeout
>> which can be caused by communication issues
>>
>> On Thu, 6 Aug 2020 at 07:09, Strahil Nikolov <hunter86_bg(a)yahoo.com>
>> wrote:
>>
>>> Can you fheck for errors on the affected host. Most probably you need
>>> the vdsm logs.
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> На 6 август 2020 г. 7:40:23 GMT+03:00, Nardus Geldenhuys <
>>> nardusg(a)gmail.com> написа:
>>> >Hi Strahil
>>> >
>>> >Hope you are well. I get the following error when I tried to confirm
>>> >reboot:
>>> >
>>> >Error while executing action: Cannot confirm 'Host has been
rebooted'
>>> >Host.
>>> >Valid Host statuses are "Non operational",
"Maintenance" or
>>> >"Connecting".
>>> >
>>> >And I can't put it in maintenance, only option is "restart"
or "stop".
>>> >
>>> >Regards
>>> >
>>> >Nar
>>> >
>>> >On Thu, 6 Aug 2020 at 06:16, Strahil Nikolov
<hunter86_bg(a)yahoo.com>
>>> >wrote:
>>> >
>>> >> After rebooting the node, have you "marked" it that it was
rebooted ?
>>> >>
>>> >> Best Regards,
>>> >> Strahil Nikolov
>>> >>
>>> >> На 5 август 2020 г. 21:29:04 GMT+03:00, Nardus Geldenhuys <
>>> >> nardusg(a)gmail.com> написа:
>>> >> >Hi oVirt land
>>> >> >
>>> >> >Hope you are well. Got a bit of an issue, actually a big issue.
We
>>> >had
>>> >> >some
>>> >> >sort of dip of some sort. All the VM's is still running, but
some of
>>> >> >the
>>> >> >hosts is show "Unassigned" or
"NonResponsive". So all the hosts was
>>> >> >showing
>>> >> >UP and was fine before our dip. So I did increase
>>> >vdsHeartbeatInSecond
>>> >> >to
>>> >> >240, no luck.
>>> >> >
>>> >> >I still get a timeout on the engine lock even thou I can connect
to
>>> >> >that
>>> >> >host from the engine using nc to test to port 54321. I also did
>>> >restart
>>> >> >vdsmd and also rebooted the host with no luck.
>>> >> >
>>> >> > nc -v someserver 54321
>>> >> >Ncat: Version 7.50 (
https://nmap.org/ncat )
>>> >> >Ncat: Connected to 172.40.2.172:54321.
>>> >> >
>>> >> >2020-08-05 20:20:34,256+02 ERROR
>>> >>
>>>
>>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> >> >(EE-ManagedThreadFactory-engineScheduled-Thread-70) []
EVENT_ID:
>>> >> >VDS_BROKER_COMMAND_FAILURE(10,802), VDSM someserver command Get
Host
>>> >> >Capabilities failed: Message timeout which can be caused by
>>> >> >communication
>>> >> >issues
>>> >> >
>>> >> >Any troubleshoot ideas will be gladly appreciated.
>>> >> >
>>> >> >Regards
>>> >> >
>>> >> >Nar
>>> >>
>>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>>
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/C4HB2J3MH76...
>>
>
>
> --
> Artur Socha
> Senior Software Engineer, RHV
> Red Hat
>