----- Original Message -----
From: "Eli Mesika" <emesika(a)redhat.com>
To: "users" <users(a)ovirt.org>
Cc: "Martin Perina" <mperina(a)redhat.com>
Sent: Thursday, September 18, 2014 10:44:43 AM
Subject: [OVIRT-3.5-TEST-DAY-3] : fence kdump integration
I was testing fence kdump integration.
Due to the following issues I was not able to test over JSON RPC and all
tests were done with XML RPC
1) Manually applied :
http://gerrit.ovirt.org/#/c/32843 (this was not
included yet
in 3.5 and cause an exception in VDSM when fenceNode is called)
2) JSON : does not return status correctly : Test Succeeded, null
Therefor , can not stop/start/restart host
Related to engine patch :
http://gerrit.ovirt.org/#/c/32855/
not back-ported yet to 3.5 (blocker IMO)
--------------------------------------------------------------
ALL tests below should be redone after resolving JSON issues !
--------------------------------------------------------------
-------------------
General comment
-------------------
All fence kdump flows are related to a scenario in which host became non
responsive.
In the case of manual PM action (stop/restart) fence kdump is not taking
place.
I think this should be fully documented in order to prevent misunderstanding
when the kdump flag
is checked and host is rebooted/stopped manual while kdumping' which will
result in losing the dump file.
If host is in status Kdumping, then user cannot execute manual PM actions
on this host.
*********************************
Tests using XML RPC
*********************************
------------------------
Installation tests :
------------------------
1) Adding a host with Detect kdump flow set to on and without crashkernel
command line parameter
Result: host installation is OK, but warning message is displayed in
Events tab and Audit log
TEST PASSED
2) Adding a host with Detect kdump flow set to on, with crashkernel command
line parameter, but without
required version of kexec-tools package
Result: host installation is OK, but warning message is displayed in
Events tab and Audit log
TEST PASSED
3) Adding a host with Detect kdump flow set to on, with crashkernel command
line parameter and with required
version of kexec-tools package
Result: host installation is OK, in General tab of host detail view you
should see Kdump Status: Enabled
TEST PASSED
------------------------
Kdump detection tests:
------------------------
1) Crashdumping a host with kdump detection disabled
Prerequisites: host was successfully deployed with Detect kdump flow set
to off, fence_kdump listener
is running
Result: Host changes its status Up -> Connecting -> Non Responsive ->
Reboot -> Non Responsive -> Up,
hard fencing is executed
TEST PASSED
2) Crashdumping a host with kdump detection enabled
Prerequisites: host was successfully deployed with Detect kdump flow set
to on, fence_kdump listener
is running
Result: Host changes its status Up -> Connecting -> Non Responsive ->
Kdumping -> Non Responsive -> Up,
hard fencing is not executed, there are messages in Events tab Kdump flow
detected on host and Kdump
flow finished on host
TEST PASSED
3) Crashdumping a host with kdump detection enabled but fence_kdump listener
down
Prerequisites: host was successfully deployed with Detect kdump flow set
to on, fence_kdump listener
is not running
Result: Host changes its status Up -> Connecting -> Non Responsive ->
Reboot -> Non Responsive -> Up,
hard fencing is executed, there's message in Events tab Kdump detection
for host had started,
but fence_kdump listener is not running
TEST PASSED
4) Host with kdump detection enabled, fence_kdump listener is running, but
network between engine
and host is down
Prerequisites: host was successfully deployed with Detect kdump flow set
to on, fence_kdump listener
is running, alter firewall rules on engine to drop everything coming from
host's IP address
Result: Host changes its status Up -> Connecting -> Non Responsive ->
Reboot -> Non Responsive -> Up,
hard fencing is executed, there's message in Events tab Kdump flow not
detected on host
TEST PASSED
5) Crashdumping a host with kdump detection enabled, fence_kdump listener is
running, stop fence_kdump
listener during kdump
Prerequisites: host was successfully deployed with Detect kdump flow set
to on, fence_kdump listener
is running
Actions: When host status is changed to Kdumping, stop fence_kdump
listener
Result: Host changes its status Up -> Connecting -> Non Responsive ->
Kdumping -> Reboot -> Non Responsive
-> Up, hard fencing is executed, there are messages in Events tab Kdump
flow detected on host and Kdump
detection for host had started, but fence_kdump listener is not running
TEST PASSED I got this message in event log :
Unable to determine if Kdump is in progress on host
'pluto-vdsf', because fence_kdump listener is not
running.
Is this OK ?
Yes, that's the correct error message
6) Crashdumping a host with kdump detection enabled, fence_kdump listener is
running,
restart engine during kdump
Prerequisites: host was successfully deployed with Detect kdump flow set
to on,
fence_kdump listener is running
Actions: When host status is changed to Kdumping, restart engine
Result: Host changes its status Up -> Connecting -> Non Responsive ->
Kdumping, hard fencing is not
executed, there are messages in Events tab Kdump flow detected on host,
after engine restart host
stays in Kdumping status for the period of DisableFenceAtStartupInSec
seconds, after that there
are messages in Events tab Kdump flow detected on host and Kdump flow
finished on host and
changes status Kdumping -> Non Responsive -> Up
TEST PASSED I got only this message in event log :
Kdump flow is in progress on host 'pluto-vdsf'.
Is this OK?
Yes, even when engine was restarted during host dumping operation, it was able
to switch host from Kdumping to Non Responsive (after fencing disabled after
startup timeout) to Up (after establishing communication with host)
Thanks
Eli Mesika
Thanks a lot for testing.
Martin Perina