[ovirt-users] [OVIRT-3.5-TEST-DAY-3] : fence kdump integration

Martin Perina mperina at redhat.com
Thu Sep 18 09:03:22 UTC 2014



----- Original Message -----
> From: "Eli Mesika" <emesika at redhat.com>
> To: "users" <users at ovirt.org>
> Cc: "Martin Perina" <mperina at redhat.com>
> Sent: Thursday, September 18, 2014 10:44:43 AM
> Subject: [OVIRT-3.5-TEST-DAY-3] : fence kdump integration
> 
> 
> I was testing fence kdump integration.
> 
> Due to the following issues I was not able to test over JSON RPC and all
> tests were done with XML RPC
> 
> 1) Manually applied : http://gerrit.ovirt.org/#/c/32843 (this was not
> included yet
>    in 3.5 and cause an exception in VDSM when fenceNode is called)
> 
> 2) JSON : does not return status correctly : Test Succeeded, null
>    Therefor , can not stop/start/restart host
>    Related to engine patch : http://gerrit.ovirt.org/#/c/32855/
>    not back-ported yet to 3.5 (blocker IMO)
> 
> --------------------------------------------------------------
> ALL tests below should be redone after resolving JSON issues !
> --------------------------------------------------------------
> 
> -------------------
> General comment
> -------------------
> 
> All fence kdump flows are related to a scenario in which host became non
> responsive.
> In the case of manual PM action (stop/restart) fence kdump is not taking
> place.
> I think this should be fully documented in order to prevent misunderstanding
> when the kdump flag
> is checked and host is rebooted/stopped manual while kdumping' which will
> result in losing the dump file.

If host is in status Kdumping, then user cannot execute manual PM actions
on this host.

> 
> 
> *********************************
> Tests using XML RPC
> *********************************
> 
> ------------------------
> Installation tests :
> ------------------------
> 
> 1) Adding a host with Detect kdump flow set to on and without crashkernel
> command line parameter
>    Result: host installation is OK, but warning message is displayed in
>    Events tab and Audit log
> 
>    TEST PASSED
> 
> 2) Adding a host with Detect kdump flow set to on, with crashkernel command
> line parameter, but without
>    required version of kexec-tools package
>    Result: host installation is OK, but warning message is displayed in
>    Events tab and Audit log
> 
>    TEST PASSED
> 
> 3) Adding a host with Detect kdump flow set to on, with crashkernel command
> line parameter and with required
>    version of kexec-tools package
>    Result: host installation is OK, in General tab of host detail view you
>    should see Kdump Status: Enabled
> 
>    TEST PASSED
> 
> ------------------------
> Kdump detection tests:
> ------------------------
> 
> 1) Crashdumping a host with kdump detection disabled
> 
>     Prerequisites: host was successfully deployed with Detect kdump flow set
>     to off, fence_kdump listener
>     is running
>     Result: Host changes its status Up -> Connecting -> Non Responsive ->
>     Reboot -> Non Responsive -> Up,
>     hard fencing is executed
> 
>     TEST PASSED
> 
> 2) Crashdumping a host with kdump detection enabled
> 
>     Prerequisites: host was successfully deployed with Detect kdump flow set
>     to on, fence_kdump listener
>     is running
>     Result: Host changes its status Up -> Connecting -> Non Responsive ->
>     Kdumping -> Non Responsive -> Up,
>     hard fencing is not executed, there are messages in Events tab Kdump flow
>     detected on host and Kdump
>     flow finished on host
> 
>     TEST PASSED
> 
> 3) Crashdumping a host with kdump detection enabled but fence_kdump listener
> down
> 
>     Prerequisites: host was successfully deployed with Detect kdump flow set
>     to on, fence_kdump listener
>     is not running
>     Result: Host changes its status Up -> Connecting -> Non Responsive ->
>     Reboot -> Non Responsive -> Up,
>     hard fencing is executed, there's message in Events tab Kdump detection
>     for host had started,
>     but fence_kdump listener is not running
> 
>     TEST PASSED
> 
> 
> 4) Host with kdump detection enabled, fence_kdump listener is running, but
> network between engine
>    and host is down
> 
>     Prerequisites: host was successfully deployed with Detect kdump flow set
>     to on, fence_kdump listener
>     is running, alter firewall rules on engine to drop everything coming from
>     host's IP address
>     Result: Host changes its status Up -> Connecting -> Non Responsive ->
>     Reboot -> Non Responsive -> Up,
>     hard fencing is executed, there's message in Events tab Kdump flow not
>     detected on host
> 
>     TEST PASSED
> 
> 5) Crashdumping a host with kdump detection enabled, fence_kdump listener is
> running, stop fence_kdump
>    listener during kdump
> 
>     Prerequisites: host was successfully deployed with Detect kdump flow set
>     to on, fence_kdump listener
>     is running
>     Actions: When host status is changed to Kdumping, stop fence_kdump
>     listener
>     Result: Host changes its status Up -> Connecting -> Non Responsive ->
>     Kdumping -> Reboot -> Non Responsive
>     -> Up, hard fencing is executed, there are messages in Events tab Kdump
>     flow detected on host and Kdump
>     detection for host had started, but fence_kdump listener is not running
> 
>     TEST PASSED           I got this message in event log :
>                           Unable to determine if Kdump is in progress on host
>                           'pluto-vdsf', because fence_kdump listener is not
>                           running.
>                           Is this OK ?

Yes, that's the correct error message

> 
> 
> 6) Crashdumping a host with kdump detection enabled, fence_kdump listener is
> running,
>    restart engine during kdump
> 
>     Prerequisites: host was successfully deployed with Detect kdump flow set
>     to on,
>     fence_kdump listener is running
>     Actions: When host status is changed to Kdumping, restart engine
>     Result: Host changes its status Up -> Connecting -> Non Responsive ->
>     Kdumping, hard fencing is not
>     executed, there are messages in Events tab Kdump flow detected on host,
>     after engine restart host
>     stays in Kdumping status for the period of DisableFenceAtStartupInSec
>     seconds, after that there
>     are messages in Events tab Kdump flow detected on host and Kdump flow
>     finished on host and
>     changes status Kdumping -> Non Responsive -> Up
> 
>     TEST PASSED           I got only this message in event log :
>                           Kdump flow is in progress on host 'pluto-vdsf'.
>                           Is this OK?

Yes, even when engine was restarted during host dumping operation, it was able
to switch host from Kdumping to Non Responsive (after fencing disabled after
startup timeout) to Up (after establishing communication with host)

> 
> Thanks
> Eli Mesika
> 
> 

Thanks a lot for testing.

Martin Perina



More information about the Users mailing list