Hi, I have 4.1 cluster with FC block storage and hosted engine. Last night a host went
unreachable due to a driver/firmware issue with the NIC card. The Engine spotted this, the
host was fenced and everything behaved as expected. However, it got me thinking - if the
affected host had been the one running the Engine, what would have happened? I'm
assuming the Engine would have failed liveness check on the other hosted engine hosts and
they would attempt to start the Engine. But as the "failed" host still had
access to the storage (I believe the HBA was still working) then they would not be able to
get a lock on the storage. In which case I'm in a catch-22, the Engine cannot fence
the failed host because its network is isolated, but the Engine cannot be restarted else
where until the failed host is fenced. At this point it requires human intervention to
fence the failed host. Is my understanding correct on this? If so is there any way to
mitigate this risk? Thanks, Alan