What do I mean by 'Hosts in Cluster B crashed':
I have (3) events (events during which I twinked with the Data Center A network) during
which I accumulated symptoms:
Event 1: A handful of the several dozen VMs in Cluster B Paused due to Storage Issues.
Restarted the VMs to restore service.
Event 2: Same
Event 3: One of the (3) Hosts in Cluster B rebooted (that's what I mean by
'crash'). gluster was unhappy also (two of the three Hosts also function as
Gluster Bricks) ... but that could be a byproduct.
At the start of Event 3, I put hosted-engine into global maintenance mode:
hosted-engine --set-maintenance --mode=global
Why? Because I was imagining that hosted-engine might perform some sort of connectivity
checks with its local IP gateway ... and if it couldn't reach it, then emit some sort
of 'shutdown' commands to *all* the KVM hosts it knows about (yes, I'm waiving
my hands a lot right here ... ergo my interest in reading about what kind of checks
ovirt-engine performs and what kind of remedial action it might take based on the results
of those checks).
You are suggesting that Cluster B depends, storage-wise, on Cluster A (or, more precisely,
on Storage located at Cluster A's site). That's where my thoughts turned
immediately ... but thus far, I don't see it in the pcaps I've gathering -- lots
of ovirt-engine traffic, but nothing else. More poking needed.
ovirt 3.5
glusterfs 3.7.6
I want to do more homework, to demonstrate that Cluster B has no storage dependency on
Data Center A.
But back to my original question: where might I go to better understand what kind of
checks ovirt-engine performs on KVM hosts and what kind of remedial action it might take,
based on the results of those checks?
--sk