Sorry Strahil, that I didn't get back to you on this...
I dare not repeat the exercise, because I have no idea if I'll get out of such a
complete break-down again cleanly.
Since I don't have duplicate physical infrastructure to just test this behavior, I was
going to use a big machine to run a test farm nested.
Spent about a week in trying to get nesting work, but it ultimately failed to run the
overlay network of the hosted engine properly (separate post somewhere here).
And then I read somewhere in a response to a post way back here, that oVirt nested on
oVirt isn't only "not supported" but known (although not documented or
advertised) not to work at all.
So there went the chance to reproduce the issue...
What I find striking is that the 'original' oVirt or RHV from pre-gluster HCI
days, seems to support the notion of shutting down compute nodes when there isn't
enough workload to fill them in order to save energy. In a HCI environment that obviously
doesn't play well with the gluster storage nodes, but pure compute nodes should still
support cold standby.
Can't find any documentation on this, though.