Hi,
i'm running ovirt-3.2.2-el6 on 18 el6 hosts with FC san storage, 46 HA
vms in 2 datacenters (3 hosts uses different storage with no
connectivity to first storage, that's why second DC)
Recently (2014-05-17) i had a double power outage: first blackout at
00:16, went back at ~00:19, second blackout at 00:26, went back at 10:06
When finally all went up (after approx. 10:16) - only 2 vms were
restarted from 46.
From browsing engine log i saw failed restart attemts of almost all vms
after first blackout with error 'Failed with error ENGINE and code
5001', but after second blackout i saw no attempts to restart vms, and
only error was 'connect timeout' (probably to srv5 - that host
physically died after blackouts).
And i cant figure why HA vms were not restarted? Please advice
engine and (supposedly) spm host logs in attach.
--
Yuriy Demchenko