Dear oVirt users
I've been having several problems with my oVirt nodes utilising GlusterFS after simulating a power loss.
Our testing setup for oVirt consists of 3 nodes, 2 with GlusterFS storage and 1 just for computing. Everything seems to be setup correct and was working correctly. Then after simulating a power loss the master storage goes down, and therefore the VMs and data center go down as well.
I checked the GlusterFS configuration and it seems to be correct (see attachment). I checked the oVirt configuration and it seems to be correct (see attachment). I tried putting several Nodes in Maintenace several times, even putting in maintenance and reinstalling them. Only when putting the nodes in maintenance (and choosing to stop the GlusterFS service) triggers Contending on the other Nodes. Only then there is a chance that a Node goes from Contending to SPM, which does not always happen. After trying this several times I got a main Node to become SPM, but the master storage remains down. When I select the master storage, go in the Data center, select the data center and click Activate, then both the master storage and the data center go in the state Locked for a few seconds and then Inactive again.
Then I upgraded to oVirt 4 (oVirt Engine Version: 4.0.0.6-1.el7.centos) and tried everything again, resulting in the same result.
In the attachment I added relevant logs:
- GlusterFS service on both main nodes
- /var/log/vdsm/vdsm.log , /var/log/vdsm/supervdsm.log , /var/log/messages , /var/log/sanlock.log , /var/log/ovirt-engine/engine.log-20160725.gz , /var/log/ovirt-engine/engine.log of the oVirt engine
as attachments (I clicked on Activate on the data center at 9:45:30 am GMT). It can be possible that I need to send different engine logs, please tell me if necessary. Any help would be highly appreciated.
Kind regards
Robin Vanderveken