[ovirt-users] Ovirt 3.5.x Power failure recovery

Raymond raymond at raymondwortel.nl
Mon Feb 9 09:50:57 UTC 2015


Hi,
Long time read-only user here :)
Unfortunately I need some info which I can't seem to find in the archives and on Google.

In the past 5 months we’ve had two very rare occasions of power failures in my hometown.
First one was a ms during spike or drop, just enough to cause a reboot, the other was a 30min failure.

Problem
After the power failure my ovirt nodes boot into CentOS 6.6 and are running fine.
In both cases my ovirt-engine-VM wouldn't "start" due to service Postgresql not starting.
I tried some things before reverting back to an older DB snapshot that was on the VM disk, but still not working.
Postgresql is not starting due to "Duplicate UUID"

My solution
Calm down girlfriend and don't sleep
Reinstall 1 node, create new ovirt-engine, recreate VM's and copy VM disk data from old store to new store.
Add second node

Other solutions I can think of
1. Start cluster without reinstall ;)
2. Buy UPS (€150 one time, extra €60 yearly power usage = €450 in 5 years)

At this moment I’m 1 click away from buying the UPS, but I prefer a more elegant solution.

After the last power failure I did NOT reinstall my second ovirt node and the old engine is also on storage available.
Is there someone that wants some data to troubleshoot/analyse?

Or is this just one of those things? Buy the UPS and get on with your life?

Short HW overview
2x ovirt 3.5.x (i3/32GB ram,1Gb eth VM network, dual x520 NFS eth)
1x NFS (i3/4GB ram, 5TB SAS and 6TB SATA on Dell PERC,1Gb eth mgmt, dual x520 10Gb)

1Gb is used for VM networking, the 10Gb is connected via DAC cables and runs NFS-storage.
This works great! Whole cluster runs below 120w and VM disk performance is around 700MB/s :)
I bought all the HW with power usage in mind, PicoPSU's and 35w CPU's in all nodes.
So adding 30w "extra" just for the UPS feels a bit like killing an elephant with tissues...

kind regards
Raymond



More information about the Users mailing list