I had a crash yesterday in my ovirt cluster, which is made up of 3 nodes.
I just tried to add a new network, but the whole cluster crashed
I added a new network to my cluster, but while I was debugging the newswitch, when the
switch was poweroff, the node detected the network card status down and then moved to
Non-Operational state.
At this time all of 3 nodes moved to Non-Operational state.
All virtual machines have started automatic migration,When I received the alert email, all
virtual machines were suspended
In 15 minutes my newswitch were power up again.The 3 ovirt-nodes become active again, but
many virtual machines become unresponsive or suspended due to forced migration, and only a
few virtual machines are pulled up again due to cancelled migration
After I tried to terminate the migration tasks and restart ovirt-engine service, I was
still unable to restore most of the virtual machines, so I had to restart 3 ovirt-nodes to
restore my virtual machine
I didn't recover all the virtual machines until an hour later
Then I modify my migration policy to " Do Not Migrate Virtual Machines"
Which migration Policy do you recommend?
I'm afraid to use cluster...
zhouhao(a)vip.friendtimes.net