[ovirt-users] oVirt 3.4 - Hosted Engine: Cluster Reboot procedure
适兕
lijiangsheng1 at gmail.com
Sat Apr 26 01:30:52 UTC 2014
HI:
Thanks.
great work .
Why not update this to wiki page?
2014-04-25 21:08 GMT+08:00 Daniel Helgenberger <daniel.helgenberger at m-box.de
>:
> Hello ovirt-users,
>
> after playing around with my ovirt 3.4 hosted engine two node HA cluster
> I have devised a procedure on how to restart the whole cluster after a
> power loss / normal shutdown. This assumes all HA-Nodes have been taken
> offline. This also applies partly to rebooted HA nodes.
>
> Please feel free do ask questions and/or comment on improvements. Most
> of the things should be obsoleted by future updates anyway.
>
> Note 1:
> The problem IMHO seems to be the non connected nfs storage domain,
> resulting in the HA-Agent crash / hang. The ha-broker service should be
> up and running all the time. Please check this.
>
> Note 2:
> My setup consists of two nodes; 'all nodes' means the task has to be
> performed on every node HA node in the cluster.
>
> Node 3:
> By 'Login' I mean SSH or local access.
>
>
> Part A: SHUTDOWN THE CLUSTER
> Prerequisite: oVirt HE cluster running, should be taken offline for
> maintenance:
> 1. In oVirt, shutdown all VM's except HostedEngine.
> 2. Login to one cluster node and run 'hosted-engine
> --set-maintenance --mode=global' to put the cluster into global
> maintenance
> 3. Login to ovirt engine VM and shut it down with 'shutdown -h now'
> 4. Login to one cluster node and run 'hosted-engine --vm-status' to
> check if the engine is really down.
> 5. Shutdown all HA nodes subsequently.
>
>
> Part B: STARTING THE CLUSTER
> Prerequisite: oVirt HE cluster down, NFS storage server running and
> exporting the vdsm share.
> 1. Start all nodes and wait for them to boot up.
> 2. Login to one cluster node. Check the status of the following
> services: vdsm, ovirt-ha-agent, ovirt-ha-broker. The status
> should be all are running except ovirt-ha-agent is in 'locked'
> state and down.
> 3. Check 'hosted-engine --vm-status', this should result in a
> python stack trace (crash).
> 4. On all cluster nodes, connect the storage pool: 'hosted-engine
> --connect-storage'. Now, 'hosted-engine --vm-status' runs and
> reports 'up to date: False' and 'unknown-stale-data' for all
> nodes.
> 5. On all cluster nodes, start the 'ovirt-ha-agent' service:
> 'service ovirt-ha-agent start'
> 6. Wait a few minutes for the ha-broker and the agent to collect
> the cluster state.
> 7. Login to one cluster node. Check 'hosted-engine --vm-status'
> until you have cluster nodes 'status-up-to-date: True' and
> 'score: 2400'
> 8. If the cluster was shutdown by yourself and in global
> maintenance, remove the maintenance mode with 'hosted-engine
> --set-maintenance --mode=none'. Now, the system should do a FSM
> reinitialize and start the HostedEngine by itself.¹ If it was
> not in maintenance (eg. power fail) the engine should be started
> as soon as one host gets a score of 2400.
>
>
> Part C: STARTING A SINGLE NODE
> Prerequisite: oVirt HE cluster up, HostedEngine running. One ha node was
> taken offline by local maintenance in oVirt and rebooted.
> 1. Follow steps 1-5 of Part B
> 2. In oVirt, navigate to Cluster, Hosts and activate the node
> previously in maintenance.
>
> ---
> 1 I observed the following things:
> * If you use the command 'hosted-engine --vm-shutdown' instead of
> loging in to the ovirt HE and do a local shutdown, the Default
> Data Center is set to non - responsive and being Contented after
> the reboot. I highly suspect an unclean shutdown by running the
> command. Further, it waits about two min. with the shutdown.
> * If you use the command 'hosted-engine --vm-start' on a cluster
> in global maintenance, wait for successful start ({'health':
> 'good', 'vm': 'up', 'detail': 'up'}) and remove the maintenance
> status, the engine gets restarted once. By removing the
> maintenance first and letting ha-agent do the work, the engine
> is not restarted.
>
>
> Cheers,
> Daniel
> --
>
> Daniel Helgenberger
> m box bewegtbild GmbH
>
> P: +49/30/2408781-22
> F: +49/30/2408781-10
>
> ACKERSTR. 19
> D-10115 BERLIN
>
>
> www.m-box.de www.monkeymen.tv
>
> Geschäftsführer: Martin Retschitzegger / Michaela Göllner
> Handeslregister: Amtsgericht Charlottenburg / HRB 112767
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
--
独立之思想,自由之精神。
--陈寅恪
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140426/e15ba866/attachment-0001.html>
More information about the Users
mailing list