[ovirt-users] Can HA Agent control NFS Mount?

Doron Fediuck dfediuck at redhat.com
Fri Jun 13 12:33:38 UTC 2014



----- Original Message -----
> From: "Andrew Lau" <andrew at andrewklau.com>
> To: "Bob Doolittle" <bob at doolittle.us.com>
> Cc: "users" <users at ovirt.org>
> Sent: Friday, June 6, 2014 6:14:18 AM
> Subject: Re: [ovirt-users] Can HA Agent control NFS Mount?
> 
> On Fri, Jun 6, 2014 at 1:09 PM, Bob Doolittle <bob at doolittle.us.com> wrote:
> > Thanks Andrew, I'll try this workaround tomorrow for sure. But reading
> > though that bug report (closed not a bug) it states that the problem should
> > only arise if something is not releasing a sanlock lease. So if we've
> > entered Global Maintenance and shut down Engine, the question is what's
> > holding the lease?
> >
> > How can that be debugged?
> 
> For me it's wdmd and sanlock itself failing to shutdown properly. I
> also noticed even when in global maintenance and the engine VM powered
> off there is still a sanlock lease for the
> /rhev/mnt/....hosted-engine/? lease file or something along those
> lines. So the global maintenance may not actually be releasing that
> lock.
> 
> I'm not too familiar with sanlock etc. So it's like stabbing in the dark :(
> 

Sounds like a bug since once the VM is off there should not
be a lease taken.

Please check if after a minute you still have a lease taken
according to: http://www.ovirt.org/SANLock#sanlock_timeouts

In this case try to stop vdsm and libvirt just so we'll know
who still keeps the lease.

> >
> > -Bob
> >
> > On Jun 5, 2014 10:56 PM, "Andrew Lau" <andrew at andrewklau.com> wrote:
> >>
> >> On Mon, May 26, 2014 at 5:10 AM, Bob Doolittle <bob at doolittle.us.com>
> >> wrote:
> >> >
> >> > On 05/25/2014 02:51 PM, Joop wrote:
> >> >>
> >> >> On 25-5-2014 19:38, Bob Doolittle wrote:
> >> >>>
> >> >>>
> >> >>> Also curious is that when I say "poweroff" it actually reboots and
> >> >>> comes
> >> >>> up again. Could that be due to the timeouts on the way down?
> >> >>>
> >> >> Ah, that's something my F19 host does too. Some more info: if engine
> >> >> hasn't been started on the host then I can shutdown it and it will
> >> >> poweroff.
> >> >> IF engine has been run on it then it will reboot.
> >> >> Its not vdsm (I think) because my shutdown sequence is (on my f19
> >> >> host):
> >> >>  service ovirt-agent-ha stop
> >> >>  service ovirt-agent-broker stop
> >> >>  service vdsmd stop
> >> >>  ssh root at engine01 "init 0"
> >> >> init 0
> >> >>
> >> >> I don't use maintenance mode because when I poweron my host (= my
> >> >> desktop)
> >> >> I want engine to power on automatically which it does most of the time
> >> >> within 10 min.
> >> >
> >> >
> >> > For comparison, I see this issue and I *do* use maintenance mode
> >> > (because
> >> > presumably that's the 'blessed' way to shut things down and I'm scared
> >> > to
> >> > mess this complex system up by straying off the beaten path ;). My
> >> > process
> >> > is:
> >> >
> >> > ssh root at engine "init 0"
> >> > (wait for "vdsClient -s 0 list | grep Status:" to show the vm as down)
> >> > hosted-engine --set-maintenance --mode=global
> >> > poweroff
> >> >
> >> > And then on startup:
> >> > hosted-engine --set-maintenance --mode=none
> >> > hosted-engine --vm-start
> >> >
> >> > There are two issues here. I am not sure if they are related or not.
> >> > 1. The NFS timeout during shutdown (Joop do you see this also? Or just
> >> > #2?)
> >> > 2. The system reboot instead of poweroff (which messes up remote machine
> >> > management)
> >> >
> >> > Thanks,
> >> >      Bob
> >> >
> >> >
> >> >> I think wdmd or sanlock are causing the reboot instead of poweroff
> >>
> >> While searching for my issue of wdmd/sanlock not shutting down, I
> >> found this which may interest you both:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=888197
> >>
> >> Specifically:
> >> "To shut down sanlock without causing a wdmd reboot, you can run the
> >> following command: "sanlock client shutdown -f 1"
> >>
> >> This will cause sanlock to kill any pid's that are holding leases,
> >> release those leases, and then exit.
> >> "
> >>
> >> >>
> >> >> Joop
> >> >>



More information about the Users mailing list