Hi,
Did anyone find much luck tracking this down? I rebooted one of our servers
and hit this issue again, conveniently, the dell remote access card has
borked as well.. so a 50 minute trip to the DC..
On Thu, Jun 19, 2014 at 10:10 AM, Bob Doolittle <bobddroid(a)gmail.com> wrote:
Specifically, if do the following:
- Enter global maintenance (hosted-engine --set-maintenance-mode
--mode=global)
- init 0 the engine
- systemctl stop ovirt-ha-agent ovirt-ha-broker libvirtd vdmsd
and then run "sanlock client status" I see:
# sanlock client status
daemon c715b5de-fd98-4146-a0b1-e9801179c768.xion2.smar
p -1 helper
p -1 listener
p -1 status
s
003510e8-966a-47e6-a5eb-3b5c8a6070a9:1:/rhev/data-center/mnt/xion2.smartcity.net\:_export_VM__NewDataDomain/003510e8-966a-47e6-a5eb-3b5c8a6070a9/dom_md/ids:0
s
18eeab54-e482-497f-b096-11f8a43f94f4:1:/rhev/data-center/mnt/xion2\:_export_vm_he1/18eeab54-e482-497f-b096-11f8a43f94f4/dom_md/ids:0
s
hosted-engine:1:/rhev/data-center/mnt/xion2\:_export_vm_he1/18eeab54-e482-497f-b096-11f8a43f94f4/ha_agent/hosted-engine.lockspace:0
Waiting a few minutes does not change this state.
The earlier data I shared which showed HostedEngine was with a different
test scenario.
-Bob
On 06/18/2014 07:53 AM, Bob Doolittle wrote:
I see I have a very unfortunate typo in my previous mail. As supported by
the vm-status output I attached, I had set --mode=global (not none) in step
1.
I am not the only one experiencing this. I can reproduce it easily. It
appears that shutting down vdsm causes the HA services to incorrectly think
the system has come out of Global Maintenance and restart the engine.
-Bob
On Jun 18, 2014 5:06 AM, "Federico Simoncelli" <fsimonce(a)redhat.com>
wrote:
> ----- Original Message -----
> > From: "Bob Doolittle" <bob(a)doolittle.us.com>
> > To: "Doron Fediuck" <dfediuck(a)redhat.com>, "Andrew
Lau" <
> andrew(a)andrewklau.com>
> > Cc: "users" <users(a)ovirt.org>, "Federico Simoncelli"
<
> fsimonce(a)redhat.com>
> > Sent: Saturday, June 14, 2014 1:29:54 AM
> > Subject: Re: [ovirt-users] Can HA Agent control NFS Mount?
> >
> >
> > But there may be more going on. Even if I stop vdsmd, the HA services,
> > and libvirtd, and sleep 60 seconds, I still see a lock held on the
> > Engine VM storage:
> >
> > daemon 6f3af037-d05e-4ad8-a53c-61627e0c2464.xion2.smar
> > p -1 helper
> > p -1 listener
> > p -1 status
> > s 003510e8-966a-47e6-a5eb-3b5c8a6070a9:1:/rhev/data-center/mnt/
>
xion2.smartcity.net
> \:_export_VM__NewDataDomain/003510e8-966a-47e6-a5eb-3b5c8a6070a9/dom_md/ids:0
> > s
>
hosted-engine:1:/rhev/data-center/mnt/xion2\:_export_vm_he1/18eeab54-e482-497f-b096-11f8a43f94f4/ha_agent/hosted-engine.lockspace:0
>
> This output shows that the lockspaces are still acquired. When you put
> hosted-engine
> in maintenance they must be released.
> One by directly using rem_lockspace (since it's the hosted-engine one)
> and the other
> one by stopMonitoringDomain.
>
> I quickly looked at the ovirt-hosted-engine* projects and I haven't found
> anything
> related to that.
>
> --
> Federico
>