On Thu, Jan 31, 2019 at 2:48 PM Nir Soffer <nsoffer(a)redhat.com> wrote:
On Thu, Jan 31, 2019 at 2:52 PM Strahil Nikolov
<hunter86_bg(a)yahoo.com>
wrote:
> Dear Nir,
>
> the issue with the 'The method does not exist or is not available:
> {'method': u'GlusterHost.list'}, code = -32601' is not related to
the
> sanlock. I don't know why the 'vdsm-gluster' package was not installed as
a
> dependency.
>
Please file a bug about this.
> Can you share your sanlock log?
> >
> I'm attaching the contents of /var/log , but here is a short snippet:
>
> About the sanlock issue - it reappeared with errors like :
> 2019-01-31 13:33:10 27551 [17279]: leader1 delta_acquire_begin error -223
> lockspace hosted-engine host_id 1
>
As I said, the error is not -233, but -223, which make sense - this error
means sanlock did not
find the magic number for a delta lease area, which means the area was not
formatted, or
corrupted.
> 2019-01-31 13:33:10 27551 [17279]: leader2 path
>
/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fed8ac83b
> offset 0
> 2019-01-31 13:33:10 27551 [17279]: leader3 m 0 v 30003 ss 512 nh 0 mh 1
> oi 0 og 0 lv 0
> 2019-01-31 13:33:10 27551 [17279]: leader4 sn hosted-engine rn ts 0 cs
> 60346c59
> 2019-01-31 13:33:11 27551 [21482]: s6 add_lockspace fail result -223
> 2019-01-31 13:33:16 27556 [21482]: s7 lockspace
>
hosted-engine:1:/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74/2c74697a-8bd9-4472-8a98-bf624f3462d5/411b6cee-5b01-47ca-8c28-bb1fe
> d8ac83b:0
>
>
> I have managed to fix it by running the following immediately after the
> ha services were started by ansible:
>
> cd
>
/rhev/data-center/mnt/glusterSD/ovirt1.localdomain\:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/ha_agent/
>
This is not a path managed by vdsm, so I guess the issue is with hosted
enigne
specific lockspace that is managed by hosted engine, not by vdsm.
> sanlock direct init -s hosted-engine:0:hosted-engine.lockspace:0
>
This formats the lockspace, and is expected to fix this issue.
> systemctl stop ovirt-ha-agent ovirt-ha-broker
> systemctl status vdsmd
> systemctl start ovirt-ha-broker ovirt-ha-agent
>
> Once the VM started - ansible managed to finish the deployment without
> any issues.
> I hope someone can check the sanlock init stuff , as it is really
> frustrating.
>
I'd suggest to avoid directly playing with the managed in the middle of the
deployment to avoid further issues.
If I understand the flow correctly, you create a new environment from
scratch, so this is
an issue with hosted engine deploymnet, not initializing the lockspace.
I think filing a bug with the info in this thread is the first step.
Simone, can you take a look at this?
On our CI env everything is working as expected and the lockspace volume
got initialised as expected.
In the attached logs a log of steps got skipped since a lot of things were
already up and running so they are not really useful.
Strahil, can you please retry on a really clean environment and eventually
attach the relevant logs if you are able to reproduce the issue?