Doesn't cleaning sanlock lockspace require also to stop sanlock
itself?
I guess it's supposed to be able to handle this, but perhaps users want
to clean the lockspace because dirt there causes also problems with
sanlock, no?
Sanlock can be up, but the lockspace has to be unused.
So the only tool we have to clean metadata is
'–clean-metadata', which
works one-by-one?
Correct, it needs to acquire the lock first to make sure nobody is writing.
The dirty disk issue should not be happening anymore, we added an
equivalent of the DD to hosted engine setup. But we might have a bug there
of course.
Martin
On Wed, Apr 20, 2016 at 10:34 AM, Yedidyah Bar David <didi(a)redhat.com> wrote:
> On Wed, Apr 20, 2016 at 11:20 AM, Martin Sivak <msivak(a)redhat.com> wrote:
>>> after moving to global maintenance.
>>
>> Good point.
>>
>>> Martin - any advantage of this over '–reinitialize-lockspace'?
Besides
>>> that it works also in older versions? Care to add this to the howto page?
>>
>> Reinitialize lockspace clears the sanlock lockspace, not the metadata
>> file. Those are two different places.
>
So the only tool we have to clean metadata is
'–clean-metadata', which
works one-by-one?
>
Doesn't cleaning sanlock lockspace require also to stop sanlock
itself?
I guess it's supposed to be able to handle this, but perhaps users want
to clean the lockspace because dirt there causes also problems with
sanlock, no?
>
>>
>>> Care to add this to the howto page?
>>
>> Yeah, I can do that.
>
> Thanks!
>
>>
>> Martin
>>
>> On Wed, Apr 20, 2016 at 10:17 AM, Yedidyah Bar David <didi(a)redhat.com>
wrote:
>>> On Wed, Apr 20, 2016 at 11:11 AM, Martin Sivak <msivak(a)redhat.com>
wrote:
>>>>> Assuming you never deployed a host with ID 52, this is likely a
result of a
>>>>> corruption or dirt or something like that.
>>>>
>>>>> I see that you use FC storage. In previous versions, we did not clean
such
>>>>> storage, so you might have dirt left.
>>>>
>>>> This is the exact reason for an error like yours. Using dirty block
>>>> storage. Please stop all hosted engine tooling (both agent and broker)
>>>> and fill the metadata drive with zeros.
>>>
>>> after moving to global maintenance.
>>>
>>> Martin - any advantage of this over '–reinitialize-lockspace'?
Besides
>>> that it works also in older versions? Care to add this to the howto page?
>>> Thanks!
>>>
>>>>
>>>> You will have to find the proper hosted-engine.metadata file (which
>>>> will be a symlink) under /rhev:
>>>>
>>>> Example:
>>>>
>>>> [root@dev-03 rhev]# find . -name hosted-engine.metadata
>>>>
>>>>
./data-center/mnt/str-01.rhev.lab.eng.brq.redhat.com:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata
>>>>
>>>> [root@dev-03 rhev]# ls -al
>>>>
./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata
>>>>
>>>> lrwxrwxrwx. 1 vdsm kvm 201 Mar 15 15:00
>>>>
./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata
>>>> ->
/rhev/data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/images/6ab3f215-f234-4cd4-b9d4-8680767c3d99/dcbfa48d-8543-42d1-93dc-aa40855c4855
>>>>
>>>> And use (for example) dd if=/dev/zero of=/path/to/metadata bs=1M to
>>>> clean it - But be CAREFUL to not touch any other file or disk you
>>>> might find.
>>>>
>>>> Then restart the hosted engine tools and all should be fine.
>>>>
>>>>
>>>>
>>>> Martin
>>>>
>>>>
>>>> On Wed, Apr 20, 2016 at 8:20 AM, Yedidyah Bar David
<didi(a)redhat.com> wrote:
>>>>> On Wed, Apr 20, 2016 at 7:15 AM, Wee Sritippho
<wee.s(a)forest.go.th> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I used CentOS-7-x86_64-Minimal-1511.iso to install the hosts and
the engine.
>>>>>>
>>>>>> The 1st host and the hosted-engine were installed successfully,
but the 2nd
>>>>>> host failed with this error message:
>>>>>>
>>>>>> "Failed to execute stage 'Setup validation':
Metadata version 2 from host 52
>>>>>> too new for this agent (highest compatible version: 1)"
>>>>>
>>>>> Assuming you never deployed a host with ID 52, this is likely a
result of a
>>>>> corruption or dirt or something like that.
>>>>>
>>>>> What do you get on host 1 running 'hosted-engine
--vm-status'?
>>>>>
>>>>> I see that you use FC storage. In previous versions, we did not clean
such
>>>>> storage, so you might have dirt left. See also [1]. You can try
cleaning
>>>>> using [2].
>>>>>
>>>>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1238823
>>>>> [2]
https://www.ovirt.org/documentation/how-to/hosted-engine/#lockspace-corru...
>>>>>
>>>>>>
>>>>>> Here is the package versions:
>>>>>>
>>>>>> [root@host02 ~]# rpm -qa | grep ovirt
>>>>>> libgovirt-0.3.3-1.el7_2.1.x86_64
>>>>>> ovirt-vmconsole-1.0.0-1.el7.centos.noarch
>>>>>> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
>>>>>> ovirt-host-deploy-1.4.1-1.el7.centos.noarch
>>>>>> ovirt-hosted-engine-ha-1.3.5.1-1.el7.centos.noarch
>>>>>> ovirt-hosted-engine-setup-1.3.4.0-1.el7.centos.noarch
>>>>>> ovirt-release36-007-1.noarch
>>>>>> ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
>>>>>> ovirt-setup-lib-1.0.1-1.el7.centos.noarch
>>>>>>
>>>>>> [root@engine ~]# rpm -qa | grep ovirt
>>>>>> ovirt-engine-setup-base-3.6.4.1-1.el7.centos.noarch
>>>>>>
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
>>>>>> ovirt-engine-tools-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-host-deploy-1.4.1-1.el7.centos.noarch
>>>>>> ovirt-release36-007-1.noarch
>>>>>> ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
>>>>>> ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
>>>>>> ovirt-engine-extensions-api-impl-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-setup-lib-1.0.1-1.el7.centos.noarch
>>>>>> ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
>>>>>>
ovirt-engine-setup-plugin-websocket-proxy-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-vmconsole-1.0.0-1.el7.centos.noarch
>>>>>> ovirt-engine-backend-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-dbscripts-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-webadmin-portal-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-setup-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-3.6.4.1-1.el7.centos.noarch
>>>>>>
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-guest-agent-common-1.0.11-1.el7.noarch
>>>>>> ovirt-engine-wildfly-8.2.1-1.el7.x86_64
>>>>>> ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
>>>>>> ovirt-engine-websocket-proxy-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-restapi-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-engine-userportal-3.6.4.1-1.el7.centos.noarch
>>>>>>
ovirt-engine-setup-plugin-ovirt-engine-3.6.4.1-1.el7.centos.noarch
>>>>>> ovirt-image-uploader-3.6.0-1.el7.centos.noarch
>>>>>> ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
>>>>>> ovirt-engine-lib-3.6.4.1-1.el7.centos.noarch
>>>>>>
>>>>>>
>>>>>> Here are the log files:
>>>>>>
https://gist.github.com/weeix/1743f88d3afe1f405889a67ed4011141
>>>>>>
>>>>>> --
>>>>>> Wee
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users(a)ovirt.org
>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Didi
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users(a)ovirt.org
>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>>
>>> --
>>> Didi
>
>
>
> --
> Didi