Am 6/3/2016 um 6:37 PM schrieb Nir Soffer:
On Fri, Jun 3, 2016 at 11:27 AM, InterNetX - Juergen Gotteswinter
<juergen.gotteswinter(a)internetx.com> wrote:
> What if we move all vm off the lun which causes this error, drop the lun
> and recreated it. Will we "migrate" the error with the VM to a different
> lun or could this be a fix?
This should will fix the ids file, but since we don't know why this corruption
happened, it may happen again.
i am pretty sure to know when / why this happend, after a major outage
with engine gone crazy in fencing hosts + crash / hard reset of the san
this messages occoured the first time.
but i can provide a log package, no problem
Please open a bug with the log I requested so we can investigate this
issue.
To fix the ids file you don't have to recreate the lun, just
initialize the ids lv.
1. Put the domain to maintenance (via engine)
No host should access it while you reconstruct the ids file
2. Activate the ids lv
You may need to connect to this iscsi target first, unless you have other
vgs connected on the same target.
lvchange -ay sd_uuid/ids
3. Initialize the lockspace
sanlock direct init -s <sd_uuid>:0:/dev/<sd_uuid>/ids:0
4. Deactivate the ids lv
lvchange -an sd_uuid/ids
6. Activate the domain (via engine)
The domain should become active after a while.
oh, this is great, going to announce an maintance window. Thanks a lot,
this already started to drive me crazy. Will Report after we did this!
Nir
>
> Am 6/3/2016 um 10:08 AM schrieb InterNetX - Juergen Gotteswinter:
>> Hello David,
>>
>> thanks for your explanation of those messages, is there any possibility
>> to get rid of this? i already figured out that it might be an corruption
>> of the ids file, but i didnt find anything about re-creating or other
>> solutions to fix this.
>>
>> Imho this occoured after an outage where several hosts, and the iscsi
>> SAN has been fenced and/or rebooted.
>>
>> Thanks,
>>
>> Juergen
>>
>>
>> Am 6/2/2016 um 6:03 PM schrieb David Teigland:
>>> On Thu, Jun 02, 2016 at 06:47:37PM +0300, Nir Soffer wrote:
>>>>> This is a mess that's been caused by improper use of storage, and
various
>>>>> sanity checks in sanlock have all reported errors for
"impossible"
>>>>> conditions indicating that something catastrophic has been done to
the
>>>>> storage it's using. Some fundamental rules are not being
followed.
>>>>
>>>> Thanks David.
>>>>
>>>> Do you need more output from sanlock to understand this issue?
>>>
>>> I can think of nothing more to learn from sanlock. I'd suggest tighter,
>>> higher level checking or control of storage. Low level sanity checks
>>> detecting lease corruption are not a convenient place to work from.
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
>>