[ovirt-users] Sanlock add Lockspace Errors

InterNetX - Juergen Gotteswinter jg at internetx.com
Sat Jun 4 10:10:01 UTC 2016


Am 6/3/2016 um 6:37 PM schrieb Nir Soffer:
> On Fri, Jun 3, 2016 at 11:27 AM, InterNetX - Juergen Gotteswinter
> <juergen.gotteswinter at internetx.com> wrote:
>> What if we move all vm off the lun which causes this error, drop the lun
>> and recreated it. Will we "migrate" the error with the VM to a different
>> lun or could this be a fix?
> 
> This should will fix the ids file, but since we don't know why this corruption
> happened, it may happen again.
>

i am pretty sure to know when / why this happend, after a major outage
with engine gone crazy in fencing hosts + crash / hard reset of the san
this messages occoured the first time.

but i can provide a log package, no problem


> Please open a bug with the log I requested so we can investigate this issue.
> 
> To fix the ids file you don't have to recreate the lun, just
> initialize the ids lv.
> 
> 1. Put the domain to maintenance (via engine)
> 
> No host should access it while you reconstruct the ids file
> 
> 2. Activate the ids lv
> 
> You may need to connect to this iscsi target first, unless you have other
> vgs connected on the same target.
> 
>     lvchange -ay sd_uuid/ids
> 
> 3. Initialize the lockspace
> 
>     sanlock direct init -s <sd_uuid>:0:/dev/<sd_uuid>/ids:0
> 
> 4. Deactivate the ids lv
> 
>     lvchange -an sd_uuid/ids
> 
> 6. Activate the domain (via engine)
> 
> The domain should become active after a while.
> 

oh, this is great, going to announce an maintance window. Thanks a lot,
this already started to drive me crazy. Will Report after we did this!

> Nir
> 
>>
>> Am 6/3/2016 um 10:08 AM schrieb InterNetX - Juergen Gotteswinter:
>>> Hello David,
>>>
>>> thanks for your explanation of those messages, is there any possibility
>>> to get rid of this? i already figured out that it might be an corruption
>>> of the ids file, but i didnt find anything about re-creating or other
>>> solutions to fix this.
>>>
>>> Imho this occoured after an outage where several hosts, and the iscsi
>>> SAN has been fenced and/or rebooted.
>>>
>>> Thanks,
>>>
>>> Juergen
>>>
>>>
>>> Am 6/2/2016 um 6:03 PM schrieb David Teigland:
>>>> On Thu, Jun 02, 2016 at 06:47:37PM +0300, Nir Soffer wrote:
>>>>>> This is a mess that's been caused by improper use of storage, and various
>>>>>> sanity checks in sanlock have all reported errors for "impossible"
>>>>>> conditions indicating that something catastrophic has been done to the
>>>>>> storage it's using.  Some fundamental rules are not being followed.
>>>>>
>>>>> Thanks David.
>>>>>
>>>>> Do you need more output from sanlock to understand this issue?
>>>>
>>>> I can think of nothing more to learn from sanlock.  I'd suggest tighter,
>>>> higher level checking or control of storage.  Low level sanity checks
>>>> detecting lease corruption are not a convenient place to work from.
>>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>




More information about the Users mailing list