[ovirt-users] [HELP] unmapping a deleted storage domain triggers crash ovirt

Andrea Ghelardi a.ghelardi at iontrading.com
Mon Nov 21 11:43:30 UTC 2016


Hello Nir,

Actually Ovirt3.5 host crashes even if the storage domain has been destroyed / removed from ovirt (hence ovirt has no knowledge of this storage and its mpath device and don’t bother about it). But I guess this is what you refer to when you write about “handling of unmapped luns”.

However, now that I caught you online, could you confirm that deleting an entity ( storage domain) from Ovirt should release the sanlock?
In other words: what is the supported way to delete an iscsi SAN volume?
My current approach is:

1)      Maintenance storage domain from ovirt4

2)      Detach+remove or just destroy storage domain from ovirt4

3)      Wait ~10 minutes

4)      Access SAN control panel and unmap+delete  associated volume.

5)      Eventually, access every host server and remove mpath device manually (multipath -f <device ID>)

SAN is a DELL Compellent SC040
Cheers
AG

From: Nir Soffer [mailto:nsoffer at redhat.com]
Sent: Thursday, November 17, 2016 18:59
To: Andrea Ghelardi <a.ghelardi at iontrading.com>
Cc: users at ovirt.org
Subject: Re: [ovirt-users] [HELP] unmapping a deleted storage domain triggers crash ovirt

On Thu, Nov 17, 2016 at 5:48 PM, Andrea Ghelardi <a.ghelardi at iontrading.com<mailto:a.ghelardi at iontrading.com>> wrote:
Hello again,

Just to provide an update for future reference, I’ve been unable to sort this problem out.
However, Ovirt4 does not show this failure.

Handling of unmapped luns was fixed in 3.6.

The best way is to upgrade to 3.6, and then upgrade to 4, and I think this is also
the only supported upgrade path.


My current approach to move VMs from one installation to another is:
1)            Shutdown VM in ovirt3.
2)            Maintenance VM storage - do you mean the storage server?
3)            Clone VM storage at LUN level
4)            Import cloned storage in ovirt4
5)            Import VM in ovirt4
If everything is ok:

You must put the storage domain to maintenance here.

Since we don't support yet removal of pv from a storage domain, you cannot
remove a lun from a storage, you must remove the entire storage domain.

6a)          detach + remove original VM storage in ovirt3
7a)          remove mappings for original VM storage LUN in ovirt3; this imply cluster instability (sanlock crash on hosts);
8a)          recover cluster; this implies shutting down host (and its VM) one by one
9a)          delete unmapped original LUN from storage manager

When Sanlock freeze on host, the only solution found so far is to reboot it. This has to be done on all host experiencing this problem (which basically means: all hosts in the cluster)

If sanlock freezed it means that you unmapped a lun that was used by sanlock.
This is not supported.


Cheers
AG


From: Andrea Ghelardi
Sent: Tuesday, November 08, 2016 16:40
To: users at ovirt.org<mailto:users at ovirt.org>
Subject: [HELP] unmapping a deleted storage domain triggers crash ovirt

Hello people,

something’s not right in my ovirt infrastructure.
I currently have two different ovirt installation:
Ovirt3: 7 hosts linked to Compellent iscsi storage running ovirt 3.5.6
Ovirt4: 4 hosts linked to (same) Compellent iscsi storage running ovirt 4.0.4

I’m currently moving my guests from ovirt3 to ovirt4.
Since iscsi storage is linked to both installations, my high level approach is:

1)      Shutdown VM in ovirt3.

2)      Maintenance + detach + remove VM storage in ovirt3

3)      Change LUN mapping via iscsi storage manager from ovirt3 to ovirt4

4)      Import storage in ovirt4

5)      Import VM in ovirt4

6)      Run and cheers with high grade liquor. GOTO step 1 for different VM.

Now, as soon as I perform step 3 (remove mappings from LUN), ovirt3 goes crazy and eventually forces me to reboot all hosts one by one.
I tried different low level approaches to unmap LUN mpath’ed from hosts with inconsistent results.
A notable error log extract is:
Nov  8 15:30:52 sovana vdsm root ERROR Process failed with rc=1 out='\nudevadm settle - timeout of 5 seconds reached, the event queue contains:\n  /sys/devices/virtual/block/dm-39 (8603)\n  /sys/devices/virtual/block/dm-39 (8604)\n  /sys
/devices/virtual/block/dm-39 (8605)\n  /sys/devices/virtual/block/dm-39 (8606)\n  /sys/devices/virtual/block/dm-39 (8607)\n  /sys/devices/virtual/block/dm-39 (8608)\n  /sys/devices/virtual/block/dm-39 (8609)\n  /sys/devices/virtual/block
/dm-39 (8610)\n  /sys/devices/virtual/block/dm-39 (8611)\n  /sys/devices/virtual/block/dm-39 (8612)\n  /sys/devices/virtual/block/dm-39 (8613)\n  /sys/devices/virtual/block/dm-39 (8614)\n  /sys/devices/virtual/block/dm-39 (8615)\n  /sys/
devices/virtual/block/dm-39 (8616)\n  /sys/devices/virtual/block/dm-39 (8617)\n  /sys/devices/virtual/block/dm-39 (8618)\n  /sys/devices/virtual/block/dm-39 (8619)\n  /sys/devices/virtual/block/dm-39 (8620)\n  /sys/devices/virtual/block/
dm-39 (8621)\n  /sys/devices/virtual/block/dm-39 (8622)\n  /sys/devices/virtual/block/dm-39 (8623)\n  /sys/devices/virtual/block/dm-39 (8624)\n  /sys/devices/virtual/block/dm-39 (8625)\n  /sys/devices/virtual/block/dm-39 (8626)\n  /sys/d
evices/virtual/block/dm-39 (8627)\n  /sys/devices/virtual/block/dm-39 (8628)\n  /sys/devices/virtual/block/dm-39 (8629)\n  /sys/devices/virtual/block/dm-39 (8630)\n  /sys/devices/virtual/block/dm-39 (8631)\n  /sys/devices/virtual/block/d
m-39 (8632)\n  /sys/devices/virtual/block/dm-39 (8633)\n  /sys/devices/virtual/block/dm-39 (8634)\n  /sys/devices/virtual/block/dm-39 (8635)\n  /sys/devices/virtual/block/dm-39 (8636)\n  /sys/devices/virtual/block/dm-39 (8637)\n  /sys/de
vices/virtual/block/dm-39 (8638)\n  /sys/devices/virtual/block/dm-39 (8639)\n  /sys/devices/virtual/block/dm-39 (8640)\n  /sys/devices/virtual/block/dm-39 (8641)\n  /sys/devices/virtual/block/dm-39 (8642)\n  /sys/devices/virtual/block/dm
-39 (8643)\n  /sys/devices/virtual/block/dm-39 (8644)\n  /sys/devices/virtual/block/dm-39 (8645)\n  /sys/devices/virtual/block/dm-39 (8646)\n  /sys/devices/virtual/block/d

I really need your help to sort this out as I’m actually blocked in my task.

Why mapping changes triggers ovirt crash on a storage which ovirt should not care about?
Thanks


Andrea Ghelardi

+39 050 2203 71 | www.iongroup.com<http://www.iongroup.com/> | a.ghelardi at iontrading.com<mailto:a.ghelardi at iontrading.com>
Via San Martino, 52 – 56125 Pisa - ITALY

This email and any attachments may contain information which is confidential and/or privileged. The information is intended exclusively for the addressee and the views expressed may not be official policy, but the personal views of the originator. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents is prohibited. If you have received this email and any file transmitted with it in error, please notify the sender by telephone or return email immediately and delete the material from your computer. Internet communications are not secure and ION Trading is not responsible for their abuse by third parties, nor for any alteration or corruption in transmission, nor for any damage or loss caused by any virus or other defect. ION Trading accepts no liability or responsibility arising out of or in any way connected to this email.

[iON_HBlu_small]
Automation through innovation


_______________________________________________
Users mailing list
Users at ovirt.org<mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161121/f63cadb0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 5300 bytes
Desc: image001.png
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161121/f63cadb0/attachment-0001.png>


More information about the Users mailing list