On Tue, May 22, 2018 at 11:49 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, May 11, 2018 at 2:59 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hello,
I had an error during live storage migration of a disk.
The destination image was created but the process was not completed, because of a bug in the original version of sw.

sw?

I originally had this problem in an oVirt environment. If i'm not wrong it was 4.1.6 at that time.
The storage was iSCSI based.
It seems to me that in 4.1.9 it doesnt't verify anymore.
I had the same problem on a RHV environment based on FC-SAN.
I opened a case for it and there is this bugzilla where I provided information about RHV version where the probems has been solved:

 
 
Then I updated sw but if I try to run again the move of the same disk to the same destination storage domain I get 

VDSM command HSMGetAllTasksStatusesVDS failed: Cannot create Logical Volume: ('679c0725-75fb-4af7-bff1-7c447c5d789c', 'd2a89b5e-7d62-4695-96d8-b762ce52b379')

This engine log is not very useful. We need complete vdsm log to understand
what happened.
 
On destination storage domain, that is empty, from web admin gui I see only the 2 OVF_STORE disks.
From OS point of view using lvs I see the leftover LV that oVirt complains not able to create (I suppose because already existent due to the former error)

# lvs 679c0725-75fb-4af7-bff1-7c447c5d789c/d2a89b5e-7d62-4695-96d8-b762ce52b379
  LV                                   VG                                   Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  d2a89b5e-7d62-4695-96d8-b762ce52b379 679c0725-75fb-4af7-bff1-7c447c5d789c -wi------- 55.00g     

Using lvs -o+tags will give more useful info on this lv, like the image uuid and state of
the lv.
 
I know that I should use the "vdsClient -s 0 deleteVolume " command from the SPM host.

the syntax should be
# vdsClient -s 0 deleteVolume --help
Error using command: list index out of range 

deleteVolume
<sdUUID> <spUUID> <imgUUID> <volUUID>,...,<volUUID> <postZero> [<force>]
Deletes an volume if its a leaf. Else returns error

This is old tool that was deleted long time ago. If you are using 4.1 or later, you can 
use vdsm-client which is better documented and supported.
 
I have difficulties to do the exact mapping of the various elements.
Is it right what below?

sdUUID --> VG name

spUUID I can retrieve using:

# vdsClient -s 0 getStorageDomainInfo 679c0725-75fb-4af7-bff1-7c447c5d789c
        uuid = 679c0725-75fb-4af7-bff1-7c447c5d789c
        type = ISCSI
        vguuid = nkoZA2-nQOu-oeXX-Phpa-moqh-FWuR-AFAh4B
        metadataDevice = 36589cfc0000006dd999f5618bf759d3f
        state = OK
        version = 4
        role = Master
        vgMetadataDevice = 36589cfc0000006dd999f5618bf759d3f
        class = Data
        pool = ['5af30d59-004c-02f2-01c9-0000000000b8']
        name = ISCSI_400G

so spUUID is the pool --> 5af30d59-004c-02f2-01c9-0000000000b8 in my case ?

for imgUUID I don't know a command to retrieve.
in my case the target storage domain (ISCSI_400G) in this moment is the master one and I can see it under  /rhev/data-center/mnt/blockSD/
and so I find

# ll /rhev/data-center/mnt/blockSD/679c0725-75fb-4af7-bff1-7c447c5d789c/images/
total 4
drwxr-xr-x. 2 vdsm kvm 4096 May 10 15:39 530b3e7f-4ce4-4051-9cac-1112f5f9e8b5

So it seems to me in my case imgUUID is 530b3e7f-4ce4-4051-9cac-1112f5f9e8b5

But even if it is right in my particular case, how can I get in general?

The image uuid is in the lv tag. If you use:

lvs -o vg_name,lv_name,tags

You will find it as IU_<image-uuid>

volUUID ? Is it the LV name corresponding, so in my case d2a89b5e-7d62-4695-96d8-b762ce52b379 ?

The result of the vdsCLient command should be the removal of LV also?

Probably, but note that vdsClient (and vdsm-client) are not supported API.

Do you see this disk on engine side? it should be aware of this disk since it created
the disk during live storage migration.

Also, we should not have leftovers volumes after failed operations. Please file a bug
for this and attach both engine.log and vdsm.log on the host doing the live storage
migration.
 
Nir