
On Wed, Aug 15, 2018 at 10:30 PM Алексей Максимов < aleksey.i.maksimov@yandex.ru> wrote:
Hello Nir
To confirm this theory, please share the output of: Top volume: dd if=/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/metadata bs=512 count=1 skip=16 iflag=direct
DOMAIN=6db73566-0f7f-4438-a9ef-6815075f45ea CTIME=1533083673 FORMAT=COW DISKTYPE=DATA LEGALITY=LEGAL SIZE=62914560 VOLTYPE=LEAF DESCRIPTION= IMAGE=cdf1751b-64d3-42bc-b9ef-b0174c7ea068 PUUID=208ece15-1c71-46f2-a019-6a9fce4309b2 MTIME=0 POOL_UUID= TYPE=SPARSE GEN=0 EOF 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.000348555 s, 1.5 MB/s
Base volume: dd if=/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/metadata bs=512 count=1 skip=23 iflag=direct
DOMAIN=6db73566-0f7f-4438-a9ef-6815075f45ea CTIME=1512474404 FORMAT=COW DISKTYPE=2 LEGALITY=LEGAL SIZE=62914560 VOLTYPE=INTERNAL DESCRIPTION={"DiskAlias":"KOM-APP14_Disk1","DiskDescription":""} IMAGE=cdf1751b-64d3-42bc-b9ef-b0174c7ea068 PUUID=00000000-0000-0000-0000-000000000000 MTIME=0 POOL_UUID= TYPE=SPARSE GEN=0 EOF 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.00031362 s, 1.6 MB/s
Deleted volume?: dd if=/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/metadata bs=512 count=1 skip=15 iflag=direct
NONE=###################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################### EOF 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.000350361 s, 1.5 MB/s
This confirms that 6db73566-0f7f-4438-a9ef-6815075f45ea/4974a4cc-b388-456f-b98e-19d2158f0d58 is a deleted volume. To fix this VM, please remove this volume. Run these commands on the SPM host: systemctl stop vdsmd lvremove 6db73566-0f7f-4438-a9ef-6815075f45ea/4974a4cc-b388-456f-b98e-19d2158f0d58 systemctl start vdsmd You should be able to create snapshot after that.
15.08.2018, 21:09, "Nir Soffer" <nsoffer@redhat.com>:
On Wed, Aug 15, 2018 at 6:14 PM Алексей Максимов < aleksey.i.maksimov@yandex.ru> wrote:
Hello Nir
Thanks for the answer. The output of the commands is below.
1. Please share the output of this command on one of the hosts: lvs -o vg_name,lv_name,tags | grep cdf1751b-64d3-42bc-b9ef-b0174c7ea068
# lvs -o vg_name,lv_name,tags | grep cdf1751b-64d3-42bc-b9ef-b0174c7ea068
VG LV LV Tags ... 6db73566-0f7f-4438-a9ef-6815075f45ea 208ece15-1c71-46f2-a019-6a9fce4309b2 IU_cdf1751b-64d3-42bc-b9ef-b0174c7ea068,MD_23,PU_00000000-0000-0000-0000-000000000000 6db73566-0f7f-4438-a9ef-6815075f45ea 4974a4cc-b388-456f-b98e-19d2158f0d58 IU_cdf1751b-64d3-42bc-b9ef-b0174c7ea068,MD_15,PU_00000000-0000-0000-0000-000000000000 6db73566-0f7f-4438-a9ef-6815075f45ea 8c66f617-7add-410c-b546-5214b0200832 IU_cdf1751b-64d3-42bc-b9ef-b0174c7ea068,MD_16,PU_208ece15-1c71-46f2-a019-6a9fce4309b2
So we have 2 volumes - 2 are base volumes:
- 208ece15-1c71-46f2-a019-6a9fce4309b2 IU_cdf1751b-64d3-42bc-b9ef-b0174c7ea068,MD_23,PU_00000000-0000-0000-0000-000000000000 - 4974a4cc-b388-456f-b98e-19d2158f0d58 IU_cdf1751b-64d3-42bc-b9ef-b0174c7ea068,MD_15,PU_00000000-0000-0000-0000-000000000000
And one is top volume: - 8c66f617-7add-410c-b546-5214b0200832 IU_cdf1751b-64d3-42bc-b9ef-b0174c7ea068,MD_16,PU_208ece15-1c71-46f2-a019-6a9fce4309b2
So according to vdsm, this is the chain:
208ece15-1c71-46f2-a019-6a9fce4309b2 <- 8c66f617-7add-410c-b546-5214b0200832 (top)
The volume 4974a4cc-b388-456f-b98e-19d2158f0d58 is not part of this chain.
qemu-img info --backing /dev/vg_name/lv_name
# qemu-img info --backing
/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/208ece15-1c71-46f2-a019-6a9fce4309b2
image:
/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/208ece15-1c71-46f2-a019-6a9fce4309b2
file format: qcow2 virtual size: 30G (32212254720 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
This is the base volume according to vdsm and qemu, good.
# qemu-img info --backing /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/4974a4cc-b388-456f-b98e-19d2158f0d58
image: /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/4974a4cc-b388-456f-b98e-19d2158f0d58 file format: qcow2 virtual size: 30G (32212254720 bytes) disk size: 0 cluster_size: 65536 backing file: 208ece15-1c71-46f2-a019-6a9fce4309b2 (actual path: /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/208ece15-1c71-46f2-a019-6a9fce4309b2) backing file format: qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
image: /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/208ece15-1c71-46f2-a019-6a9fce4309b2 file format: qcow2 virtual size: 30G (32212254720 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
This is the deleted volume according to vdsm metadata. We can see that this volume still has a backing file pointing to the base volume.
# qemu-img info --backing /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/8c66f617-7add-410c-b546-5214b0200832
image: /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/8c66f617-7add-410c-b546-5214b0200832 file format: qcow2 virtual size: 30G (32212254720 bytes) disk size: 0 cluster_size: 65536 backing file: 208ece15-1c71-46f2-a019-6a9fce4309b2 (actual path: /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/208ece15-1c71-46f2-a019-6a9fce4309b2) backing file format: qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
image: /dev/6db73566-0f7f-4438-a9ef-6815075f45ea/208ece15-1c71-46f2-a019-6a9fce4309b2 file format: qcow2 virtual size: 30G (32212254720 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false
This is top top volume.
So I think this is what happened:
You had this chain in the past:
208ece15-1c71-46f2-a019-6a9fce4309b2 <- 4974a4cc-b388-456f-b98e-19d2158f0d5 <- 8c66f617-7add-410c-b546-5214b0200832 (top)
You deleted a snapshot in engine, which created the new chain:
208ece15-1c71-46f2-a019-6a9fce4309b2 <- 8c66f617-7add-410c-b546-5214b0200832 (top)
<- 4974a4cc-b388-456f-b98e-19d2158f0d5 (deleted)
Deleting 4974a4cc-b388-456f-b98e-19d2158f0d5 failed, but we cleared the
metadata
of this volume.
To confirm this theory, please share the output of:
Top volume:
dd if=/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/metadata bs=512 count=1 skip=16 iflag=direct
Base volume:
dd if=/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/metadata bs=512 count=1 skip=23 iflag=direct
Deleted volume?:
dd if=/dev/6db73566-0f7f-4438-a9ef-6815075f45ea/metadata bs=512 count=1 skip=15 iflag=direct Nir