Hi Nir, thanks for the reply, here's the output:(BASE)[root@node02 ~]# qemu-img info --backing-chain /rhev/data-center/mnt/blockSD/cec63cf0-9311-488d-b1fa- 99c4405e8379/images/65ec515e- 0aae-4fe6-a561-387929c7fb4d/ 52532d05-970e-4643-9774- 96c31796062c
image: /rhev/data-center/mnt/blockSD/cec63cf0-9311-488d-b1fa- 99c4405e8379/images/65ec515e- 0aae-4fe6-a561-387929c7fb4d/ 52532d05-970e-4643-9774- 96c31796062c
file format: raw
virtual size: 700G (751619276800 bytes)
disk size: 0(with Snapshot)
[root@node02 ~]# qemu-img info --backing-chain /rhev/data-center/mnt/blockSD/cec63cf0-9311-488d-b1fa- 99c4405e8379/images/65ec515e- 0aae-4fe6-a561-387929c7fb4d/ 86a6fdb9-b9a4-4b78-8e3d- 940f83cedc5a
image: /rhev/data-center/mnt/blockSD/cec63cf0-9311-488d-b1fa- 99c4405e8379/images/65ec515e- 0aae-4fe6-a561-387929c7fb4d/ 86a6fdb9-b9a4-4b78-8e3d- 940f83cedc5a
file format: qcow2
virtual size: 1.1T (1181116006400 bytes)
disk size: 0
cluster_size: 65536
backing file: 52532d05-970e-4643-9774-96c31796062c (actual path: /rhev/data-center/mnt/blockSD/ cec63cf0-9311-488d-b1fa- 99c4405e8379/images/65ec515e- 0aae-4fe6-a561-387929c7fb4d/ 52532d05-970e-4643-9774- 96c31796062c)
backing file format: raw
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
image: /rhev/data-center/mnt/blockSD/cec63cf0-9311-488d-b1fa- 99c4405e8379/images/65ec515e- 0aae-4fe6-a561-387929c7fb4d/ 52532d05-970e-4643-9774- 96c31796062c
file format: raw
virtual size: 700G (751619276800 bytes)
disk size: 0I really appreciate your time helping,regards,2018-05-14 11:36 GMT-03:00 Nir Soffer <nsoffer@redhat.com>:On Mon, May 14, 2018 at 5:19 PM Juan Pablo <pablo.localhost@gmail.com> wrote:ok, so Im confirming that the image is wrong somehow:
with no snapshot, from inside the vm disk size is reporting 750G.
with a snapshot, from inside the vm disk size is reporting 1100G.both have no partitions on it, so I guess ovirt migrated the structure of the 750G disk on a 1100 disk, any ideas to troubleshoot this and see if there's data to recover?Maybe you resized the disk after making a snapshot?If the base is raw, the size seen by the guest is the size of the image.The snapshot is qcow2, the size seen by the guest is the size saved in the qcow2 header.Can you share the output of:qemu-img info --backing-chain /path/to/snapshotAnd:qemu-img info --backing-chain /path/to/baseYou can see the path in the vm xml, either in vdsm.log, or using virsh:virsh -r listvirtsh -r dumpxml vm-idNirregards,2018-05-13 15:25 GMT-03:00 Juan Pablo <pablo.localhost@gmail.com>:2 clues:
-the original size of the disk was 750G and was extended a month ago to 1100G. The System rebooted fine several times, and took the new size with no problems.-I run fdisk from a centos 7 rescue cd and '/dev/vda' reported 750G. then, I took a snapshot of the disk to play with recovery tools and now fdisk reports 1100G... ¬¬
so my guess is on the extend and later migration to a different storage domain caused the issue.Im currently running testdisk to see if theres any partition to recover.regards,2018-05-13 12:31 GMT-03:00 Juan Pablo <pablo.localhost@gmail.com>:I removed the auto-snapshot and still no lucky. no bootable disk found. =(ideas?2018-05-13 12:26 GMT-03:00 Juan Pablo <pablo.localhost@gmail.com>:benny, thanks for your reply:
ok, so the steps are : removing the snapshot on the first place. then what do you suggest?2018-05-12 15:23 GMT-03:00 Nir Soffer <nsoffer@redhat.com>:On Sat, 12 May 2018, 11:32 Benny Zlotnik, <bzlotnik@redhat.com> wrote:Using the auto-generated snapshot is generally a bad idea as it's inconsistent,What do you mean by inconsistant?you should remove it before moving further______________________________On Fri, May 11, 2018 at 7:25 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:I rebooted it with no luck, them I used the auto-gen snapshot , same luck.attaching the logs in gdrivethanks in advance2018-05-11 12:50 GMT-03:00 Benny Zlotnik <bzlotnik@redhat.com>:I see here a failed attempt:2018-05-09 16:00:20,129-03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gDirector] (EE-ManagedThreadFactory-engin eScheduled-Thread-67) [bd8eeb1d-f49a-4f91-a521-e0f31 b4a7cbd] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILU RE(2,011), User admin@internal-authz have failed to move disk mail02-int_Disk1 to domain 2penLA. Then another:2018-05-09 16:15:06,998-03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gDirector] (EE-ManagedThreadFactory-engin eScheduled-Thread-34) [] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILU RE(2,011), User admin@internal-authz have failed to move disk mail02-int_Disk1 to domain 2penLA. Here I see a successful attempt:2018-05-09 21:58:42,628-03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gDirector] (default task-50) [940b051c-8c63-4711-baf9-f3520 bb2b825] EVENT_ID: USER_MOVED_DISK(2,008), User admin@internal-authz moving disk mail02-int_Disk1 to domain 2penLA. Then, in the last attempt I see the attempt was successful but live merge failed:2018-05-11 03:37:59,509-03 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-comma ndCoordinator-Thread-2) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Failed to live merge, still in volume chain: [5d9d2958-96bc-49fa-9100-2f33a 3ba737f, 52532d05-970e-4643-9774-96c317 96062c] 2018-05-11 03:38:01,495-03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallb ack] (EE-ManagedThreadFactory-engin eScheduled-Thread-51) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Command 'LiveMigrateDisk' (id: '115fc375-6018-4d59-b9f2-51ee0 5ca49f8') waiting on child command id: '26bc52a4-4509-4577-b342-44a67 9bc628f' type:'RemoveSnapshot' to complete 2018-05-11 03:38:01,501-03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDis kLiveCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-51) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Command id: '4936d196-a891-4484-9cf5-fceaa fbf3364 failed child command status for step 'MERGE_STATUS' 2018-05-11 03:38:01,501-03 INFO [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDis kLiveCommandCallback] (EE-ManagedThreadFactory-engin eScheduled-Thread-51) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Command 'RemoveSnapshotSingleDiskLive' id: '4936d196-a891-4484-9cf5-fceaa fbf3364' child commands '[8da5f261-7edd-4930-8d9d-d34f 232d84b3, 1c320f4b-7296-43c4-a3e6-8a868e 23fc35, a0e9e70c-cd65-4dfb-bd00-076c4e 99556a]' executions were completed, status 'FAILED' 2018-05-11 03:38:02,513-03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDis kLiveCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-2) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Merging of snapshot '319e8bbb-9efe-4de4-a9a6-862e3 deb891f' images '52532d05-970e-4643-9774-96c31 796062c'..'5d9d2958-96bc-49fa- 9100-2f33a3ba737f' failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation. 2018-05-11 03:38:02,519-03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDis kLiveCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-2) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Ending command 'org.ovirt.engine.core.bll.sna pshots.RemoveSnapshotSingleDis kLiveCommand' with failure. 2018-05-11 03:38:03,530-03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionC allback] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Command 'RemoveSnapshot' id: '26bc52a4-4509-4577-b342-44a67 9bc628f' child commands '[4936d196-a891-4484-9cf5-fcea afbf3364]' executions were completed, status 'FAILED' 2018-05-11 03:38:04,548-03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-66) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Ending command 'org.ovirt.engine.core.bll.sna pshots.RemoveSnapshotCommand' with failure. 2018-05-11 03:38:04,557-03 INFO [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-66) [d5b7fdf5-9c37-4c1f-8543-a7bc7 5c993a5] Lock freed to object 'EngineLock:{exclusiveLocks='[ 4808bb70-c9cc-4286-aa39-16b579 8213ac=LIVE_STORAGE_MIGRATION] ', sharedLocks=''}' I do not see the merge attempt in the vdsm.log, so please send vdsm logs for node02.phy.eze.ampgn.com.ar from that time.Also, did you use the auto-generated snapshot to start the vm?On Fri, May 11, 2018 at 6:11 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:after the xfs_repair, it says: sorry I could not find valid secondary superblock2018-05-11 12:09 GMT-03:00 Juan Pablo <pablo.localhost@gmail.com>:hi,Alias:mail02-int_Disk1Description:ID:65ec515e-0aae-4fe6-a561-387929c7fb4d Alignment:UnknownDisk Profile:Wipe After Delete:Nothat one2018-05-11 11:12 GMT-03:00 Benny Zlotnik <bzlotnik@redhat.com>:I looked at the logs and I see some disks have moved successfully and some failed. Which disk is causing the problems?On Fri, May 11, 2018 at 5:02 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:Hi, just sent you via drive the files. attaching some extra info, thanks thanks and thanks :from inside the migrated vm I had the following attached dmesg output before rebootingregards and thanks again for the help,2018-05-11 10:45 GMT-03:00 Benny Zlotnik <bzlotnik@redhat.com>:Dropbox or google drive I guess. Also, can you attach engine.log?On Fri, May 11, 2018 at 4:43 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
vdsm is too big for gmail ...any other way I can share it with you?---------- Forwrded message ----------
From: Juan Pablo <pablo.localhost@gmail.com>
Date: 2018-05-11 10:40 GMT-03:00
Subject: Re: [ovirt-users] strange issue: vm lost info on disk
To: Benny Zlotnik <bzlotnik@redhat.com>
Cc: users <Users@ovirt.org>Benny, thanks for your reply! it was a Live migration. sorry, it was from nfs to iscsi, not otherwise. I have reboot the vm for rescue and it does not detect any partitions with fdisk, Im running a xfs_repair with -n and found some corrupted primary superblock., its still running... ( so... there's info in the disk maybe?)attaching logs, let me know if those are the ones.thanks again!2018-05-11 9:45 GMT-03:00 Benny Zlotnik <bzlotnik@redhat.com>:Can you provide the logs? engine and vdsm.Did you perform a live migration (the VM is running) or cold?On Fri, May 11, 2018 at 2:49 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:______________________________Hi! , Im strugled about an ongoing problem:
after migrating a vm's disk from an iscsi domain to a nfs and ovirt reporting the migration was successful, I see there's no data 'inside' the vm's disk. we never had this issues with ovirt so Im stranged about the root cause and if theres a chance of recovering the information.can you please help me out troubleshooting this one? I would really appreciate it =)running ovirt 4.2.1 here!thanks in advance,JP_________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
_________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org