
Hi everyone, Thank you everyone who answered. In fact, I will be glad to file a bug when I'm done with recovering this very serious VM. But my main concern now is to be able to run it asap or switch to the painfull way of tape recovering. I found similarities between some already filed bugs and my issue, but I think my issue is much simpler. In my case : - the VM has only one disk - the whole oVirt setup is using an iSCSI SAN - the VM was shut, there were no attempt to do a live snapshot - I did not stop the engine during to delete or whatever disturbing action - I did the exact same steps two days ago on a test VM and it ran fine. - In between I did not upgrade or reset anything I found in the mail below many many common points : http://list-archives.org/2013/10/25/users-ovirt-org/vm-snapshot-delete-faile... By reading my logs, some of you jumped to the python errors, but when looking far above, one can see some previous (non-python) errors complaining about some logical volume not found. Today, I had no more log written in engine.log, so I decided to restart the engine : - Logs came back (...). - In the faulty VM, now I see NO snapshot at all. - I still see the disk - Trying to start the VM leads to the following error : VM uc-674 is down. Exit message: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/rhev/data-center/5849b030-626e-47cb-ad90-3ce782d831b3/11a077c7-658b-49bb-8596-a785109c24c9/images/69220da6-eeed-4435-aad0-7aa33f3a0d21/c50561d9-c3ba-4366-b2bc-49bbfaa4cd23,if=none,id=drive-virtio-disk0,format=qcow2,serial=69220da6-eeed-4435-aad0-7aa33f3a0d21,cache=none,werror=stop,rerror=stop,aio=native: could not open disk image /rhev/data-center/5849b030-626e-47cb-ad90-3ce782d831b3/11a077c7-658b-49bb-8596-a785109c24c9/images/69220da6-eeed-4435-aad0-7aa33f3a0d21/c50561d9-c3ba-4366-b2bc-49bbfaa4cd23: Invalid argument . And indeed, from the SPM I'm looking for the device, I see nothing : [root@serv-vm-adm9 11a077c7-658b-49bb-8596-a785109c24c9]# ls -la /dev/11a077c7-658b-49bb-8596-a785109c24c9/ total 0 drwxr-xr-x. 2 root root 200 7 janv. 08:23 . drwxr-xr-x. 21 root root 4480 7 janv. 08:23 .. lrwxrwxrwx. 1 root root 8 5 déc. 11:58 5c71e53b-21f2-4671-94f8-4603d1b0bf5e -> ../dm-19 lrwxrwxrwx. 1 root root 8 5 déc. 11:58 7369a73a-fea5-40d9-ad0a-7d81a43fe931 -> ../dm-20 lrwxrwxrwx. 1 root root 7 10 oct. 17:22 ids -> ../dm-5 lrwxrwxrwx. 1 root root 7 10 oct. 17:22 inbox -> ../dm-7 lrwxrwxrwx. 1 root root 7 10 oct. 17:22 leases -> ../dm-6 lrwxrwxrwx. 1 root root 7 10 oct. 17:22 master -> ../dm-9 lrwxrwxrwx. 1 root root 7 10 oct. 17:22 metadata -> ../dm-4 lrwxrwxrwx. 1 root root 7 10 oct. 17:22 outbox -> ../dm-8 There is no trace of the lvs it should be using (/dev/11a077c7-658b-49bb-8596-a785109c24c9/c50561d9-c3ba-4366-b2bc-49bbfaa4cd23). In the URL I provided above, the op is able to lvchange -aey the device. In my case, though a lvmdiskscan + a lvs is showing me the LV, there is not device in /dev/{the proper VG}/{my missing LV}. Well, the last thing to ask is : Is there a way to recover it, to recreate an device to access this LV and to activate it? -- Nicolas Ecarnot Le 07/01/2014 04:09, Maor Lipchuk a écrit :
Hi Nicolas, I think that the initial problem started at 10:06 when VDSM tried to clear records of the ancestor volume c50561d9-c3ba-4366-b2bc-49bbfaa4cd23 (see [1])
Looking at bugzilla, it could be related to https://bugzilla.redhat.com/1029069 (based on the exception described at https://bugzilla.redhat.com/show_bug.cgi?id=1029069#c1)
The issue there was fixed after an upgrade to 3.3.1 (as Sander mentioned it before in the mailing list)
Could you give it a try and check if that works for you?
Also it will be great if you could open a bug on that with the full VDSM, engine logs and the list of lvs.
Regards, Maor
[1] 236b3c5a-452a-4614-801a-c30cefbce87e::ERROR::2014-01-06 10:06:14,407::task::850::TaskManager.Task::(_setError) Task=`236b3c5a-452a-4614-801a-c30cefbce87e`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 318, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1937, in mergeSnapshots sdUUID, vmUUID, imgUUID, ancestor, successor, postZero) File "/usr/share/vdsm/storage/image.py", line 1162, in merge srcVol.shrinkToOptimalSize() File "/usr/share/vdsm/storage/blockVolume.py", line 315, in shrinkToOptimalSize volParams = self.getVolumeParams() File "/usr/share/vdsm/storage/volume.py", line 1008, in getVolumeParams volParams['imgUUID'] = self.getImage() File "/usr/share/vdsm/storage/blockVolume.py", line 494, in getImage return self.getVolumeTag(TAG_PREFIX_IMAGE) File "/usr/share/vdsm/storage/blockVolume.py", line 464, in getVolumeTag return _getVolumeTag(self.sdUUID, self.volUUID, tagPrefix) File "/usr/share/vdsm/storage/blockVolume.py", line 662, in _getVolumeTag tags = lvm.getLV(sdUUID, volUUID).tags File "/usr/share/vdsm/storage/lvm.py", line 851, in getLV raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName)) LogicalVolumeDoesNotExistError: Logical volume does not exist: ('11a077c7-658b-49bb-8596-a785109c24c9/_remove_me_aVmPgweS_c50561d9-c3ba-4366-b2bc-49bbfaa4cd23',)
On 01/06/2014 04:39 PM, Meital Bourvine wrote:
I got the attachment.
This is the relevant error: 6caec3bc-fc66-42be-a642-7733fc033103::ERROR::2014-01-06 10:13:21,068::task::850::TaskManager.Task::(_setError) Task=`6caec3bc-fc66-42be-a642-7733fc033103`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 318, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1937, in mergeSnapshots sdUUID, vmUUID, imgUUID, ancestor, successor, postZero) File "/usr/share/vdsm/storage/image.py", line 1101, in merge dstVol = vols[ancestor] KeyError: '506085b6-40e0-4176-a4df-9102857f51f2'
I don't know why it happens, so you'll have to wait for someone else to answer.
----- Original Message -----
From: "Nicolas Ecarnot" <nicolas@ecarnot.net> To: "users" <users@ovirt.org> Sent: Monday, January 6, 2014 4:22:57 PM Subject: Re: [Users] Unable to delete a snapshot
Le 06/01/2014 12:51, Nicolas Ecarnot a écrit :
Also, Please attach the whole vdsm.log, it's hard to read it this way (lines are broken)
See attachment.
Actually, I don't know if this mailing list allows attachments ?
-- Nicolas Ecarnot _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
-- Nicolas Ecarnot