[ovirt-users] Can't remove snapshot

Marcelo Leandro marceloltmm at gmail.com
Sun May 1 22:36:25 UTC 2016


hello,
i have problem for delete one snapshot.
output the script vm-disk-info.py

Warning: volume 023110fa-7d24-46ec-ada8-d617d7c2adaf is in chain but illegal
    Volumes:
        a09bfb5d-3922-406d-b4e0-daafad96ffec

after running the md5sum command I realized that the volume change is the
base:
a09bfb5d-3922-406d-b4e0-daafad96ffec

the disk  023110fa-7d24-46ec-ada8-d617d7c2adaf does not change.

Thanks.



2016-03-18 16:50 GMT-03:00 Greg Padgett <gpadgett at redhat.com>:

> On 03/18/2016 03:10 PM, Nir Soffer wrote:
>
>> On Fri, Mar 18, 2016 at 7:55 PM, Nathanaël Blanchet <blanchet at abes.fr>
>> wrote:
>>
>>> Hello,
>>>
>>> I can create snapshot when no one exists but I'm not able to remove it
>>> after.
>>>
>>
>> Do you try to remove it when the vm is running?
>>
>> It concerns many of my vms, and when stopping them, they can't boot
>>> anymore
>>> because of the illegal status of the disks, this leads me in a critical
>>> situation
>>>
>>> VM fedora23 is down with error. Exit message: Unable to get volume size
>>> for
>>> domain 5ef8572c-0ab5-4491-994a-e4c30230a525 volume
>>> e5969faa-97ea-41df-809b-cc62161ab1bc
>>>
>>> As far as I didn't initiate any live merge, am I concerned by this bug
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1306741?
>>> I'm running 3.6.2, will upgrade to 3.6.3 solve this issue?
>>>
>>
>> If you tried to remove a snapshot while the vm is running you did
>> initiate live merge, and this bug may effect you.
>>
>> Adding Greg for adding more info about this.
>>
>>
> Hi Nathanaël,
>
> From the logs you pasted below, showing RemoveSnapshotSingleDiskCommand
> (not ..SingleDiskLiveCommand), it looks like a non-live snapshot.  In that
> case, bug 1306741 would not affect you.
>
> To dig deeper, we'd need to know the root cause of why the image could not
> be deleted.  You should be able to find some clues in your engine log above
> the snippet you pasted below, or perhaps something in the vdsm log will
> reveal the reason.
>
> Thanks,
> Greg
>
>
>
>>> 2016-03-18 18:26:57,652 ERROR
>>> [org.ovirt.engine.core.bll.RemoveSnapshotCommand]
>>> (org.ovirt.thread.pool-8-thread-39) [a1e222d] Ending command
>>> 'org.ovirt.engine.core.bll.RemoveSnapshotCommand' with failure.
>>> 2016-03-18 18:26:57,663 ERROR
>>> [org.ovirt.engine.core.bll.RemoveSnapshotCommand]
>>> (org.ovirt.thread.pool-8-thread-39) [a1e222d] Could not delete image
>>> '46e9ecc8-e168-4f4d-926c-e769f5df1f2c' from snapshot
>>> '88fcf167-4302-405e-825f-ad7e0e9f6564'
>>> 2016-03-18 18:26:57,678 WARN
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (org.ovirt.thread.pool-8-thread-39) [a1e222d] Correlation ID: a1e222d,
>>> Job
>>> ID: 00d3e364-7e47-4022-82ff-f772cd79d4a1, Call Stack: null, Custom Event
>>> ID:
>>> -1, Message: Due to partial snapshot removal, Snapshot 'test' of VM
>>> 'fedora23' now contains only the following disks: 'fedora23_Disk1'.
>>> 2016-03-18 18:26:57,695 ERROR
>>> [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand]
>>> (org.ovirt.thread.pool-8-thread-39) [724e99fd] Ending command
>>> 'org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand' with failure.
>>> 2016-03-18 18:26:57,708 ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandlin
>>>
>>> Thank you for your help.
>>>
>>>
>>> Le 23/02/2016 19:51, Greg Padgett a écrit :
>>>
>>>>
>>>> On 02/22/2016 07:10 AM, Marcelo Leandro wrote:
>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> The bug with snapshot  it will be fixed in ovirt 3.6.3?
>>>>>
>>>>> thanks.
>>>>>
>>>>>
>>>> Hi Marcelo,
>>>>
>>>> Yes, the bug below (bug 1301709) is now targeted to 3.6.3.
>>>>
>>>> Thanks,
>>>> Greg
>>>>
>>>> 2016-02-18 11:34 GMT-03:00 Adam Litke <alitke at redhat.com>:
>>>>>
>>>>>>
>>>>>> On 18/02/16 10:37 +0100, Rik Theys wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 02/17/2016 05:29 PM, Adam Litke wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 17/02/16 11:14 -0500, Greg Padgett wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 02/17/2016 03:42 AM, Rik Theys wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 02/16/2016 10:52 PM, Greg Padgett wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 02/16/2016 08:50 AM, Rik Theys wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>    From the above I conclude that the disk with id that ends
>>>>>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Similar to what I wrote to Marcelo above in the thread, I'd
>>>>>>>>>>> recommend
>>>>>>>>>>> running the "VM disk info gathering tool" attached to [1].  It's
>>>>>>>>>>> the
>>>>>>>>>>> best way to ensure the merge was completed and determine which
>>>>>>>>>>> image
>>>>>>>>>>> is
>>>>>>>>>>> the "bad" one that is no longer in use by any volume chains.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I've ran the disk info gathering tool and this outputs (for the
>>>>>>>>>> affected
>>>>>>>>>> VM):
>>>>>>>>>>
>>>>>>>>>> VM lena
>>>>>>>>>>       Disk b2390535-744f-4c02-bdc8-5a897226554b
>>>>>>>>>> (sd:a7ba2db3-517c-408a-8b27-ea45989d6416)
>>>>>>>>>>       Volumes:
>>>>>>>>>>           24d78600-22f4-44f7-987b-fbd866736249
>>>>>>>>>>
>>>>>>>>>> The id of the volume is the ID of the snapshot that is marked
>>>>>>>>>> "illegal".
>>>>>>>>>> So the "bad" image would be the dc39 one, which according to the
>>>>>>>>>> UI
>>>>>>>>>> is
>>>>>>>>>> in use by the "Active VM" snapshot. Can this make sense?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It looks accurate.  Live merges are "backwards" merges, so the
>>>>>>>>> merge
>>>>>>>>> would have pushed data from the volume associated with "Active VM"
>>>>>>>>> into the volume associated with the snapshot you're trying to
>>>>>>>>> remove.
>>>>>>>>>
>>>>>>>>> Upon completion, we "pivot" so that the VM uses that older volume,
>>>>>>>>> and
>>>>>>>>> we update the engine database to reflect this (basically we
>>>>>>>>> re-associate that older volume with, in your case, "Active VM").
>>>>>>>>>
>>>>>>>>> In your case, it seems the pivot operation was done, but the
>>>>>>>>> database
>>>>>>>>> wasn't updated to reflect it.  Given snapshot/image associations
>>>>>>>>> e.g.:
>>>>>>>>>
>>>>>>>>>    VM Name  Snapshot Name  Volume
>>>>>>>>>    -------  -------------  ------
>>>>>>>>>    My-VM    Active VM      123-abc
>>>>>>>>>    My-VM    My-Snapshot    789-def
>>>>>>>>>
>>>>>>>>> My-VM in your case is actually running on volume 789-def.  If you
>>>>>>>>> run
>>>>>>>>> the db fixup script and supply ("My-VM", "My-Snapshot", "123-abc")
>>>>>>>>> (note the volume is the newer, "bad" one), then it will switch the
>>>>>>>>> volume association for you and remove the invalid entries.
>>>>>>>>>
>>>>>>>>> Of course, I'd shut down the VM, and back up the db beforehand.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I've executed the sql script and it seems to have worked. Thanks!
>>>>>>>
>>>>>>> "Active VM" should now be unused; it previously (pre-merge) was the
>>>>>>>>> data written since the snapshot was taken.  Normally the larger
>>>>>>>>> actual
>>>>>>>>> size might be from qcow format overhead.  If your listing above is
>>>>>>>>> complete (ie one volume for the vm), then I'm not sure why the base
>>>>>>>>> volume would have a larger actual size than virtual size.
>>>>>>>>>
>>>>>>>>> Adam, Nir--any thoughts on this?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> There is a bug which has caused inflation of the snapshot volumes
>>>>>>>> when
>>>>>>>> performing a live merge.  We are submitting fixes for 3.5, 3.6, and
>>>>>>>> master right at this moment.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Which bug number is assigned to this bug? Will upgrading to a release
>>>>>>> with a fix reduce the disk usage again?
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=1301709 for the bug.
>>>>>> It's about a clone disk failure after the problem occurs.
>>>>>> Unfortunately, there is not an automatic way to repair the raw base
>>>>>> volumes if they were affected by this bug.  They will need to be
>>>>>> manually shrunk using lvreduce if you are certain that they are
>>>>>> inflated.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Adam Litke
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>
>>>
>>> --
>>> Nathanaël Blanchet
>>>
>>> Supervision réseau
>>> Pôle Infrastrutures Informatiques
>>> 227 avenue Professeur-Jean-Louis-Viala
>>> 34193 MONTPELLIER CEDEX 5
>>> Tél. 33 (0)4 67 54 84 55
>>> Fax  33 (0)4 67 54 84 14
>>> blanchet at abes.fr
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160501/4b75802b/attachment-0001.html>


More information about the Users mailing list