[ovirt-users] Bug in Snapshot Removing

Soeren Malchow soeren.malchow at mcon.net
Wed Jun 3 16:07:13 UTC 2015


Hi,

This is not happening every time, the last time i had this, it was a
script runnning, and something like th 9. Vm and the 23. Vm had a problem,
and it is not always the same VMS, it is not about the OS (happen for
Windows and Linux alike)

And as i said it also happened when i tried to remove the snapshots
sequentially, here is the code (i know it is probably not the elegant way,
but i am not a developer) and the code actually has correct indentions.

<― snip ―>

print "Snapshot deletion"
try:
    time.sleep(300)
    Connect()
    vms = api.vms.list()
    for vm in vms:
        print ("Deleting snapshots for %s ") % vm.name
        snapshotlist = vm.snapshots.list()
        for snapshot in snapshotlist:
            if snapshot.description != "Active VM":
                time.sleep(30)
                snapshot.delete()
                try:
                    while
api.vms.get(name=vm.name).snapshots.get(id=snapshot.id).snapshot_status ==
"locked":
                        print("Waiting for snapshot %s on %s deletion to
finish") % (snapshot.description, vm.name)
                        time.sleep(60)
                except Exception as e:
                    print ("Snapshot %s does not exist anymore") %
snapshot.description
        print ("Snapshot deletion for %s done") % vm.name
    print ("Deletion of snapshots done")
    api.disconnect()
except Exception as e:
    print ("Something went wrong when deleting the snapshots\n%s") % str(e)



<― snip ―> 


Cheers
Soeren





On 03/06/15 15:20, "Adam Litke" <alitke at redhat.com> wrote:

>On 03/06/15 07:36 +0000, Soeren Malchow wrote:
>>Dear Adam
>>
>>First we were using a python script that was working on 4 threads and
>>therefore removing 4 snapshots at the time throughout the cluster, that
>>still caused problems.
>>
>>Now i took the snapshot removing out of the threaded part an i am just
>>looping through each snapshot on each VM one after another, even with
>>³sleeps² inbetween, but the problem remains.
>>But i am getting the impression that it is a problem with the amount of
>>snapshots that are deleted in a certain time, if i delete manually and
>>one
>>after another (meaning every 10 min or so) i do not have problems, if i
>>delete manually and do several at once and on one VM the next one just
>>after one finished, the risk seems to increase.
>
>Hmm.  In our lab we extensively tested removing a snapshot for a VM
>with 4 disks.  This means 4 block jobs running simultaneously.  Less
>than 10 minutes later (closer to 1 minute) we would remove a second
>snapshot for the same VM (again involving 4 block jobs).  I guess we
>should rerun this flow on a fully updated CentOS 7.1 host to see about
>local reproduction.  Seems your case is much simpler than this though.
>Is this happening every time or intermittently?
>
>>I do not think it is the number of VMS because we had this on hosts with
>>only 3 or 4 Vms running
>>
>>I will try restarting the libvirt and see what happens.
>>
>>We are not using RHEL 7.1 only CentOS 7.1
>>
>>Is there anything else we can look at when this happens again ?
>
>I'll defer to Eric Blake for the libvirt side of this.  Eric, would
>enabling debug logging in libvirtd help to shine some light on the
>problem?
>
>-- 
>Adam Litke



More information about the Users mailing list