Delete snapshot with status illegal - live merge not possible

Hello, after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3. I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access) Kind regards Jan Siml

Further investigations showed, that the time on the target storage went backwards due to a misconfigured NTP service. Someone has corrected the configuration while live storage migration was running and the time jumped backwards. Might this be a cause for the issue? -------- Original Message -------- Subject: Delete snapshot with status illegal - live merge not possible From: Jan Siml <jsiml@plusline.net> To: users@ovirt.org Date: 08/26/2015 06:14 PM
Hello,
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
Kind regards
Jan Siml

Hello,
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml. a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d). What can we do, to solve this issue? Is this to be corrected in database only? Kind regards Jan Siml

Hello, if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue: Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM. Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla. Kind regards Jan Siml
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml.
a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d).
What can we do, to solve this issue? Is this to be corrected in database only?

------=_Part_286_211283766.1440765137323 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not.
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben:
Hello,
if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue:
Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM.
Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla.
Kind regards
Jan Siml
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml.
a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d).
What can we do, to solve this issue? Is this to be corrected in database only?
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ------=_Part_286_211283766.1440765137323 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org= /TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns=3D"http://www.w3.org/1999/xhtml"><head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DUTF-8= "/> <style type=3D"text/css">.mceResizeHandle {position: absolute;border: 1px = solid black;background: #FFF;width: 5px;height: 5px;z-index: 10000}.mceResi= zeHandle:hover {background: #000}img[data-mce-selected] {outline: 1px solid= black}img.mceClonedResizable, table.mceClonedResizable {position: absolute= ;outline: 1px dashed black;opacity: .5;z-index: 10000} </style></head><body style=3D""><div>got exactly the same issue, with all n= ice side effects like performance degradation. Until now i was not able to = fix this, or to fool the engine somehow that it whould show the image as ok= again and give me a 2nd chance to drop the snapshot.</div> <div> </div> <div>in some cases this procedure helped (needs 2nd storage domain)</div> <div> </div> <div>-> image live migration to a different storage domain (check which= combinations are supported, iscsi -> nfs domain seems unsupported. isc= si -> iscsi works)</div> <div>-> snapshot went into ok state, and in ~50% i was able to drop the= snapshot than. space had been reclaimed, so seems like this worked</div> <div> </div> <div> </div> <div>other workaround is through exporting the image onto a nfs export doma= in, here you can tell the engine to not export snapshots. after re-importin= g everything is fine</div> <div> </div> <div> </div> <div>the snapshot feature (live at least) should be avoided at all currentl= y.... simply not reliable enaugh.</div> <div> </div> <div> </div> <div>your way works, too. already did that, even it was a pita to figure ou= t where to find what. this symlinking mess between /rhev /dev and /var/lib/= libvirt is really awesome. not.</div> <div> </div> <div> </div> <div>> Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um = 12:56 geschrieben:<br>> <br>> <br>> Hello,<br>> <br>> i= f no one has an idea how to correct the Disk/Snapshot paths in Engine <br>&= #62; database, I see only one possible way to solve the issue:<br>> <br=
> Stop the VM and copy image/meta files target storage to source stora= ge <br>> (the one where Engine thinks the files are located). Start the= VM.<br>> <br>> Any concerns regarding this procedure? But I still = hope that someone <br>> from oVirt team can give an advice how to corre= ct the database entries. <br>> If necessary I would open a bug in Bugzi= lla.<br>> <br>> Kind regards<br>> <br>> Jan Siml<br>> <= br>> >> after a failed live storage migration (cause unknown) w= e have a<br>> >> snapshot which is undeletable due to its statu= s 'illegal' (as seen<br>> >> in storage/snapshot tab). = I have already found some bugs [1],[2],[3]<br>> >> regarding th= is issue, but no way how to solve the issue within oVirt<br>> > = 2; 3.5.3.<br>> >><br>> >> I have attached the relev= ant engine.log snippet. Is there any way to<br>> >> do a live m= erge (and therefore delete the snapshot)?<br>> >><br>> >= ;> [1] https://bugzilla.redhat.com/show_bug.cgi?id=3D1213157<br>>; &= #62;> [2] https://bugzilla.redhat.com/show_bug.cgi?id=3D1247377 links t= o [3]<br>> >> [3] https://bugzilla.redhat.com/show_bug.cgi?id= =3D1247379 (no access)<br>> ><br>> > some additional inform= ations. I have checked the images on both storages<br>> > and verif= ied the disk paths with virsh's dumpxml.<br>> ><br>> > = a) The images and snapshots are on both storages.<br>> > b) The ima= ges on source storage aren't used. (modification time)<br>> > c= ) The images on target storage are used. (modification time)<br>> >= d) virsh -r dumpxml tells me disk images are located on _target_ storage.<= br>> > e) Admin interface tells me, that images and snapshot are lo= cated on<br>> > _source_ storage, which isn't true, see b), c) = and d).<br>> ><br>> > What can we do, to solve this issue? = Is this to be corrected in database<br>> > only?<br>> _________= ______________________________________<br>> Users mailing list<br>>= Users@ovirt.org<br>> http://lists.ovirt.org/mailman/listinfo/users</di= v></body></html> ------=_Part_286_211283766.1440765137323--

Hello Juergen,
got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked
okay, seems interesting. But I'm afraid of not knowing which image files Engine uses when live migration is demanded. If Engine uses the ones which are actually used and updates the database afterwards -- fine. But if the images are used that are referenced in Engine database, we will take a journey into the past.
other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not.
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben:
Hello,
if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue:
Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM.
Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla.
Kind regards
Jan Siml
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml.
a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d).
What can we do, to solve this issue? Is this to be corrected in database only?
Kind regards Jan Siml

------=_Part_292_1575779874.1440769177725 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 15:15 geschrieben:
Hello Juergen,
got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked
okay, seems interesting. But I'm afraid of not knowing which image files Engine uses when live migration is demanded. If Engine uses the ones which are actually used and updates the database afterwards -- fine. But if the images are used that are referenced in Engine database, we will take a journey into the past.
knocking on wood. so far no problems, and i used this way for sure 50 times + in cases where the live merge failed, offline merging worked in another 50%. those which fail offline, too went back to illegal snap state
other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not.
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben:
Hello,
if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue:
Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM.
Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla.
Kind regards
Jan Siml
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml.
a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d).
What can we do, to solve this issue? Is this to be corrected in database only?
Kind regards
Jan Siml
> > where to find what. this symlinking mess between /rhev /dev an= d<br>> > /var/lib/libvirt is really awesome. not.<br>> > = 62; Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56<b= r>> > geschrieben:<br>> > ><br>> > ><br>= 2; > > Hello,<br>> > ><br>> > > if no one h= as an idea how to correct the Disk/Snapshot paths in Engine<br>> > = > database, I see only one possible way to solve the issue:<br>> = 62; ><br>> > > Stop the VM and copy image/meta files target= storage to source storage<br>> > > (the one where Engine think= s the files are located). Start the VM.<br>> > ><br>> >= > Any concerns regarding this procedure? But I still hope that someone= <br>> > > from oVirt team can give an advice how to correct the= database entries.<br>> > > If necessary I would open a bug in = Bugzilla.<br>> > ><br>> > > Kind regards<br>> &= #62; ><br>> > > Jan Siml<br>> > ><br>> >= ; > >> after a failed live storage migration (cause unknown) we= have a<br>> > > >> snapshot which is undeletable due t= o its status 'illegal' (as seen<br>> > > >> in = storage/snapshot tab). I have already found some bugs [1],[2],[3]<br>> = > > >> regarding this issue, but no way how to solve the is= sue within oVirt<br>> > > > > 3.5.3.<br>> > = 2; >><br>> > > >> I have attached the relevant = engine.log snippet. Is there any way to<br>> > > >> do = a live merge (and therefore delete the snapshot)?<br>> > > >= ;><br>> > > >> [1] https://bugzilla.redhat.com/show= _bug.cgi?id=3D1213157<br>> > > >> [2] https://bugzilla.= redhat.com/show_bug.cgi?id=3D1247377 links to [3]<br>> > > >= ;> [3] https://bugzilla.redhat.com/show_bug.cgi?id=3D1247379 (no access= )<br>> > > ><br>> > > > some additional inf= ormations. I have checked the images on both<br>> > storages<br>= 2; > > > and verified the disk paths with virsh's dumpxml.<= br>> > > ><br>> > > > a) The images and sna=
------=_Part_292_1575779874.1440769177725 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org= /TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns=3D"http://www.w3.org/1999/xhtml"><head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DUTF-8= "/> <style type=3D"text/css">.mceResizeHandle {position: absolute;border: 1px = solid black;background: #FFF;width: 5px;height: 5px;z-index: 10000}.mceResi= zeHandle:hover {background: #000}img[data-mce-selected] {outline: 1px solid= black}img.mceClonedResizable, table.mceClonedResizable {position: absolute= ;outline: 1px dashed black;opacity: .5;z-index: 10000} </style></head><body style=3D""><div><br>> Jan Siml <jsiml@plusline= .net> hat am 28. August 2015 um 15:15 geschrieben:<br>> <br>> <= br>> Hello Juergen,<br>> <br>> > got exactly the same issue= , with all nice side effects like performance<br>> > degradation. U= ntil now i was not able to fix this, or to fool the engine<br>> > s= omehow that it whould show the image as ok again and give me a 2nd<br>>= > chance to drop the snapshot.<br>> > in some cases this proce= dure helped (needs 2nd storage domain)<br>> > -> image live mig= ration to a different storage domain (check which<br>> > combinatio= ns are supported, iscsi -> nfs domain seems unsupported. iscsi<br>>= > -> iscsi works)<br>> > -> snapshot went into ok stat= e, and in ~50% i was able to drop the<br>> > snapshot than. space h= ad been reclaimed, so seems like this worked<br>> <br>> okay, seems= interesting. But I'm afraid of not knowing which image files <br>>= Engine uses when live migration is demanded. If Engine uses the ones <br>&= #62; which are actually used and updates the database afterwards -- fine. B= ut <br>> if the images are used that are referenced in Engine database,= we will <br>> take a journey into the past.</div> <div> </div> <div>knocking on wood. so far no problems, and i used this way for sure 50 = times +</div> <div> </div> <div>in cases where the live merge failed, offline merging worked in anothe= r 50%. those which fail offline, too went back to illegal snap state</div> <div><br>> <br>> > other workaround is through exporting the im= age onto a nfs export<br>> > domain, here you can tell the engine t= o not export snapshots. after<br>> > re-importing everything is fin= e<br>> > the snapshot feature (live at least) should be avoided at = all<br>> > currently.... simply not reliable enaugh.<br>> >= your way works, too. already did that, even it was a pita to figure out<br= pshots are on both storages.<br>> > > > b) The images on so= urce storage aren't used. (modification time)<br>> > > >= ; c) The images on target storage are used. (modification time)<br>> = 62; > > d) virsh -r dumpxml tells me disk images are located on _ta= rget_<br>> > storage.<br>> > > > e) Admin interface= tells me, that images and snapshot are located on<br>> > > = 2; _source_ storage, which isn't true, see b), c) and d).<br>> >= ; > ><br>> > > > What can we do, to solve this issu= e? Is this to be corrected in<br>> > database<br>> > > = > only?<br>> <br>> Kind regards<br>> <br>> Jan Siml</di= v></body></html> ------=_Part_292_1575779874.1440769177725--

Hello,
got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked
okay, seems interesting. But I'm afraid of not knowing which image files Engine uses when live migration is demanded. If Engine uses the ones which are actually used and updates the database afterwards -- fine. But if the images are used that are referenced in Engine database, we will take a journey into the past. knocking on wood. so far no problems, and i used this way for sure 50 times +
This doesn't work. Engine creates the snapshots on wrong storage (old) and this process fails, cause the VM (qemu process) uses the images on other storage (new).
in cases where the live merge failed, offline merging worked in another 50%. those which fail offline, too went back to illegal snap state
I fear offline merge would cause data corruption. Because if I shut down the VM, the information in Engine database is still wrong. Engine thinks image files and snapshots are on old storage. But VM has written to the equal named image files on new storage. And offline merge might use the "old" files on old storage.
other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine
Same issue as with offline merge. Meanwhile I think, we need to shut down the VM, copy the image files from one storage (qemu has used before) to the other storage (the one Engine expects) and pray while starting the VM again.
the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not.
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben:
Hello,
if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue:
Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM.
Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla.
Kind regards
Jan Siml
after a failed live storage migration (cause unknown) we have a snapshot which is undeletable due to its status 'illegal' (as seen in storage/snapshot tab). I have already found some bugs [1],[2],[3] regarding this issue, but no way how to solve the issue within oVirt 3.5.3.
I have attached the relevant engine.log snippet. Is there any way to do a live merge (and therefore delete the snapshot)?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml.
a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d).
What can we do, to solve this issue? Is this to be corrected in database only?
Kind regards Jan Siml

------=_Part_303_827523889.1440777327833 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 16:47 geschrieben:
Hello,
got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked
okay, seems interesting. But I'm afraid of not knowing which image files Engine uses when live migration is demanded. If Engine uses the ones which are actually used and updates the database afterwards -- fine. But if the images are used that are referenced in Engine database, we will take a journey into the past. knocking on wood. so far no problems, and i used this way for sure 50 times +
This doesn't work. Engine creates the snapshots on wrong storage (old) and this process fails, cause the VM (qemu process) uses the images on other storage (new).
sounds like there are some other problems in your case, wrong db entries image -> snapshot? i didnt investigate further in the vm which failed this process, i directly went further and exported them
in cases where the live merge failed, offline merging worked in another 50%. those which fail offline, too went back to illegal snap state
I fear offline merge would cause data corruption. Because if I shut down the VM, the information in Engine database is still wrong. Engine thinks image files and snapshots are on old storage. But VM has written to the equal named image files on new storage. And offline merge might use the "old" files on old storage.
than your initial plan is an alternative. you use thin or raw on what kind of storage domain? but like said, manually processing is a pita due to the symlink mess.
other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine
Same issue as with offline merge.
Meanwhile I think, we need to shut down the VM, copy the image files from one storage (qemu has used before) to the other storage (the one Engine expects) and pray while starting the VM again.
the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not.
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben:
Hello,
if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue:
Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM.
Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla.
Kind regards
Jan Siml
> after a failed live storage migration (cause unknown) we have a > snapshot which is undeletable due to its status 'illegal' (as seen > in storage/snapshot tab). I have already found some bugs [1],[2],[3] > regarding this issue, but no way how to solve the issue within oVirt > 3.5.3. > > I have attached the relevant engine.log snippet. Is there any way to > do a live merge (and therefore delete the snapshot)? > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] > [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access)
some additional informations. I have checked the images on both storages and verified the disk paths with virsh's dumpxml.
a) The images and snapshots are on both storages. b) The images on source storage aren't used. (modification time) c) The images on target storage are used. (modification time) d) virsh -r dumpxml tells me disk images are located on _target_ storage. e) Admin interface tells me, that images and snapshot are located on _source_ storage, which isn't true, see b), c) and d).
What can we do, to solve this issue? Is this to be corrected in database only?
Kind regards
Jan Siml
------=_Part_303_827523889.1440777327833 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org= /TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns=3D"http://www.w3.org/1999/xhtml"><head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DUTF-8= "/> <style type=3D"text/css">.mceResizeHandle {position: absolute;border: 1px = solid black;background: #FFF;width: 5px;height: 5px;z-index: 10000}.mceResi= zeHandle:hover {background: #000}img[data-mce-selected] {outline: 1px solid= black}img.mceClonedResizable, table.mceClonedResizable {position: absolute= ;outline: 1px dashed black;opacity: .5;z-index: 10000} </style></head><body style=3D""><div><br>> Jan Siml <jsiml@plusline= .net> hat am 28. August 2015 um 16:47 geschrieben:<br>> <br>> <= br>> Hello,<br>> <br>> > > > got exactly the same i= ssue, with all nice side effects like performance<br>> > > >= ; degradation. Until now i was not able to fix this, or to fool the<br>>= ; > engine<br>> > > > somehow that it whould show the i= mage as ok again and give me a 2nd<br>> > > > chance to dro= p the snapshot.<br>> > > > in some cases this procedure hel= ped (needs 2nd storage domain)<br>> > > > -> image live= migration to a different storage domain (check which<br>> > > = > combinations are supported, iscsi -> nfs domain seems unsupported= .<br>> > iscsi<br>> > > > -> iscsi works)<br>= 62; > > > -> snapshot went into ok state, and in ~50% i was= able to drop the<br>> > > > snapshot than. space had been = reclaimed, so seems like this worked<br>> > ><br>> > = 62; okay, seems interesting. But I'm afraid of not knowing which image = files<br>> > > Engine uses when live migration is demanded. If = Engine uses the ones<br>> > > which are actually used and updat= es the database afterwards -- fine. But<br>> > > if the images = are used that are referenced in Engine database, we will<br>> > = 2; take a journey into the past.<br>> > knocking on wood. so far no= problems, and i used this way for sure 50<br>> > times +<br>> = <br>> This doesn't work. Engine creates the snapshots on wrong stor= age (old) <br>> and this process fails, cause the VM (qemu process) use= s the images on <br>> other storage (new).</div> <div> </div> <div>sounds like there are some other problems in your case, wrong db entri= es image -> snapshot? i didnt investigate further in the vm which faile= d this process, i directly went further and exported them</div> <div><br>> <br>> > in cases where the live merge failed, offlin= e merging worked in another<br>> > 50%. those which fail offline, t= oo went back to illegal snap state<br>> <br>> I fear offline merge = would cause data corruption. Because if I shut down <br>> the VM, the i= nformation in Engine database is still wrong. Engine thinks <br>> image= files and snapshots are on old storage. But VM has written to the <br>>= ; equal named image files on new storage. And offline merge might use the <= br>> "old" files on old storage.</div> <div> </div> <div>than your initial plan is an alternative. you use thin or raw on what = kind of storage domain? but like said, manually processing is a pita due to= the symlink mess.</div> <div><br>> <br>> > > > other workaround is through expo= rting the image onto a nfs export<br>> > > > domain, here y= ou can tell the engine to not export snapshots. after<br>> > > = > re-importing everything is fine<br>> <br>> Same issue as with= offline merge.<br>> <br>> Meanwhile I think, we need to shut down = the VM, copy the image files <br>> from one storage (qemu has used befo= re) to the other storage (the one <br>> Engine expects) and pray while = starting the VM again.<br>> <br>> > > > the snapshot fe= ature (live at least) should be avoided at all<br>> > > > c= urrently.... simply not reliable enaugh.<br>> > > > your wa= y works, too. already did that, even it was a pita to figure out<br>> &= #62; > > where to find what. this symlinking mess between /rhev /de= v and<br>> > > > /var/lib/libvirt is really awesome. not.<b= r>> > > > > Jan Siml <jsiml@plusline.net> hat a= m 28. August 2015 um 12:56<br>> > > > geschrieben:<br>>= > > > ><br>> > > > ><br>> > = 62; > > Hello,<br>> > > > ><br>> > >= ; > > if no one has an idea how to correct the Disk/Snapshot paths = in<br>> > Engine<br>> > > > > database, I see o= nly one possible way to solve the issue:<br>> > > > ><b= r>> > > > > Stop the VM and copy image/meta files targe= t storage to source<br>> > storage<br>> > > > >= (the one where Engine thinks the files are located). Start the VM.<br>>= ; > > > ><br>> > > > > Any concerns reg= arding this procedure? But I still hope that someone<br>> > > &= #62; > from oVirt team can give an advice how to correct the database<b= r>> > entries.<br>> > > > > If necessary I woul= d open a bug in Bugzilla.<br>> > > > ><br>> > &= #62; > > Kind regards<br>> > > > ><br>> = 2; > > > Jan Siml<br>> > > > ><br>> = 2; > > > >> after a failed live storage migration (caus= e unknown) we have a<br>> > > > > >> snapshot w= hich is undeletable due to its status 'illegal' (as seen<br>> &= #62; > > > >> in storage/snapshot tab). I have already = found some bugs<br>> > [1],[2],[3]<br>> > > > >= >> regarding this issue, but no way how to solve the issue within<= br>> > oVirt<br>> > > > > > > 3.5.3.<br=
> > > > > >><br>> > > > > = >> I have attached the relevant engine.log snippet. Is there any<br= > > way to<br>> > > > > >> do a live m= erge (and therefore delete the snapshot)?<br>> > > > > = >><br>> > > > > >> [1] https://bugzilla= .redhat.com/show_bug.cgi?id=3D1213157<br>> > > > > >= ;> [2] https://bugzilla.redhat.com/show_bug.cgi?id=3D1247377 links<br>&= #62; > to [3]<br>> > > > > >> [3] https://b= ugzilla.redhat.com/show_bug.cgi?id=3D1247379 (no<br>> > access)<br>= > > > > > ><br>> > > > > > = some additional informations. I have checked the images on both<br>> = 62; > > storages<br>> > > > > > and verifie= d the disk paths with virsh's dumpxml.<br>> > > > >= ><br>> > > > > > a) The images and snapshots a= re on both storages.<br>> > > > > > b) The images o= n source storage aren't used. (modification time)<br>> > > = > > > c) The images on target storage are used. (modification t= ime)<br>> > > > > > d) virsh -r dumpxml tells me di= sk images are located on _target_<br>> > > > storage.<br>= 62; > > > > > e) Admin interface tells me, that images = and snapshot are<br>> > located on<br>> > > > >= > _source_ storage, which isn't true, see b), c) and d).<br>> = > > > > ><br>> > > > > > What c= an we do, to solve this issue? Is this to be corrected in<br>> > = 62; > database<br>> > > > > > only?<br>> <b= r>> Kind regards<br>> <br>> Jan Siml</div></body></html> ------=_Part_303_827523889.1440777327833--

Hello,
got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked
okay, seems interesting. But I'm afraid of not knowing which image files Engine uses when live migration is demanded. If Engine uses the ones which are actually used and updates the database afterwards -- fine. But if the images are used that are referenced in Engine database, we will take a journey into the past. knocking on wood. so far no problems, and i used this way for sure 50 times +
This doesn't work. Engine creates the snapshots on wrong storage (old) and this process fails, cause the VM (qemu process) uses the images on other storage (new).
sounds like there are some other problems in your case, wrong db entries image -> snapshot? i didnt investigate further in the vm which failed this process, i directly went further and exported them
Yes, engine thinks image and snapshot are on storage a, but qemu process uses equal named images on storage b. It seems to me, that first live storage migration was successful on qemu level, but engine hasn't updated the database entries. Seems to be a possible solution to correct the database entries, but I'm not familar with the oVirt schema and won't even try it without an advice from oVirt developers.
in cases where the live merge failed, offline merging worked in another 50%. those which fail offline, too went back to illegal snap state
I fear offline merge would cause data corruption. Because if I shut down the VM, the information in Engine database is still wrong. Engine thinks image files and snapshots are on old storage. But VM has written to the equal named image files on new storage. And offline merge might use the "old" files on old storage.
than your initial plan is an alternative. you use thin or raw on what kind of storage domain? but like said, manually processing is a pita due to the symlink mess.
We are using raw images which are thin provisioned on NFS based storage domains. On storage b I can see an qcow formatted image file which qemu uses and the original (raw) image which is now backing file.
other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine
Same issue as with offline merge.
Meanwhile I think, we need to shut down the VM, copy the image files from one storage (qemu has used before) to the other storage (the one Engine expects) and pray while starting the VM again.
the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not.
Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben:
Hello,
if no one has an idea how to correct the Disk/Snapshot paths in Engine database, I see only one possible way to solve the issue:
Stop the VM and copy image/meta files target storage to source storage (the one where Engine thinks the files are located). Start the VM.
Any concerns regarding this procedure? But I still hope that someone from oVirt team can give an advice how to correct the database entries. If necessary I would open a bug in Bugzilla.
Kind regards
Jan Siml
>> after a failed live storage migration (cause unknown) we have a >> snapshot which is undeletable due to its status 'illegal' (as seen >> in storage/snapshot tab). I have already found some bugs [1],[2],[3] >> regarding this issue, but no way how to solve the issue within oVirt > > 3.5.3. >> >> I have attached the relevant engine.log snippet. Is there any way to >> do a live merge (and therefore delete the snapshot)? >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] >> [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access) > > some additional informations. I have checked the images on both storages > and verified the disk paths with virsh's dumpxml. > > a) The images and snapshots are on both storages. > b) The images on source storage aren't used. (modification time) > c) The images on target storage are used. (modification time) > d) virsh -r dumpxml tells me disk images are located on _target_ storage. > e) Admin interface tells me, that images and snapshot are located on > _source_ storage, which isn't true, see b), c) and d). > > What can we do, to solve this issue? Is this to be corrected in database > only?
Kind regards Jan Siml

Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 19:52 geschrieben:
Hello,
got exactly the same issue, with all nice side effects like performance degradation. Until now i was not able to fix this, or to fool the engine somehow that it whould show the image as ok again and give me a 2nd chance to drop the snapshot. in some cases this procedure helped (needs 2nd storage domain) -> image live migration to a different storage domain (check which combinations are supported, iscsi -> nfs domain seems unsupported. iscsi -> iscsi works) -> snapshot went into ok state, and in ~50% i was able to drop the snapshot than. space had been reclaimed, so seems like this worked
okay, seems interesting. But I'm afraid of not knowing which image files Engine uses when live migration is demanded. If Engine uses the ones which are actually used and updates the database afterwards -- fine. But if the images are used that are referenced in Engine database, we will take a journey into the past. knocking on wood. so far no problems, and i used this way for sure 50 times +
This doesn't work. Engine creates the snapshots on wrong storage (old) and this process fails, cause the VM (qemu process) uses the images on other storage (new).
sounds like there are some other problems in your case, wrong db entries image -> snapshot? i didnt investigate further in the vm which failed this process, i directly went further and exported them
Yes, engine thinks image and snapshot are on storage a, but qemu process uses equal named images on storage b.
It seems to me, that first live storage migration was successful on qemu level, but engine hasn't updated the database entries.
Seems to be a possible solution to correct the database entries, but I'm not familar with the oVirt schema and won't even try it without an advice from oVirt developers.
in cases where the live merge failed, offline merging worked in another 50%. those which fail offline, too went back to illegal snap state
I fear offline merge would cause data corruption. Because if I shut down the VM, the information in Engine database is still wrong. Engine thinks image files and snapshots are on old storage. But VM has written to the equal named image files on new storage. And offline merge might use the "old" files on old storage.
than your initial plan is an alternative. you use thin or raw on what kind of storage domain? but like said, manually processing is a pita due to the symlink mess.
We are using raw images which are thin provisioned on NFS based storage domains. On storage b I can see an qcow formatted image file which qemu uses and the original (raw) image which is now backing file.
might sound a little bit curious, but imho this is the best setup for your plan. thin on iscsi is an totally different story... lvm volumes which get extended on demand (which fails with default settings during heavy writes, and causes vm to pause), additionally ovirt writes qcows images raw onto those lv volumes. since you can get your hands directly on the images this whould be my prefered workaround. but maybe one of the ovirt devs got a better idea/solution?
other workaround is through exporting the image onto a nfs export domain, here you can tell the engine to not export snapshots. after re-importing everything is fine
Same issue as with offline merge.
Meanwhile I think, we need to shut down the VM, copy the image files from one storage (qemu has used before) to the other storage (the one Engine expects) and pray while starting the VM again.
the snapshot feature (live at least) should be avoided at all currently.... simply not reliable enaugh. your way works, too. already did that, even it was a pita to figure out where to find what. this symlinking mess between /rhev /dev and /var/lib/libvirt is really awesome. not. > Jan Siml <jsiml@plusline.net> hat am 28. August 2015 um 12:56 geschrieben: > > > Hello, > > if no one has an idea how to correct the Disk/Snapshot paths in Engine > database, I see only one possible way to solve the issue: > > Stop the VM and copy image/meta files target storage to source storage > (the one where Engine thinks the files are located). Start the VM. > > Any concerns regarding this procedure? But I still hope that someone > from oVirt team can give an advice how to correct the database entries. > If necessary I would open a bug in Bugzilla. > > Kind regards > > Jan Siml > > >> after a failed live storage migration (cause unknown) we have a > >> snapshot which is undeletable due to its status 'illegal' (as seen > >> in storage/snapshot tab). I have already found some bugs [1],[2],[3] > >> regarding this issue, but no way how to solve the issue within oVirt > > > 3.5.3. > >> > >> I have attached the relevant engine.log snippet. Is there any way to > >> do a live merge (and therefore delete the snapshot)? > >> > >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1213157 > >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links to [3] > >> [3] https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no access) > > > > some additional informations. I have checked the images on both storages > > and verified the disk paths with virsh's dumpxml. > > > > a) The images and snapshots are on both storages. > > b) The images on source storage aren't used. (modification time) > > c) The images on target storage are used. (modification time) > > d) virsh -r dumpxml tells me disk images are located on _target_ storage. > > e) Admin interface tells me, that images and snapshot are located on > > _source_ storage, which isn't true, see b), c) and d). > > > > What can we do, to solve this issue? Is this to be corrected in database > > only?
Kind regards
Jan Siml
participants (3)
-
InterNetX - Juergen Gotteswinter
-
InterNetX - Juergen Gotteswinter
-
Jan Siml