Jan Siml <jsiml(a)plusline.net> hat am 28. August 2015 um 19:52
geschrieben:
Hello,
>> > > > got exactly the same issue, with all nice side effects like
> performance
>> > > > degradation. Until now i was not able to fix this, or to fool
the
>> > engine
>> > > > somehow that it whould show the image as ok again and give me a
2nd
>> > > > chance to drop the snapshot.
>> > > > in some cases this procedure helped (needs 2nd storage domain)
>> > > > -> image live migration to a different storage domain (check
which
>> > > > combinations are supported, iscsi -> nfs domain seems
unsupported.
>> > iscsi
>> > > > -> iscsi works)
>> > > > -> snapshot went into ok state, and in ~50% i was able to drop
the
>> > > > snapshot than. space had been reclaimed, so seems like this
worked
>> > >
>> > > okay, seems interesting. But I'm afraid of not knowing which
image
> files
>> > > Engine uses when live migration is demanded. If Engine uses the ones
>> > > which are actually used and updates the database afterwards --
> fine. But
>> > > if the images are used that are referenced in Engine database, we
will
>> > > take a journey into the past.
>> > knocking on wood. so far no problems, and i used this way for sure 50
>> > times +
>>
>> This doesn't work. Engine creates the snapshots on wrong storage (old)
>> and this process fails, cause the VM (qemu process) uses the images on
>> other storage (new).
>
> sounds like there are some other problems in your case, wrong db entries
> image -> snapshot? i didnt investigate further in the vm which failed
> this process, i directly went further and exported them
Yes, engine thinks image and snapshot are on storage a, but qemu process
uses equal named images on storage b.
It seems to me, that first live storage migration was successful on qemu
level, but engine hasn't updated the database entries.
Seems to be a possible solution to correct the database entries, but I'm
not familar with the oVirt schema and won't even try it without an
advice from oVirt developers.
>> > in cases where the live merge failed, offline merging worked in another
>> > 50%. those which fail offline, too went back to illegal snap state
>>
>> I fear offline merge would cause data corruption. Because if I shut down
>> the VM, the information in Engine database is still wrong. Engine thinks
>> image files and snapshots are on old storage. But VM has written to the
>> equal named image files on new storage. And offline merge might use the
>> "old" files on old storage.
>
> than your initial plan is an alternative. you use thin or raw on what
> kind of storage domain? but like said, manually processing is a pita due
> to the symlink mess.
We are using raw images which are thin provisioned on NFS based storage
domains. On storage b I can see an qcow formatted image file which qemu
uses and the original (raw) image which is now backing file.
might sound a little bit curious, but imho this is the best setup for your plan.
thin on iscsi is an totally different story... lvm volumes which get extended on
demand (which fails with default settings during heavy writes, and causes vm to
pause), additionally ovirt writes qcows images raw onto those lv volumes. since
you can get your hands directly on the images this whould be my prefered
workaround. but maybe one of the ovirt devs got a better idea/solution?
>> > > > other workaround is through exporting the
image onto a nfs export
>> > > > domain, here you can tell the engine to not export snapshots.
after
>> > > > re-importing everything is fine
>>
>> Same issue as with offline merge.
>>
>> Meanwhile I think, we need to shut down the VM, copy the image files
>> from one storage (qemu has used before) to the other storage (the one
>> Engine expects) and pray while starting the VM again.
>> > > > the snapshot feature (live at least) should be avoided at all
>> > > > currently.... simply not reliable enaugh.
>> > > > your way works, too. already did that, even it was a pita to
> figure out
>> > > > where to find what. this symlinking mess between /rhev /dev and
>> > > > /var/lib/libvirt is really awesome. not.
>> > > > > Jan Siml <jsiml(a)plusline.net> hat am 28. August 2015
um 12:56
>> > > > geschrieben:
>> > > > >
>> > > > >
>> > > > > Hello,
>> > > > >
>> > > > > if no one has an idea how to correct the Disk/Snapshot paths
in
>> > Engine
>> > > > > database, I see only one possible way to solve the issue:
>> > > > >
>> > > > > Stop the VM and copy image/meta files target storage to
source
>> > storage
>> > > > > (the one where Engine thinks the files are located). Start
the VM.
>> > > > >
>> > > > > Any concerns regarding this procedure? But I still hope
that
> someone
>> > > > > from oVirt team can give an advice how to correct the
database
>> > entries.
>> > > > > If necessary I would open a bug in Bugzilla.
>> > > > >
>> > > > > Kind regards
>> > > > >
>> > > > > Jan Siml
>> > > > >
>> > > > > >> after a failed live storage migration (cause
unknown) we have a
>> > > > > >> snapshot which is undeletable due to its status
'illegal'
> (as seen
>> > > > > >> in storage/snapshot tab). I have already found some
bugs
>> > [1],[2],[3]
>> > > > > >> regarding this issue, but no way how to solve the
issue within
>> > oVirt
>> > > > > > > 3.5.3.
>> > > > > >>
>> > > > > >> I have attached the relevant engine.log snippet. Is
there any
>> > way to
>> > > > > >> do a live merge (and therefore delete the
snapshot)?
>> > > > > >>
>> > > > > >> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1213157
>> > > > > >> [2]
https://bugzilla.redhat.com/show_bug.cgi?id=1247377 links
>> > to [3]
>> > > > > >> [3]
https://bugzilla.redhat.com/show_bug.cgi?id=1247379 (no
>> > access)
>> > > > > >
>> > > > > > some additional informations. I have checked the images
on both
>> > > > storages
>> > > > > > and verified the disk paths with virsh's dumpxml.
>> > > > > >
>> > > > > > a) The images and snapshots are on both storages.
>> > > > > > b) The images on source storage aren't used.
(modification time)
>> > > > > > c) The images on target storage are used. (modification
time)
>> > > > > > d) virsh -r dumpxml tells me disk images are located on
_target_
>> > > > storage.
>> > > > > > e) Admin interface tells me, that images and snapshot
are
>> > located on
>> > > > > > _source_ storage, which isn't true, see b), c) and
d).
>> > > > > >
>> > > > > > What can we do, to solve this issue? Is this to be
corrected in
>> > > > database
>> > > > > > only?
Kind regards
Jan Siml