----- Original Message -----
From: "Martijn Grendelman"
<martijn.grendelman(a)isaac.nl>
To: users(a)ovirt.org
Sent: Tuesday, October 1, 2013 12:29:58 PM
Subject: Re: [Users] VMs and volumes disappearing
Hi,
>> I have recently set up an oVirt environment, I think in a pretty
>> standard fashion, with engine 3.3 on one host, one oVirt host on a
>> physical machine, both running CentOS 6.4, using NFS for all storage
>> domains.
>
> Please provide rpm -qa on the ovirt rpms (ovirt engine).
martijn@ovirt:~> rpm -qa | grep ovirt
ovirt-log-collector-3.3.0-1.el6.noarch
ovirt-engine-3.3.0-1.el6.noarch
ovirt-host-deploy-1.1.1-1.el6.noarch
ovirt-engine-cli-3.3.0.4-1.el6.noarch
ovirt-engine-userportal-3.3.0-1.el6.noarch
ovirt-engine-tools-3.3.0-1.el6.noarch
ovirt-engine-setup-3.3.0-4.el6.noarch
ovirt-engine-sdk-python-3.3.0.6-1.el6.noarch
ovirt-image-uploader-3.3.0-1.el6.noarch
ovirt-engine-restapi-3.3.0-1.el6.noarch
ovirt-engine-webadmin-portal-3.3.0-1.el6.noarch
ovirt-host-deploy-java-1.1.1-1.el6.noarch
ovirt-engine-backend-3.3.0-1.el6.noarch
ovirt-release-el6-8-1.noarch
ovirt-iso-uploader-3.3.0-1.el6.noarch
ovirt-engine-dbscripts-3.3.0-1.el6.noarch
ovirt-engine-lib-3.3.0-4.el6.noarch
>> Today I was playing around with snapshots, when I noticed that the
>> Snapshots panel didn't show any of the snapshots I created, not even the
>> 'Current - Active VM' snapshot that all VMs have.
>
> Not sure why this has happened. How do you know that snapshot
> creation was completed? Did you look at the events tab? (Asking to be
> sure) engine.log will be quite helpful here.
I find engine.log somewhat hard to read, to be honest, and documentation
is hard to find, but I think I found some clues.
Hi,
I understand what you're saying about engine.log, when I asked for it, it was because
I'm one of the maintainers of ovirt engine, so I thought I could give you a hand here,
especially after reading your email and
getting a sense that I saw a similar issue in the past.
I tried to create 4 snapshots of a certain VM, 2 of which completed
normally and 2 of which failed:
"Failed with VDSM error SNAPSHOT_FAILED and code 48"
However, what I find most upsetting, is that the VMs that disappeared
were not the subject of my experiments. I was creating snapshots of a
single VM, and the VMs that disappeared were unrelated. As a matter of
fact, the VM I was experimenting with IS THE ONLY ONE that survived.
By the way, the Snapshots panel has been displaying snapshots correctly
for a while, but when I logged in this morning, it appeared empty again,
for all VMs.
Is there anything I can check to see what causes this?
>> Not sure what to do, I decided to restart the ovirt-engine process.
>>
>> When I logged back on to the administrator panel, I was shocked to
>> see 2endWith of my 4 VMs completely missing from the inventory. I
>> haven't been able to find back a single trace of either machine,
>> neither in the portal nor on disk. It seems like they never
>> existed. The storage of both VMs seems to be erased from the data
>> domain.
> Not sure why storage domain was erased. About Vms disappeared - there
> were previous discussions on that at users(a)ovirt.org. In a nutshell,
> due to a bug (that was already fixed) prior to the restart you might
> have had records at the table that contained value of
> "empty guid" (a string in UUID format with only 0 and - ) at the
> vdsm_task_id_column. This means that the task is not associated with
> a real SPM task, and when the engine restarts, if for a given flow
> (let's say - snapshot creation) there are tasks with such
> vdsm_task_id, the flow will end with failure. For some flows ,
> ending with failure means erasing the vm (for example - real failure
> of importing a vm). By the way, similar issue can probably occur with
> disks as well, as there are flows that run async tasks that deal with
> disks.
I think I have an idea about what happended now.
The 2 disappeared VMs have been imported into oVirt using virt-v2v. The
3rd one that's now missing a disk volume was not, but I have been
playing with storage migration in the past.
Then this is the reason, other users have complained about it at users(a)ovirt.org
Yesterday's engine.log seems to suggest, that all of these tasks
(importing the 2 VMs and trying to move a volume) have been restarted
immediately after restarting Engine. After failure, the VMs and volume
were removed. It seems to fit the above description of the bug.
So...
What can I do to prevent this from happening again?
Should I periodically check the 'async_tasks' table for anomalies? Is
there a bugfix I can apply, or should I wait for a new release of oVirt?
If the latter, when is that expected to happen?
Upgrade
I just talked with Ofer (CC'ed), our release engineer, and he said that all packages
should be 3.3.0-4 (notice ovirt-engine is not)
I hope this helps you out,
Yair
Thanks,
Martijn.
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users