[ovirt-users] Disks Illegal State

Colin Coe colin.coe at gmail.com
Mon Apr 18 18:37:09 EDT 2016


We're seeing this in RHEV 3.5 with snapshot management on VMs with multiple
disks.  It would be awesome to have a "fsck" type script that could be run
daily which reports on any problems with the snapshot disks.

On Mon, Apr 18, 2016 at 10:59 PM, Clint Boggio <clint at theboggios.com> wrote:

> Markus thank you so much for the information. I'll be focusing on
> resolution of this problem this week and I'll keep you in the loop.
>
> On Apr 18, 2016, at 7:39 AM, Markus Stockhausen <stockhausen at collogia.de>
> wrote:
>
> >> Von: users-bounces at ovirt.org [users-bounces at ovirt.org]&quot; im
> Auftrag von &quot;Clint Boggio [clint at theboggios.com]
> >> Gesendet: Montag, 18. April 2016 14:16
> >> An: users at ovirt.org
> >> Betreff: [ovirt-users] Disks Illegal State
> >>
> >> OVirt 3.6, 4 node cluster with dedicated engine. Main storage domain is
> iscsi, ISO and Export domains are NFS.
> >>
> >> Several of my VM snapshot disks show to be in an "illegal state". The
> system will not allow me to manipulate the snapshots in any way, nor clone
> the active system, or create a new snapshot.
> >>
> >> In the logs I see that the system complains about not being able to
> "get volume size for xxx", and also that the system appears to believe that
> the image is "locked" and is currently in the snapshot process.
> >>
> >> Of the VM's with this status, one rebooted and was lost due to "cannot
> get volume size for domain xxx".
> >>
> >> I fear that in this current condition, should any of the other machine
> reboot, they too will be lost.
> >>
> >> How can I troubleshoot this problem further, and hopefully alleviate
> the condition ?
> >>
> >> Thank you for your help.
> >
> > Hi Clint,
> >
> > for us the problem always boils down to the following steps. Might be
> simpler as we use
> > NFS for all of our domains and have direct access to the image files.
> >
> > 1) Check if snapshot disks are currently used. Capture the qemu command
> line with a "ps -ef"
> > on the nodes. There you can see what images qemu is started with. For
> each of the files check
> > the backing chain:
> >
> > # qemu-img info /rhev/.../bbd05dd8-c3bf-4d15-9317-73040e04abae
> > image: bbd05dd8-c3bf-4d15-9317-73040e04abae
> > file format: qcow2
> > virtual size: 50G (53687091200 bytes)
> > disk size: 133M
> > cluster_size: 65536
> > backing file:
> ../f8ebfb39-2ac6-4b87-b193-4204d1854edc/595b95f4-ce1a-4298-bd27-3f6745ae4e4c
> > backing file format: raw
> > Format specific information:
> >    compat: 0.10
> >
> > # qemu-img info .../595b95f4-ce1a-4298-bd27-3f6745ae4e4c (see above)
> > ...
> >
> > I don't know how you can accomplish this on ISCSI (and LVM based images
> inside iirc). We
> > usually follow the backing chain and test if all the files exist and are
> linked correctly. Especially
> > if everything matches the OVirt GUI. I guess this is the most important
> part for you.
> >
> > 2) In most of our cases everything is fine and only the OVirt database
> is wrong. So we fix it
> > at our own risk. Because of your explanation I do not recommend that for
> you. It is just for
> > documentation purpose.
> >
> > engine# su - postgres
> >> psql engine postgres
> >
> >> select image_group_id,imagestatus from images where imagestatus =4;
> >> ... list of illegal images
> >> update images set imagestatus =1 where imagestatus = 4 and <other
> criteria>;
> >> commit
> >
> >> select description,status from snapshots where status <> 'OK';
> >> ... list of locked snapshots
> >> update snapshots set status = 'OK' where status <> 'OK' and <other
> criteria>;
> >> commit
> >
> >> \q
> >
> > Restart engine and everything should be in sync again.
> >
> > Best regards.
> >
> > Markus=
> > <InterScan_Disclaimer.txt>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160419/c4b9378f/attachment.html>


More information about the Users mailing list