
On Thu, Apr 21, 2016 at 1:10 AM, Clint Boggio <clint@theboggios.com> wrote:
Bug is filed.
[Bug 1329000] Snapshot Images Flagged as "ILLEGAL" After Backup Script Is Run
Thanks
Is there a technique for recovery from this condition or should I back up the data on the VM's that are afflicted and still running and start over ?
I'm not sure what is the condition. It may be good chain on host side, with missing volumes on engine side which are not needed and can be deleted, or missing volumes on host side. To check if there are missing volumes on host side, you can inspect the real chain used by libvirt using virsh: # virsh virsh # list Please enter your authentication name: vdsm@ovirt Please enter your password: shibboleth Id Name State ---------------------------------------------------- 2 lsm-test running virsh # dumpxml 2 <domain type='kvm' id='2'> [...] <devices> [...] <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/f9374c0e-ae24-4bc1-a596-f61d5f05bc5f/1e999a77-8fbb-4792-9224-0693be3242b9/images/bb26f6eb-d54d-43f3-8d18-e260efb1df7e/4786bb86-da94-44af-b012-51d899cc7225'/> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/f9374c0e-ae24-4bc1-a596-f61d5f05bc5f/1e999a77-8fbb-4792-9224-0693be3242b9/images/bb26f6eb-d54d-43f3-8d18-e260efb1df7e/../bb26f6eb-d54d-43f3-8d18-e260efb1df7e/a07c0fec-242f-444b-8892-e4a0b22e08a7'/> <backingStore type='block' index='2'> <format type='qcow2'/> <source dev='/rhev/data-center/f9374c0e-ae24-4bc1-a596-f61d5f05bc5f/1e999a77-8fbb-4792-9224-0693be3242b9/images/bb26f6eb-d54d-43f3-8d18-e260efb1df7e/../bb26f6eb-d54d-43f3-8d18-e260efb1df7e/../bb26f6eb-d54d-43f3-8d18-e260efb1df7e/c76a763a-8208-4fa6-ab60-6bdab15b6159'/> <backingStore type='block' index='3'> <format type='qcow2'/> <source dev='/rhev/data-center/f9374c0e-ae24-4bc1-a596-f61d5f05bc5f/1e999a77-8fbb-4792-9224-0693be3242b9/images/bb26f6eb-d54d-43f3-8d18-e260efb1df7e/../bb26f6eb-d54d-43f3-8d18-e260efb1df7e/../bb26f6eb-d54d-43f3-8d18-e260efb1df7e/../bb26f6eb-d54d-43f3-8d18-e260efb1df7e/7c0d5c23-710d-445b-868c-9add6219436d'/> <backingStore/> </backingStore> </backingStore> </backingStore> <target dev='vda' bus='virtio'/> <serial>bb26f6eb-d54d-43f3-8d18-e260efb1df7e</serial> <boot order='1'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> [...] Here we can see the real chain (removing everything but the volume id) 1.4786bb86-da94-44af-b012-51d899cc7225 2. a07c0fec-242f-444b-8892-e4a0b22e08a7 3. c76a763a-8208-4fa6-ab60-6bdab15b6159 4. 7c0d5c23-710d-445b-868c-9add6219436d If engine complains about a snapshot which is not part of this chain, the problem is in engine database and we can safely remove the snapshot from the database. If engine complains about a volume which is in this chain, and the volume is missing on disk, this is an issue on the host side. I'm not sure it is possible to restore such missing file unless you have a backup. It would be useful if you dump the xml of the vms with this issue and attach it to the bug. Nir
Thank you all for your help.
On Apr 20, 2016, at 3:19 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Apr 20, 2016 at 10:42 PM, Clint Boggio <clint@theboggios.com> wrote: I grepped out the engine logs until i found reference to the illegal disk in question. The log indicates that the image has been flagged illegal because the original disk is no longer present. So it is very possible that the backup script, somehow through the miracle of 1's and 0's deleted the base VM disks.
###################### # BEGIN ######################
2016-03-27 18:57:41,769 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSComma nd] (org.ovirt.thread.pool-8-thread-11) [30680dce] START, CreateSnapshotVDSCommand( CreateSnapshotVDSCommandParameters:{runAsync='true', storagePoolId='85a72afd-7bde-4065-a1bc-7fc6e22e6bf6', ignoreFailoverLimit='false', storageDomainId='045c7fda-ab98-4905-876c- 00b5413a619f', imageGroupId='ad486d26-4594-4d16-a402-68b45d82078a', imageSizeInBytes='268435456000', volumeFormat='COW', newImageId='e87e0c7c-4f6f-45e9-90ca-cf34617da3f6', newImageDescription='', imageInitialSizeInBytes='0', imageId='d538e0ef- 2f55-4c74-b8f1-8900fd6b814b', sourceImageGroupId='ad486d26-4594-4d16- a402-68b45d82078a'}), log id: 7648bbd2 2016-03-27 18:57:42,835 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSComma nd] (org.ovirt.thread.pool-8-thread-11) [30680dce] FINISH, CreateSnapshotVDSCommand, return: e87e0c7c-4f6f-45e9-90ca-cf34617da3f6, log id: 7648bbd2 2016-03-27 18:58:24,395 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.GetImageInfoVDSCommand ] (org.ovirt.thread.pool-8-thread-20) [30680dce] START, GetImageInfoVDSCommand( GetImageInfoVDSCommandParameters:{runAsync='true', storagePoolId='85a72afd-7bde-4065-a1bc-7fc6e22e6bf6', ignoreFailoverLimit='false', storageDomainId='045c7fda-ab98-4905-876c- 00b5413a619f', imageGroupId='ad486d26-4594-4d16-a402-68b45d82078a', imageId='e87e0c7c-4f6f-45e9-90ca-cf34617da3f6'}), log id: 6d2d19f6 2016-03-28 14:14:49,454 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-7-thread-3) [718f57] START, MergeVDSCommand(HostName = KVM04, MergeVDSCommandParameters:{runAsync='true', hostId='b51933a3-9201-4446- a3e3-906a2ec1b467', vmId='6ef30172-b010-46fa-9482-accd30682232', storagePoolId='85a72afd-7bde-4065-a1bc-7fc6e22e6bf6', storageDomainId='045c7fda-ab98-4905-876c-00b5413a619f', imageGroupId='ad486d26-4594-4d16-a402-68b45d82078a', imageId='e87e0c7c- 4f6f-45e9-90ca-cf34617da3f6', baseImageId='6e008200-3c21-4285-96b8- 07c29c0cb72c', topImageId='d538e0ef-2f55-4c74-b8f1-8900fd6b814b', bandwidth='0'}), log id: 2cc2db4 2016-03-28 17:01:22,368 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSComma nd] (default task-77) [410b6a44] START, CreateSnapshotVDSCommand( CreateSnapshotVDSCommandParameters:{runAsync='true', storagePoolId='85a72afd-7bde-4065-a1bc-7fc6e22e6bf6', ignoreFailoverLimit='false', storageDomainId='045c7fda-ab98-4905-876c- 00b5413a619f', imageGroupId='ad486d26-4594-4d16-a402-68b45d82078a', imageSizeInBytes='268435456000', volumeFormat='COW', newImageId='919d6991-43e4-4f26-868e-031a01011191', newImageDescription='', imageInitialSizeInBytes='0', imageId='e87e0c7c- 4f6f-45e9-90ca-cf34617da3f6', sourceImageGroupId='ad486d26-4594-4d16- a402-68b45d82078a'}), log id: 4ed3e9ca 2016-03-28 18:36:28,404 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (pool-7-thread-1) [6911a44f] START, MergeVDSCommand(HostName = KVM04, MergeVDSCommandParameters:{runAsync='true', hostId='b51933a3-9201-4446- a3e3-906a2ec1b467', vmId='6ef30172-b010-46fa-9482-accd30682232', storagePoolId='85a72afd-7bde-4065-a1bc-7fc6e22e6bf6', storageDomainId='045c7fda-ab98-4905-876c-00b5413a619f', imageGroupId='ad486d26-4594-4d16-a402-68b45d82078a', imageId='919d6991- 43e4-4f26-868e-031a01011191', baseImageId='e87e0c7c-4f6f-45e9-90ca- cf34617da3f6', topImageId='919d6991-43e4-4f26-868e-031a01011191', bandwidth='0'}), log id: d09cb70 2016-03-28 18:39:53,773 INFO [org.ovirt.engine.core.bll.MergeCommandCallback] (DefaultQuartzScheduler_Worker-99) [6911a44f] Merge command has completed for images 'e87e0c7c-4f6f-45e9-90ca-cf34617da3f6'..'919d6991- 43e4-4f26-868e-031a01011191' 2016-03-28 18:41:23,003 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskLiveCommand] (DefaultQuartzScheduler_Worker-44) [a00e3a8] Merging of snapshot 'a1b3c247-2c6f-4731-9e62-c15f5cfb9a72' images 'e87e0c7c-4f6f-45e9-90ca- cf34617da3f6'..'919d6991-43e4-4f26-868e-031a01011191' failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation.
This is live merge failure - we have a similar bug causing this, and I have reproduced similar failure today. This may be the same bug, we must inspect the logs to be sure.
Typically the merge succeeds in vdsm side, but from some reason the engine fail to detect the merge success and mark the volumes as illegal.
################## # END ##################
If that's the case, then why (how) are the afflicted machines that have not been rebooted still running without thier backing disks ?
It is possible to unlink a file while it is being used by another process. The directory entry is removed so another process cannot access the file, but processes that already opened the file are not affected.
But this looks like the live merge issue, not like your backup script trying too hard.
I can upload the logs and a copy of the backup script. Do you all have a repository you'd like meto upload to ? Let me know and i'll upload them right now.
Please file a bug and attach the files there.
Nir
On Wed, 2016-04-20 at 13:33 -0400, users-request@ovirt.org wrote: Send Users mailing list submissions to users@ovirt.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.ovirt.org/mailman/listinfo/users or, via email, send a message with subject or body 'help' to users-request@ovirt.org
You can reach the person managing the list at users-owner@ovirt.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Users digest..."
Today's Topics:
1. Re: vhostmd vdsm-hook (Ars?ne Gschwind) 2. Re: Disks Illegal State (Nir Soffer)
------------------------------------------------------------------- ---
Message: 1 Date: Wed, 20 Apr 2016 19:09:39 +0200 From: Ars?ne Gschwind <arsene.gschwind@unibas.ch> To: Simon Barrett <Simon.Barrett@tradingscreen.com>, "users@ov irt.org" <users@ovirt.org> Subject: Re: [ovirt-users] vhostmd vdsm-hook Message-ID: <5717B7D3.2070502@unibas.ch> Content-Type: text/plain; charset="windows-1252"; Format="flowed"
I've never tried with 2 disks but I will assume that the next free available disk will be used by the vdsm hook and the vm-dump-metrics cmd will check the kind of disk. Let me know if you give a try....
thanks, Ars?ne
On 04/19/2016 02:43 PM, Simon Barrett wrote:
Thanks again but how does that work when a VM is configured to have more than one disk?
If I have a VM with a /dev/vda disk and a /dev/vdb disk, when I turn the vhostmd hook on the vm metric device gets created as /dev/vdb and the original /dev/vdb disk gets bumped to /dev/vdc.
Is that expected behavior? Will that not cause problems?
Thanks,
Simon
*From:*Ars?ne Gschwind [mailto:arsene.gschwind@unibas.ch] *Sent:* Tuesday, 19 April, 2016 13:06 *To:* Simon Barrett <Simon.Barrett@tradingscreen.com>; users@ovirt. org *Subject:* Re: [ovirt-users] vhostmd vdsm-hook
The metric information are available on this additional disk /dev/vdb. You may install the package vm-dump-metrics and use the command vm-dump-metrics which will display all metrics in an xml format.
Ars?ne
On 04/19/2016 10:48 AM, Simon Barrett wrote:
Thanks Ars?ne,
I have vhostmd running on the ovirt node and have set the sap_agent to true on the VM configuration. I also stopped and started the VM to ensure that the config change took effect.
On the oVirt node I see the vhostmd running and see the following entry in the qemu-kvm output:
drive file=/dev/shm/vhostmd0,if=none,id=drive-virtio- disk701,readonly=on,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio- disk701,id=virtio-disk701
The part I wasn?t quite understanding was how this presented itself on the VM but I now see a new disk device ?/dev/vdb?. If I cat the contents of /dev/vdb I now see the information that is provided from the ovirt node, which is great news and very useful.
Thanks for your help.
Simon
*From:*users-bounces@ovirt.org <mailto:users-bounces@ovirt.org> [mailto:users-bounces@ovirt.org] *On Behalf Of *Ars?ne Gschwind *Sent:* Monday, 18 April, 2016 16:03 *To:* users@ovirt.org <mailto:users@ovirt.org> *Subject:* Re: [ovirt-users] vhostmd vdsm-hook
Hi Simon,
You will need to have vhostmd running on the oVirt node and set the "sap_agent" custom property for the vm as you may see on the screenshot.
sap_agent
Ars?ne
On 04/15/2016 12:15 PM, Simon Barrett wrote:
I?m trying to use the vhostmd vdsm host to access ovirt node metrics from within a VM. Vhostmd is running and updating the /dev/shm/vhostmd0 on the ovirt node.
The part I?m stuck on is: ?This disk image is exported read-only to guests. Guests can read the disk image to see metrics? from http://www.ovirt.org/develop/developer-guide/vdsm/hook/vhos tmd/
Does the hook do this by default? I don?t see any new read-only device mounted in the guest. Is there additional work I need to do to mount this and access the data from within the guest?
Many thanks,
Simon
_______________________________________________
Users mailing list
Users@ovirt.org <mailto:Users@ovirt.org>