 
            *Steve * On Wed, Apr 23, 2014 at 5:14 AM, Dafna Ron <dron@redhat.com> wrote:
steve, I did not say that there is a limit. there is no limit and you can take a 1000 snapshots if you like, I simply said that I think that it would not be would a good practice to do so.
I'm not trying to be adversarial here, but this is contradictory; if there's 'no limit' but 'its not good practice' and we assume that we want our virtual infrastructure to run smoothly, then effectively there is a limit we just don't know what it is.
I also did not say that this is your current problem with the vm so you are jumping to conclusions here.
I wasn't connecting the dots between # of snapshots, and the current issue, I have other VM's with the same amount of snapshots without this problem. No conclusion jumping going on. More interested in what the best practice is for VM's that accumulate snapshots over time. There is a feature slated for 3.5 http://www.ovirt.org/Features/Live_Mergewhich merges snapshots on a running VM, so I suppose in the long run I won't have a high snapshot count.
i simply explained how snapshots work which is that they are created in a chain, if there is a problem at a single point in time it would effect the rest of the snapshots below it.
Just for clarity, such a problem would affect the snapshots 'below it' means after the problematic snapshot? Example: Snapshot 1,2,3,4,5. #4 has a consistency issue, snaps 1,2,3 should be ok? I can try incrementally rolling back snapshots if this is the case (after vdsm restart suggested). Is there any way to do a consistency check? I can imagine scheduling a cronjob to run through a nightly check for consistency issues, then roll back to an earlier snapshot to circumvent the issue.
And that we query all images under the base Image so if you have a lot of them it would take a long time for the results to come back.
That's good to know, is this query done on new snapshot creation only? So over time the more snapshots I have, new snapshots will take longer to complete?
as for your vm, since you fail to create a snapshot on only that vm it means that there is a problem in the current vm and it's chain.
I can see when comparing the uuid's that the pool, domain, base image and last snapshots all exists in the rhev link.
2014-04-22 12:13:41,083 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSCommand] (pool-6-thread-49) [7ccaed5] -- createVolume parameters: sdUUID=95b9d922-4df7-4d3b-9bca-467e2fd9d573 spUUID=9497ef2c-8368-4c92-8d61-7f318a90748f imgGUID=466d9ae9-e46a-46f8-9f4b-964d8af0675b size=21,474,836,480 bytes volFormat=COW volType=Sparse volUUID=0b2d15e5-bf4f-4eaf-90e2-f1bd51a3a936 descr= srcImgGUID=466d9ae9-e46a-46f8-9f4b-964d8af0675b srcVolUUID=1a67de4b-aa1c-4436-baca-ca55726d54d7
lets see if it's possibly a cache issue - can you please restart vdsm on the hosts?
I'll update when I have a chance to restart the services. Thanks
On 04/22/2014 08:22 PM, Steve Dainard wrote:
All snapshots are from before failure.
That's a bit scary that there may be a 'too many snapshots' issue. I take snapshots for point in time consistency, and without the ability to collapse them while the vm is running I'm not sure what the best option is here. What is the recommended snapshot limit? Or maybe a better question; whats the intended use case for snapshots in ovirt?
Export domain is currently unavailable, and without it active I can't disable it properly.
# ls -tl /rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/ 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d9ae9- e46a-46f8-9f4b-964d8af0675b total 8517740 -rw-rw----. 1 vdsm kvm 97583104 Apr 22 14:03 1a67de4b-aa1c-4436-baca- ca55726d54d7 -rw-r--r--. 1 vdsm kvm 268 Apr 22 12:13 1a67de4b-aa1c-4436-baca- ca55726d54d7.meta -rw-r--r--. 1 vdsm kvm 272 Apr 22 01:06 87390b64-becd-4a6f-a4fc- d27655f59b64.meta -rw-rw----. 1 vdsm kvm 1048576 Apr 22 01:04 1a67de4b-aa1c-4436-baca- ca55726d54d7.lease -rw-rw----. 1 vdsm kvm 107413504 Apr 20 22:00 87390b64-becd-4a6f-a4fc- d27655f59b64 -rw-rw----. 1 vdsm kvm 104267776 Apr 19 22:00 6f9fd451-6c82-4390-802c- 9e23a7d89427 -rw-rw----. 1 vdsm kvm 1048576 Apr 19 22:00 87390b64-becd-4a6f-a4fc- d27655f59b64.lease -rw-r--r--. 1 vdsm kvm 272 Apr 19 22:00 6f9fd451-6c82-4390-802c- 9e23a7d89427.meta -rw-rw----. 1 vdsm kvm 118358016 Apr 18 22:00 c298ce3b-ec6a-4526-9971- a769f4d3d69b -rw-rw----. 1 vdsm kvm 1048576 Apr 18 22:00 6f9fd451-6c82-4390-802c- 9e23a7d89427.lease -rw-r--r--. 1 vdsm kvm 272 Apr 18 22:00 c298ce3b-ec6a-4526-9971- a769f4d3d69b.meta -rw-rw----. 1 vdsm kvm 120913920 Apr 17 22:00 0ee58208-6be8-4f81-bd51- 0bd4b6d5d83a -rw-rw----. 1 vdsm kvm 1048576 Apr 17 22:00 c298ce3b-ec6a-4526-9971- a769f4d3d69b.lease -rw-r--r--. 1 vdsm kvm 272 Apr 17 22:00 0ee58208-6be8-4f81-bd51- 0bd4b6d5d83a.meta -rw-rw----. 1 vdsm kvm 117374976 Apr 16 22:00 9aeb973d-9a54-441e-9ce9- f4f1a233da26 -rw-rw----. 1 vdsm kvm 1048576 Apr 16 22:00 0ee58208-6be8-4f81-bd51- 0bd4b6d5d83a.lease -rw-r--r--. 1 vdsm kvm 272 Apr 16 22:00 9aeb973d-9a54-441e-9ce9- f4f1a233da26.meta -rw-rw----. 1 vdsm kvm 110886912 Apr 15 22:00 0eae2185-884a-44d3-9099- e952b6b7ec37 -rw-rw----. 1 vdsm kvm 1048576 Apr 15 22:00 9aeb973d-9a54-441e-9ce9- f4f1a233da26.lease -rw-r--r--. 1 vdsm kvm 272 Apr 15 22:00 0eae2185-884a-44d3-9099- e952b6b7ec37.meta -rw-rw----. 1 vdsm kvm 1048576 Apr 14 22:00 0eae2185-884a-44d3-9099- e952b6b7ec37.lease -rw-rw----. 1 vdsm kvm 164560896 Apr 14 22:00 ceffc643-b823-44b3-961e- 93f3dc971886 -rw-r--r--. 1 vdsm kvm 272 Apr 14 22:00 ceffc643-b823-44b3-961e- 93f3dc971886.meta -rw-rw----. 1 vdsm kvm 1048576 Apr 13 22:00 ceffc643-b823-44b3-961e- 93f3dc971886.lease -rw-r--r--. 1 vdsm kvm 272 Apr 13 22:00 878fc690-ab08-489c-955b- 9159f62026b1.meta -rw-rw----. 1 vdsm kvm 109182976 Apr 13 21:59 878fc690-ab08-489c-955b- 9159f62026b1 -rw-rw----. 1 vdsm kvm 110297088 Apr 12 22:00 5210eec2-a0eb-462e-95d5- 7cf27db312f5 -rw-rw----. 1 vdsm kvm 1048576 Apr 12 22:00 878fc690-ab08-489c-955b- 9159f62026b1.lease -rw-r--r--. 1 vdsm kvm 272 Apr 12 22:00 5210eec2-a0eb-462e-95d5- 7cf27db312f5.meta -rw-rw----. 1 vdsm kvm 76480512 Apr 11 22:00 dcce0903-0f24-434b-9d1c- d70e3969e5ea -rw-rw----. 1 vdsm kvm 1048576 Apr 11 22:00 5210eec2-a0eb-462e-95d5- 7cf27db312f5.lease -rw-r--r--. 1 vdsm kvm 272 Apr 11 22:00 dcce0903-0f24-434b-9d1c- d70e3969e5ea.meta -rw-rw----. 1 vdsm kvm 1048576 Apr 11 12:34 dcce0903-0f24-434b-9d1c- d70e3969e5ea.lease -rw-r--r--. 1 vdsm kvm 272 Apr 11 12:34 d3a1c505-8f6a-4c2b-97b7- 764cd5baea47.meta -rw-rw----. 1 vdsm kvm 208666624 Apr 11 12:33 d3a1c505-8f6a-4c2b-97b7- 764cd5baea47 -rw-rw----. 1 vdsm kvm 14614528 Apr 10 16:12 638c2164-2edc-4294-ac99- c51963140940 -rw-rw----. 1 vdsm kvm 1048576 Apr 10 16:12 d3a1c505-8f6a-4c2b-97b7- 764cd5baea47.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 16:12 638c2164-2edc-4294-ac99- c51963140940.meta -rw-rw----. 1 vdsm kvm 12779520 Apr 10 16:06 f8f1f164-c0d9-4716-9ab3- 9131179a79bd -rw-rw----. 1 vdsm kvm 1048576 Apr 10 16:05 638c2164-2edc-4294-ac99- c51963140940.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 16:05 f8f1f164-c0d9-4716-9ab3- 9131179a79bd.meta -rw-rw----. 1 vdsm kvm 92995584 Apr 10 16:00 f9b14795-a26c-4edb-ae34- 22361531a0a1 -rw-rw----. 1 vdsm kvm 1048576 Apr 10 16:00 f8f1f164-c0d9-4716-9ab3- 9131179a79bd.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 16:00 f9b14795-a26c-4edb-ae34- 22361531a0a1.meta -rw-rw----. 1 vdsm kvm 30015488 Apr 10 14:57 39cbf947-f084-4e75-8d6b- b3e5c32b82d6 -rw-rw----. 1 vdsm kvm 1048576 Apr 10 14:57 f9b14795-a26c-4edb-ae34- 22361531a0a1.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 14:57 39cbf947-f084-4e75-8d6b- b3e5c32b82d6.meta -rw-rw----. 1 vdsm kvm 19267584 Apr 10 14:34 3ece1489-9bff-4223-ab97- e45135106222 -rw-rw----. 1 vdsm kvm 1048576 Apr 10 14:34 39cbf947-f084-4e75-8d6b- b3e5c32b82d6.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 14:34 3ece1489-9bff-4223-ab97- e45135106222.meta -rw-rw----. 1 vdsm kvm 22413312 Apr 10 14:29 dcee2e8a-8803-44e2-80e8- 82c882af83ef -rw-rw----. 1 vdsm kvm 1048576 Apr 10 14:28 3ece1489-9bff-4223-ab97- e45135106222.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 14:28 dcee2e8a-8803-44e2-80e8- 82c882af83ef.meta -rw-rw----. 1 vdsm kvm 54460416 Apr 10 14:26 57066786-613a-46ff-b2f9- 06d84678975b -rw-rw----. 1 vdsm kvm 1048576 Apr 10 14:26 dcee2e8a-8803-44e2-80e8- 82c882af83ef.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 14:26 57066786-613a-46ff-b2f9- 06d84678975b.meta -rw-rw----. 1 vdsm kvm 15728640 Apr 10 13:31 121ae509-d2b2-4df2-a56f- dfdba4b8d21c -rw-rw----. 1 vdsm kvm 1048576 Apr 10 13:30 57066786-613a-46ff-b2f9- 06d84678975b.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 13:30 121ae509-d2b2-4df2-a56f- dfdba4b8d21c.meta -rw-rw----. 1 vdsm kvm 5767168 Apr 10 13:18 1d95a9d2-e4ba-4bcc-ba71- 5d493a838dcc -rw-rw----. 1 vdsm kvm 1048576 Apr 10 13:17 121ae509-d2b2-4df2-a56f- dfdba4b8d21c.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 13:17 1d95a9d2-e4ba-4bcc-ba71- 5d493a838dcc.meta -rw-rw----. 1 vdsm kvm 5373952 Apr 10 13:13 3ce8936a-38f5-43a9-a4e0- 820094fbeb04 -rw-rw----. 1 vdsm kvm 1048576 Apr 10 13:13 1d95a9d2-e4ba-4bcc-ba71- 5d493a838dcc.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 13:12 3ce8936a-38f5-43a9-a4e0- 820094fbeb04.meta -rw-rw----. 1 vdsm kvm 3815243776 Apr 10 13:11 7211d323-c398-4c1c-8524- a1047f9d5ec9 -rw-rw----. 1 vdsm kvm 1048576 Apr 10 13:11 3ce8936a-38f5-43a9-a4e0- 820094fbeb04.lease -rw-r--r--. 1 vdsm kvm 272 Apr 10 13:11 7211d323-c398-4c1c-8524- a1047f9d5ec9.meta -rw-r--r--. 1 vdsm kvm 272 Mar 19 10:35 af94adc4-fad4-42f5-a004- 689670311d66.meta -rw-rw----. 1 vdsm kvm 21474836480 Mar 19 10:22 af94adc4-fad4-42f5-a004- 689670311d66 -rw-rw----. 1 vdsm kvm 1048576 Mar 19 09:39 7211d323-c398-4c1c-8524- a1047f9d5ec9.lease -rw-rw----. 1 vdsm kvm 1048576 Mar 19 09:39 af94adc4-fad4-42f5-a004- 689670311d66.lease
Its just very odd that I can snapshot any other VM except this one.
I just cloned a new VM from the last snapshot on this VM and it created without issue. I was also able to snapshot the new VM without a problem.
*Steve *
On Tue, Apr 22, 2014 at 12:51 PM, Dafna Ron <dron@redhat.com <mailto: dron@redhat.com>> wrote:
it's the same error:
c1d7c4e-392b-4a62-9836-3add1360a46d::DEBUG::2014-04-22 12:13:44,340::volume::1058::Storage.Misc.excCmd::(createVolume) FAILED: <err> = '/rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/ 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/4 66d9ae9-e46a-46f8-9f4b-964d8af0675b/0b2d15e5-bf4f- 4eaf-90e2-f1bd51a3a936: error while creating qcow2: No such file or directory\n'; <rc> = 1
were these 23 snapshots created any way each time we fail to create the snapshot or are these older snapshots which you actually created before the failure?
at this point my main theory is that somewhere along the line you had some sort of failure in your storage and from that time each snapshot you create will fail. if the snapshots are created during the failure can you please delete the snapshots you do not need and try again?
There should not be a limit on how many snapshots you can have since it's only a link changing the image the vm should boot from. Having said that, it's not ideal to have that many snapshots and can probably lead to unexpected results so I would not recommend having that many snapshots on a single vm :)
for example, my second theory would be that because we have so many snapshots we have some sort of race where part of the createVolume command expects some result from a query run before the create itself and because there are so many snapshots there is "no such file" on the volume because it's too far up the list.
can you also run: ls -l /rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/ 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d9ae9- e46a-46f8-9f4b-964d8af0675b
lets see what images are listed under that vm.
btw, you know that your export domain is getting StorageDomainDoesNotExist in the vdsm log? is that domain in up state? can you try to deactivate the export domain?
Thanks,
Dafna
On 04/22/2014 05:20 PM, Steve Dainard wrote:
Ominous..
23 snapshots. Is there an upper limit?
Offline snapshot fails as well. Both logs attached again (snapshot attempted at 12:13 EST).
*Steve *
On Tue, Apr 22, 2014 at 11:20 AM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>>> wrote:
are you able to take an offline snapshot? (while the vm is down) how many snapshots do you have on this vm?
On 04/22/2014 04:19 PM, Steve Dainard wrote:
No alert in web ui, I restarted the VM yesterday just in case, no change. I also restored an earlier snapshot and tried to re-snapshot, same result.
*Steve *
On Tue, Apr 22, 2014 at 10:57 AM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>> <mailto:dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>>>> wrote:
This is the actual problem:
bf025a73-eeeb-4ac5-b8a9- 32afa4ae482e::DEBUG::2014-04-22 10:21:49,374::volume::1058:: Storage.Misc.excCmd::(createVolume) FAILED: <err> = '/rhev/data-center/9497ef2c- 8368-4c92-8d61-7f318a90748f/95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/4 66d9ae9-e46a-46f8-9f4b- 964d8af0675b/87efa937-b31f-4bb1-aee1-0ee14a0dc6fb: error while creating qcow2: No such file or directory\n'; <rc> = 1
from that you see the actual failure:
bf025a73-eeeb-4ac5-b8a9- 32afa4ae482e::ERROR::2014-04-22 10:21:49,392::volume::286::Storage.Volume::(clone) Volume.clone: can't clone: /rhev/data-center/9497ef2c- 8368-4c92-8d61-7f318a90748f/95b9d922-4df7-4d3b-9bca- 467e2fd9d573/images/466d 9ae9-e46a-46f8-9f4b- 964d8af0675b/1a67de4b-aa1c-4436-baca-ca55726d54d7 to /rhev/data-center/9497ef2c- 8368-4c92-8d61-7f318a90748f/95b9d922-4df7-4d3b-9bca- 467e2fd9d573/images/466d9ae9-e46a-46f8-9f4b-964d8af0675b/ 87efa937-b31f-4bb1-aee1-0ee1 4a0dc6fb bf025a73-eeeb-4ac5-b8a9- 32afa4ae482e::ERROR::2014-04-22 10:21:49,392::volume::508::Storage.Volume::(create) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/volume.py", line 466, in create srcVolUUID, imgPath, volPath) File "/usr/share/vdsm/storage/fileVolume.py", line 160, in _create volParent.clone(imgPath, volUUID, volFormat, preallocate) File "/usr/share/vdsm/storage/volume.py", line 287, in clone raise se.CannotCloneVolume(self.volumePath, dst_path, str(e)) CannotCloneVolume: Cannot clone volume: 'src=/rhev/data-center/ 9497ef2c-8368-4c92-8d61-7f318a90748f/95b9d922-4df7- 4d3b-9bca-467e2fd9d573/images/466d9ae9-e46a-46f8-9f4b- 964d8af0675b/1a67de4b-aa1c-4436-baca-ca55726d54d7, dst=/rhev/data-cen ter/9497ef2c-8368-4c92-8d61- 7f318a90748f/95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/ 466d9ae9-e46a-46f8-9f4b-964d8af0675b/87efa937-b31f- 4bb1-aee1-0ee14a0dc6fb: Error creating a new volume: (["Formatting \'/rhev/data-center/9497ef2c-8368- 4c92-8d61-7f318a90748f/ 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d9ae9- e46a-46f8-9f4b-964d8af0675b/87efa937-b31f-4bb1-aee1-0ee14a0dc6fb\', fmt=qcow2 size=21474836480 backing_file=\'../466d9ae9- e46a-46f8-9f4b-964d8af0675b/1a67de4b-aa 1c-4436-baca-ca55726d54d7\' backing_fmt=\'qcow2\' encryption=off cluster_size=65536 "],)'
do you have any alert in the webadmin to restart the vm?
Dafna
On 04/22/2014 03:31 PM, Steve Dainard wrote:
Sorry for the confusion.
I attempted to take a live snapshot of a running VM. After that failed, I migrated the VM to another host, and attempted the live snapshot again without success, eliminating a single host as the cause of failure.
Ovirt is 3.3.4, storage domain is gluster 3.4.2.1, OS is CentOS 6.5.
Package versions: libvirt-0.10.2-29.el6_5.5.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.5.x86_64 qemu-img-rhev-0.12.1.2-2.415.el6.nux.3.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6.nux.3.x86_64 qemu-kvm-rhev-tools-0.12.1.2- 2.415.el6.nux.3.x86_64 vdsm-4.13.3-4.el6.x86_64 vdsm-gluster-4.13.3-4.el6.noarch
I made another live snapshot attempt at 10:21 EST today, full vdsm.log attached, and a truncated engine.log.
Thanks,
*Steve *
On Tue, Apr 22, 2014 at 9:48 AM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>> <mailto:dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>>> <mailto:dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>>
<mailto:dron@redhat.com <mailto:dron@redhat.com> <mailto:dron@redhat.com <mailto:dron@redhat.com>>>>> wrote:
please explain the flow of what you are trying to do, are you trying to live migrate the disk (from one storage to another), are you trying to migrate the vm and after vm migration is finished you try to take a live snapshot of the vm? or are you trying to take a live snapshot of the vm during a vm migration from host1 to host2?
Please attach full vdsm logs from any host you are using (if you are trying to migrate the vm from host1 to host2) + please attach engine log.
Also, what is the vdsm, libvirt and qemu versions, what ovirt version are you using and what is the storage you are using?
Thanks,
Dafna
On 04/22/2014 02:12 PM, Steve Dainard wrote:
I've attempted migrating the vm to another host and taking a snapshot, but I get this error:
6efd33f4-984c-4513-b5e6-fffdca2e983b::ERROR::2014-04-22 01:09:37,296::volume::286:: Storage.Volume::(clone) Volume.clone: can't clone: /rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/ 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d9ae9- e46a-46f8-9f4b-964d8af0675b/1a67de4b-aa1c-4436-baca-ca55726d54d7 to /rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/ 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d9ae9- e46a-46f8-9f4b-964d8af0675b/b230596f-97bc-4532-ba57-5654fa9c6c51
A bit more of the vdsm log is attached.
Other vm's are snapshotting without issue.
Any help appreciated,
*Steve *
_______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> <mailto:Users@ovirt.org <mailto:Users@ovirt.org>> <mailto:Users@ovirt.org <mailto:Users@ovirt.org> <mailto:Users@ovirt.org <mailto:Users@ovirt.org>>> <mailto:Users@ovirt.org <mailto:Users@ovirt.org> <mailto:Users@ovirt.org <mailto:Users@ovirt.org>> <mailto:Users@ovirt.org <mailto:Users@ovirt.org> <mailto:Users@ovirt.org <mailto:Users@ovirt.org>>>>
http://lists.ovirt.org/mailman/listinfo/users
-- Dafna Ron
-- Dafna Ron
-- Dafna Ron
-- Dafna Ron
-- Dafna Ron