September 2019 - Users - Ovirt List Archives

Re: Urgent help needed / Snapshot deletion failure

by Strahil

Is it really down ? Have you tried to migrate it to another host. I found out yesterday that one of my VMs was constantly reporting errors (relayed to snapshot delete) and the errors stopped when the system got migrated to another host. Best Regards, Strahil NikolovOn Sep 25, 2019 15:27, smirta(a)gmx.net wrote: > > Dear all we are desperate > > We have tried to delete a Snapshot and it hangs at merging the snapshots. We've found out that this is a known bug because the merge process is being called with wrong parameters according to the bug report https://bugzilla.redhat.com/show_bug.cgi?id=1601212. The Snapshot's Disks are in illegal state and the VM is locked. Since it is a very important VM for our non-profit organization, we need to have this machine back online as soon as possible. Is there a way to fix this without updating to qemu-kvm-ev-2.12.0? At least back to the state before the deletion would be fantastic. > > Our vdsm.log: > 2019-09-25 14:01:41,283+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/0 running <Task <JsonRpcTask {'params': {u'topVolUUID': u'8a9f190f-2725-4535-a69d-c74e4e57d372', u'vmID': u'1422899a-2151-4d5d-9d66-e74f19084542', u'drive': {u'imageID': u'eb2bce92-e758-4bea-93fa-02a56574b932', u'volumeID': u'8a9f190f-2725-4535-a69d-c74e4e57d372', u'domainID': u'022f39ee-eeb8-4b51-9549-9d7e3c88d4a8', u'poolID': u'00000001-0001-0001-0001-000000000307'}, u'bandwidth': u'0', u'jobUUID': u'c3fdc4a9-9d6d-424a-b9df-b96be5622e0a', u'baseVolUUID': u'3e15121b-0795-4056-bafe-448068c9ec71'}, 'jsonrpc': '2.0', 'method': u'VM.merge', 'id': u'9cd540b7-a32f-4f95-9fe2-9ce70d5b6478'} at 0x7f674fec5710> timeout=60, duration=6420 at 0x7f674fec58d0> task#=1896299 at 0x7f674c06ae90>, traceback: > File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap > self.__bootstrap_inner() > File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner > self.run() > File: "/usr/lib64/python2.7/threading.py", line 765, in run > self.__target(*self.__args, **self.__kwargs) > File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run > ret = func(*args, **kwargs) > File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run > self._execute_task() > File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task > task() > File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ > self._callable() > File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__ > self._handler(self._ctx, self._req) > File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest > response = self._handle_request(req, ctx) > File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request > res = method(**params) > File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 197, in _dynamicMethod > result = fn(*methodArgs) > File: "<string>", line 2, in merge > File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method > ret = func(*args, **kwargs) > File: "<string>", line 2, in merge > File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method > ret = func(*args, **kwargs) > File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 739, in merge > drive, baseVolUUID, topVolUUID, bandwidth, jobUUID) > File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6041, in merge > self.updateVmJobs() > File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5818, in updateVmJobs > self._vmJobs = self.queryBlockJobs() > File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5832, in queryBlockJobs > with self._jobsLock: > File: "/usr/lib/python2.7/site-packages/pthreading.py", line 60, in __enter__ > self.acquire() > File: "/usr/lib/python2.7/site-packages/pthreading.py", line 68, in acquire > rc = self.lock() if blocking else self.trylock() > File: "/usr/lib/python2.7/site-packages/pthread.py", line 96, in lock > return _libpthread.pthread_mutex_lock(self._mutex) (executor:363) > > > Kind regards > Simon > _______________________________________________ > Users mailing list -- users(a)ovirt.org > To unsubscribe send an email to users-leave(a)ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LW7BNZBO5AA...

5 years, 9 months

1
0
0 / 0

Change hostname of physical hosts under an oVirt and Gluster combination

by TomK

Hey All, Would like to change the hostname of hosts within a gluster volume that's been added to oVirt. I also need to readd the bricks to the gluster volume under the new hostname. What's the recommended approach to do so when all this is added to oVirt? -- Thx, TK.

5 years, 9 months

2
7
0 / 0

Managed Block Storage: ceph detach_volume failing after migration

by Dan Poltawski

On ovirt 4.3.5 we are seeing various problems related to the rbd device staying mapped after a guest has been live migrated. This causes problems migrating the guest back, as well as rebooting the guest when it starts back up on the original host. The error returned is ‘nrbd: unmap failed: (16) Device or resource busy’. I’ve pasted the full vdsm log below. As far as I can tell this isn’t happening 100% of the time, and seems to be more prevalent on busy guests. (Not sure if I should create a bug for this, so thought I’d start here first) Thanks, Dan Sep 24 19:26:18 mario vdsm[5485]: ERROR FINISH detach_volume error=Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'detach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmptQzb10/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 76076\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 149, in detach\\n ignore_errors=False)\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 121, in disconnect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 241, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 203, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd unmap /dev/rbd/rbd/volume-0e8c1056-45d6-4740-934d-eb07a9f73160 --conf /tmp/brickrbd_LCKezP --id ovirt --mon_host 172.16.10.13:3300 --mon_host 172.16.10.14:3300 --mon_host 172.16.10.12:6789\\nExit code: 16\\nStdout: u\\\'\\\'\\nStderr: u\\\'rbd: sysfs write failed\\\\nrbd: unmap failed: (16) Device or resource busy\\\\n\\\'\\n\'',)#012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in method#012 ret = func(*args, **kwargs)#012 File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1766, in detach_volume#012 return managedvolume.detach_volume(vol_id)#012 File "/usr/lib/python2.7/site-packages/vdsm/storage/managedvolume.py", line 67, in wrapper#012 return func(*args, **kwargs)#012 File "/usr/lib/python2.7/site-packages/vdsm/storage/managedvolume.py", line 135, in detach_volume#012 run_helper("detach", vol_info)#012 File "/usr/lib/python2.7/site-packages/vdsm/storage/managedvolume.py", line 179, in run_helper#012 sub_cmd, cmd_input=cmd_input)#012 File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 56, in __call__#012 return callMethod()#012 File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda>#012 **kwargs)#012 File "<string>", line 2, in managedvolume_run_helper#012 File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod#012 raise convert_to_error(kind, result)#012ManagedVolumeHelperFailed: Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'detach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmptQzb10/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 76076\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 149, in detach\\n ignore_errors=False)\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 121, in disconnect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 241, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 203, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd unmap /dev/rbd/rbd/volume-0e8c1056-45d6-4740-934d-eb07a9f73160 --conf /tmp/brickrbd_LCKezP --id ovirt --mon_host 172.16.10.13:3300 --mon_host 172.16.10.14:3300 --mon_host 172.16.10.12:6789\\nExit code: 16\\nStdout: u\\\'\\\'\\nStderr: u\\\'rbd: sysfs write failed\\\\nrbd: unmap failed: (16) Device or resource busy\\\\n\\\'\\n\'',) ________________________________ The Networking People (TNP) Limited. Registered office: Network House, Caton Rd, Lancaster, LA1 3PE. Registered in England & Wales with company number: 07667393 This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

5 years, 9 months

4
4
0 / 0

Urgent help needed / Snapshot deletion failure

by smirta＠gmx.net

Dear all we are desperate We have tried to delete a Snapshot and it hangs at merging the snapshots. We've found out that this is a known bug because the merge process is being called with wrong parameters according to the bug report https://bugzilla.redhat.com/show_bug.cgi?id=1601212. The Snapshot's Disks are in illegal state and the VM is locked. Since it is a very important VM for our non-profit organization, we need to have this machine back online as soon as possible. Is there a way to fix this without updating to qemu-kvm-ev-2.12.0? At least back to the state before the deletion would be fantastic. Our vdsm.log: 2019-09-25 14:01:41,283+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/0 running <Task <JsonRpcTask {'params': {u'topVolUUID': u'8a9f190f-2725-4535-a69d-c74e4e57d372', u'vmID': u'1422899a-2151-4d5d-9d66-e74f19084542', u'drive': {u'imageID': u'eb2bce92-e758-4bea-93fa-02a56574b932', u'volumeID': u'8a9f190f-2725-4535-a69d-c74e4e57d372', u'domainID': u'022f39ee-eeb8-4b51-9549-9d7e3c88d4a8', u'poolID': u'00000001-0001-0001-0001-000000000307'}, u'bandwidth': u'0', u'jobUUID': u'c3fdc4a9-9d6d-424a-b9df-b96be5622e0a', u'baseVolUUID': u'3e15121b-0795-4056-bafe-448068c9ec71'}, 'jsonrpc': '2.0', 'method': u'VM.merge', 'id': u'9cd540b7-a32f-4f95-9fe2-9ce70d5b6478'} at 0x7f674fec5710> timeout=60, duration=6420 at 0x7f674fec58d0> task#=1896299 at 0x7f674c06ae90>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 197, in _dynamicMethod result = fn(*methodArgs) File: "<string>", line 2, in merge File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File: "<string>", line 2, in merge File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 739, in merge drive, baseVolUUID, topVolUUID, bandwidth, jobUUID) File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6041, in merge self.updateVmJobs() File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5818, in updateVmJobs self._vmJobs = self.queryBlockJobs() File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5832, in queryBlockJobs with self._jobsLock: File: "/usr/lib/python2.7/site-packages/pthreading.py", line 60, in __enter__ self.acquire() File: "/usr/lib/python2.7/site-packages/pthreading.py", line 68, in acquire rc = self.lock() if blocking else self.trylock() File: "/usr/lib/python2.7/site-packages/pthread.py", line 96, in lock return _libpthread.pthread_mutex_lock(self._mutex) (executor:363) Kind regards Simon

5 years, 9 months

1
1
0 / 0

How to delete obsolete Data Centers with no hosts, but with domains inside

by Claudio Soprano

Hi to all, We are using ovirt to manage 6 Data Centers, 3 of them are old Data Centers with no hosts inside, but with domains, storage and VMs not running. We left them because we wanted to have some backups in case of failure of the new Data Centers created. Time pass and now we would like to remove these Data Centers, but we got no way for now to remove them. If we try to remove the Storage Domains (using remove o destroy) we get "Error while executing action: Cannot destroy the master Storage Domain from the Data Center without another active Storage Domain to take its place. -Either activate another Storage Domain in the Data Center, or remove the Data Center. -If you have problems with the master Data Domain, consider following the recovery process described in the documentation, or contact your system administrator." if we try to remove the Data Center directly we get "Error while executing action: Cannot remove Data Center. There is no active Host in the Data Center." How can we solve the problem ? It can be done via ovirt-shell or using some script or via ovirt management interface ? Thanks in advance Claudio

5 years, 9 months

2
6
0 / 0

Add External Provider - OpenStack Glance - Test Failed - "Failed to communicate with the External Provider"

by Pravin Mohandass

We have installed Ovirt Manager 4.3 with couple of KVM compute host added into cluster. When we add OpenStack Glance as an External Provider, failing with following error message - "Failed to communicate with the External Provider" We are able to add the external provider for Openstack Neutron and Openstack Cinder and import the storage and network in to ovirt manager and use it for vm. We are getting following error message on engine.log file "Failed with error PROVIDER_FAILURE and code 5050". But we are able to fetch the images using postman tool. Requesting you to provide some insights on adding the Glance provider.

5 years, 9 months

1
0
0 / 0

[ANN] oVirt 4.3.6 Sixth Release Candidate is now available for testing

by Sandro Bonazzola

The oVirt Project is pleased to announce the availability of the oVirt 4.3.6 Sixth Release Candidate for testing, as of September 25th, 2019. This update is a release candidate of the sixth in a series of stabilization updates to the 4.3 series. This is pre-release software. This pre-release should not to be used in production. This release is available now on x86_64 architecture for: * Red Hat Enterprise Linux 7.7 or later (but <8) * CentOS Linux (or similar) 7.7 or later (but <8) This release supports Hypervisor Hosts on x86_64 and ppc64le architectures for: * Red Hat Enterprise Linux 7.7 or later (but <8) * CentOS Linux (or similar) 7.7 or later (but <8) * oVirt Node 4.3 (available for x86_64 only) has been built consuming CentOS 7.7 Release See the release notes [1] for known issues, new features and bugs fixed. Notes: - oVirt Appliance is already available - oVirt Node is already available Additional Resources: * Read more about the oVirt 4.3.6 release highlights: http://www.ovirt.org/release/4.3.6/ * Get more oVirt Project updates on Twitter: https://twitter.com/ovirt * Check out the latest project news on the oVirt blog: http://www.ovirt.org/blog/ [1] http://www.ovirt.org/release/4.3.6/ [2] http://resources.ovirt.org/pub/ovirt-4.3-pre/iso/ -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://www.redhat.com/>*Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours.*

5 years, 9 months

1
0
0 / 0

Messed up 4.2.3.1 installation - SSL handshake ERROR

by souvaliotimaria＠mail.com

Hello, everyone! So, I have an experimental installation of ovirt 4.2.3.1 with 3 nodes and glustered. Recently I deployed a new installation with ovirt 4.3.5.2, 3nodes and glustered storage here as well. The thing is, in my enthousiasm I thought "hey! what if I can import the experimental nodes as hosts in the new installation in a new cluster and see what happens? Will the 4.3.5.2 engine see them? Probably yeah. But will it see the VMs I have there?" And so I imported the experimental nodes. Without detaching them from their hosted engine. I could see the only VM that was active at the moment and not one of the suspended ones and of course I could not see the 4.2.3.1 HE VM. I have removed the hosts from the new installation and I have tried reconnecting the old engine and its nodes. Passwordless ssh works just fine, but the problem persists. hosted-engine --vm-status reports stale-data on node 2 and node 3 The thing is I know I messed the experimental installation (and I blame only my curiosity), SSL handshake is no longer feasable and I can't remove the hosts from the initial Cluster to Import them again. Basically everything is either in the process of activating without ever being able to do so or down or Non responsive. I would like to find a way around this, as I have seen in other posts in the ovirt forum that the SSL handshake error appears in some other cases and I would like to have a know-how if an occasion like this occurs in the future in production. Is it possible to re-deploy the engine on the nodes and not lose the glustered space or the existing VMs? Can the HE be destroyed and then deployed from scratch? What about the glustered space and the VMs' space? Will the VMs just take up space without being able to neither bring them up nor destroy them? I know I'm asking a lot and it was my fault to begin with but I am really curious if we can see this through. Thanks in advance

5 years, 9 months

1
0
0 / 0

oVirt node - post-upgrade tasks

by dan.munteanu＠mdc-berlin.de

Dear oVirt community, recently I've started to use oVirt as a replacement for KVM + VirtManager and I've decided to use the oVirt Node installed on Dell EMC servers with self-hosted engine. So far, everything works fine, except that, each time I'm updating the nodes, I must reinstall Dell OMSA needed by the monitoring system (via SNMP). Is there any way to automatize the OMSA installation as a post-upgrade task/hook? Thank you Dan

5 years, 9 months

1
0
0 / 0

Got a RequestError: status: 409 reason: Conflict

by smidhunraj＠gmail.com

I'm trying to clone the snapshot into a new vm. The tool I am using is ovirtBAckup from the github wefixit-AT The link is here https://github.com/wefixit-AT/oVirtBackup this piece of code snippet throws me error. if not config.get_dry_run(): api.vms.add(params.VM(name=vm_clone_name, memory=vm.get_memory(), cluster=api.clusters.get(config.get_cluster_name()), snapshots=snapshots_param)) VMTools.wait_for_vm_operation(api, config, "Cloning", vm_from_list) print 'hellooooo' logger.info("Cloning finished") The above lines are from the 325 line number of backup.py of the github. I am getting error as !!! Got a RequestError: status: 409 reason: Conflict detail: Cannot add VM. The VM is performing an operation on a Snapshot. Please wait for the operation to finish, and try again. How can i further debug the code to know what is happening wrong in my program.I am new to python please help me.

5 years, 9 months

2
1
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Users September 2019