VM hanging at sustained high throughput
by David Johnson
Hi ovirt gurus,
This is an interesting issue, one I never expected to have.
When I push high volumes of writes to my NAS, I will cause VM's to go into
a paused state. I'm looking at this from a number of angles, including
upgrades on the NAS appliance.
I can reproduce this problem at will running a centos 7.9 VM on Ovirt 4.5.
*Questions:*
1. Is my analysis of the failure (below) reasonable/correct?
2. What am I looking for to validate this?
3. Is there a configuration that I can set to make it a little more robust
while I acquire the hardware to improve the NAS?
*Reproduction:*
Standard test of file write speed:
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=4096
oflag=direct
4096+0 records in
4096+0 records out
2147483648 bytes (2.1 GB) copied, 1.68431 s, 1.3 GB/s
Give it more data
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=12228
oflag=direct
12228+0 records in
12228+0 records out
6410993664 bytes (6.4 GB) copied, 7.22078 s, 888 MB/s
The odds are about 50/50 that 6 GB will kill the VM, but 100% when I hit 8
GB.
*Analysis:*
What I think appears to be happening is that the intent cache on the NAS is
on an SSD, and my VM's are pushing data about three times as fast as the
SSD can handle. When the SSD gets queued up beyond a certain point, the NAS
(which places reliability over speed) says "Whoah Nellie!", and the VM
chokes.
*David Johnson*
5 months, 1 week
Deploy ovirt-csi in the kubernetes cluster
by ssarang520@gmail.com
Hi,
I want to deploy ovirt-csi in the kubernetes cluster. But the guide only has how to deploy to openshift.
How can I deploy the ovirt-csi in the kubernetes cluster? Is there any way to do that?
6 months, 1 week
Host needs to be reinstalled after configuring power management
by Andrew DeMaria
Hi,
I am running ovirt 4.3 and have found the following action item immediately
after configuring power management for a host:
Host needs to be reinstalled as important configuration changes were
applied on it.
The thing is - I've just freshly installed this host and it seems strange
that I need to reinstall it.
Is there a better way to install a host and configure power management
without having to reinstall it after?
Thanks,
Andrew
6 months, 2 weeks
Cinderlib RBD ceph template issues
by Sketch
This is on oVirt 4.4.8, engine on CS8, hosts on C8, cluster and DC are
both set to 4.6.
With a newly configured cinderlib/ceph RBD setup. I can create new VM
images, and copy existing VM images, but I can't copy existing template
images to RBD. When I do, I try, I get this error in cinderlib.log (see
below), which sounds like the disk already exists there, but it definitely
does not. This leaves me unable to create new VMs on RBD, only migrate
existing VM disks.
2021-09-01 04:31:05,881 - cinder.volume.driver - INFO - Driver hasn't implemented _init_vendor_properties()
2021-09-01 04:31:05,882 - cinderlib-client - INFO - Creating volume '0e8b9aca-1eb1-4837-ac9e-cb3d8f4c1676', with size '500' GB [5c5d0a6b]
2021-09-01 04:31:05,943 - cinderlib-client - ERROR - Failure occurred when trying to run command 'create_volume': Entity '<class 'cinder.db.sqlalchemy.models.Volume'>' has no property 'glance_metadata' [5c5d0a6b]
2021-09-01 04:31:05,944 - cinder - CRITICAL - Unhandled error
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/cinderlib/objects.py", line 455, in create
self._raise_with_resource()
File "/usr/lib/python3.6/site-packages/cinderlib/objects.py", line 222, in _raise_with_resource
six.reraise(*exc_info)
File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/usr/lib/python3.6/site-packages/cinderlib/objects.py", line 448, in create
model_update = self.backend.driver.create_volume(self._ovo)
File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py", line 986, in create_volume
features=client.features)
File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit
result = proxy_call(self._autowrap, f, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call
rv = execute(f, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute
six.reraise(c, e, tb)
File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker
rv = meth(*args, **kwargs)
File "rbd.pyx", line 629, in rbd.RBD.create
rbd.ImageExists: [errno 17] RBD image already exists (error creating image)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/base.py", line 399, in _entity_descriptor
return getattr(entity, key)
AttributeError: type object 'Volume' has no attribute 'glance_metadata'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./cinderlib-client.py", line 170, in main
args.command(args)
File "./cinderlib-client.py", line 208, in create_volume
backend.create_volume(int(args.size), id=args.volume_id)
File "/usr/lib/python3.6/site-packages/cinderlib/cinderlib.py", line 175, in create_volume
vol.create()
File "/usr/lib/python3.6/site-packages/cinderlib/objects.py", line 457, in create
self.save()
File "/usr/lib/python3.6/site-packages/cinderlib/objects.py", line 628, in save
self.persistence.set_volume(self)
File "/usr/lib/python3.6/site-packages/cinderlib/persistence/dbms.py", line 254, in set_volume
self.db.volume_update(objects.CONTEXT, volume.id, changed)
File "/usr/lib/python3.6/site-packages/cinder/db/sqlalchemy/api.py", line 236, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/cinder/db/sqlalchemy/api.py", line 184, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/cinder/db/sqlalchemy/api.py", line 2570, in volume_update
result = query.filter_by(id=volume_id).update(values)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3818, in update
update_op.exec_()
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1670, in exec_
self._do_pre_synchronize()
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1743, in _do_pre_synchronize
self._additional_evaluators(evaluator_compiler)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1912, in _additional_evaluators
values = self._resolved_values_keys_as_propnames
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1831, in _resolved_values_keys_as_propnames
for k, v in self._resolved_values:
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1818, in _resolved_values
desc = _entity_descriptor(self.mapper, k)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/base.py", line 402, in _entity_descriptor
"Entity '%s' has no property '%s'" % (description, key)
sqlalchemy.exc.InvalidRequestError: Entity '<class 'cinder.db.sqlalchemy.models.Volume'>' has no property 'glance_metadata'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./cinderlib-client.py", line 390, in <module>
sys.exit(main(sys.argv[1:]))
File "./cinderlib-client.py", line 176, in main
sys.stderr.write(traceback.format_exc(e))
File "/usr/lib64/python3.6/traceback.py", line 167, in format_exc
return "".join(format_exception(*sys.exc_info(), limit=limit, chain=chain))
File "/usr/lib64/python3.6/traceback.py", line 121, in format_exception
type(value), value, tb, limit=limit).format(chain=chain))
File "/usr/lib64/python3.6/traceback.py", line 498, in __init__
_seen=_seen)
File "/usr/lib64/python3.6/traceback.py", line 498, in __init__
_seen=_seen)
File "/usr/lib64/python3.6/traceback.py", line 509, in __init__
capture_locals=capture_locals)
File "/usr/lib64/python3.6/traceback.py", line 338, in extract
if limit >= 0:
TypeError: '>=' not supported between instances of 'InvalidRequestError' and 'int'
6 months, 2 weeks
Import an exported VM using Ansible
by paolo@airaldi.it
Hello everybody!
I'm trying to automate a copy of a VM from one Datacenter to another using an Ansible.playbook.
I'm able to:
- Create a snapshot of the source VM
- create a clone from the snapshot
- remove the snapshot
- attach an Export Domain
- export the clone to the Export Domain
- remove the clone
- detach the Export domain from the source Datacenter and attach to the destination.
Unfortunately I cannot find a module to:
- import the VM from the Export Domain
- delete the VM image from the Export Domain.
Any hint on how to do that?
Thanks in advance. Cheers.
Paolo
PS: if someone is interested I can share the playbook.
6 months, 3 weeks
did 4.3.9 reset bug https://bugzilla.redhat.com/show_bug.cgi?id=1590266
by kelley bryan
I am experiencing the error message in the ovirt-hosted-engine-setup-ansible-create_target_vm log
{2020-05-06 14:15:30,024-0500 ERROR ansible failed {'status': 'FAILED', 'ansible_type': 'task', 'ansible_task': u"Fail if Engine IP is different from engine's he_fqdn resolved IP", 'ansible_result': u'type: <type \'dict\'>\nstr: {\'msg\': u"Engine VM IP address is while the engine\'s he_fqdn ovirt1-engine.kelleykars.org resolves to 192.168.122.2. If you are using DHCP, check your DHCP reservation configuration", \'changed\': False, \'_ansible_no_log\': False}', 'task_duration': 1, 'ansible_host': u'localhost', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml'}}:Q!
The bug 1590266 says it should report the engine VM IP address xxx.xxx.xxx.xxx while the Engines he_fqdn is xxxxxxxxx
I need to see what it thins is wrong as both dig fqdn engine name and dig -x ip return the correct information.
Now this bug looks like it may play but I don't see the failed rediness check in the this log https://access.redhat.com/solutions/4462431
or is it because the vm fails or dies or ???
7 months, 1 week
Lots of storage.MailBox.SpmMailMonitor
by Fabrice Bacchella
My vdsm log files are huge:
-rw-r--r-- 1 vdsm kvm 1.8G Nov 22 11:32 vdsm.log
And this is juste half an hour of logs:
$ head -1 vdsm.log
2018-11-22 11:01:12,132+0100 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] mailbox 2 checksum failed, not clearing mailbox, clearing new mail (data='...lots of data', expected='\xa4\x06\x08\x00') (mailbox:612)
I just upgraded vdsm:
$ rpm -qi vdsm
Name : vdsm
Version : 4.20.43
7 months, 1 week
How to renew vmconsole-proxy* certificates
by capelle@labri.fr
Hi,
Since a few weeks, we are not able to connect to the vmconsole proxy:
$ ssh -t -p 2222 ovirt-vmconsole@ovirt
ovirt-vmconsole@ovirt: Permission denied (publickey).
Last successful login record: Mar 29 11:31:32
First login failure record: Mar 31 17:28:51
We tracked the issue to the following log in /var/log/ovirt-engine/engine.log:
ERROR [org.ovirt.engine.core.services.VMConsoleProxyServlet] (default task-11) [] Error validating ticket: : sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Indeed, certificate /etc/pki/ovirt-engine/certs/vmconsole-proxy-helper.cer and others did expire:
--
# grep 'Not After' /etc/pki/ovirt-engine/certs/vmconsole-proxy-*
/etc/pki/ovirt-engine/certs/vmconsole-proxy-helper.cer: Not After : Mar 31 13:18:44 2021 GMT
/etc/pki/ovirt-engine/certs/vmconsole-proxy-host.cer: Not After : Mar 31 13:18:44 2021 GMT
/etc/pki/ovirt-engine/certs/vmconsole-proxy-user.cer: Not After : Mar 31 13:18:44 2021 GMT
--
But we did not manage to found how to renew them. Any advice ?
--
Benoît
9 months
Snapshot and disk size allocation
by jorgevisentini@gmail.com
Hello everyone.
I would like to know how disk size and snapshot allocation works, because every time I create a new snapshot, it increases 1 GB in the VM's disk size, and when I remove the snap, that space is not returned to Domain Storage.
I'm using the oVirt 4.3.10
How do I reprovision the VM disk?
Thank you all.
9 months, 3 weeks
fresh hyperconverged Gluster setup failed in ovirt 4.4.8
by dhanaraj.ramesh@yahoo.com
Hi Team
I'm trying to setup 3 node Gluster + ovirt setup with latest stable 4.4.8 version but while deploying the gluster from cokpit getting below error what could be the reason
TASK [gluster.infra/roles/backend_setup : Set Gluster specific SeLinux context on the bricks] ***
failed: [beclovkvma03.bec. lab ] (item={'path': '/gluster_bricks/engine', 'lvname': 'gluster_lv_engine', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_engine", "path": "/gluster_bricks/engine", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma01.bec. lab ] (item={'path': '/gluster_bricks/engine', 'lvname': 'gluster_lv_engine', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_engine", "path": "/gluster_bricks/engine", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma02.bec. lab ] (item={'path': '/gluster_bricks/engine', 'lvname': 'gluster_lv_engine', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_engine", "path": "/gluster_bricks/engine", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma03.bec. lab ] (item={'path': '/gluster_bricks/data', 'lvname': 'gluster_lv_data', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_data", "path": "/gluster_bricks/data", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma01.bec. lab ] (item={'path': '/gluster_bricks/data', 'lvname': 'gluster_lv_data', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_data", "path": "/gluster_bricks/data", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma02.bec. lab ] (item={'path': '/gluster_bricks/data', 'lvname': 'gluster_lv_data', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_data", "path": "/gluster_bricks/data", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma03.bec. lab ] (item={'path': '/gluster_bricks/vmstore', 'lvname': 'gluster_lv_vmstore', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_vmstore", "path": "/gluster_bricks/vmstore", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma01.bec. lab ] (item={'path': '/gluster_bricks/vmstore', 'lvname': 'gluster_lv_vmstore', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_vmstore", "path": "/gluster_bricks/vmstore", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
failed: [beclovkvma02.bec. lab ] (item={'path': '/gluster_bricks/vmstore', 'lvname': 'gluster_lv_vmstore', 'vgname': 'gluster_vg_sde'}) => {"ansible_loop_var": "item", "changed": false, "item": {"lvname": "gluster_lv_vmstore", "path": "/gluster_bricks/vmstore", "vgname": "gluster_vg_sde"}, "msg": "ValueError: Type glusterd_brick_t is invalid, must be a file or device type\n"}
NO MORE HOSTS LEFT *************************************************************
NO MORE HOSTS LEFT *************************************************************
PLAY RECAP *********************************************************************
beclovkvma01.bec. lab : ok=53 changed=14 unreachable=0 failed=1 skipped=116 rescued=0 ignored=1
beclovkvma02.bec. lab : ok=52 changed=13 unreachable=0 failed=1 skipped=116 rescued=0 ignored=1
beclovkvma03.bec. lab : ok=52 changed=13 unreachable=0 failed=1 skipped=116 rescued=0 ignored=1
10 months