Failed to delete snapshot (again)
by Giulio Casella
Hi folks,
since some month I have an issue with snapshot removals (I have storware
vprotect backup system, heavily using snapshots).
After some time spent on a bugzilla
(https://bugzilla.redhat.com/show_bug.cgi?id=1948599) we discovered that
my issue is not depending on that bug :-(
So they pointed me here again.
Briefly: sometime snapshot removal fails, leaving snapshot in illegal
state. Trying to remove again (via ovirt UI) keeps failing and doesn't
solve. The only way to rebuild a consistent situation is live migrating
affected disk to another storage domain; after moving the disk, snapshot
is no more marked illegal and then I can remove it. You can imagine this
is a bit tricky, specially for large disks.
In my logs I can find:
2022-08-29 09:17:11,890+02 ERROR
[org.ovirt.engine.core.bll.MergeStatusCommand]
(EE-ManagedExecutorService-commandCoordinator-Thread-1)
[0eced56f-689d-422b-b15c-20b824377b08] Failed to live merge. Top volume
f8f84b1c-53ab-4c99-a01d-743ed3d7859b is still in qemu chain
[0ea89fbc-d39a-48ff-aa2b-0381d79d7714,
55bb387f-01a6-41b6-b585-4bcaf2ea5e32, f8f84b1c-53ab-4c99-a01d-743ed3d7859b]
My setup is ovirt-engine-4.5.2.4-1.el8.noarch, with hypervisors based on
oVirt Node 4.5.2 (vdsm-4.50.2.2-1.el8).
Thank you in advance.
Regards,
gc
2 years, 4 months
Ovirt 4.4.7, can't renew certificate of ovirt engine (certificates expired)
by vk@itiviti.com
Hi Team,
I'm looking for your help since I didn't find any clear documentation. Is there somewhere in ovirt website a clear documentation about how to renew the engine certificates located in /etc/pki/ovirt-engine/certs/
We have an engine GUI not working, showing error message "PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed".
After checking, all the cert in /etc/pki/ovirt-engine/certs/ are expired.
I didn't find a clear documentation on ovirt website, or even on redhat website (it was always about host but not the engine)
Anyway I've read that the renew process can be done via "engine-setup --offline", but when I try it, it generates this error:
--== PKI CONFIGURATION ==--
[ ERROR ] Failed to execute stage 'Environment customization': Unable to load certificate. See https://cryptography.io/en/latest/faq/#why-can-t-i-import-my-pem-file for more details.
and in log file:
File "/usr/lib64/python3.6/site-packages/cryptography/hazmat/backends/openssl/backend.py", line 1371, in load_pem_x509_certificate
"Unable to load certificate. See https://cryptography.io/en/la"
ValueError: Unable to load certificate. See https://cryptography.io/en/latest/faq/#why-can-t-i-import-my-pem-file for more details.
2022-08-29 19:16:29,502+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Environment customization': Unable to load certificate. See https://cryptography.io/en/latest/faq/#why-can-t-i-import-my-pem-file for more details.
I've also tried the manual procedure (using /usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh) mentioned in https://users.ovirt.narkive.com/4ugjgicE/ovirt-regenerating-new-ssl-certi... (message from Alon Bar-Lev), but the 4th command always says I enter a wrogn apssword, but it's not.
we are blocked here and we can't use our ovirt cluster, so it's pretty blocking.
Thx a lot in advance
2 years, 4 months
unable to bring up gluster bricks after 4.5 upgrade
by Jayme
Hello All,
I've been struggling with a few issues upgrading my 3-node HCI custer from
4.4 to 4.5.
At present the self hosted engine VM is properly running oVirt 4.5 on
CentOS 8x stream.
First host node, I set in maintenance and installed new node-ng image. I
ran into issue with rescue mode on boot which appears to have been related
to LVM devices bug. I was able to work past that and get the node to boot.
The node running 4.5.2 image is booting properly and gluster/lvm mounts etc
all look good. I am able to activate the host and run VMs on it etc.
however, oVirt cli is showing that all bricks on host are DOWN.
I was unable to get the bricks back up even after doing a force start of
the volumes.
Here is the glusterd log from the host in question when I try force start
on the engine volume (other volumes are similar:
==> glusterd.log <==
The message "I [MSGID: 106568] [glusterd-svc-mgmt.c:266:glusterd_svc_stop]
0-management: bitd service is stopped" repeated 2 times between [2022-08-29
18:09:56.027147 +0000] and [2022-08-29 18:10:34.694144 +0000]
[2022-08-29 18:10:34.695348 +0000] I [MSGID: 106618]
[glusterd-svc-helper.c:909:glusterd_attach_svc] 0-glusterd: adding svc
glustershd (volume=engine) to existing process with pid 2473
[2022-08-29 18:10:34.695669 +0000] I [MSGID: 106131]
[glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already
stopped
[2022-08-29 18:10:34.695691 +0000] I [MSGID: 106568]
[glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: scrub service is
stopped
[2022-08-29 18:10:34.695832 +0000] I [MSGID: 106617]
[glusterd-svc-helper.c:698:glusterd_svc_attach_cbk] 0-management: svc
glustershd of volume engine attached successfully to pid 2473
[2022-08-29 18:10:34.703718 +0000] E [MSGID: 106115]
[glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit
failed on gluster2.xxxxx. Please check log file for details.
[2022-08-29 18:10:34.703774 +0000] E [MSGID: 106115]
[glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit
failed on gluster1.xxxxx. Please check log file for details.
[2022-08-29 18:10:34.703797 +0000] E [MSGID: 106664]
[glusterd-mgmt.c:1969:glusterd_mgmt_v3_post_commit] 0-management: Post
commit failed on peers
[2022-08-29 18:10:34.703800 +0000] E [MSGID: 106664]
[glusterd-mgmt.c:2664:glusterd_mgmt_v3_initiate_all_phases] 0-management:
Post commit Op Failed
If I run start command manually on host cli:
gluster volume start engine force
volume start: engine: failed: Post commit failed on gluster1.xxxx. Please
check log file for details.
Post commit failed on gluster2.xxxx. Please check log file for details.
I feel like this may be some issue with the difference in major versions of
GlusterFS on the nodes but I am unsure. The other nodes are running
ovirt-node-ng-4.4.6.3
At this point I am afraid to bring down any other node to attempt upgrading
it without the bricks in UP status on the first host. I do not want to lose
quorum and potentially disrupt running VMs.
Any idea why I can't seem to start the volumes on the upgraded host?
Thanks!
2 years, 4 months
(no subject)
by parallax
oVirt version:4.4.4.7-1.el8
I have several servers in cluster and I got this error:
Data Center is being initialized, please wait for initialization to
complete.
VDSM command GetStoragePoolInfoVDS failed: PKIX path validation failed:
java.security.cert.CertPathValidatorException: validity check failed
the SPM role is constantly being transferred to the servers and I can't do
anything
in StorageDomains storages are in inactive status but virtual machines are
runnig
how ti fix it?
2 years, 4 months
Commissioning hew host on centos9
by David Johnson
Good evening all,
I am trying to commission an ovirt host on new hardware using Centos 9.
I followed the pre-install instructions at
https://www.ovirt.org/download/install_on_rhel.html, and eventually figured
out that this line was missing from the preparation scripts:
dnf install rdo-openvswitch-2.15
With that, I was able to continue installation.
After completing the instructions, DNF update reports this:
[root@localhost administrator]# dnf update --nobest
Last metadata expiration check: 0:01:06 ago on Fri 26 Aug 2022 08:20:27 PM
CDT.
Dependencies resolved.
Problem 1: package ovirt-openvswitch-2.15-4.el9.noarch requires
openvswitch2.15, but none of the providers can be installed
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-99.el9s.x86_64
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-51.el9s.x86_64
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-56.el9s.x86_64
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-81.el9s.x86_64
- cannot install the best update candidate for package
ovirt-openvswitch-2.15-4.el9.noarch
- cannot install the best update candidate for package
openvswitch2.15-2.15.0-99.el9s.x86_64
Problem 2: problem with installed package
ovirt-openvswitch-2.15-4.el9.noarch
- package ovirt-openvswitch-2.15-4.el9.noarch requires openvswitch2.15,
but none of the providers can be installed
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-99.el9s.x86_64
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-51.el9s.x86_64
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-56.el9s.x86_64
- package rdo-openvswitch-2:2.17-2.el9s.noarch obsoletes openvswitch2.15
< 2.17 provided by openvswitch2.15-2.15.0-81.el9s.x86_64
- cannot install the best update candidate for package
rdo-openvswitch-1:2.15-2.el9s.noarch
DNF info lists these
[root@localhost administrator]# dnf list installed |grep openvswitch
centos-release-nfv-openvswitch.noarch 1-4.el9s
@c9s-extras-common
network-scripts-openvswitch2.15.x86_64 2.15.0-99.el9s
@centos-nfv-openvswitch
openvswitch-selinux-extra-policy.noarch 1.0-31.el9s
@centos-nfv-openvswitch
*openvswitch2.15.x86_64 2.15.0-99.el9s
@centos-nfv-openvswitch*ovirt-openvswitch.noarch
2.15-4.el9 @centos-ovirt45
python3-openvswitch2.15.x86_64 2.15.0-99.el9s
@centos-nfv-openvswitch
rdo-openvswitch.noarch 1:2.15-2.el9s
@centos-openstack-yoga
*Questions:*
1. Is this a problem?
2. If this is a problem, is there an FAQ sheet or known steps to work
around/fix this?
3. Does there need to be an update to the pre-installation instructions?
*David Johnson*
*Director of Development, Maxis Technology*
844.696.2947 ext 702 (o) | 479.531.3590 (c)
<https://www.linkedin.com/in/pojoguy/>
<https://maxistechnology.com/wp-content/uploads/vcards/vcard-David_Johnson...>
<https://maxistechnology.com/>
*Follow us:* <https://www.linkedin.com/company/maxis-tech-inc/>
2 years, 4 months
oVirt 4.5 self-hosted network not working
by Paul-Erik Törrönen
Specifically, the installation (ovirt-hosted-engine-setup) completes successfully, and the engine VM is up and running, but it can not be accessed from any other machine in the same subnet but the host machine.
I can ssh (from the host) to the engine VM, and confirm that initiating the connection from the engine VM to any other machine, except the host, fails. The routing looks correct and stopping the firewalld on the engine VM does not make any difference. Nor does stopping the firewalld on the host either make any difference.
This seems like a failure in correctly setting up the host routing for the engine VM.
How do I fix this?
Poltsi
2 years, 4 months
Ovirt 4.5.2 problem with change CD during VM installation
by Facundo Badaracco
HI everyone
i have a GlusterGS replica 3 node. All is working fine. Except when i want
to install a win10 x64 test VM. The windows ISO loads fine, when it comes
the time to put de virio win so the VM recongnizes the disk, the change CD
feature doesnt change anything. I have tried with several ISOs, but in the
windows installer it never changes the CD, always seems empty.
Any ideas?
thx in advance.
2 years, 4 months
oVirt Engine Dashboard and information from Guest Agents
by markeczzz@gmail.com
Hi!
Recently I had a situation where single node oVirt setup was without network connection to shared storage.
After shared storage got reconnected, everything got up as it was supposed to, but I can't get information from guest agents shown in ovirt Dashboard.
I tried restarting ovirt-engine and then 2 vm-s started sending info from guest-agent, but rest of them not. Guest agent is running on those vm-s. I also tried restarting those vm-s.
Hosted engine is up and healty, and ovirt-engine service is running and it looks ok also.
I have this error in vdsm.log
2022-08-25 10:41:39,913+0200 INFO (periodic/1) [vdsm.api] FINISH repoStats return={'dff59a93-4dca-40cd-9c44-c0a74ebaa656': {'code': 0, 'lastCheck': '2.2', 'delay': '0.000840686', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '30c0c9e4-b634-4a05-b921-7a30b69f6dd2': {'code': 0, 'lastCheck': '3.9', 'delay': '0.000830941', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '6cb12d96-3a0a-4902-acb0-deb907ef945c': {'code': 0, 'lastCheck': '5.5', 'delay': '0.000857244', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '559569b9-c33e-4fb0-be6a-51a178a228a9': {'code': 0, 'lastCheck': '5.3', 'delay': '0.000463871', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=internal, task_id=8c638713-f010-4692-ba0b-1ac94052fe7c (api:54)
2022-08-25 10:41:40,778+0200 ERROR (qgapoller/2) [virt.periodic.Operation] <bound method QemuGuestAgentPoller._poller of <vdsm.virt.qemuguestagent.QemuGuestAgentPoller object at 0x7f9efc08e6d8>> operation failed (periodic:204)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/virt/periodic.py", line 202, in __call__
self._func()
File "/usr/lib/python3.6/site-packages/vdsm/virt/qemuguestagent.py", line 493, in _poller
vm_id, self._qga_call_get_vcpus(vm_obj))
File "/usr/lib/python3.6/site-packages/vdsm/virt/qemuguestagent.py", line 814, in _qga_call_get_vcpus
if 'online' in vcpus:
TypeError: argument of type 'NoneType' is not iterable
2022-08-25 10:41:41,551+0200 INFO (jsonrpc/2) [api.virt] START getStats() from=::1,50396, vmId=a6e745c4-2e4e-4672-8625-08b787832020 (api:48)
2 years, 4 months
oVirt 4.5.2 new ISO uploads are not usable
by Christoph Timm
Hi list,
we have uploaded new ISO files to our data domain which we are using for
ISO images and found out that we cannot boot from these ISO.
Older ISOs are still working but no new one.
I have not really any idea what to check and where to look so any kind
of help would be really appreciated.
Best regards
Christoph
2 years, 4 months
Self-hosted engine deploy failed
by Henry Wong
Hi,
I have been trying to deploy the engine from cockpit of a ovirt node 4.5.2. The system is freshly installed from the iso. The deployment failed on step 3 Prepare VM. The last error in the /var/log/ovirt-hosted-engine-setup/*log said something about "SSO authentication access_denied : Cannot authenticate user Invalid user credentials." Has anyone seen this before? Thanks
```
2022-08-18 18:57:20,334-0500 ERROR ansible failed {
"ansible_host": "localhost",
"ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/he_ansible/trigger_role.yml",
"ansible_result": {
"_ansible_no_log": false,
"attempts": 50,
"changed": false,
"exception": "Traceback (most recent call last):\n File \"/tmp/ansible_ovirt_auth_payload_1t3ixb8c/ansible_ovirt_auth_payload.zip/ansible_collections/ovirt/ovirt/plugins/modules/ovirt_auth.py\", line 287, in main\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/__init__.py\", line 382, in authenticate\n self._sso_token = self._get_access_token()\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/__init__.py\", line 627, in _get_access_token\n sso_error[1]\novirtsdk4.AuthError: Error during SSO authentication access_denied : Cannot authenticate user Invalid user credentials.\n",
"invocation": {
"module_args": {
"ca_file": null,
"compress": true,
"headers": null,
"hostname": null,
"insecure": true,
"kerberos": false,
"ovirt_auth": null,
"password": null,
"state": "present",
"timeout": 0,
"token": null,
"url": null,
"username": null
}
},
"msg": "Error during SSO authentication access_denied : Cannot authenticate user Invalid user credentials."
},
"ansible_task": "Obtain SSO token using username/password credentials",
"ansible_type": "task",
"status": "FAILED",
"task_duration": 537
}
```
Static hostname: xxxxxxxxx
Icon name: computer-server
Chassis: server
Machine ID: 6090168dcd724b04be97d57b01c2a11c
Boot ID: 5fba59e1914e48a1aad23fc02641bf3c
Operating System: oVirt Node 4.5.2
CPE OS Name: cpe:/o:centos:centos:8
Kernel: Linux 4.18.0-408.el8.x86_64
Architecture: x86-64
2 years, 4 months