After upgrade to 4.5.4: host reboot fails and api-user unable to log in
by Thyen, Niko
Hi all,
i just upgraded a cluster from 4.3.10 to 4.5.4. 3 hosts, hosted-engine,
engine deployment via shell and with restore from backup. Engine update
worked fine, but host update from 4.4.10 to 4.5.4 failed because the
engine could not initiate a host reboot:
2023-10-10 13:37:27,802+02 WARN
[org.ovirt.engine.core.dal.job.ExecutionMessageDirector]
(EE-ManagedThreadFactory-engine-Thread-2) [7d4d6 a58] The message key
'SshHostReboot' is missing from 'bundles/ExecutionMessages'
2023-10-10 13:37:27,823+02 INFO
[org.ovirt.engine.core.bll.SshHostRebootCommand]
(EE-ManagedThreadFactory-engine-Thread-2) [7d4d6a58] Running command:
SshHostRebootCommand internal: true. Entities affected : ID:
b578519d-7f34-4509-8b8b-dbab1b2cf6d2 Type: VDSAction group
MANIPULATE_HOST with role type ADMIN
2023-10-10 13:37:27,827+02 INFO
[org.ovirt.engine.core.bll.SshHostRebootCommand]
(EE-ManagedThreadFactory-engine-Thread-2) [7d4d6a58] Opening SSH reboot
session on host <host fqdn>
2023-10-10 13:37:28,030+02 ERROR
[org.ovirt.engine.core.bll.SshHostRebootCommand]
(EE-ManagedThreadFactory-engine-Thread-2) [7d4d6a58] SSH reboot command
failed on host '<host fqdn>': SSH session closed during connection
'root@<host fqdn>'
Stdout:
Stderr:
2023-10-10 13:37:28,038+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-2) [7d4d6a58] EVENT_ID:
SYSTEM_FAILED_SSH_HOST_RESTART(198), A restart using SSH initiated by
the engine to Host <host fqdn> has failed.
This also happens when a host is reinstalled via engine webinterface.
How can i fix this? The only Workaround i found is rebooting the host
via SSH and reactivating it via engine webinterface.
Also, i constantly find error messages like these in the engine.log:
2023-10-11 13:54:11,338+02 INFO
[org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default
task-815) [] locking user: api-user due to interval failures
2023-10-11 13:54:14,932+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-812) [] EVENT_ID: USER_VDC_LOGIN_FAILED(114), User
api-user@internal connecting from '<engine-ip>' failed to log in :
'Unable to log in because the password has expired. Please change the
password to proceed.'.
2023-10-11 13:54:14,933+02 ERROR
[org.ovirt.engine.core.sso.service.SsoService] (default task-815) []
OAuthException access_denied: Cannot authenticate user
'api-user@internal': Unable to log in because the password has expired.
Please change the password to proceed..
2023-10-11 13:54:14,934+02 ERROR
[org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default
task-814) [] Cannot authenticate using authentication Headers:
access_denied: Cannot authenticate user 'api-user@internal': Unable to
log in because the password has expired. Please change the password to
proceed..
This only affects the user "api-user". I am still new to oVirt and have
no idea what the api-user is trying to do and where.
I found
https://www.ovirt.org/develop/release-management/features/infra/aaa-jdbc....
but havent tried resetting yet because i dont know what will be affected
by that.
Can anyone help to point me in the right direction?
Many thanks in advance :)
Regards,
Niko
1 year, 6 months
Impossible to snaphot certain virtual machines
by nolhan.bertille@unige.ch
Hello all,
When performing snapshots I recieve the following error
2023-10-09 16:09:49,951: !!! No snapshot found !!!
2023-10-09 16:09:49,951: All backups done
2023-10-09 16:09:49,951: Backup failure for:
2023-10-09 16:09:49,951: *******
2023-10-09 16:09:49,951: Some errors occurred during the backup, please check the log file
Where can I look for more information? Where can I rise the verbosity ?
What process should I look at ?
Thanks all
1 year, 6 months
Getting error when performing snapshots
by nolhan.bertille@unige.ch
Hello Guys,
I'm repeatly getting the following error when running snapshots with the script backup.py
2023-10-06 14:54:09,773: !!! No snapshot found !!!
2023-10-06 14:54:09,773: All backups done
2023-10-06 14:54:09,773: Backup failure for:
2023-10-06 14:54:09,773: *******
2023-10-06 14:54:09,773: Some errors occurred during the backup, please check the log file
I have no idea what/where to lok at ?
Could somebody help me ?
1 year, 6 months
Hosted engine VM not coming up after Storage rebuild.
by Sumit Basu
Hi,
We are running Ovirt 4.3 on IBM x3650, x3550 servers with our SAN on IBM Midrange storage DS5300. All the storage domains are separate LUN's in the storage with a dedicated LUN for the HostedEngine VM. The storage had a failure due to a major power issue. One of the storage array's that have the storage domains had to be re-constructed using IBM's StorageManager tool. After booting the hosts, i can see all the storage domains as LUN's with the multipath -ll from all the hosts, but on starting the hosted engine with #hosted-engine --vm-start, and checking with --vm-status i get
"Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}'.
The global maintenance mode is enabled and i tried "hosted-engine --console" - after logging in i checked the boot.log, i find
[FAILED] Failed to mount /var/log.
See 'systemctl status var-log.mount' for details.
[DEPEND] Dependency failed for Update UTMP about System Boot/Shutdown.
[DEPEND] Dependency failed for Update UTMP about System Runlevel Changes.
[DEPEND] Dependency failed for Flush Journal to Persistent Storage.
[DEPEND] Dependency failed for /var/log/audit.
[DEPEND] Dependency failed for Local File Systems.
[DEPEND] Dependency failed for Mark the need to relabel after reboot.
[DEPEND] Dependency failed for Migrate local... structure to the new structure.
[DEPEND] Dependency failed for Relabel all filesystems, if necessary.
the "systemctl status var-log.mount" shows
● var-log.mount - /var/log
Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2023-10-06 14:38:58 IST; 10min ago
Where: /var/log
What: /dev/mapper/ovirt-log
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Process: 691 ExecMount=/bin/mount /dev/mapper/ovirt-log /var/log -t xfs -o nodev (code=exited, status=32)
Oct 06 14:38:57 ovman systemd[1]: Mounting /var/log...
Oct 06 14:38:58 ovman mount[691]: mount: mount /dev/mapper/ovirt-...g
Oct 06 14:38:58 ovman systemd[1]: var-log.mount mount process exi...2
Oct 06 14:38:58 ovman systemd[1]: Failed to mount /var/log.
Oct 06 14:38:58 ovman systemd[1]: Unit var-log.mount entered fail....
Hint: Some lines were ellipsized, use -l to show in full.
I need to recover from this quick.
Sumit Basu
1 year, 6 months
keycloak Active Directory BufferOverFlowException
by karl.morgan@gmail.com
been trying to get a configuration working with ovirt 4.5.4 and keycloak and windows Active Directory
I have had partial success in that i can with a little fiddling use ldap authentication for users. The fiddling involves going in and adding specific permissions to each ID after the first attempt has been made.
I am now trying to get groups working such that I can have an AD user group automatically determine a users capabilities dependent on inclusion in specific groups.
I have keycloak configured where, from within keycloak I can list users, And i can see which groups a user is a member of. It looks like is should be working. However, in Ovirt im getting invalid data error in the web and logs print the following
==> error_log <==
[Thu Oct 05 09:22:58.037130 2023] [proxy_ajp:error] [pid 75583:tid 140622648301312] AH03229: ajp_msg_append_cvt_string(): BufferOverflowException 4 5694
==> ssl_error_log <==
[Thu Oct 05 09:22:58.037162 2023] [proxy_ajp:error] [pid 75583:tid 140622648301312] [client 172.28.96.180:43980] AH00971: ajp_marshal_into_msgb: Error appending the header value, referer: https://ov2ctl01-mn.internal.shutterfly.com/ovirt-engine/
[Thu Oct 05 09:22:58.037171 2023] [proxy_ajp:error] [pid 75583:tid 140622648301312] [client 172.28.96.180:43980] AH00988: ajp_send_header: ajp_marshal_into_msgb failed, referer: https://ov2ctl01-mn.internal.shutterfly.com/ovirt-engine/
[Thu Oct 05 09:22:58.037176 2023] [proxy_ajp:error] [pid 75583:tid 140622648301312] (120001)APR does not understand this error code: [client 172.28.96.180:43980] AH00868: request failed to 127.0.0.1:8702 (127.0.0.1), referer: https://ov2ctl01-mn.internal.shutterfly.com/ovirt-engine/
Pretty sure i need to be able to adjust a buffer size between keycloak and ajp module but am clueless how to proceed. Any help would be appreciated
1 year, 6 months
How to obtain vm snapshots status
by anton.alymov@cyberprotect.ru
Hi! I use ovirt rest api to start vm, backup vm and then remove vm.
I start vm, wait for vmstatus up, then start backup, wait for starting, finalize, wait for succeeded, wait for disk unlock. Looks like backup is finished here from my side.Because ovirt repost succeed status and unlocks disk. But if i try shutdown and remove vm ovirt will throw error Cannot remove VM. The VM is performing an operation on a Snapshot. Please wait for the operation to finish, and try again.
Ok, ovirt is right here, I see from web interface that operation hasn't finished yet. How can I obtain correct status where vm can be removed? I also tried to get info about vm snapshots but all of them had Status: ok
1 year, 6 months
Cannot remove template because a disk is based on it
by nicolas@devels.es
Hi,
We're running oVirt 4.5. We have a template which we'd like to get rid
of, there are no VMs based on it. However, trying to remove it oVirt
states that:
Cannot remove Template. The following Disk(s) are based on it:
(b54ee1cb-ed64-4db4-bd3d-eac8b22ea095) .
When opening the 'Disks' subtab in the template, it appears a disk
(screenshot attached).
However, when opening the oVirt Storage->Disks option, I cannot find the
disk by the ID, not even ordering the ID column and trying to find it
visually.
Can anyone point to the problem and a possible solution/workaround?
Thanks.
1 year, 6 months
Image upload paused by system
by muchiri.maina@gmail.com
I recently renewed my certs for the manager and hosts in ovirt 4.4 now when i try to upload an iso it gets paused by system then the error ,Unable to upload image to disk d840c8a3-21f5-4ad9-bd64-f569a93b4e74 due to a network error. Ensure ovirt-engine's CA certificate is registered as a trusted CA in the browser. i have registered the CA certificate but still wont upload
1 year, 6 months
What certs need monitoring for expiration for Ovirt hosts ?
by morgan cox
Hi.
I am aware of the following certs
- "/etc/pki/vdsm/certs/vdsmcert.pem"
- "/etc/pki/vdsm/libvirt-spice/server-cert.pem"
- "/etc/pki/vdsm/libvirt-vnc/server-cert.pem"
- "/etc/pki/libvirt/clientcert.pem"
- "/etc/pki/vdsm/libvirt-migrate/server-cert.pem"
And are monitoring them to avoid certs being expired.
We have a 3rd party cert/ca - do I also need to monitor the following
- "/etc/pki/ovirt-vmconsole/ca.pub"
- "/etc/pki/vdsm/certs/cacert.pem"
- "/etc/pki/vdsm/libvirt-migrate/ca-cert.pem"
- "/etc/pki/vdsm/libvirt-spice/ca-cert.pem"
- "/etc/pki/vdsm/libvirt-vnc/ca-cert.pem"
- "/etc/pki/CA/cacert.pem"
If the CA is updated on the engine do the above ca certs get updated with an update or re-enroll ?
Thanks
1 year, 6 months