oVirt 4.5.2 /var growing rapidly due to ovirt_engine_history db
by sohail_akhter3@hotmail.com
Hi,
We have recently upgrade our oVirt environment to 4.5.2 version. Environment is based on hosted-engine. Since we have upgrade we noticed rapid incrase in /var partition in engine VM. It is increasing very rapidly. If we vacuum ovirt_engine_history db, /var size reduces but next day size will increase again upto 5-10%. We did db vacuuming couple of time but not sure we it is increasing so rapidly.
Here is the partial output of vacuuming that was done on 26-08-22. Table "host_interface_hourly_history" had more entries to be removed. Rest of the table had not much entries. Previously table "host_interface_samples_history" had entries to be removed.
Any Idea what can be the reason for that?
# dwh-vacuum -f -v
SELECT pg_catalog.set_config('search_path', '', false);
vacuumdb: vacuuming database "ovirt_engine_history"
RESET search_path;
SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace ns ON c.relnamespace OPERATOR(pg_catalog.=) ns.oid
LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid
WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm'])
ORDER BY c.relpages DESC;
SELECT pg_catalog.set_config('search_path', '', false);
VACUUM (FULL, VERBOSE) public.host_interface_samples_history;
INFO: vacuuming "public.host_interface_samples_history"
INFO: "host_interface_samples_history": found 3135 removable, 84609901 nonremovable row versions in 1564960 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU: user: 41.88 s, system: 14.93 s, elapsed: 422.83 s.
VACUUM (FULL, VERBOSE) public.host_interface_hourly_history;
INFO: vacuuming "public.host_interface_hourly_history"
INFO: "host_interface_hourly_history": found 252422 removable, 39904650 nonremovable row versions in 473269 pages
Please let me know if any further information is required.
Regards
Sohail
2 years, 6 months
Notified that Engine's certification is about to expire but no documentation to renew it
by Guillaume Pavese
Hello
We are receiving the following notifications from our ovirt manager :
Message:Engine's certification is about to expire at 2022-05-03. Please
renew the engine's certification.
Severity:WARNING
Effectively :
# openssl x509 -in /etc/pki/ovirt-engine/certs/engine.cer -startdate
-enddate -noout
notBefore=Mar 30 04:48:15 2021 GMT
notAfter=May 3 04:48:15 2022 GMT
However I can not find any documentation on how to renew this certificate.
The following doc only convers changing apache-ca.pem & apache.cer, and
not engine.cer
Doc oVirt :
https://ovirt.org/documentation/administration_guide/index.html#Replacing...
Doc RHV :
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/...
Any help ?
Guillaume Pavese
Ingénieur Système et Réseau
Interactiv-Group
--
Ce message et toutes les pièces jointes (ci-après le “message”) sont
établis à l’intention exclusive de ses destinataires et sont confidentiels.
Si vous recevez ce message par erreur, merci de le détruire et d’en avertir
immédiatement l’expéditeur. Toute utilisation de ce message non conforme a
sa destination, toute diffusion ou toute publication, totale ou partielle,
est interdite, sauf autorisation expresse. L’internet ne permettant pas
d’assurer l’intégrité de ce message . Interactiv-group (et ses filiales)
décline(nt) toute responsabilité au titre de ce message, dans l’hypothèse
ou il aurait été modifié. IT, ES, UK.
<https://interactiv-group.com/disclaimer.html>
2 years, 6 months
Certificate expiration
by Joseph Gelinas
Hi,
The certificates on our oVirt stack recently expired, while all the VMs are still up, I can't put the cluster into global maintenance via ovirt-engine, or do anything via ovirt-engine for that matter. Just get event logs about cert validity.
VDSM ovirt-1.xxxxx.com command Get Host Capabilities failed: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed
VDSM ovirt-2.xxxxx.com command Get Host Capabilities failed: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed
VDSM ovirt-3.xxxxx.com command Get Host Capabilities failed: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed
Under Compute -> Hosts, all are status Unassigned. Default data center is status Non Responsive.
I have tried a couple of solutions to regenerate the certificates without much luck and have copied the originals back in place.
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/...
https://access.redhat.com/solutions/2409751
I have seen things saying running engine-setup will generate new certs, however engine doesn't think the cluster is in global maintenance so won't run that, I believe I can get around the check with `engine-setup --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` but is that the right thing to do? Will it deploy the certs on to the hosts as well so things communicate properly? Looks like one is supposed to put a node into maintenance and reenroll it after doing the engine-setup, but will it even be able to put the nodes into maintenance given I can't do anything with them now?
Appreciate any ideas.
2 years, 6 months
Re: Should I migrate existing oVirt Engine, or deploy new?
by David White
Hi Paul,
Thanks for the response.
I think you're suggesting that I take a hybrid approach, and do a restore of the current Engine onto the new VM. I hadn't thought about this option.
Essentially what I was considering was either:
- Export to OVA or something
OR
- Build a completely new oVirt engine with a completely new domain, etc... and try to live migrate the VMs from the old engine to the new engine.
Do I understand you correctly that you're suggesting I install the OS onto a new VM, and try to do a restore of the oVirt settings onto the new VM (after I put the cluster into Global maintenance mode and shutdown the old oVirt)?
Sent with Proton Mail secure email.
------- Original Message -------
On Friday, August 19th, 2022 at 10:46 AM, Staniforth, Paul <P.Staniforth(a)leedsbeckett.ac.uk> wrote:
> Hello David,
> I don't think there's a documentated method to go from a Hosted Engine to standalone just the other way standalone to HE.
>
> I would suggest doing a full backup of the engine prepare the new VM and restore to that rather than trying to export it.
> This way you can shut down the original engine and run the new engine VM to test it works as you will be able to restart the original engine if it doesn't work.
>
> Regards,
> Paul S.
>
>
>
>
>
> From: David White via Users <users(a)ovirt.org>
> Sent: 19 August 2022 15:27
> To: David White <dmwhite823(a)protonmail.com>
> Cc: oVirt Users <users(a)ovirt.org>
> Subject: [ovirt-users] Re: Should I migrate existing oVirt Engine, or deploy new?
>
> Caution External Mail: Do not click any links or open any attachments unless you trust the sender and know that the content is safe.
>
> In other words, I want to migrate the Engine from a hyperconverged environment into a stand-alone setup.
>
>
> Sent with Proton Mail secure email.
>
> ------- Original Message -------
> On Friday, August 19th, 2022 at 10:17 AM, David White via Users <users(a)ovirt.org> wrote:
>
>
> > Hello,
> > I have just purchased a Synology SA3400 which I plan to use for my oVirt storage domain(s) going forward. I'm currently using Gluster storage in a hyperconverged environment.
> >
> > My goal now is to:
> >
> > - Use the Synology Virtual Machine manager to host the oVirt Engine on the Synology
> > - Setup NFS storage on the Synology as the storage domain for all VMs in our environment
> > - Migrate all VM storage onto the new NFS domain
> > - Get rid of Gluster
> >
> >
> > My first step is to migrate the oVirt Engine off of Gluster storage / off the Hyperconverged hosts into the Synology Virtual Machine manager.
> >
> > Is it possible to migrate the existing oVirt Engine (put the cluster into Global Maintenance Mode, shutdown oVirt, export to VDI or something, and then import into Synology's virtualization)? Or would it be better for me to install a completely new Engine, and then somehow migrate all of the VMs from the old engine into the new engine?
> >
> > Thanks,
> > David
> >
> >
> > Sent with Proton Mail secure email.
>
> To view the terms under which this email is distributed, please go to:-
> https://leedsbeckett.ac.uk/disclaimer/email
2 years, 7 months
hosted-engine -vm-status show a ghost node that is not anymore in the cluster: how to remove?
by Diego Ercolani
engine 4.5.2.4
The issue is that in my cluster when I use the:
[root@ovirt-node3 ~]# hosted-engine --vm-status
--== Host ovirt-node3.ovirt (id: 1) status ==--
Host ID : 1
Host timestamp : 1633143
Score : 3400
Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"}
Hostname : ovirt-node3.ovirt
Local maintenance : False
stopped : False
crc32 : 1cbfcd19
conf_on_shared_storage : True
local_conf_timestamp : 1633143
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1633143 (Wed Aug 31 14:37:53 2022)
host-id=1
score=3400
vm_conf_refresh_time=1633143 (Wed Aug 31 14:37:53 2022)
conf_on_shared_storage=True
maintenance=False
state=EngineDown
stopped=False
--== Host ovirt-node1.ovirt (id: 2) status ==--
Host ID : 2
Host timestamp : 373629
Score : 0
Engine status : unknown stale-data
Hostname : ovirt-node1.ovirt
Local maintenance : True
stopped : False
crc32 : 12a6eb81
conf_on_shared_storage : True
local_conf_timestamp : 373630
Status up-to-date : False
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=373629 (Tue Jun 14 16:48:50 2022)
host-id=2
score=0
vm_conf_refresh_time=373630 (Tue Jun 14 16:48:50 2022)
conf_on_shared_storage=True
maintenance=True
state=LocalMaintenance
stopped=False
--== Host ovirt-node2.ovirt (id: 3) status ==--
Host ID : 3
Host timestamp : 434247
Score : 3400
Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"}
Hostname : ovirt-node2.ovirt
Local maintenance : False
stopped : False
crc32 : badb3751
conf_on_shared_storage : True
local_conf_timestamp : 434247
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=434247 (Wed Aug 31 14:37:45 2022)
host-id=3
score=3400
vm_conf_refresh_time=434247 (Wed Aug 31 14:37:45 2022)
conf_on_shared_storage=True
maintenance=False
state=EngineDown
stopped=False
--== Host ovirt-node4.ovirt (id: 4) status ==--
Host ID : 4
Host timestamp : 1646655
Score : 3400
Engine status : {"vm": "up", "health": "good", "detail": "Up"}
Hostname : ovirt-node4.ovirt
Local maintenance : False
stopped : False
crc32 : 1a16027e
conf_on_shared_storage : True
local_conf_timestamp : 1646655
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1646655 (Wed Aug 31 14:37:43 2022)
host-id=4
score=3400
vm_conf_refresh_time=1646655 (Wed Aug 31 14:37:43 2022)
conf_on_shared_storage=True
maintenance=False
state=EngineUp
stopped=False
The problem is that ovirt-node1.ovirt is not anymore in ther cluster, in the host list presented by the ui there is correctly no ovirt-node1, the ovirt-node1 appears only in the commandline.
I did a full text search in the engine DB, but node1 doesn't appear anywhere, even in the filesystem, a grep doesn't find anything
2 years, 7 months
How many oVirt cluster, hosts and VMs do you have running?
by jlhm@usa.net
Hi, just trying to understand how our oVirt deployment compare to others.
- 75 cluster (and 75 Data centers as we map them 1 to 1 due to security requirements) spanning +3 data centers
- 328 oVirt servers
- +1900 VMs running
Majority are still on 4.3 (CentOS 7) but our engines (4 of them) runs RedHat 8/oVirt 4.4. Working to upgrade all hypervisors to RedHat 8 (or Rocky 8)/oVirt 4.4.
When that is done we will start to upgrade to latest oVirt version but it takes time due to the size of our environment and we move slowly to ensure stability.
2 years, 7 months
Ubuntu NFS
by thilburn@generalpacific.com
Hello,
I was having trouble with getting an Ubuntu 22.04 NFS share working and after searching for hours I was able to figure out what was needed. Below is what I found if anyone else runs into this.
My error was
engine.log
"...Unexpected return value: Status [code=701, message=Could not initialize cluster lock: ()]"
Host
supervdsm.log
-open error -13 EACCES: no permission to open /ThePath/ids
-check that daemon user sanlock *** group sanlock *** has access to disk or file.
The fix was
changing /etc/nfs.conf manage-gids=y ( Which is the default ) to # manage-gids=y ( Commenting this sets the default which is no )
It looks like in the past the fix was to change /etc/default/nfs-kernel-server Line RPCMOUNTDOPTS="--manage-gids" which I didn't need to change.
2 years, 7 months
how kill backup operation
by Diego Ercolani
Hello I saw there are other thread asking how to delete disk snapshots from backup operation.
We definitively need a tool to kill pending backup operations and locked snapshots.
I Think this is very frustrating ovirt is a good piece of software but it's very immature in a dirty asyncronous world.
We need a unified toolbox to clean manually and do database housekeeping.
2 years, 7 months
Interested in contributing with Spanish translation.
by Luis Pereida
Hello,
My name is Luis Pereida, I am Mexican, from Guadalajara and I am currently
an application security specialist.
Some time ago I met the O-Virt project and it helped me a lot to solve many
situations where virtualization was the perfect option.
Since some time ago I have been thinking about how to contribute to the
project, and talking with some friends, they would like to have
documentation in Spanish. Although English is basic for us, many times the
context or expressions are hard to understand.
I would like to help with that. How can I do it? I see that it is necessary
to use a Zanata account. How can I get an account?
Regards and thanks for being so supportive of the community.
2 years, 7 months