I am new here, been searching the mailing list on a regular base when I encountered problems and till now I always was able to keep the system up & running.
As said... till now...
I have oVirt at home, and I have about 7 vm's running on it. Lately I have had some troubles with my electricity, which results in a complete power outage on a irregular base.
Yesterday I had a power outage which left the system unbootable with the following errors:
error: ../../grub-core/loader/i386/pc/linux.c:170:invalid magic number.
error: ../../grub-core/loader/i386/pc/linux.c:1418:you need to load the kernel first.
Press any key to continue
Normally this can be solved following https://access.redhat.com/solutions/5829141, but his time also the files in /boot had a size of 0 bytes. So basically i did not have a working kernel on the system anymore.
https://www.thegeekdiary.com/centos-rhel-7-how-to-install-kernel-from-res... does work for CentOS 8 also, but the ovirt 4.8 iso does not have the same directory structure. Using CentOS 8 installs a kernel, but not a bootable system.
Is there a way i can start the installer in troubleshooting mode so I can reinstall just the kernel on the system?
yesterday I had a glich and my second ovnode2 server restarted
here are some errors in the events :
VDSM ovnode3.telecom.lan command SpmStatusVDS failed: Connection timeout for host 'ovnode3.telecom.lan', last response arrived 2455 ms ago.
Host ovnode3.telecom.lan is not responding. It will stay in Connecting state for a grace period of 86 seconds and after that an attempt to fence the host will be issued.
Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host ovnode3.telecom.lan, Error: Network error during communication with the Host.).
Executing power management status on Host ovnode3.telecom.lan using Proxy Host ovnode1.telecom.lan and Fence Agent ipmilan:10.5.1.16.
Now my 3 bricks have errors from my gluster volume
[root@ovnode2 ~]# gluster volume status
Status of volume: datassd
Gluster process TCP Port RDMA Port Online Pid
datassd/datassd 49152 0 Y 4027
datassd/datassd 49153 0 Y 2393
datassd/datassd 49152 0 Y 2347
Self-heal Daemon on localhost N/A N/A Y 2405
Self-heal Daemon on ovnode3s.telecom.lan N/A N/A Y 2366
Self-heal Daemon on 172.16.70.91 N/A N/A Y 4043
Task Status of Volume datassd
There are no active volume tasks
gluster volume heal datassd info | grep -i "Number of entries:" | grep -v "entries: 0"
Number of entries: 5759
in the webadmin all the bricks are green with comments for two :
ovnode1 Up, 5834 Unsynced entries present
ovnode3 Up, 5820 Unsynced entries present
I tried this without success
gluster volume heal datassd
Launching heal operation to perform index self heal on volume datassd has been unsuccessful:
Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file for details.
What are the next steps ?
As you know, there are many kinds of certificates in Ovirt, used for
communication, authentication and so on.
However, in practice, there is a security risk related to the above
That is, you need to generate a new certificate after the certificate
expires. Otherwise, a problem will occur.
In addition, different certificates expire at different times, which brings
a lot of management trouble to users.
Especially in the production system, a huge virtualization cluster may run
thousands of VMS. If a cluster certificate has a problem, the impact is very
So I felt there was an urgent need for a technical tool that could help
users quickly locate certificates, identify their expiration dates, and
Even if there is no tool, there should be a way to solve the problems caused
by partial certificate expiration. I think it should include the following
First, how to list the certificate in detail
Second, how to check the certificate expiration time
Third, how to rebuild the certificate
Does anyone else have this kind of confusion? What's a good solution?
I'm making a bare metal oVirt installation, version 4.4.8.
'ovirt-engine' command ends well, however, we're using a third-party
certificate (from LetsEncrypt) both for the apache server and the
ovirt-websocket-proxy. So we changed configuration files regarding httpd
Once changed the configurations, if I try to log in to the oVirt engine,
I get a "PKIX path building failed:
sun.security.provider.certpath.SunCertPathBuilderException: unable to
find valid certification path to requested target" error.
In prior versions we used to add the chain to the
/etc/pki/ovirt-engine/.truststore file, however, simply listing the
current certificates seems not to be working on 4.4.8.
# LANG=C keytool -list -keystore /etc/pki/ovirt-engine/.truststore
-alias intermedia_le -storepass mypass
keytool error: java.io.IOException: Invalid keystore format
Is there something I'm missing here?
I have a cluster composed of 4 hosts, with 2 hosts in site A and 2 hosts in
Version of engine and hosts is latest 4.4.8-6.
Site A is the primary site and its hosts have SPM priority high, while site
B hosts have SPM priority low.
For critical VMs I create a cluster affinity group so that they preferably
run on hosts in site A.
If I migrate a VM from one host in site A to one host in site B, the
migration completes but suddenly, after a few seconds (ranging from 10 to
30) the VM comes back again (live migrates) to one host of the site A pool.
. when the VM comes back to site A and I'm connected to the web admin gui I
see in bottom right the pop-up message regarding the balancing operation:
But then if I go in the VM, or cluster, or general events pane I don't see
any direct feedback regarding this balancing that took place.
I only see the VM migration events:
Oct 1, 2021, 2:47:01 PM Migration completed (VM: impoldsrvdbpbi, Source:
xxxx, Destination: yyyy, Duration: 15 seconds, Total: 27 seconds, Actual
Oct 1, 2021, 2:46:34 PM Migration initiated by system (VM: impoldsrvdbpbi,
Source: xxxx, Destination: yyyy, Reason: Affinity rules enforcement).
Oct 1, 2021, 2:45:45 PM Migration completed (VM: impoldsrvdbpbi, Source:
yyyy, Destination: xxxx, Duration: 2 seconds, Total: 14 seconds, Actual
Oct 1, 2021, 2:45:30 PM Migration started (VM: impoldsrvdbpbi, Source:
yyyy, Destination: xxxx, User: gian@internal).
That indeed contain some information (Reason: Affinity rules enforcement)
but only in the VM migration line.
Could it be useful to add an independent line regarding the balancing
trigger that implies then a migration?
. In this case could it be useful to give the user a warning that the VM
will be suddenly migrated back so that he/she can think about it before
having at the end two migrations with a final stage that is the starting
If I leave only one host in site A and put it into maintenance, the VMs are
correctly migrated to hosts in site B and even when the host in site A
comes back available, the coming back operation is not triggered. Is this
something expected or should the live migrate to hosts in site A?