gluster 5834 Unsynced entries present
by Dominique D
yesterday I had a glich and my second ovnode2 server restarted
here are some errors in the events :
VDSM ovnode3.telecom.lan command SpmStatusVDS failed: Connection timeout for host 'ovnode3.telecom.lan', last response arrived 2455 ms ago.
Host ovnode3.telecom.lan is not responding. It will stay in Connecting state for a grace period of 86 seconds and after that an attempt to fence the host will be issued.
Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host ovnode3.telecom.lan, Error: Network error during communication with the Host.).
Executing power management status on Host ovnode3.telecom.lan using Proxy Host ovnode1.telecom.lan and Fence Agent ipmilan:10.5.1.16.
Now my 3 bricks have errors from my gluster volume
[root@ovnode2 ~]# gluster volume status
Status of volume: datassd
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ovnode1s.telecom.lan:/gluster_bricks/
datassd/datassd 49152 0 Y 4027
Brick ovnode2s.telecom.lan:/gluster_bricks/
datassd/datassd 49153 0 Y 2393
Brick ovnode3s.telecom.lan:/gluster_bricks/
datassd/datassd 49152 0 Y 2347
Self-heal Daemon on localhost N/A N/A Y 2405
Self-heal Daemon on ovnode3s.telecom.lan N/A N/A Y 2366
Self-heal Daemon on 172.16.70.91 N/A N/A Y 4043
Task Status of Volume datassd
------------------------------------------------------------------------------
There are no active volume tasks
gluster volume heal datassd info | grep -i "Number of entries:" | grep -v "entries: 0"
Number of entries: 5759
in the webadmin all the bricks are green with comments for two :
ovnode1 Up, 5834 Unsynced entries present
ovnode2 Up,
ovnode3 Up, 5820 Unsynced entries present
I tried this without success
gluster volume heal datassd
Launching heal operation to perform index self heal on volume datassd has been unsuccessful:
Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file for details.
What are the next steps ?
Thank you
3 years, 2 months
about the expiration time of the oVirt certs
by Tommy Sway
As you know, there are many kinds of certificates in Ovirt, used for
communication, authentication and so on.
However, in practice, there is a security risk related to the above
certificates.
That is, you need to generate a new certificate after the certificate
expires. Otherwise, a problem will occur.
In addition, different certificates expire at different times, which brings
a lot of management trouble to users.
Especially in the production system, a huge virtualization cluster may run
thousands of VMS. If a cluster certificate has a problem, the impact is very
serious.
So I felt there was an urgent need for a technical tool that could help
users quickly locate certificates, identify their expiration dates, and
rebuild them.
Even if there is no tool, there should be a way to solve the problems caused
by partial certificate expiration. I think it should include the following
points:
First, how to list the certificate in detail
Second, how to check the certificate expiration time
Third, how to rebuild the certificate
Does anyone else have this kind of confusion? What's a good solution?
Thanks.
3 years, 2 months
Using third-party certificate: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
by nicolas@devels.es
Hi,
I'm making a bare metal oVirt installation, version 4.4.8.
'ovirt-engine' command ends well, however, we're using a third-party
certificate (from LetsEncrypt) both for the apache server and the
ovirt-websocket-proxy. So we changed configuration files regarding httpd
and ovirt-websocket-proxy.
Once changed the configurations, if I try to log in to the oVirt engine,
I get a "PKIX path building failed:
sun.security.provider.certpath.SunCertPathBuilderException: unable to
find valid certification path to requested target" error.
In prior versions we used to add the chain to the
/etc/pki/ovirt-engine/.truststore file, however, simply listing the
current certificates seems not to be working on 4.4.8.
# LANG=C keytool -list -keystore /etc/pki/ovirt-engine/.truststore
-alias intermedia_le -storepass mypass
keytool error: java.io.IOException: Invalid keystore format
Is there something I'm missing here?
Thank
3 years, 2 months
Balancing actions not shown in events list
by Gianluca Cecchi
Hello,
I have a cluster composed of 4 hosts, with 2 hosts in site A and 2 hosts in
site B.
Version of engine and hosts is latest 4.4.8-6.
Site A is the primary site and its hosts have SPM priority high, while site
B hosts have SPM priority low.
For critical VMs I create a cluster affinity group so that they preferably
run on hosts in site A.
If I migrate a VM from one host in site A to one host in site B, the
migration completes but suddenly, after a few seconds (ranging from 10 to
30) the VM comes back again (live migrates) to one host of the site A pool.
Two considerations:
. when the VM comes back to site A and I'm connected to the web admin gui I
see in bottom right the pop-up message regarding the balancing operation:
https://drive.google.com/file/d/1lfm0AVwYKyyRL1qHh94AySpr3XAtV7lO/view?us...
But then if I go in the VM, or cluster, or general events pane I don't see
any direct feedback regarding this balancing that took place.
I only see the VM migration events:
Oct 1, 2021, 2:47:01 PM Migration completed (VM: impoldsrvdbpbi, Source:
xxxx, Destination: yyyy, Duration: 15 seconds, Total: 27 seconds, Actual
downtime: 67ms)
Oct 1, 2021, 2:46:34 PM Migration initiated by system (VM: impoldsrvdbpbi,
Source: xxxx, Destination: yyyy, Reason: Affinity rules enforcement).
Oct 1, 2021, 2:45:45 PM Migration completed (VM: impoldsrvdbpbi, Source:
yyyy, Destination: xxxx, Duration: 2 seconds, Total: 14 seconds, Actual
downtime: (N/A))
Oct 1, 2021, 2:45:30 PM Migration started (VM: impoldsrvdbpbi, Source:
yyyy, Destination: xxxx, User: gian@internal).
That indeed contain some information (Reason: Affinity rules enforcement)
but only in the VM migration line.
Could it be useful to add an independent line regarding the balancing
trigger that implies then a migration?
. In this case could it be useful to give the user a warning that the VM
will be suddenly migrated back so that he/she can think about it before
having at the end two migrations with a final stage that is the starting
point itself...?
If I leave only one host in site A and put it into maintenance, the VMs are
correctly migrated to hosts in site B and even when the host in site A
comes back available, the coming back operation is not triggered. Is this
something expected or should the live migrate to hosts in site A?
Thanks,
Gianluca
3 years, 2 months