local on host storage domain full, how to clean up
by Gianluca Cecchi
Hello,
I have a local based storage domain that has got full, so that it is now
inactive and virtual machines that are on it are paused (vm paused due to
lack of storage space).
Any advice on how to clean up, eventually deleting some of them?
# df -h /2t_2
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_2t_2-2t_2 1.8T 1.7T 0 100% /2t_2
[root@ovirt01 fdf9546c-68fa-42c9-8a10-78ef3ee534b8]#
Is there any easy way to map disk-to-vm of directories inside
/2t_2/images/dbf9611d-9090-42d6-81e0-58105bc20011/images/ so that I can
"sacrifice" some VMs deleting the corresponding disks' directories, to be
able at least to activate the storage domain again and make a cleaner check?
Or any other suggestions? I'm not able to expand it.
It is not directly managed by me and I suppose too much storage
over-provisioning has been done.
Thanks in advance.
Gianluca
2 years, 5 months
Q: How to Fix Frozen "Reboot in progress" VM Status
by Andrei Verovski
Hi,
I have VM which have restarted successfully yet in oVirt web it is being shown with “Rebooting” status for a very long time.
I did:
su - postgres
psql engine
select vm_guid from vm_static where vm_name='WInServerTerminal-2022’;
engine=# select status from vm_dynamic where vm_guid='7871067f-221c-48ed-a046-f49499ce9be4';
status
--------
10
(1 row)
How to properly correct status from "Rebooting”?
Thanks in advance
Andrei
2 years, 5 months
how to force engine certificate renewal
by Gianluca Cecchi
Hello,
I'm currently still on 4.4.x.
Suppose I have an engine certificate expiring on mid August and I want to
force renew it now using "engine-setup --offline" command.
How can I do it if possible?
How many days before expiration I get the message that it is expiring soon
with a proposal of renewing it when running "engine-setup"?
Thanks,
Gianluca
2 years, 5 months
new cluster, 6 nodes
by bpbp@fastmail.com
Hi all, planning a new 6 node hyper-converged cluster. Have a couple of questions
1) storage - I think we want to do 2x replicas and 1 arbiter, in the chained configuration seen here (example 5.7)
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5...
any suggestions on how that looks from the bottom up? for example does each host have all their disks in a single hardware raid6 volume, and then the bricks are thinly provisioned via LVM on top so each node has 2 data and 1 arbiter bricks. or is something else recommended?
2) setup - Do I start with a 3 node pool and extend to 6 or use ansible to set up 6 from the start?
Thanks
2 years, 5 months
Re: Getting error on oVirt installation
by dwayne.morton@cment.com
I've installed ovirt node (fresh 4.4) and am trying to deploy engine and it fails each time. seems to be in a loop. I seen this in RHV 4.3 and it was due to IPV6 being enabled. Any help us appreciated.
2 years, 5 months
storage high latency, sanlock errors, cluster instability
by Jonathan Baecker
Hello everybody,
we run a 3 node self hosted cluster with GlusterFS. I had a lot of
problem upgrading ovirt from 4.4.10 to 4.5.0.2 and now we have cluster
instability.
First I will write down the problems I had with upgrading, so you get a
bigger picture:
* engine update when fine
* But nodes I could not update because of wrong version of imgbase, so
I did a manual update to 4.5.0.1 and later to 4.5.0.2. First time
after updating it was still booting into 4.4.10, so I did a reinstall.
* Then after second reboot I ended up in the emergency mode. After a
long searching I figure out that lvm.conf using *use_devicesfile
*now but there it uses the wrong filters. So I comment out this and
add the old filters back. This procedure I have done on all 3 nodes.
* Then in cockpit on all nodes I see errors about:
|ovs|00077|stream_ssl|ERR|Private key must be configured to use SSL|
to fix that I run *vdsm-tool ovn-config [engine IP] ovirtmgmt, *and
later in then web interface I choice for every node: enroll certificate.
* between upgrading the nodes, I was a bit to fast to migrate all
running VMs inclusive the HostedEngine, from one host to another and
then hosted engine crashes one time. But it came back after some
minutes and since this the engine runs normal.
* Then I finish the installation with updating the cluster
compatibility version to 4.7.
* I notice some unsync volume warning, but because I had this in the
past to, after upgrading, I though after some time they will
disappear. The next day there still where there, so I decided to put
the nodes again in the maintenance mode and restart the glusterd
service. After some time the sync warnings where gone.
So now the actual problem:
Since this time the cluster is unstable. I get different errors and
warning, like:
* VM [name] is not responding
* out of nothing HA VM gets migrated
* VM migration can fail
* VM backup with snapshoting and export take very long
* VMs are getting very slow some times
* Storage domain vmstore experienced a high latency of 9.14251
*
ovs|00001|db_ctl_base|ERR|no key "dpdk-init" in Open_vSwitch record
"." column other_config
* 489279 [1064359]: s8 renewal error -202 delta_length 10 last_success
489249
* 444853 [2243175]: s27 delta_renew read timeout 10 sec offset 0
/rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/ids
* 471099 [2243175]: s27 delta_renew read timeout 10 sec offset 0
/rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/ids
* many of: 424035 [2243175]: s27 delta_renew long write time XX sec
I will put here the sanlock.log messages and vdsm.log.
Is there a way that I can fix this issues?
Regards!
Jonathan
2 years, 5 months
Install OKD 4.10 with Custom oVirt Certificate
by Fredrik Arneving
Hi,
I've setup and ran Installer Provisioned Installation of OKD on several occations with OKD versions 4.4 - 4.8 on my oVirt (4.3?)/4.4 platform. However, after installing a Custom certificate for my self-hosted ovirt engine I've got problems getting the installation of OKD 4.10 (and 4.8) to complete. Is this a known problem with a known solution I can read up on somewhere?
The install takes three times as long as the working ones did before and when I look at pods and cluster operators the "authentication" ones are in a bad state. I can use the KUBECONFIG environment variable to list pods and interact with the environment but the "oc login" fails with "unknown issuer".
I had the choice of a "full install" of my custom cert or just the GUI/Web and I chose the latter. When installing the custom cert I followed the official RHV documentation that was pointed to by some oVirt user in a forum. Whatever certs I didn't change seemed to have worked before so I would be surprised if the solution is to go for the "full install". In all other cases (like my Foreman server and my freeIPA server) the oVirt works just fine with it's custom cert.
Since I've made it before I'm pretty sure I've correctly followed the OKD installation instructions. What's new is the custom ovirt hosted-engine cert. Is there a detailed documentation on exactly what certificates from my oVirt installation that should be added to my "additionalTrustBundle" in OKD to make it work? In my previous working installations I added the custom root CA since I needed it for other purposes but maybe I need to add some other internal ovirt CA?
I'm currently running oVirt version "4.4.10.7-1.el8" on CentOS Stream release 8 and OKD version "4.10.0-0.okd-2022-03-07-131213". No hardware changes between working installations and failed ones.
Any hints on how to solve this would be appreciated
2 years, 6 months
why so many such logs ?
by tommy
In the new version of 4.5, we can see a lot of OVN synchronization items in the engine logs, very frequently, which was not seen in previous versions.
Is it a new feature?
2 years, 6 months
about the bridge of the host
by tommy
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovirtmgmt state UP group default qlen 1000
link/ether 08:00:27:94:4d:e8 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 9e:5d:8f:94:00:86 brd ff:ff:ff:ff:ff:ff
4: br-int: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether ea:20:e5:c3:d6:31 brd ff:ff:ff:ff:ff:ff
5: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 08:00:27:94:4d:e8 brd ff:ff:ff:ff:ff:ff
inet 10.1.1.7/24 brd 10.1.1.255 scope global noprefixroute ovirtmgmt
valid_lft forever preferred_lft forever
21: ip_vti0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
22: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 1e:cb:bf:02:f7:33 brd ff:ff:ff:ff:ff:ff
what use of the 3/4/5/21/22 ? ( I know the item 5 )
are they all the bridge ?
The out put of the brctl show appears only the ovirtmgmt and ;vdsmdummy' are bridge.
[root@host1 ~]# brctl show
bridge name bridge id STP enabled interfaces
;vdsmdummy; 8000.000000000000 no
ovirtmgmt 8000.080027944de8 no enp0s3
[root@host1 ~]#
2 years, 6 months