[ANN] oVirt 4.5.1 First Release Candidate is now available for testing
by Lev Veyde
oVirt 4.5.1 First Release Candidate is now available for testing
The oVirt Project is pleased to announce the availability of oVirt 4.5.1
First Release Candidate for testing, as of June 9th, 2022.
This update is the first in a series of stabilization updates to the 4.5
series.
Documentation
-
If you want to try oVirt as quickly as possible, follow the instructions
on the Download <https://ovirt.org/download/> page.
-
For complete installation, administration, and usage instructions, see
the oVirt Documentation <https://ovirt.org/documentation/>.
-
For upgrading from a previous version, see the oVirt Upgrade Guide
<https://ovirt.org/documentation/upgrade_guide/>.
-
For a general overview of oVirt, see About oVirt
<https://ovirt.org/community/about.html>.
Important notes before you try it
Please note this is a pre-release build.
The oVirt Project makes no guarantees as to its suitability or usefulness.
This pre-release must not be used in production.
Installation instructions
For installation instructions and additional information please refer to:
https://ovirt.org/documentation/
This release is available now on x86_64 architecture for:
-
CentOS Stream 8
-
RHEL 8.6 Beta and derivatives
This release supports Hypervisor Hosts on x86_64:
-
oVirt Node NG (based on CentOS Stream 8)
-
CentOS Stream 8
-
RHEL 8.6 Beta and derivatives
Builds are also available for ppc64le and aarch64.
Experimental builds for CentOS Stream 9 are also provided for Hypervisor
Hosts.
See the release notes [1] for installation instructions and a list of new
features and bugs fixed.
Notes:
- oVirt Appliance is already available based on CentOS Stream 8
- oVirt Node NG is already available based on CentOS Stream 8
Additional Resources:
* Read more about the oVirt 4.5.1 pre-release highlights:
http://www.ovirt.org/release/4.5.1/
* Get more oVirt project updates on Twitter: https://twitter.com/ovirt
* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/
[1] http://www.ovirt.org/release/4.5.1/
[2] http://resources.ovirt.org/pub/ovirt-4.5-pre/iso/
Thanks in advance,
--
Lev Veyde
Senior Software Engineer, RHCE | RHCVA | MCITP
Red Hat Israel
<https://www.redhat.com>
lev(a)redhat.com | lveyde(a)redhat.com
<https://red.ht/sig>
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
2 years, 7 months
VmMediatedDevices help for vGPU in overt 4.5
by Don Dupuis
Hello
I am looking for an example on how to use the new VmMediatedDevices service
to add an Nvidia vGPU in ovirt to guest vms. I had it working just fine in
oVirt 4.4 when using the custom_properties method. Just need to understand
the new method/way of doing it correctly using the python3-ovirt-engine-sdk.
Thanks
Don
2 years, 7 months
Re: list-view instead of tiled-view in oVirt VM Portal?
by Frank Coons
Please note that (a) there are people that use more than 20 VM's that do
not need admin access, and (b) some people do not LIKE looking at big gaudy
buttons, even if there are only 15 of them.
I put in an RFE to bring back the list view YEARS ago and was basically
told that "we know what you want better than you do." I am willing to bet
that many more people want the list view than you realize, but you don't
seem to be willing to listen.
Disgruntled.
2 years, 7 months
ovirt 4.4.10 ansible version
by Kapetanakis Giannis
Could someone verify the correct ansible version for ovirt 4.4.10 ?
I'm having dependencies problem:
# rpm -q ansible
ansible-2.9.27-3.el8.noarch
# dnf update
Last metadata expiration check: 1:14:38 ago on Thu 09 Jun 2022 10:59:53 EEST.
Error:
Problem: package ovirt-engine-4.4.10.7-1.el8.noarch requires ansible < 2.10.0, but none of the providers can be installed
- cannot install both ansible-5.4.0-2.el8.noarch and ansible-2.9.27-3.el8.noarch
- cannot install both ansible-2.9.17-1.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install both ansible-2.9.18-2.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install both ansible-2.9.20-2.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install both ansible-2.9.21-2.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install both ansible-2.9.23-2.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install both ansible-2.9.24-2.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install both ansible-2.9.27-2.el8.noarch and ansible-5.4.0-2.el8.noarch
- cannot install the best update candidate for package ovirt-engine-4.4.10.7-1.el8.noarch
- cannot install the best update candidate for package ansible-2.9.27-3.el8.noarch
- package ansible-2.9.20-1.el8.noarch is filtered out by exclude filtering
- package ansible-2.9.16-1.el8.noarch is filtered out by exclude filtering
- package ansible-2.9.19-1.el8.noarch is filtered out by exclude filtering
- package ansible-2.9.23-1.el8.noarch is filtered out by exclude filtering
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
# dnf list --showduplicates ansible
Last metadata expiration check: 1:15:11 ago on Thu 09 Jun 2022 10:59:53 EEST.
Installed Packages
ansible.noarch 2.9.27-3.el8 @epel
Available Packages
ansible.noarch 2.9.17-1.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 2.9.18-2.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 2.9.20-2.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 2.9.21-2.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 2.9.23-2.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 2.9.24-2.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 2.9.27-2.el8 ovirt-4.4-centos-ovirt44
ansible.noarch 5.4.0-2.el8 epel
ansible.noarch 5.4.0-2.el8 ovirt-4.4-epel
thanks,
G
2 years, 7 months
Re: Self-hosted engine failing liveliness check
by McNamara, Bradley
When a run "hosted-engine --check-liveliness" it returns "Hosted Engine is not up!" When I run this command and can see it hitting the httpd server, with success, in the httpd logs. Accessing the URL returns this: "DB Up!Welcome to Health Status!".
________________________________
From: McNamara, Bradley <Bradley.McNamara(a)seattle.gov>
Sent: Wednesday, June 8, 2022 12:54 PM
To: users(a)ovirt.org <users(a)ovirt.org>
Subject: [ovirt-users] Self-hosted engine failing liveliness check
CAUTION: External Email
Hello, and thank you all for your help.
I'm running Oracle's rebranded oVirt 4.3.10. All has been good until I patched my self-hosted engine. I ran through the normal process: backup, global maintenance mode, update the oVirt packages, run engine-setup, etc. All completed normally without issues. I rebooted the self-hosted engine VM, and now it constantly fails liveliness checks and the HA agent reboots it every five minutes, or so. I put it in back in global maintenance so the HA agent would not reboot it. The VM is up and works correctly. I can do everything normally.
From what I can tell the HA agent liveliness check is just a http get to the web portal. I can see that happening with success. What is the lilveliness check actually doing? All services on the VM are up and running without issue. Where can I look to figure this out?
Here is the output of hosted-engine --vm-status:
[root@itdlolv101 ~]# hosted-engine --vm-status
!! Cluster is in GLOBAL MAINTENANCE mode !!
--== Host itdlolv100.ci.seattle.wa.us (id: 1) status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : itdlolv100.ci.seattle.wa.us
Host ID : 1
Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 855e161f
local_conf_timestamp : 55128
Host timestamp : 55128
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=55128 (Wed Jun 8 12:52:20 2022)
host-id=1
score=3400
vm_conf_refresh_time=55128 (Wed Jun 8 12:52:20 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False
--== Host itdlolv101.ci.seattle.wa.us (id: 2) status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : itdlolv101.ci.seattle.wa.us
Host ID : 2
Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : cc1c2261
local_conf_timestamp : 45453
Host timestamp : 45453
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=45453 (Wed Jun 8 12:55:15 2022)
host-id=2
score=3400
vm_conf_refresh_time=45453 (Wed Jun 8 12:55:15 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False
!! Cluster is in GLOBAL MAINTENANCE mode !!
[root@itdlolv101 ~]#
2 years, 7 months
Re: storage high latency, sanlock errors, cluster instability
by Nir Soffer
On Sun, May 29, 2022 at 9:03 PM Jonathan Baecker <jonbae77(a)gmail.com> wrote:
>
> Am 29.05.22 um 19:24 schrieb Nir Soffer:
>
> On Sun, May 29, 2022 at 7:50 PM Jonathan Baecker <jonbae77(a)gmail.com> wrote:
>
> Hello everybody,
>
> we run a 3 node self hosted cluster with GlusterFS. I had a lot of problem upgrading ovirt from 4.4.10 to 4.5.0.2 and now we have cluster instability.
>
> First I will write down the problems I had with upgrading, so you get a bigger picture:
>
> engine update when fine
> But nodes I could not update because of wrong version of imgbase, so I did a manual update to 4.5.0.1 and later to 4.5.0.2. First time after updating it was still booting into 4.4.10, so I did a reinstall.
> Then after second reboot I ended up in the emergency mode. After a long searching I figure out that lvm.conf using use_devicesfile now but there it uses the wrong filters. So I comment out this and add the old filters back. This procedure I have done on all 3 nodes.
>
> When use_devicesfile (default in 4.5) is enabled, lvm filter is not
> used. During installation
> the old lvm filter is removed.
>
> Can you share more info on why it does not work for you?
>
> The problem was, that the node could not mount the gluster volumes anymore and ended up in emergency mode.
>
> - output of lsblk
>
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.8T 0 disk
> `-XA1920LE10063_HKS028AV 253:0 0 1.8T 0 mpath
> |-gluster_vg_sda-gluster_thinpool_gluster_vg_sda_tmeta 253:16 0 9G 0 lvm
> | `-gluster_vg_sda-gluster_thinpool_gluster_vg_sda-tpool 253:18 0 1.7T 0 lvm
> | |-gluster_vg_sda-gluster_thinpool_gluster_vg_sda 253:19 0 1.7T 1 lvm
> | |-gluster_vg_sda-gluster_lv_data 253:20 0 100G 0 lvm /gluster_bricks/data
> | `-gluster_vg_sda-gluster_lv_vmstore 253:21 0 1.6T 0 lvm /gluster_bricks/vmstore
> `-gluster_vg_sda-gluster_thinpool_gluster_vg_sda_tdata 253:17 0 1.7T 0 lvm
> `-gluster_vg_sda-gluster_thinpool_gluster_vg_sda-tpool 253:18 0 1.7T 0 lvm
> |-gluster_vg_sda-gluster_thinpool_gluster_vg_sda 253:19 0 1.7T 1 lvm
> |-gluster_vg_sda-gluster_lv_data 253:20 0 100G 0 lvm /gluster_bricks/data
> `-gluster_vg_sda-gluster_lv_vmstore 253:21 0 1.6T 0 lvm /gluster_bricks/vmstore
> sr0 11:0 1 1024M 0 rom
> nvme0n1 259:0 0 238.5G 0 disk
> |-nvme0n1p1 259:1 0 1G 0 part /boot
> |-nvme0n1p2 259:2 0 134G 0 part
> | |-onn-pool00_tmeta 253:1 0 1G 0 lvm
> | | `-onn-pool00-tpool 253:3 0 87G 0 lvm
> | | |-onn-ovirt--node--ng--4.5.0.2--0.20220513.0+1 253:4 0 50G 0 lvm /
> | | |-onn-pool00 253:7 0 87G 1 lvm
> | | |-onn-home 253:8 0 1G 0 lvm /home
> | | |-onn-tmp 253:9 0 1G 0 lvm /tmp
> | | |-onn-var 253:10 0 15G 0 lvm /var
> | | |-onn-var_crash 253:11 0 10G 0 lvm /var/crash
> | | |-onn-var_log 253:12 0 8G 0 lvm /var/log
> | | |-onn-var_log_audit 253:13 0 2G 0 lvm /var/log/audit
> | | |-onn-ovirt--node--ng--4.5.0.1--0.20220511.0+1 253:14 0 50G 0 lvm
> | | `-onn-var_tmp 253:15 0 10G 0 lvm /var/tmp
> | |-onn-pool00_tdata 253:2 0 87G 0 lvm
> | | `-onn-pool00-tpool 253:3 0 87G 0 lvm
> | | |-onn-ovirt--node--ng--4.5.0.2--0.20220513.0+1 253:4 0 50G 0 lvm /
> | | |-onn-pool00 253:7 0 87G 1 lvm
> | | |-onn-home 253:8 0 1G 0 lvm /home
> | | |-onn-tmp 253:9 0 1G 0 lvm /tmp
> | | |-onn-var 253:10 0 15G 0 lvm /var
> | | |-onn-var_crash 253:11 0 10G 0 lvm /var/crash
> | | |-onn-var_log 253:12 0 8G 0 lvm /var/log
> | | |-onn-var_log_audit 253:13 0 2G 0 lvm /var/log/audit
> | | |-onn-ovirt--node--ng--4.5.0.1--0.20220511.0+1 253:14 0 50G 0 lvm
> | | `-onn-var_tmp 253:15 0 10G 0 lvm /var/tmp
> | `-onn-swap 253:5 0 20G 0 lvm [SWAP]
> `-nvme0n1p3 259:3 0 95G 0 part
> `-gluster_vg_nvme0n1p3-gluster_lv_engine 253:6 0 94G 0 lvm /gluster_bricks/engine
>
> - The old lvm filter used, and why it was needed
>
> filter = ["a|^/dev/disk/by-id/lvm-pv-uuid-Nn7tZl-TFdY-BujO-VZG5-EaGW-5YFd-Lo5pwa$|", "a|^/dev/disk/by-id/lvm-pv-uuid-Wcbxnx-2RhC-s1Re-s148-nLj9-Tr3f-jj4VvE$|", "a|^/dev/disk/by-id/lvm-pv-uuid-lX51wm-H7V4-3CTn-qYob-Rkpx-Tptd-t94jNL$|", "r|.*|"]
>
> I don't remember exactly any more why it was needed, but without the node was not working correctly. I think I even used vdsm-tool config-lvm-filter.
I think that if you list the devices in this filter:
ls -lh /dev/disk/by-id/lvm-pv-uuid-Nn7tZl-TFdY-BujO-VZG5-EaGW-5YFd-Lo5pwa \
/dev/disk/by-id/lvm-pv-uuid-Wcbxnx-2RhC-s1Re-s148-nLj9-Tr3f-jj4VvE
\
/dev/disk/by-id/lvm-pv-uuid-lX51wm-H7V4-3CTn-qYob-Rkpx-Tptd-t94jNL
You will see that these are the devices used by these vgs:
gluster_vg_sda, gluster_vg_nvme0n1p3, onn
>
> - output of vdsm-tool config-lvm-filter
>
> Analyzing host...
> Found these mounted logical volumes on this host:
>
> logical volume: /dev/mapper/gluster_vg_nvme0n1p3-gluster_lv_engine
> mountpoint: /gluster_bricks/engine
> devices: /dev/nvme0n1p3
>
> logical volume: /dev/mapper/gluster_vg_sda-gluster_lv_data
> mountpoint: /gluster_bricks/data
> devices: /dev/mapper/XA1920LE10063_HKS028AV
>
> logical volume: /dev/mapper/gluster_vg_sda-gluster_lv_vmstore
> mountpoint: /gluster_bricks/vmstore
> devices: /dev/mapper/XA1920LE10063_HKS028AV
>
> logical volume: /dev/mapper/onn-home
> mountpoint: /home
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-ovirt--node--ng--4.5.0.2--0.20220513.0+1
> mountpoint: /
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-swap
> mountpoint: [SWAP]
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-tmp
> mountpoint: /tmp
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-var
> mountpoint: /var
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-var_crash
> mountpoint: /var/crash
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-var_log
> mountpoint: /var/log
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-var_log_audit
> mountpoint: /var/log/audit
> devices: /dev/nvme0n1p2
>
> logical volume: /dev/mapper/onn-var_tmp
> mountpoint: /var/tmp
> devices: /dev/nvme0n1p2
>
> Configuring LVM system.devices.
> Devices for following VGs will be imported:
>
> gluster_vg_sda, gluster_vg_nvme0n1p3, onn
>
> To properly configure the host, we need to add multipath
> blacklist in /etc/multipath/conf.d/vdsm_blacklist.conf:
>
> blacklist {
> wwid "eui.0025388901b1e26f"
> }
>
>
> Configure host? [yes,NO]
If you run "vdsm-tool config-lvm-filter" and confirm with "yes", I
think all the vgs
will be imported properly into lvm devices file.
I don't think it will solve the storage issues you have since Feb
2022, but at least
you will have a standard configuration and the next upgrade will not revert your
local settings.
> If using lvm devices does not work for you, you can enable the lvm
> filter in vdsm configuration
> by adding a drop-in file:
>
> $ cat /etc/vdsm/vdsm.conf.d/99-local.conf
> [lvm]
> config_method = filter
>
> And run:
>
> vdsm-tool config-lvm-filter
>
> to configure the lvm filter in the best way for vdsm. If this does not create
> the right filter we would like to know why, but in general you should use
> lvm devices since it avoids the trouble of maintaining the filter and dealing
> with upgrades and user edited lvm filter.
>
> If you disable use_devicesfile, the next vdsm upgrade will enable it
> back unless
> you change the configuration.
>
> I would be happy to just use the default, when there is a way to make use_devicesfile to wok.
>
> Also even if you disable use_devicesfile in lvm.conf, vdsm still use
> --devices instead
> of filter when running lvm commands, and lvm commands run by vdsm ignore your
> lvm filter since the --devices option overrides the system settings.
>
> ...
>
> I notice some unsync volume warning, but because I had this in the past to, after upgrading, I though after some time they will disappear. The next day there still where there, so I decided to put the nodes again in the maintenance mode and restart the glusterd service. After some time the sync warnings where gone.
>
> Not clear what these warnings are, I guess Gluster warning?
>
> Yes was Gluster warnings under Storage -> Volumes it was saying that some entries are unsync.
>
> So now the actual problem:
>
> Since this time the cluster is unstable. I get different errors and warning, like:
>
> VM [name] is not responding
> out of nothing HA VM gets migrated
> VM migration can fail
> VM backup with snapshoting and export take very long
>
> How do you backup the vms? do you sue a backup application? how is it
> configured?
>
> I use a self made plython script, which uses the rest api. I create a snapshot from the VM, build a new VM from that snapshot and move the new one to the export domain.
This is not very efficient - this copy the entire vm at the point of
time of the snapshot
and then copy it again to the export domain.
If you use a backup application supporting the incremental backup API,
the first full backup
will copy the entire vm once, but later incremental backup will copy
only the changes
since the last backup.
>
> VMs are getting very slow some times
> Storage domain vmstore experienced a high latency of 9.14251
> ovs|00001|db_ctl_base|ERR|no key "dpdk-init" in Open_vSwitch record "." column other_config
> 489279 [1064359]: s8 renewal error -202 delta_length 10 last_success 489249
> 444853 [2243175]: s27 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/ids
> 471099 [2243175]: s27 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/ids
> many of: 424035 [2243175]: s27 delta_renew long write time XX sec
>
> All these issues tell use that your storage is not working correctly.
>
> sanlock.log is full of renewal errors form May:
>
> $ grep 2022-05- sanlock.log | wc -l
> 4844
>
> $ grep 2022-05- sanlock.log | grep 'renewal error' | wc -l
> 631
>
> But there is lot of trouble from earlier months:
>
> $ grep 2022-04- sanlock.log | wc -l
> 844
> $ grep 2022-04- sanlock.log | grep 'renewal error' | wc -l
> 29
>
> $ grep 2022-03- sanlock.log | wc -l
> 1609
> $ grep 2022-03- sanlock.log | grep 'renewal error' | wc -l
> 483
>
> $ grep 2022-02- sanlock.log | wc -l
> 826
> $ grep 2022-02- sanlock.log | grep 'renewal error' | wc -l
> 242
>
> Here sanlock log looks healthy:
>
> $ grep 2022-01- sanlock.log | wc -l
> 3
> $ grep 2022-01- sanlock.log | grep 'renewal error' | wc -l
> 0
>
> $ grep 2021-12- sanlock.log | wc -l
> 48
> $ grep 2021-12- sanlock.log | grep 'renewal error' | wc -l
> 0
>
> vdsm log shows that 2 domains are not accessible:
>
> $ grep ERROR vdsm.log
> 2022-05-29 15:07:19,048+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 16:33:59,049+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 16:34:39,049+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:21:39,050+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:55:59,712+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:56:19,711+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:56:39,050+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_data/de5f4123-0fac-4238-abcf-a329c142bd47/dom_md/metadata
> (monitor:511)
> 2022-05-29 17:56:39,711+0200 ERROR (check/loop) [storage.monitor]
> Error checking path
> /rhev/data-center/mnt/glusterSD/onode1.example.org:_vmstore/3cf83851-1cc8-4f97-8960-08a60b9e25db/dom_md/metadata
> (monitor:511)
>
> You need to find what is the issue with your Gluster storage.
>
> I hope that Ritesh can help debug the issue with Gluster.
>
> Nir
>
> I'm worry that I do something, that it makes it even more worst, and I hove not idea what's the problem. To me it looks not exactly like a problem with data inconsistencies.
The problem is that your Gluster storage is not healthy, and reading
and writing to it times out.
Please keep users(a)ovirt.org CC when you reply. Gluster storage is very
popular in this mailing list
and you may get useful help from other users.
Nir
2 years, 7 months
Self-hosted engine failing liveliness check
by McNamara, Bradley
Hello, and thank you all for your help.
I'm running Oracle's rebranded oVirt 4.3.10. All has been good until I patched my self-hosted engine. I ran through the normal process: backup, global maintenance mode, update the oVirt packages, run engine-setup, etc. All completed normally without issues. I rebooted the self-hosted engine VM, and now it constantly fails liveliness checks and the HA agent reboots it every five minutes, or so. I put it in back in global maintenance so the HA agent would not reboot it. The VM is up and works correctly. I can do everything normally.
From what I can tell the HA agent liveliness check is just a http get to the web portal. I can see that happening with success. What is the lilveliness check actually doing? All services on the VM are up and running without issue. Where can I look to figure this out?
Here is the output of hosted-engine --vm-status:
[root@itdlolv101 ~]# hosted-engine --vm-status
!! Cluster is in GLOBAL MAINTENANCE mode !!
--== Host itdlolv100.ci.seattle.wa.us (id: 1) status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : itdlolv100.ci.seattle.wa.us
Host ID : 1
Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 855e161f
local_conf_timestamp : 55128
Host timestamp : 55128
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=55128 (Wed Jun 8 12:52:20 2022)
host-id=1
score=3400
vm_conf_refresh_time=55128 (Wed Jun 8 12:52:20 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False
--== Host itdlolv101.ci.seattle.wa.us (id: 2) status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : itdlolv101.ci.seattle.wa.us
Host ID : 2
Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : cc1c2261
local_conf_timestamp : 45453
Host timestamp : 45453
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=45453 (Wed Jun 8 12:55:15 2022)
host-id=2
score=3400
vm_conf_refresh_time=45453 (Wed Jun 8 12:55:15 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False
!! Cluster is in GLOBAL MAINTENANCE mode !!
[root@itdlolv101 ~]#
2 years, 7 months
How to run virt-sysprep / How to troubleshoot template creation failure
by jeremy_tourville@hotmail.com
When attempting to create a template the process fails. It has been suggested to run virt-sysprep manually and see why it it failed. Specifically, I have an Ubuntu 20.04 machine that doesn't boot properly even if the the template process does finish without error. I have tried creating the template several times as a test. about 80% of the time the template creation fails outright for qemu errors. The other 20% "appear" to work but the system boots to a grub emergency prompt
Can someone clarify the process?
I know you can run the command: virt-sysprep -d <name_of_vm>
1. Where do you run it from? The hypervisor host or the management engine?
2. What account do you need to use? What is the authentication username and password?
I also think I read somewhere that you should refrain from using root. Anything else to know?
Anything else to try regarding the grub prompt?
Thanks.
2 years, 7 months
Multiple displays
by mblecha@flagshipbio.com
Our lab environment was set up on 4.4.10, then upgraded to 4.5, and our production environment was a clean install of 4.5.
In our lab environment, I remember having to alter a setting back in 4.4.10 through 'engine-config' or other to get the multiple monitor option to cooperate in the VM settings.
I googled around for the original article that allowed me to fix this on 4.4.10, but I cannot find what I had earlier.
On the lab environment, the VM displays behave as expected; X starts and the second display becomes available in virt-viewer. In our production environment, same configurations, OS, etcetera on the VM, but X starts and the second display is still not available.
2 years, 7 months