Cannot restart ovirt after massive failure.
by Gilboa Davara
Hello all,
During the night, one of my (smaller) setups, a single node self hosted
engine (localhost NFS) crashed due to what-looks-like a massive disk
failure (Software RAID6, with 10 drives + spare).
After a reboot, I let the RAID resync with a fresh drive) and went on to
start oVirt.
However, no such luck.
Two issues:
1. ovirt-ha-broker fails due to broken hosted engine state (log attached).
2. ovirt-ha-agent fails due to network test (tcp) even though both
remote-host and DNS servers are active. (log attached).
Two questions:
1. Can I somehow force the agent to disable the network liveliness test?
2. Can I somehow force the broker to rebuild / fix the hosted engine state?
- Gilboa
10 months, 1 week
Please, Please Help - New oVirt Install/Deployment Failing - "Host is not up..."
by Matthew J Black
Hi Everyone,
Could someone please help me - I've been trying to do an install of oVirt for *weeks* (including false starts and self-inflicted wounds/errors) and it is still not working.
My setup:
- oVirt v4.5.3
- A brand new fresh vanilla install of RockyLinux 8.6 - all working AOK
- 2*NICs in a bond (802.3ad) with a couple of sub-Interfaces/VLANs - all working AOK
- All relevant IPv4 Address in DNS with Reverse Lookups - all working AOK
- All relevant IPv4 Address in "/etc/hosts" file - all working AOK
- IPv6 (using "method=auto" in the interface config file) enabled on the relevant sub-Interface/VLAN - I'm not using IPv6 on the network, only IPv4, but I'm trying to cover all the bases.
- All relevant Ports (as per the oVirt documentation) set up on the firewall
- ie firewall-cmd --add-service={{ libvirt-tls | ovirt-imageio | ovirt-vmconsole | vdsm }}
- All the relevant Repositories installed (ie RockyLinux BaseOS, AppStream, & PowerTools, and the EPEL, plus the ones from the oVirt documentation)
I have followed the oVirt documentation (including the special RHEL-instructions and RockyLinux-instructions) to the letter - no deviations, no special settings, exactly as they are written.
All the dnf installs, etc, went off without a hitch, including the "dnf install centos-release-ovirt45", "dnf install ovirt-engine-appliance", and "dnf install ovirt-hosted-engine-setup" - no errors anywhere.
Here is the results of a "dnf repolist":
- appstream Rocky Linux 8 - AppStream
- baseos Rocky Linux 8 - BaseOS
- centos-ceph-pacific CentOS-8-stream - Ceph Pacific
- centos-gluster10 CentOS-8-stream - Gluster 10
- centos-nfv-openvswitch CentOS-8 - NFV OpenvSwitch
- centos-opstools CentOS-OpsTools - collectd
- centos-ovirt45 CentOS Stream 8 - oVirt 4.5
- cs8-extras CentOS Stream 8 - Extras
- cs8-extras-common CentOS Stream 8 - Extras common packages
- epel Extra Packages for Enterprise Linux 8 - x86_64
- epel-modular Extra Packages for Enterprise Linux Modular 8 - x86_64
- ovirt-45-centos-stream-openstack-yoga CentOS Stream 8 - oVirt 4.5 - OpenStack Yoga Repository
- ovirt-45-upstream oVirt upstream for CentOS Stream 8 - oVirt 4.5
- powertools Rocky Linux 8 - PowerTools
So I kicked-off the oVirt deployment with: "hosted-engine --deploy --4 --ansible-extra-vars=he_offline_deployment=true".
I used "--ansible-extra-vars=he_offline_deployment=true" because without that flag I was getting "DNF timout" issues (see my previous post `Local (Deployment) VM Can't Reach "centos-ceph-pacific" Repo`).
I answer the defaults to all of questions the script asked, or entered the deployment-relevant answers where appropriate. In doing this I double-checked every answer before hitting <Enter>. Everything progressed smoothly until the deployment reached the "Wait for the host to be up" task... which then hung for more than 30 minutes before failing.
From the ovirt-hosted-engine-setup... log file:
- 2022-10-20 17:54:26,285+1100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"}
I checked the following log files and found all of the relevant ERROR lines, then checked several 10s of proceeding and succeeding lines trying to determine what was going wrong, but I could not determine anything.
- ovirt-hosted-engine-setup...
- ovirt-hosted-engine-setup-ansible-bootstrap_local_vm...
- ovirt-hosted-engine-setup-ansible-final_clean... - not really relevant, I believe
I can include the log files (or the relevant parts of the log files) if people want - but that are very large: several 100 kilobytes each.
I also googled "oVirt Host is not up" and found several entries, but after reading them all the most relevant seems to be a thread from these mailing list: `Install of RHV 4.4 failing - "Host is not up, please check logs, perhaps also on the engine machine"` - but this seems to be talking about an upgrade and I didn't gleam anything useful from it - I could, of course, be wrong about that.
So my questions are:
- Where else should I be looking (ie other log files, etc, and possible where to find them)?
- Does anyone have any idea why this isn't working?
- Does anyone have a work-around (including a completely manual process to get things working - I don't mind working in the CLI with virsh, etc)?
- What am I doing wrong?
Please, I'm really stumped with this, and I really do need help.
Cheers
Dulux-Oz
10 months, 1 week
Configure OVN for oVirt failing - vdsm.tool.ovn_config.NetworkNotFoundError: hostname
by huw.m@twinstream.com
Hello,
When installing the self-hosted engine using rocky 9 as a host (using nightly builds), the install gets as far as running the below ansible task from ovirt-engine
- name: Configure OVN for oVirt
ansible.builtin.command: >
vdsm-tool ovn-config {{ ovn_central }} {{ ovn_tunneling_interface }} {{ ovn_host_fqdn }}
This command gets executed as vdsm-tool ovn-config 192.168.57.4 hostname.my.project.com
and fails with error
"stderr" : "Traceback (most recent call last):\n File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 117, in get_network\n return networks[net_name]\nKeyError: 'virt-1.local.hyp.twinstream.com'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/bin/vdsm-tool\", line 195, in main\n return tool_command[cmd][\"command\"](*args)\n File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 63, in ovn_config\n ip_address = get_ip_addr(get_network(network_caps(), net_name))\n File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 119, in get_network\n raise NetworkNotFoundError(net_name)\nvdsm.tool.ovn_config.NetworkNotFoundError: hostname.my.project.com"
Running `vdsm-tool list-nets` on the host gives an empty list.
`ip a` gives
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:6d:16:65 brd ff:ff:ff:ff:ff:ff
altname enp0s6
altname ens6
inet 192.168.121.29/24 brd 192.168.121.255 scope global dynamic noprefixroute eth0
valid_lft 2482sec preferred_lft 2482sec
inet6 fe80::5054:ff:fe6d:1665/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:6b:f4:7b brd ff:ff:ff:ff:ff:ff
altname enp0s7
altname ens7
inet 192.168.56.151/24 brd 192.168.56.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe6b:f47b/64 scope link
valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
link/ether 52:54:00:8f:40:45 brd ff:ff:ff:ff:ff:ff
altname enp0s8
altname ens8
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:2f:27:9d brd ff:ff:ff:ff:ff:ff
altname enp0s9
altname ens9
6: eth4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bondstorage state UP group default qlen 1000
link/ether 52:54:00:b8:9b:d7 brd ff:ff:ff:ff:ff:ff
altname enp0s10
altname ens10
7: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:c2:9a:bd brd ff:ff:ff:ff:ff:ff
altname enp0s11
altname ens11
8: eth6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bondvm state UP group default qlen 1000
link/ether 52:54:00:ed:f7:cc brd ff:ff:ff:ff:ff:ff
altname enp0s12
altname ens12
9: eth7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:de:8a:48 brd ff:ff:ff:ff:ff:ff
altname enp0s13
altname ens13
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:8f:40:45 brd ff:ff:ff:ff:ff:ff
inet 192.168.57.4/24 brd 192.168.57.255 scope global noprefixroute bond0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe8f:4045/64 scope link
valid_lft forever preferred_lft forever
11: bondvm: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:ed:f7:cc brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:feed:f7cc/64 scope link
valid_lft forever preferred_lft forever
12: bondstorage: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:b8:9b:d7 brd ff:ff:ff:ff:ff:ff
inet 192.168.59.4/24 brd 192.168.59.255 scope global noprefixroute bondstorage
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:feb8:9bd7/64 scope link
valid_lft forever preferred_lft forever
13: bondvm.20@bondvm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:ed:f7:cc brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:feed:f7cc/64 scope link
valid_lft forever preferred_lft forever
15: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:b2:5f:e2 brd ff:ff:ff:ff:ff:ff
inet 192.168.222.1/24 brd 192.168.222.255 scope global virbr0
valid_lft forever preferred_lft forever
16: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master virbr0 state UNKNOWN group default qlen 1000
link/ether fe:16:3e:34:3d:ea brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc16:3eff:fe34:3dea/64 scope link
valid_lft forever preferred_lft forever
47: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 6e:27:5f:fa:e3:3a brd ff:ff:ff:ff:ff:ff
48: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 12:7c:d9:2e:cf:26 brd ff:ff:ff:ff:ff:ff
49: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether a2:35:6e:5e:4c:60 brd ff:ff:ff:ff:ff:ff
bond0 was selected as the ovirtmgmt bridge NIC. It currently only has one member interface eth2 using balance-xor. In the ovirt management console I can the see host in a down state and given the rest of the playbook ran which requires ssh connectivity between hosted-engine and host, I believe the network is generally setup correctly.
No other immediate errors I can. As vdsm-tool ovn-config expects a network to exist with value of the hostname, what is meant to be creating this on the host?
Thanks,
Huw
10 months, 2 weeks
how to renew expired ovirt node vdsm cert manually ?
by dhanaraj.ramesh@yahoo.com
below are the steps to renew the expired vdsm cert of ovirt node
# To check CERT expired
# openssl x509 -in /etc/pki/vdsm/certs/vdsmcert.pem -noout -dates
1. Backup vdsm folder
# cd /etc/pki
# mv vdsm vdsm.orig
# mkdir vdsm ; chown vdsm:kvm vdsm
# cd vdsm
# mkdir libvirt-vnc certs keys libvirt-spice libvirt-migrate
# chown vdsm:kvm libvirt-vnc certs keys libvirt-spice libvirt-migrate
2. Regenerate cert & keys
# vdsm-tool configure --module certificates
3. Copy the cert to destination location
chmod 440 /etc/pki/vdsm/keys/vdsmkey.pem
chown root /etc/pki/vdsmcerts/*pem
chmod 644 /etc/pki/vdsmcerts/*pem
cp /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/libvirt-spice/ca-cert.pem
cp /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/vdsm/libvirt-spice/server-key.pem
cp /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/libvirt-spice/server-cert.pem
cp /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/libvirt-vnc/ca-cert.pem
cp /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/vdsm/libvirt-vnc/server-key.pem
cp /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/libvirt-vnc/server-cert.pem
cp -p /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/libvirt-migrate/ca-cert.pem
cp -p /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/vdsm/libvirt-migrate/server-key.pem
cp -p /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/libvirt-migrate/server-cert.pem
chown root:qemu /etc/pki/vdsm/libvirt-migrate/server-key.pem
cp -p /etc/pki/vdsm.orig/keys/libvirt_password /etc/pki/vdsm/keys/
mv /etc/pki/libvirt/clientcert.pem /etc/pki/libvirt/clientcert.pem.orig
mv /etc/pki/libvirt/private/clientkey.pem /etc/pki/libvirt/private/clientkey.pem.orig
mv /etc/pki/CA/cacert.pem /etc/pki/CA/cacert.pem.orig
cp -p /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/libvirt/clientcert.pem
cp -p /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/libvirt/private/clientkey.pem
cp -p /etc/pki/vdsm/certs/cacert.pem /etc/pki/CA/cacert.pem
3. cross check the backup folder /etc/pki/vdsm.orig vs /etc/pki/vdsm
# refer to /etc/pki/vdsm.orig/*/ and set the correct owner & group permission in /etc/pki/vdsm/*/
4. restart services # Make sure both services are up
systemctl restart vdsmd libvirtd
10 months, 2 weeks
Unable to install oVirt on RHEL7.5
by SS00514758@techmahindra.com
Hi All,
I am unable to install oVirt on RHEL7.5, to install it I am taking reference of below link,
https://www.ovirt.org/documentation/install-guide/chap-Installing_oVirt.html
But though it is not working for me, couple of dependencies is not getting installed, and because of this I am not able to run the ovirt-engine, below are the depencies packages that unable to install,
Error: Package: collectd-write_http-5.8.0-6.1.el7.x86_64 (@ovirt-4.2-centos-opstools)
Requires: collectd(x86-64) = 5.8.0-6.1.el7
Removing: collectd-5.8.0-6.1.el7.x86_64 (@ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-6.1.el7
Updated By: collectd-5.8.1-1.el7.x86_64 (epel)
collectd(x86-64) = 5.8.1-1.el7
Available: collectd-5.7.2-1.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.7.2-1.el7
Available: collectd-5.7.2-3.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.7.2-3.el7
Available: collectd-5.8.0-2.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-2.el7
Available: collectd-5.8.0-3.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-3.el7
Available: collectd-5.8.0-5.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-5.el7
Help me to install this.
Looking forward to resolve this issue.
Regards
Sumit Sahay
10 months, 3 weeks
Grafana - Origin Not Allowed
by Maton, Brett
oVirt 4.5.0.8-1.el8
I tried to connect to grafana via the monitoring portal link from the dash
and all panels are failing to display any data with varying error messages,
but all include 'Origin Not Allowed'
I navigated to Data Sources and ran a test on the PostgreSQL connection
(localhost) which threw the same Origin Not Allowed error message.
Any suggestions?
11 months, 2 weeks
Cannot successfully import Windows vm to new ovirt deployment
by netracerx@mac.com
I am running into an issue that, from what little Google-fu I've been able to use, should be solved in oVirt 4.5. I'm trying to import WS2019 VMs from ESXi 7 as OVFs or even OVAs. But when I do the import and set the OS to Windows Server 2019 x64 in the admin portal, I get the error "Invalid time zone for the given OS type ". If I leave it at Other OS, the import fails with event ID 1153. I've been banging my head against this for over a week (previously couldn't get the management VM to complete setup), so any guidance is appreciated. Let me know what to pull to help me pin this down?
BTW, yes this is a self hosted install as a nested VM on ESXi. Kinda have to right now to test everything (easiest for me over doing a bare metal on my old PowerEdge 2900 at the moment). Running CentOS 9 Stream.
12 months
Hosted-engine restore failing
by Devin A. Bougie
Hi, All. We are attempting to migrate to a new storage domain for our oVirt 4.5.4 self-hosted engine setup, and are failing with "cannot import name 'Callable' from 'collections'"
Please see below for the errors on the console.
Many thanks,
Devin
------
hosted-engine --deploy --restore-from-file=backup.bck --4
...
[ INFO ] Checking available network interfaces:
[ ERROR ] b'[WARNING]: Skipping plugin (/usr/share/ovirt-hosted-engine-\n'
[ ERROR ] b'setup/he_ansible/callback_plugins/2_ovirt_logger.py), cannot load: cannot\n'
[ ERROR ] b"import name 'Callable' from 'collections'\n"
[ ERROR ] b'(/usr/lib64/python3.11/collections/__init__.py)\n'
[ ERROR ] b"ERROR! Unexpected Exception, this is probably a bug: cannot import name 'Callable' from 'collections' (/usr/lib64/python3.11/collections/__init__.py)\n"
[ ERROR ] Failed to execute stage 'Environment customization': Failed executing ansible-playbook
[ INFO ] Stage: Clean up
[ INFO ] Cleaning temporary resources
[ ERROR ] b'[WARNING]: Skipping plugin (/usr/share/ovirt-hosted-engine-\n'
[ ERROR ] b'setup/he_ansible/callback_plugins/2_ovirt_logger.py), cannot load: cannot\n'
[ ERROR ] b"import name 'Callable' from 'collections'\n"
[ ERROR ] b'(/usr/lib64/python3.11/collections/__init__.py)\n'
[ ERROR ] b"ERROR! Unexpected Exception, this is probably a bug: cannot import name 'Callable' from 'collections' (/usr/lib64/python3.11/collections/__init__.py)\n"
[ ERROR ] Failed to execute stage 'Clean up': Failed executing ansible-playbook
[ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20231011110358.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed
Log file is located at
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20231011110352-raupj9.log
1 year
Gluster: Ideas for migration
by jonas@rabe.ch
Hello
I have to migrate the Gluster volumes from an old oVirt cluster to a newly built one. I looked into migration strategies, but everything that Red Hat recommends is related to replacing old bricks. In a testing environment I created two clusters and wanted to migrate one volume after the other. Unfortunately that fails because a node cannot be part of two clusters at the same time.
The next thing I see, is to recreate the volumes on the new cluster, then constantly rsync the files from the old cluster to the new one and at a specified point in time make the cut over where I stop the applicaiton, do a final rsync and remount the new volume under the old path.
Is there any other, nicer way I could accomplish migrating a volume from one Gluster cluster to another?
1 year
Engine on EL 9
by David Carvalho
Hello, good morning.
I’m using Oracle Linux and I intended to install a virtualization platffom with KVM and oracle VM. The Oracle documention only mentions Oracle Linux 8 and there are no oVirt repositories available for OL 9.
I visited ovirt.org site and at the download page it only mentions:
Engine:
* Red Hat Enterprise Linux 8.7 (or similar)
* CentOS Stream 8
I still had no reply at Oracle foruns. Will there be a possibility to use this with Oracle Linux 9 soon?
I have 3 servers to install and I also intend to use Gluster FS.
Thanks and regards.
Os melhores cumprimentos
David Alexandre M. de Carvalho
═══════════════════
Especialista de Informática
Departamento de Informática
Universidade da Beira Interior
1 year