Lots of problems with deploying the hosted-engine (ovirt 4.4 | CentOS 8.2.2004)
by jonas
Hi!
I have banged my head against deploying the ovirt 4.4 self-hosted engine
on Centos 8.2 for last couple of days.
First I was astonished that resources.ovirt.org has no IPv6
connectivity, which made my initial plan for a mostly IPv6-only
deployment impossible.
CentOS was installed from scratch using the ks.cgf Kickstart file below,
which also adds the ovirt 4.4 repo and installs cockpit-ovirt-dashboard
& ovirt-engine-appliance.
When deploying the hosted-engine from cockpit while logged in as a
non-root (although privileged) user, the "(3) Prepare VM" step instantly
fails with a nondescript error message and without generating any logs.
By using the browser dev tools it was determined that this was because
the ansible vars file could not be created as the non-root user did not
have write permissions in '/var/lib/ovirt-hosted-engine-setup/cockpit/'
. Shouldn't cockpit be capable of using sudo when appropriate, or at
least give a more descriptive error message?
After login into cockpit as root, or when using the command line
ovirt-hosted-engine-setup tool, the deployment fails with "Failed to
download metadata for repo 'AppStream'".
This seems to be because a) the dnsmasq running on the host does not
forward dns queries, even though the host itself can resolve dns queries
just fine, and b) there also does not seem to be any functioning routing
setup to reach anything outside the host.
Regarding a) it is strange that dnsmasq is running with a config file
'/var/lib/libvirt/dnsmasq/default.conf' containing the 'no-resolv'
option. Could the operation of systemd-resolved be interfering with
dnsmasq (see ss -tulpen output)? I tried to manually stop
systemd-resolved, but got the same behaviour as before.
I hope someone could give me a hint how I could get past this problem,
as so far my ovirt experience has been a little bit sub-par. :D
Also when running ovirt-hosted-engine-cleanup, the extracted engine VMs
in /var/tmp/localvm* are not removed, leading to a "disk-memory-leak"
with subsequent runs.
Best regards
Jonas
--- ss -tulpen output post deploy-run ---
[root@nxtvirt ~]# ss -tulpen | grep ':53 '
udp UNCONN 0 0 127.0.0.53%lo:53
0.0.0.0:* users:(("systemd-resolve",pid=1379,fd=18)) uid:193
ino:32910 sk:6 <->
udp UNCONN 0 0 [fd00:1234:5678:900::1]:53
[::]:* users:(("dnsmasq",pid=13525,fd=15)) uid:979 ino:113580
sk:d v6only:1 <->
udp UNCONN 0 0 [fe80::5054:ff:fe94:f314]%virbr0:53
[::]:* users:(("dnsmasq",pid=13525,fd=12)) uid:979 ino:113575
sk:e v6only:1 <->
tcp LISTEN 0 32 [fd00:1234:5678:900::1]:53
[::]:* users:(("dnsmasq",pid=13525,fd=16)) uid:979 ino:113581
sk:20 v6only:1 <->
tcp LISTEN 0 32 [fe80::5054:ff:fe94:f314]%virbr0:53
[::]:* users:(("dnsmasq",pid=13525,fd=13)) uid:979 ino:113576
sk:21 v6only:1 <->
--- running dnsmasq processes on host ('nxtvirt') post deploy-run ---
dnsmasq 13525 0.0 0.0 71888 2344 ? S 12:31 0:00
/usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf
--leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
root 13526 0.0 0.0 71860 436 ? S 12:31 0:00
/usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf
--leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
--- var/lib/libvirt/dnsmasq/default.conf ---
##WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO
BE
##OVERWRITTEN AND LOST. Changes to this configuration should be made
using:
## virsh net-edit default
## or other application using the libvirt API.
##
## dnsmasq conf file created by libvirt
strict-order
pid-file=/run/libvirt/network/default.pid
except-interface=lo
bind-dynamic
interface=virbr0
dhcp-option=3
no-resolv
ra-param=*,0,0
dhcp-range=fd00:1234:5678:900::10,fd00:1234:5678:900::ff,64
dhcp-lease-max=240
dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile
addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
enable-ra
--- cockpit wizard overview before the 'Prepare VM' step ---
VM
Engine FQDN:engine.*REDACTED*
MAC Address:00:16:3e:20:13:b3
Network Configuration:Static
VM IP Address:*REDACTED*:1099:babe::3/64
Gateway Address:*REDACTED*:1099::1
DNS Servers:*REDACTED*:1052::11
Root User SSH Access:yes
Number of Virtual CPUs:4
Memory Size (MiB):4096
Root User SSH Public Key:(None)
Add Lines to /etc/hosts:yes
Bridge Name:ovirtmgmt
Apply OpenSCAP profile:no
Engine
SMTP Server Name:localhost
SMTP Server Port Number:25
Sender E-Mail Address:root@localhost
Recipient E-Mail Addresses:root@localhost
--- ks.cgf ---
#version=RHEL8
ignoredisk --only-use=vda
autopart --type=lvm
# Partition clearing information
clearpart --drives=vda --all --initlabel
# Use graphical install
#graphical
text
# Use CDROM installation media
cdrom
# Keyboard layouts
keyboard --vckeymap=de --xlayouts='de','us'
# System language
lang en_US.UTF-8
# Network information
network --bootproto=static --device=enp1s0 --ip=192.168.199.250
--netmask=255.255.255.0 --gateway=192.168.199.10
--ipv6=*REDACTED*:1090:babe::250/64 --ipv6gateway=*REDACTED*:1090::1
--hostname=nxtvirt.*REDACTED* --nameserver=*REDACTED*:1052::11
--activate
network --hostname=nxtvirt.*REDACTED*
# Root password
rootpw --iscrypted $6$*REDACTED*
firewall --enabled --service=cockpit --service=ssh
# Run the Setup Agent on first boot
firstboot --enable
# Do not configure the X Window System
skipx
# System services
services --enabled="chronyd"
# System timezone
timezone Etc/UTC --isUtc --ntpservers=ntp.*REDACTED*,ntp2.*REDACTED*
user --name=nonrootuser --groups=wheel --password=$6$*REDACTED*
--iscrypted
# KVM Users/Groups
group --name=kvm --gid=36
user --name=vdsm --uid=36 --gid=36
%packages
@^server-product-environment
#@graphical-admin-tools
@headless-management
kexec-tools
cockpit
%end
%addon com_redhat_kdump --enable --reserve-mb='auto'
%end
%anaconda
pwpolicy root --minlen=6 --minquality=1 --notstrict --nochanges
--notempty
pwpolicy user --minlen=6 --minquality=1 --notstrict --nochanges
--emptyok
pwpolicy luks --minlen=6 --minquality=1 --notstrict --nochanges
--notempty
%end
%post --erroronfail --log=/root/ks-post.log
#!/bin/sh
dnf update -y
# NFS storage
mkdir -p /opt/ovirt/nfs-storage
chown -R 36:36 /opt/ovirt/nfs-storage
chmod 0755 /opt/ovirt/nfs-storage
echo "/opt/ovirt/nfs-storage localhost" > /etc/exports
echo "/opt/ovirt/nfs-storage engine.*REDACTED*" >> /etc/exports
dnf install -y nfs-utils
systemctl enable nfs-server.service
# Install ovirt packages
dnf install -y
https://resources.ovirt.org/pub/yum-repo/ovirt-release44.rpm
dnf install -y cockpit-ovirt-dashboard ovirt-engine-appliance
# Enable cockpit
systemctl enable cockpit.socket
%end
#reboot --eject --kexec
reboot --eject
--- Host (nxtvirt) ip -a post deploy-run ---
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
state UP group default qlen 1000
link/ether 52:54:00:ad:79:1b brd ff:ff:ff:ff:ff:ff
inet 192.168.199.250/24 brd 192.168.199.255 scope global
noprefixroute enp1s0
valid_lft forever preferred_lft forever
inet6 *REDACTED*:1099:babe::250/64 scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fead:791b/64 scope link noprefixroute
valid_lft forever preferred_lft forever
5: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP group default qlen 1000
link/ether 52:54:00:94:f3:14 brd ff:ff:ff:ff:ff:ff
inet6 fd00:1234:5678:900::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe94:f314/64 scope link
valid_lft forever preferred_lft forever
6: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master
virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:94:f3:14 brd ff:ff:ff:ff:ff:ff
7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
master virbr0 state UNKNOWN group default qlen 1000
link/ether fe:16:3e:68:d3:8a brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc16:3eff:fe68:d38a/64 scope link
valid_lft forever preferred_lft forever
--- iptables-save post deploy-run ---
# Generated by iptables-save v1.8.4 on Sun Jun 28 13:20:53 2020
*filter
:INPUT ACCEPT [4007:8578553]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [3920:7633249]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWX - [0:0]
-A INPUT -j LIBVIRT_INP
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A OUTPUT -j LIBVIRT_OUT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p tcp -m tcp --dport 68 -j ACCEPT
-A LIBVIRT_FWO -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWI -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWX -i virbr0 -o virbr0 -j ACCEPT
COMMIT
# Completed on Sun Jun 28 13:20:53 2020
# Generated by iptables-save v1.8.4 on Sun Jun 28 13:20:53 2020
*security
:INPUT ACCEPT [3959:8576054]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [3920:7633249]
COMMIT
# Completed on Sun Jun 28 13:20:53 2020
# Generated by iptables-save v1.8.4 on Sun Jun 28 13:20:53 2020
*raw
:PREROUTING ACCEPT [4299:8608260]
:OUTPUT ACCEPT [3920:7633249]
COMMIT
# Completed on Sun Jun 28 13:20:53 2020
# Generated by iptables-save v1.8.4 on Sun Jun 28 13:20:53 2020
*mangle
:PREROUTING ACCEPT [4299:8608260]
:INPUT ACCEPT [4007:8578553]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [3920:7633249]
:POSTROUTING ACCEPT [3923:7633408]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
COMMIT
# Completed on Sun Jun 28 13:20:53 2020
# Generated by iptables-save v1.8.4 on Sun Jun 28 13:20:53 2020
*nat
:PREROUTING ACCEPT [337:32047]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [159:9351]
:OUTPUT ACCEPT [159:9351]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
COMMIT
# Completed on Sun Jun 28 13:20:53 2020
4 years, 6 months
Weird problem starting VMs in oVirt-4.4
by Joop
Hi All,
Just had a rather new experience in that starting a VM worked but the
kernel entered grub2 rescue console due to the fact that something was
wrong with its virtio-scsi disk.
The message is Booting from Hard Disk ....
error: ../../grub-core/kern/dl.c:266:invalid arch-independent ELF maginc.
entering rescue mode...
Doing a CTRL-ALT-Del through the spice console let the VM boot
correctly. Shutting it down and repeating the procedure I get a disk
problem everytime. Weird thing is if I activate the BootMenu and then
straight away start the VM all is OK.
I don't see any ERROR messages in either vdsm.log, engine.log
If I would have to guess it looks like the disk image isn't connected
yet when the VM boots but thats weird isn't it?
Regards,
Joop
4 years, 6 months
Ovirt 4.3.10 Glusterfs SSD slow performance over 10GE
by jury cat
Hello all,
I am using Ovirt 4.3.10 on Centos 7.8 with glusterfs 6.9 .
My Gluster setup is of 3 hosts in replica 3 (2 hosts + 1 arbiter).
All the 3 hosts are Dell R720 with Perc Raid Controller H710 mini(that has
maximim throughtout 6Gbs) and with 2×1TB samsumg SSD in RAID 0. The
volume is partitioned using LVM thin provision and formated XFS.
The hosts have separate 10GE network cards for storage traffic.
The Gluster Network is connected to this 10GE network cards and is mounted
using Fuse Glusterfs(NFS is disabled).Also Migration Network is activated
on the same storage network.
The problem is that the 10GE network is not used at full potential by the
Gluster.
If i do live Migration of Vms i can see speeds of 7GB/s ~ 9GB/s.
The same network tests using iperf3 reported 9.9GB/s , these exluding the
network setup as a bottleneck(i will not paste all the iperf3 tests here
for now).
I did not enable all the Volume options from "Optimize for Virt Store",
because of the bug that cant set volume cluster.granural-heal to
enable(this was fixed in vdsm-4
40, but that is working only on Centos 8 with ovirt 4.4 ) .
i whould be happy to know what are all these "Optimize for Virt Store"
options, so i can set them manually.
The speed on the disk inside the host using dd is b etween 1GB/s to 700Mbs.
[root@host1 ~]# dd if=/dev/zero of=test bs=100M count=40 cou nt=80
status=progress 8074035200 bytes (8.1 GB) copied, 11.059372 s, 730 MB/s
80+0 records in 80+0 records out 8388608000 bytes (8.4 GB) copied, 11.9928
s, 699 MB/s
The dd write test on the gluster volme inside the host is poor only ~
120MB/s .
During the dd test, if i look at Networks->Gluster network ->Hosts at Tx
and Rx the network speed barerly reaches over 1Gbs (~1073 Mbs) out of
maximum of 10000 Mbs.
dd if=/dev/zero of=/rhev/data-center/mnt/glu
sterSD/gluster1.domain.local\:_data/test
bs=100M count=80 status=progress 8283750400 bytes (8.3 GB) copied,
71.297942 s, 116 MB/s 80+0 records in 80+0 records out 8388608000 bytes
(8.4 GB) copied, 71.9545 s, 117 MB/s
I have attached my Gluster volume settings and mount options.
Thanks,
Emy
4 years, 6 months
oVirt-node 4.4.0 - Hosted engine deployment fails when host is unable to download updates
by Marco Fais
Hi,
fresh installation of oVirt-node 4.4.0 on a cluster -- the hosted-engine
--deploy command fails if DNF is unable to download updates.
This cluster is not connected to the public network at the moment.
If I use a proxy (setting the relevant env. variables) it fails at a later
stage (I think the engine VM is trying to download updates as well, but
encounters the same issue and doesn't seem to use the proxy).
With oVirt-node 4.3.x I didn't have this issue -- any suggestions?
[~]# hosted-engine --deploy
[ INFO ] Stage: Initializing
[ INFO ] Stage: Environment setup
During customization use CTRL-D to abort.
Continuing will configure this host for serving as hypervisor and
will create a local VM with a running engine.
The locally running engine will be used to configure a new
storage domain and create a VM there.
At the end the disk of the local VM will be moved to the shared
storage.
Are you sure you want to continue? (Yes, No)[Yes]:
Configuration files:
Log file:
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20200604214638-vae2wf.log
Version: otopi-1.9.1 (otopi-1.9.1-1.el8)
[ INFO ] DNF Downloading 1 files, 0.00KB
[ INFO ] DNF Downloaded Extra Packages for Enterprise Linux 8 - x86_64
[
* ERROR ] DNF Failed to download metadata for repo 'ovirt-4.4-epel'[ ERROR
] DNF Failed to download metadata for repo 'ovirt-4.4-epel'[ ERROR ] Failed
to execute stage 'Environment setup': Failed to download metadata for repo
'ovirt-4.4-epel'*
[ INFO ] Stage: Clean up
[...]
Thanks,
Marco
4 years, 6 months
New fenceType in oVirt code for IBM OpenBMC
by Vinícius Ferrão
Hello,
After some days scratching my head I found that oVirt is probably missing fenceTypes for IBM’s implementation of OpenBMC in the Power Management section. The host machine is an OpenPOWER AC922 (ppc64le).
The BMC basically is an “ipmilan” device but the ciphers must be defined as 3 or 17 by default:
[root@h01 ~]# ipmitool -I lanplus -H 10.20.10.2 root -P 0penBmc -L operator -C 3 channel getciphers ipmi
ID IANA Auth Alg Integrity Alg Confidentiality Alg
3 N/A hmac_sha1 hmac_sha1_96 aes_cbc_128
17 N/A hmac_sha256 sha256_128 aes_cbc_128
The default ipmilan connector forces the option cipher=1 which breaks the communication.
So I was reading the code and found this “fenceType” class, but I wasn't able to found where to define those classes. So I can create another one called something like openbmc to set cipher=17 by default.
Another question is how bad the output is, it only returns a JSON-RPC generic error. But I don’t know how to suggest a fix for this.
Thanks,
4 years, 6 months
VMs shutdown mysteriously
by Bobby
Hello,
All 4 VMs on one of my oVirt cluster node shutdown for an unknown reason
almost simultaneously.
Please help me to find the root cause.
Thanks.
Please note the host seems doing fine and never crash or hangs and I can
migrate VMs back to it later.
Here is the exact timeline of all the related events combined from the host
and the VM(s):
On oVirt host:
/var/log/vdsm/vdsm.log:
2020-06-25 15:25:16,944-0500 WARN (qgapoller/3)
[virt.periodic.VmDispatcher] could not run <function <lambda> at
0x7f4ed2f9f5f0> on ['e0257b06-28fd-4d41-83a9-adf1904d3622'] (periodic:289)
2020-06-25 15:25:19,203-0500 WARN (libvirt/events) [root] File:
/var/lib/libvirt/qemu/channels/e0257b06-28fd-4d41-83a9-adf1904d3622.ovirt-guest-agent.0
already removed (fileutils:54)
2020-06-25 15:25:19,203-0500 WARN (libvirt/events) [root] File:
/var/lib/libvirt/qemu/channels/e0257b06-28fd-4d41-83a9-adf1904d3622.org.qemu.guest_agent.0
already removed (fileutils:54)
[root@athos log]# journalctl -u NetworkManager --since=today
-- Logs begin at Wed 2020-05-20 22:07:33 CDT, end at Thu 2020-06-25
16:36:05 CDT. --
Jun 25 15:25:18 athos NetworkManager[1600]: <info> [1593116718.1136]
device (vnet0): state change: disconnected -> unmanaged (reason
'unmanaged', sys-iface-state: 'removed')
Jun 25 15:25:18 athos NetworkManager[1600]: <info> [1593116718.1146]
device (vnet0): released from master device SRV-VL
/var/log/messages:
Jun 25 15:25:18 athos kernel: SRV-VL: port 2(vnet0) entered disabled state
Jun 25 15:25:18 athos NetworkManager[1600]: <info> [1593116718.1136]
device (vnet0): state change: disconnected -> unmanaged (reason
'unmanaged', sys-iface-state: 'removed')
Jun 25 15:25:18 athos NetworkManager[1600]: <info> [1593116718.1146]
device (vnet0): released from master device SRV-VL
Jun 25 15:25:18 athos libvirtd: 2020-06-25 20:25:18.122+0000: 2713: error :
qemuMonitorIO:718 : internal error: End of file from qemu monitor
/var/log/libvirt/qemu/aries.log:
2020-06-25T20:25:28.353975Z qemu-kvm: terminating on signal 15 from pid
2713 (/usr/sbin/libvirtd)
2020-06-25 20:25:28.584+0000: shutting down, reason=shutdown
=============================================================================================
On the first VM effected (same thing on others):
/var/log/ovirt-guest-agent/ovirt-guest-agent.log:
MainThread::INFO::2020-06-25
15:25:20,270::ovirt-guest-agent::104::root::Stopping oVirt guest agent
CredServer::INFO::2020-06-25
15:25:20,626::CredServer::262::root::CredServer has stopped.
MainThread::INFO::2020-06-25
15:25:21,150::ovirt-guest-agent::78::root::oVirt guest agent is down.
=============================================================================================
Packages version installated:
Host OS version: CentOS 7.7.1908:
ovirt-hosted-engine-ha-2.3.5-1.el7.noarch
ovirt-provider-ovn-driver-1.2.22-1.el7.noarch
ovirt-release43-4.3.6-1.el7.noarch
ovirt-imageio-daemon-1.5.2-0.el7.noarch
ovirt-vmconsole-1.0.7-2.el7.noarch
ovirt-imageio-common-1.5.2-0.el7.x86_64
ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch
ovirt-vmconsole-host-1.0.7-2.el7.noarch
ovirt-host-4.3.4-1.el7.x86_64
libvirt-4.5.0-23.el7_7.1.x86_64
libvirt-daemon-4.5.0-23.el7_7.1.x86_6
qemu-kvm-ev-2.12.0-33.1.el7.x86_64
qemu-kvm-common-ev-2.12.0-33.1.el7.x86_64
On guest VM:
ovirt-guest-agent-1.0.13-1.el6.noarch
qemu-guest-agent-0.12.1.2-2.491.el6_8.3.x86_64
4 years, 6 months
status of oVirt 4.4.x and CentOS 8.2
by Gianluca Cecchi
Hello,
what is the current status both if using plain CentOS based nodes and
ovirt-node-ng?
Do the release of CentOS 8.2 impact new installation for 4.4.0 and/or
4.4.1rc?
Thanks,
Gianluca
4 years, 6 months
Localdisk hook not working
by tim-nospam@bordemann.com
Hi,
I'd like to use the localdisk hook of vdsm and have configured everything according to the readme: https://github.com/oVirt/vdsm/tree/master/vdsm_hooks/localdisk
After installing the hook, configuring the ovirt-engine, creating the volume group, adding the custom property 'localdisk' to the virtual machine and fixing a small bug in the localdisk-helper, vdsm creates the logical volume on the 'ovirt-local' volume group after starting the virtual machine. Unfortunately nothing more seems to happen then. There is no activity on the NAS which would indicate that the disk image is pulled to the host. I can also see no errors on the host.
Is anyone currently running ovirt 4.4 with the localdisk hook?
What else can I do to find out why the image is not being copied to my host?
Thanks,
Tim
4 years, 6 months
Re: VDSM not binding on IPv4 after ovirt-engine restart
by Dominik Holler
On Tue, Jun 30, 2020 at 6:13 PM Erez Zarum <erezz(a)nanosek.com> wrote:
> While troubleshooting a fresh installation of (after a failed one) that
> caused all the hosts but the one running the hosted-engine to become in
> “Unassigned” state I noticed that the ovirt-engine complains about not
> being able to contact the VDSM.
> I noticed that VDSM has stopped listening on IPv4.
>
>
Thanks for sharing the details.
> I didn’t disable any IPv6 as it states not to disable it on hosts that are
> capable running the hosted-engine and it seems that the reason behind it
> is that the hosted-engine talks to the host it runs on through “localhost”,
> this also explains why the host which the hosted-engine runs on is “OK”.
>
> Below is from a host that does not run the hosted-engine:
> # ss -atn | grep 543
> LISTEN 0 5 *:54322 *:*
> ESTAB 0 0 127.0.0.1:54792 127.0.0.1:54321
> ESTAB 0 0 127.0.0.1:54798 127.0.0.1:54321
> LISTEN 0 5 [::]:54321 [::]:*
> ESTAB 0 0 [::ffff:127.0.0.1]:54321
> [::ffff:127.0.0.1]:54798
> ESTAB 0 0 [::ffff:127.0.0.1]:54321
> [::ffff:127.0.0.1]:54792
> ESTAB 0 0 [::1]:54321 [::1]:50238
> ESTAB 0 0 [::1]:50238 [::1]:54321
>
> Below is from a host that runs the hosted-engine at the moment:
> # ss -atn | grep 543
> LISTEN 0 5 *:54322 *:*
> LISTEN 0 5 [::]:54321 [::]:*
> ESTAB 0 0 [::1]:51230 [::1]:54321
> ESTAB 0 0 [::1]:54321 [::1]:51242
> ESTAB 0 0 [::ffff:10.46.20.23]:54321
> [::ffff:10.46.20.20]:45706
> ESTAB 0 0 [::ffff:10.46.20.23]:54321
> [::ffff:10.46.20.20]:45746
> ESTAB 0 0 [::1]:51240 [::1]:54321
> ESTAB 0 0 [::1]:54321 [::1]:51230
> ESTAB 0 0 [::1]:51242 [::1]:54321
> ESTAB 0 0 [::1]:54321 [::1]:51240
>
> The hosted-engine IP is 10.46.20.20 and the host is 10.46.20.23.
>
>
Why do you think the host does not listen to IPv4 anymore?
Can you please share the output of
"nc -vz 10.46.20.23 54321"
executed on engine VM or another host?
> /etc/hosts on all hosts:
> 127.0.0.1 localhost localhost.localdomain localhost4
> localhost4.localdomain4
> ::1 localhost localhost.localdomain localhost6
> localhost6.localdomain6
>
> Perhaps this is relevant but all hosts are enrolled into IDM (FreeIPA) and
> as an outcome they all have a DNS record and a PTR record as well as the
> ovirt-engine VM.
>
> # cat /etc/vdsm/vdsm.conf
> [vars]
> ssl = true
> ssl_ciphers = HIGH:!aNULL
> ssl_excludes = OP_NO_TLSv1,OP_NO_TLSv1_1
>
> [addresses]
> management_port = 54321
>
> I have tried adding “management_ip = 0.0.0.0” but then it only binds to
> IPv4 and yet, the host still shows as Unassigned, sometimes it switches to
> “NonResponsive” and trying to “Reinstall” the host fails, the ovirt-engine
> complains it can't contact/reach the VDSM, while using netcat from the
> ovirt-engine it works.
>
> I have KSM and Memory Ballooning enabled on the Cluster as well.
>
> oVirt 4.3.10 installed on CentOS 7.8.2003
> The self-hosted Engine runs on an external GlusterFS, before reinstalling
> everything (fresh start of OS, etc..) I tried iSCSI as well.
>
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCTWZLS2VPI...
>
4 years, 6 months
VDSM not binding on IPv4 after ovirt-engine restart
by Erez Zarum
While troubleshooting a fresh installation of (after a failed one) that caused all the hosts but the one running the hosted-engine to become in “Unassigned” state I noticed that the ovirt-engine complains about not being able to contact the VDSM.
I noticed that VDSM has stopped listening on IPv4.
I didn’t disable any IPv6 as it states not to disable it on hosts that are capable running the hosted-engine and it seems that the reason behind it is that the hosted-engine talks to the host it runs on through “localhost”, this also explains why the host which the hosted-engine runs on is “OK”.
Below is from a host that does not run the hosted-engine:
# ss -atn | grep 543
LISTEN 0 5 *:54322 *:*
ESTAB 0 0 127.0.0.1:54792 127.0.0.1:54321
ESTAB 0 0 127.0.0.1:54798 127.0.0.1:54321
LISTEN 0 5 [::]:54321 [::]:*
ESTAB 0 0 [::ffff:127.0.0.1]:54321 [::ffff:127.0.0.1]:54798
ESTAB 0 0 [::ffff:127.0.0.1]:54321 [::ffff:127.0.0.1]:54792
ESTAB 0 0 [::1]:54321 [::1]:50238
ESTAB 0 0 [::1]:50238 [::1]:54321
Below is from a host that runs the hosted-engine at the moment:
# ss -atn | grep 543
LISTEN 0 5 *:54322 *:*
LISTEN 0 5 [::]:54321 [::]:*
ESTAB 0 0 [::1]:51230 [::1]:54321
ESTAB 0 0 [::1]:54321 [::1]:51242
ESTAB 0 0 [::ffff:10.46.20.23]:54321 [::ffff:10.46.20.20]:45706
ESTAB 0 0 [::ffff:10.46.20.23]:54321 [::ffff:10.46.20.20]:45746
ESTAB 0 0 [::1]:51240 [::1]:54321
ESTAB 0 0 [::1]:54321 [::1]:51230
ESTAB 0 0 [::1]:51242 [::1]:54321
ESTAB 0 0 [::1]:54321 [::1]:51240
The hosted-engine IP is 10.46.20.20 and the host is 10.46.20.23.
/etc/hosts on all hosts:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
Perhaps this is relevant but all hosts are enrolled into IDM (FreeIPA) and as an outcome they all have a DNS record and a PTR record as well as the ovirt-engine VM.
# cat /etc/vdsm/vdsm.conf
[vars]
ssl = true
ssl_ciphers = HIGH:!aNULL
ssl_excludes = OP_NO_TLSv1,OP_NO_TLSv1_1
[addresses]
management_port = 54321
I have tried adding “management_ip = 0.0.0.0” but then it only binds to IPv4 and yet, the host still shows as Unassigned, sometimes it switches to “NonResponsive” and trying to “Reinstall” the host fails, the ovirt-engine complains it can't contact/reach the VDSM, while using netcat from the ovirt-engine it works.
I have KSM and Memory Ballooning enabled on the Cluster as well.
oVirt 4.3.10 installed on CentOS 7.8.2003
The self-hosted Engine runs on an external GlusterFS, before reinstalling everything (fresh start of OS, etc..) I tried iSCSI as well.
4 years, 6 months