April 2022 - Users - oVirt List Archives

Ovirt 4.4.10 lab unstable on ESXi
by Mohamed Roushdy 11 Apr '22

11 Apr '22

Hello, I’m used to having a nested-ESXi on another ESXi, but we are testing Ovirt these days, and we are facing a strange behavior with the engine VM. I have a hyper-converged Ovirt cluster with a hosted-engine, and whenever I reboot/shutdown one of the nodes for testing (a node that is not hosting the engine VM of course) the engine goes crazy and too slow, and starts losing network connecting to the remaining (healthy) nodes in the cluster, and I can hardly SSH into it. Ping also has too many drops while another node is down. Once that faulty node is up again the engine works very fine, but this is really limiting my ability to make further Ovirt HA tests before moving to production. I though maybe the promiscuous mode on the physical host causes this, but never faced this with nester-virtualization with VMware, and turning-off promiscuous mode isn’t an option either… what do you think please? Mohamed Roushdy Team Member – Systems Administrator M: +31 61 55 94 300 [signature_514057058]<https://peopleintouch.com/>[signature_1294661742]<https://www.linkedin.com/authwall?trk=bf&trkInfo=AQGK3xaGp-v5DAAAAX16b8yY54…>

1 0

Ovirt 4.4 - (Restore from 4.3): The provided authorization grant for the auth code has expired. (all Users)
by fabian.mohren＠qbeyond.de 11 Apr '22

11 Apr '22

Hi, we install ovirt 4.4 on a fresh Redhat 8.5 Installation. Next Restore the oVirt Backup (https://www.ovirt.org/documentation/upgrade_guide/#Remote_Upgrading_from_4-3) We use a external Certificate.I copy all certs/key from old installation to the new Manager. No matter with which user I log in - the same error message always appears. Also create a new user - had no success. Same error message Unfortunately there is nothing new in the logfile - except exactly this message. Setting the log level to "DEBUG" for all services did not bring any new or clearer error messages. The time is set correctly - this can almost not be the cause. Unfortunately I have no idea what the cause is and how I can solve this - I hope you have another idea. Thanks & many greetings

4 8

VM failed to start when host's network is down
by lizhijian＠fujitsu.com 08 Apr '22

08 Apr '22

Post again after subscribing the mail list. Hi guys I have an all in one ovirt environment which node installed both vdsm and ovirt-engine. I have setup the ovirt environment and it could work well. For some reasons, i have to use this ovirt with node's networking down(i unplugged the network cable) In such case, I noticed that i cannot start a VM anymore. I wonder if there is a configuration switch to enable a ovirt to work with node's networking down ? if not, may i possible to make it work by a easy way ? When i tried to start VM with ovirt API, it responses with: ```bash [root@74d2ab9cb0 ~]# sh start.sh <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <action> <async>false</async> <fault> <detail>[Cannot run VM. Unknown Data Center status.]</detail> <reason>Operation Failed</reason> </fault> <status>failed</status> </action> [root@74d2ab9cb0 ~]# sh start.sh <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <action> <async>false</async> <fault> <detail>[Cannot run VM. Unknown Data Center status.]</detail> <reason>Operation Failed</reason> </fault> <status>failed</status> </action> [root@74d2ab9cb0 ~]# ``` Attached the vdsm and ovirt-engine Thanks Zhijian

2 1

Wait for the engine to come up on the target vm
by Vladimir Belov 08 Apr '22

08 Apr '22

I'm trying to deploy oVirt from a self-hosted engine, but at the last step I get an engine startup error. [ INFO ] TASK [Wait for the engine to come up on the target VM] [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.181846", "end": "2022-03-28 15:41:28.853150", "rc": 0, "start": "2022-03-28 15:41:28.671304", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5537 (Mon Mar 28 15:41:20 2022)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5537 (Mon Mar 28 15:41:20 2022)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"hostname\": \"v2.test.ru\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"4d2eeaea\", \"local_conf_timestamp\": 5537, \"host-ts\": 5537}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5537 (Mon Mar 28 15:41:20 2022)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5537 (Mon Mar 28 15:41:20 2022)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"hostname\": \"v2.test.ru\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"4d2eeaea\", \"local_conf_timestamp\": 5537, \"host-ts\": 5537}, \"global_maintenance\": false}"]} Аfter the installation is completed, the condition of the engine is as follows: Engine status: {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"} After reading the vdsm.logs, I found that qemu-guest-agent failed to connect to the engine for some reason. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5400, in qemuGuestAgentShutdown self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2517, in shutdownFlags if ret == -1: raise libvirtError ('virDomainShutdownFlags() failed', dom=self) libvirtError: Guest agent is not responding: QEMU guest agent is not connected During the installation phase, qemu-guest-agent on the guest VM is running. Setting a temporary password (hosted-engine --add-console-password --password) and connecting via VNC also failed. Using "hosted-engine --console" also failed to connect The engine VM is running on this host Connected to HostedEngine domain Escaping character: ^] error: internal error: character device <null> not found The network settings are configured using static addressing, without DHCP. It seems to me that this is due to the fact that the engine receives an IP address that does not match the entry in /etc/hosts, but I do not know how to fix it. Any help is welcome, I will provide the necessary logs. Thanks

3 2

Gluster storage and TRIM VDO
by Oleh Horbachov 08 Apr '22

08 Apr '22

Hello everyone. I have a Gluster distributed replication cluster deployed. The cluster - store for ovirt. For bricks - VDO over a raw disk. When discarding via 'fstrim -av' the storage hangs for a few seconds and the connection is lost. Does anyone know the best practices for using TRIM with VDO in the context of ovirt? ovirt - v4.4.10 gluster - v8.6

2 3

Mac addresses pool issues
by Nicolas MAIRE 08 Apr '22

08 Apr '22

Hi, We're encountering some issues on one of our production clusters running oVirt 4.2. We've had an incident with the engine's database a few weeks back that we were able to recover from, however since then we've been having a bunch of weird issues, mostly around MACs. It started off with the engine being unable to find a free MAC when creating a VM, despite there being significantly less virtual interfaces (around 250) than the total number of MACs in the default pool (default configuration, so 65536 addresses) and escalated into creating duplicate MACs (despite the pool not allowing it) and now we can't even modify the pool or remove VMs (since deleting the attached vnics fail), so we're kinda stuck with a cluster that has running VMs which are fine as long as we don't touch them, but on which we can't create new VMs (or modify the existing ones). In the engine's log we can see that we've had an "Unable to initialize MAC pool due to existing duplicates (Failed with error MAC_POOL_INITIALIZATION_FAILED and code 5010)" error when we tried to reconfigure the pool this morning (see the full error stack here : https://pastebin.com/6bKMfbLn) and now whenever we try to delete a VM or reconfigure the pool we have a 'Pool for id="58ca604b-017d-0374-0220-00000000014e" does not exist' error (see the full error stack here: https://pastebin.com/Huy91iig) but, if we check the engine's mac_pool table we can see that it's there : engine=# select * from mac_pools; id | name | description | allow_duplicate_mac_addresses | default_pool --------------------------------------+---------+------------------+-------------------------------+-------------- 58ca604b-017d-0374-0220-00000000014e | Default | Default MAC pool | f | t (1 row) engine=# select * from mac_pool_ranges; mac_pool_id | from_mac | to_mac --------------------------------------+-------------------+------------------- 58ca604b-017d-0374-0220-00000000014e | 56:6f:1a:1a:00:00 | 56:6f:1a:1a:ff:ff (1 row) I found this bugzilla that seems to somehow apply https://bugzilla.redhat.com/show_bug.cgi?id=1554180 however I don't really know how to "reinitialize engine", especially considering that the mac pool was not configured to allow duplicate macs to begin with, and I've no idea what the impact of that reinitialization would be on the current VMs. I'm quite new to oVirt (only been using it for one year) so any help would be greatly appreciated.

3 2

How to create User in web ui?
by jihwahn1018＠naver.com 08 Apr '22

08 Apr '22

Hello, in ovirt guide, we need to create user by ovirt-aaa-jdbc-tool and we can add only users created by ovirt-aaa-jdbc-tool. Is there any way to create user in web ui? and if not, is there any reason to block creating user in web ui? Thank you.

2 1

Duplicate nameserver in Host causing unassigned state when adding. possible bug?
by ravi k 08 Apr '22

08 Apr '22

Hello all, We are running oVirt 4.3.10.4-1.0.22.el7. I noticed an interesting issue or a possible bug yesterday. I was trying to add a host when I noticed that it was failing and the host status was going into 'unassigned' state. I saw the below error in the engine log. /var/log/ovirt-engine/engine.log 2022-04-07 15:17:07,739+04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-24723) [4917a348] HostName = olvsrv005u 2022-04-07 15:17:07,739+04 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-24723) [4917a348] Failed in 'CollectVdsNetworkDataAfterInstallationVDS' method, for vds: 'olvsrv005u'; host: '10.119.6.232': CallableStatementCallback; SQL [{call insertnameserver(?, ?, ?)}ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(459b68e6-b684-4cf6-8834-755249a6bd3a, 10.119.10.212) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(459b68e6-b684-4cf6-8834-755249a6bd3a, 10.119.10.212) already exists. Then I checked the resolv.conf on the host [root@olvsrv005u ~]# cat /etc/resolv.conf # Version: 1.00 search uat.abc.com nameserver 10.119.10.212 nameserver 10.119.10.212 Well, ideally it's of no use having duplicate nameserver. But it was not affecting the functionality of the host. However it was failing the addition of the host, probably because it was failing when updating the host's config in the engine DB due to the duplicate nameserver. To test this I commented the duplicate value and checked. The host is now added successfully. 2022-04-07 15:33:37,301+04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoAsyncVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [] START, GetHardwareInfoAsyncVDSCommand(HostName = olvsrv005u, VdsIdA ndVdsVDSCommandParametersBase:{hostId='459b68e6-b684-4cf6-8834-755249a6bd3a', vds='Host[olvsrv005u,459b68e6-b684-4cf6-8834-755249a6bd3a]'}), log id: 52e7ec52 2022-04-07 15:33:37,301+04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoAsyncVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [] FINISH, GetHardwareInfoAsyncVDSCommand, return: , log id: 52e7ec52 2022-04-07 15:33:37,356+04 INFO [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [3de72cb7] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 459b68e6-b684-4cf6-8834-755249a6bd3a Type: VDS 2022-04-07 15:33:37,360+04 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [3de72cb7] START, SetVdsStatusVDSCommand(HostName = olvsrv005u, SetVdsStatusVDSCommandPa rameters:{hostId='459b68e6-b684-4cf6-8834-755249a6bd3a', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 1bfc90a3 2022-04-07 15:33:37,363+04 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [3de72cb7] FINISH, SetVdsStatusVDSCommand, return: , log id: 1bfc90a3 2022-04-07 15:33:37,404+04 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [3de72cb7] Host 'olvsrv005u' is set to Non-Operational, it is missing the following networks Should I raise this as a bug? I'm of the opinion that it should be because if it's not breaking the host's functionality then it should be ok. Regards, Ravi

2 1

Welcome to oVirt 4.5.0 Beta test day!
by Sandro Bonazzola 07 Apr '22

07 Apr '22

Hi, 4.5.0 Beta was released yesterday! As for oVirt 4.5 Alpha test day we have a trello board at https://trello.com/b/3FZ7gdhM/ovirt-450-test-day . If you have troubles accessing the trello board please let me know. A release management draft page has been created at: https://www.ovirt.org/release/4.5.0/ If you're willing to help testing the release during the test days please join the oVirt development mailing list at https://lists.ovirt.org/archives/list/devel@ovirt.org/ and report your feedback there. Please join the trello board for sharing what you're going to test so others can focus on different areas not covered by your test. If you don't want to register to trello, please share on the oVirt development mailing list and we'll add it to the board. The board is publicly visible also to non-registered users. Instructions for installing oVirt 4.5.0 Beta for testing have been added to the release page https://www.ovirt.org/release/4.5.0/ and to relevant documentation sections on the oVirt website. Professional Services, Integrators and Backup vendors: please run a test session against your additional services, integrated solutions, downstream rebuilds, backup solution accordingly. If you're not listed here: https://ovirt.org/community/user-stories/users-and-providers.html consider adding your company there. If you're willing to help updating the localization for oVirt 4.5.0 please follow https://ovirt.org/develop/localization.html If you're willing to help promoting the oVirt 4.5.0 release you can submit your banner proposals for the oVirt home page and for the social media advertising at https://github.com/oVirt/ovirt-site/issues The current banner proposal is here: https://github.com/oVirt/ovirt-site/issues/2797#issuecomment-1084842431 As an alternative please consider submitting a case study as in https://ovirt.org/community/user-stories/user-stories.html Feature owners: please share recordings of the presentation of your feature for oVirt Youtube channel: https://www.youtube.com/c/ovirtproject If you have some new feature requiring community feedback / testing please add your case under the "Test looking for volunteer" section. Do you want to contribute to getting ready for this release? Read more about oVirt community at https://ovirt.org/community/ and join the oVirt developers https://ovirt.org/develop/ -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://www.redhat.com/> *Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours.*

2 2

10Gbps iSCSI Bonding issue on HPE Gen10 server
by michael.li＠hactlsolutions.com 07 Apr '22

07 Apr '22

Hi support, I have an issue on configure 10Gbps iSCSI port on HPE Gen 10 Server in oVirt 4.4.10. The behavior as below 1. Two 10Gbps ports are running on 10000Mb/s and up verified ethtool 2. configure iSCSI active-standby bonding (bond0) on oVirt manager. 3. When reboot the server, one of port is down (no carrier). Temp. solution: I manually up the port by "nmcli conn up "port name" to resume bonding. Anyone know how to resolve it? Let me know for any information needed to investigate. I enclose my configuration as below. [root@ovrphv04 network-scripts]# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: ens1f1np1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 Slave Interface: ens1f1np1 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: f4:03:43:e7:da:38 Slave queue ID: 0 Slave Interface: ens4f1np1 MII Status: down Speed: Unknown Duplex: Unknown Link Failure Count: 0 Permanent HW addr: f4:03:43:e7:d7:68 Slave queue ID: 0 [root@ovrphv04 network-scripts]# ip link |grep ens1f1np1 7: ens1f1np1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 [root@ovrphv04 network-scripts]# ip link |grep ens4f1np1 10: ens4f1np1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9000 qdisc mq master bond0 state DOWN mode DEFAULT group default qlen 1000 [root@ovrphv04 network-scripts]# cat ifcfg-ens1f1np1 TYPE=Ethernet MTU=9000 SRIOV_TOTAL_VFS=0 NAME=ens1f1np1 UUID=6072cd16-a45a-433e-bc9e-817557706fb2 DEVICE=ens1f1np1 ONBOOT=yes LLDP=no ETHTOOL_OPTS="speed 10000 duplex full autoneg on" MASTER_UUID=77e39d1b-cf14-406f-87f6-fe954fde40f0 MASTER=bond0 SLAVE=yes [root@ovrphv04 network-scripts]# cat ifcfg-ens4f1np1 TYPE=Ethernet MTU=9000 SRIOV_TOTAL_VFS=0 NAME=ens4f1np1 UUID=cf6b28e7-8fdb-472f-bbe9-0eecc95c5c3c DEVICE=ens4f1np1 ONBOOT=yes LLDP=no ETHTOOL_OPTS="speed 10000 duplex full autoneg on" MASTER_UUID=77e39d1b-cf14-406f-87f6-fe954fde40f0 MASTER=bond0 SLAVE=yes [root@ovrphv04 network-scripts]# cat ifcfg-bond0 BONDING_OPTS="mode=active-backup miimon=100" TYPE=Bond BONDING_MASTER=yes HWADDR= MTU=9000 PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=none IPADDR=172.26.14.139 PREFIX=24 DEFROUTE=yes DHCP_CLIENT_ID=mac IPV4_FAILURE_FATAL=no IPV6_DISABLED=yes IPV6INIT=no DHCPV6_DUID=ll DHCPV6_IAID=mac IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no NAME=bond0 UUID=77e39d1b-cf14-406f-87f6-fe954fde40f0 DEVICE=bond0 ONBOOT=yes AUTOCONNECT_SLAVES=yes LLDP=no

2 1