February 2021 - Users - Ovirt List Archives

oVirt 4.4.4 not dislaying Sign-on Screen

by louisb＠ameritech.net

I recently installed oVirt 4.4.4, everything appeared to run successfully, however, I’m unable to get the sign-on screen to display on my RHEL 8.3 machine. I did do an install using the remote database installation. I encountered no issue with the install, I can see the “engine” and “ovirt_engine_history” in PostgreSQL 13.3. The tables and table spaces have all been created as designed by the installation process. When I enter the URL that was displayed once the installation completed I get the following information displayed on the screen: Internal Error I’ve tried using the port 443 and port 80 URL and get the same results. The URL are as follows: http://WorkstationName:80/ovirt-engine https://WorkstationName:443/ovirt-engine For three days I’ve been trying to locate logs that will help me located and resolve the problem, but so far I’ve had no luck. Can anyone tell where I should start looking to resolve my issue? I’ve checked several system logs and logs for Apache with luck. What should I be looking for within the logs? What logs should I be looking at? I don’t believe my issue is with the back-end database, I thinks its something with Apache not being able to find the web pages to display. How can I start to investigate my problem an develop a solution? My Linux/RHEL 8.3 OS has all of the latest patches and fixes applied. Thanks

4 years, 3 months

3
10
0 / 0

Problems after upgrade from 4.4.3 to 4.4.4

by tferic＠swissonline.ch

Hi I have problems after upgrading my 2-node cluster from 4.4.3 to 4.4.4. Initially, I performed the upgrade of the oVirt hosts using the oVirt GUI (I wasn't planning any changes). It appears that the upgrade broke the system. On host1, the ovirt-engine was configured to run on the oVirt host itself (not self-hosted engine). After the upgrade, the oVirt GUI didn't load in the Browser anymore. I tried to fix the issue by migrating to self-hosted engine, which did not work, so I ran engine restore and engine-setup in order to get back to the initial state. I am now able to login to the oVirt GUI again, but I am having the following problems: host1 is in status "Unassigned", and it has the SPM role. It cannot be set to maintenance mode, nor re-installed from GUI, but I am able to reboot the host from oVirt. All Storage Domains are inactive. (all NFS) In the /var/log/messages log, I can see the following message appearing frequently: "vdsm[5935]: ERROR ssl handshake: socket error, address: ::ffff:192.168.100.61" The cluster is down and no VM's can be run. I don't know how to fix either of the issues. Does anyone have an idea? I am appending a tar file containing log files to this email. http://gofile.me/5fp92/d7iGEqh3H <http://gofile.me/5fp92/d7iGEqh3H> Many thanks Toni

4 years, 3 months

2
7
0 / 0

Locked disks

by Giulio Casella

Since yesterday I found a couple VMs with locked disk. I don't know the reason, I suspect some interaction made by our backup system (vprotect, snapshot based), despite it's working for more than a year. I'd give a chance to unlock_entity.sh script, but it reports: CAUTION, this operation may lead to data corruption and should be used with care. Please contact support prior to running this command Do you think I should trust? Is it safe? VMs are in production... My manager is 4.4.4.7-1.el8 (CentOS stream 8), hosts are oVirt Node 4.4.4 TIA, Giulio

4 years, 3 months

3
7
0 / 0

supervdsm failing during network_caps

by Alan G

Hi, I have issues with one host where supervdsm is failing in network_caps. I see the following trace in the log. MainProcess|jsonrpc/1::ERROR::2020-01-06 03:01:05,558::supervdsm_server::100::SuperVdsm.ServerCallback::(wrapper) Error in network_caps Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 98, in wrapper res = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 56, in network_caps return netswitch.configurator.netcaps(compatibility=30600) File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 317, in netcaps net_caps = netinfo(compatibility=compatibility) File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 325, in netinfo _netinfo = netinfo_get(vdsmnets, compatibility) File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py", line 150, in get return _stringify_mtus(_get(vdsmnets)) File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py", line 59, in _get ipaddrs = getIpAddrs() File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/addresses.py", line 72, in getIpAddrs for addr in nl_addr.iter_addrs(): File "/usr/lib/python2.7/site-packages/vdsm/network/netlink/addr.py", line 33, in iter_addrs with _nl_addr_cache(sock) as addr_cache: File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/vdsm/network/netlink/__init__.py", line 92, in _cache_manager cache = cache_allocator(sock) File "/usr/lib/python2.7/site-packages/vdsm/network/netlink/libnl.py", line 469, in rtnl_addr_alloc_cache raise IOError(-err, nl_geterror(err)) IOError: [Errno 16] Message sequence number mismatch A restart of supervdsm will resolve the issue for a period, maybe 24 hours, then it will occur again. So I'm thinking it's resource exhaustion or a leak of some kind? Running 4.2.8.2 with VDSM at 4.20.46. I've had a look through the bugzilla and can't find an exact match, closest was this one https://bugzilla.redhat.com/show_bug.cgi?id=1666123 which seems to be a RHV only fix. Thanks, Alan

4 years, 3 months

2
3
0 / 0

Migrate windows 2003 server 64bits from libvirt to ovirt

by Fernando Hallberg

Hi, I have a VM with 2003 server x64, and I upload the vm image to oVirt. The VM boot on the oVirt, but, the blue screen appear with a error message: [image: image.png] Anybody has some information about this? I try to convert de img file from raw to qcow2, but the error persists. Regards, Fernando Hallberg

4 years, 3 months

5
8
0 / 0

Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

by souvaliotimaria＠mail.com

Hello everyone, Any help would be greatly appreciated in the following problem. In my lab, the day before yesterday, we had power issues, with a UPS going off-line and following the power outage of the NFS/DNS server I have set up to serve ovirt with isos and as a DNS server (our other DNS servers are located as VMs within the oVirt environment). We found a broadcast storm on the switch (due to a faulty NIC on the aformentioned UPS) that the ovirt nodes are connected and later on had to re-establish several of the virtual connections as well. The above led to one of the hosts becoming NonResponsive, two machines becoming unresponsive and three VMs shuting down. The oVirt environment, version 4.3.5.2, is a replica 2 + arbiter 1 environment and runs GlusterFS with the recommended volumes of data, engine and vmstore. So far, the times there was some kind of a problem, usually oVirt was able to solve it by its own. This time, however, after we recovered from the above state, the volumes of data and vmstore successfully healing , the volume engine became stuck to the healing process (Up, unsynched entries, needs healing), and from the web GUI I see that the VM HostedEngine is paused due to a storage I/O error while the output of virsh list --all command shows that the HostedEngine is running.. How is that happening? I tried to manually trigger the healing process for the volume but nothing with gluster volume heal engine The command gluster volume heal engine info shows the following [root@ov-no3 ~]# gluster volume heal engine info Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine Status: Connected Number of entries: 0 Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine /80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7 Status: Connected Number of entries: 1 Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine /80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7 Status: Connected Number of entries: 1 This morning I came upon this Reddit post https://www.reddit.com/r/gluster/comments/fl3yb7/entries_stuck_in_heal_pe... where it seems that after a graceful reboot one of the ovirt hosts, the gluster came back online after it completed the appropriate healing processes. The thing is from what I have read that when there are unsynched entries in the gluster a host cannot be put into maintenance mode so that it can be rebooted, correct? Should I try to restart the glusterd service. Could someone tell me what I should do? Thank you all for your time and help, Maria Souvalioti

4 years, 4 months

5
21
0 / 0

deploy oVirt 4.4 errors

by grig.4n＠gmail.com

grep ERROR /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20210227232700-gil2fj.log ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9786 2021-02-28 00:00:12,059+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 ovirtsdk4.ConnectionError: Error while sending HTTP request: (7, 'Failed to connect to ovirt4-adm.domain.local port 443: No route to host') 2021-02-28 00:00:12,160+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"attempts": 50, "changed": false, "msg": "Error while sending HTTP request: (7, 'Failed to connect to ovirt4-adm.domain.local port 443: No route to host')"} 2021-02-28 00:00:58,055+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"} 2021-02-28 00:00:58,759+0600 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Closing up': Failed executing ansible-playbook 2021-02-28 00:01:05,984+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host ovirt4-adm.domain.local. port 22: No route to host", "skip_reason": "Host localhost is unreachable", "unreachable": true} 2021-02-28 00:01:22,146+0600 ERROR otopi.plugins.gr_he_common.core.misc misc._terminate:167 Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.

4 years, 4 months

3
17
0 / 0

CVE-2021-3156 && ovirt-node-ng 4.3 && 4.4 (sudo)

by Renaud RAKOTOMALALA

Hello everyone, I operate several oVirt clusters including pre-productions using ovirt-node-ng images. For our traditional clusters we manage the incident in a unitary way with a dedicated rpm, however for ovirt-node-ng I am not yet up to date with critical package updates process. Do you have any advice or tips? Nice day, Renaud

4 years, 4 months

3
2
1 / 0

Right Glusterfs Config on HA Server

by Marcel d'Heureuse

Hi Guys or Girls, I have a problem in a productive system. We use three Servers which have a Raid 5 with four Harddrives SATA 8 TB 7200 rpms. The Systems are interconnected for Glusterfs with 1 GB/s. The Network Load between all three servers are between 5 and 8 % of the network. The Glusterfs has a separate VLAN which complete isolated from other networks. Find attached a small drawing. On that server we rum as HA and there are 8 VMs installed. Most of the VMs working well but if there is high load on the system some of the VM can't write the Data in a "good" speed on the disks. The Glusterfs is configured as Replica on the Raid5 disk which is as Jbob in Glusterfs defined. If i take a look on the iostats i can see that the Harddisk have round 50 open connections some of the glusterfs and a lot of the vms. In side the VMs i can see long iowaiting times from 300 up to 600 ms. With is on case of a Posgresql database very long. Should we configure the Raid5 controller out that the controller only should 4 disks with 8 TB and Glusterfs should connect directly to the /dev/sd[b-e]? What do you think? Will SSD HDD will help on performance but the same config with Raid5 and Jbod could be the bigger problem. Thanks Marcel

4 years, 4 months

2
1
0 / 0

Multipath flapping with SAS via FCP

by Benoit Chatelain

Hi, I have some troubles with multipath. When I add SAS disk over FCP as Storage Domain via oVirt WebUI, The first link as active, but the second is stuck as failed. Volum disk is provided from Dell Compellent via FCP, and disk is transported in SAS. multipath is flapping in all hypervisor from the same domain disk: [root@isildur-adm ~]# tail -f /var/log/messages Feb 25 11:48:21 isildur-adm kernel: device-mapper: multipath: 253:3: Failing path 8:32. Feb 25 11:48:24 isildur-adm multipathd[659460]: 36000d31003d5c2000000000000000010: sdc - tur checker reports path is up Feb 25 11:48:24 isildur-adm multipathd[659460]: 8:32: reinstated Feb 25 11:48:24 isildur-adm multipathd[659460]: 36000d31003d5c2000000000000000010: remaining active paths: 2 Feb 25 11:48:24 isildur-adm kernel: device-mapper: multipath: 253:3: Reinstating path 8:32. Feb 25 11:48:24 isildur-adm kernel: sd 1:0:1:2: alua: port group f01c state S non-preferred supports toluSNA Feb 25 11:48:24 isildur-adm kernel: sd 1:0:1:2: alua: port group f01c state S non-preferred supports toluSNA Feb 25 11:48:24 isildur-adm kernel: device-mapper: multipath: 253:3: Failing path 8:32. Feb 25 11:48:25 isildur-adm multipathd[659460]: sdc: mark as failed Feb 25 11:48:25 isildur-adm multipathd[659460]: 36000d31003d5c2000000000000000010: remaining active paths: 1 --- [root@isildur-adm ~]# multipath -ll 36000d31003d5c2000000000000000010 dm-3 COMPELNT,Compellent Vol size=1.5T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw `-+- policy='service-time 0' prio=25 status=active |- 1:0:0:2 sdb 8:16 active ready running `- 1:0:1:2 sdc 8:32 failed ready running --- VDSM generate multipath.conf like this ( I have remove commented lines for read confort ) : [root@isildur-adm ~]# cat /etc/multipath.conf # VDSM REVISION 2.0 # This file is managed by vdsm. defaults { polling_interval 5 no_path_retry 16 user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 } blacklist { protocol "(scsi:adt|scsi:sbp)" } no_path_retry 16 } Have you some idea why this link is flapping on my two hypervisor? Thanks a lot in advance. - Benoit Chatelain

4 years, 4 months

4
5
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Users February 2021