Grafana - Origin Not Allowed
by Maton, Brett
oVirt 4.5.0.8-1.el8
I tried to connect to grafana via the monitoring portal link from the dash
and all panels are failing to display any data with varying error messages,
but all include 'Origin Not Allowed'
I navigated to Data Sources and ran a test on the PostgreSQL connection
(localhost) which threw the same Origin Not Allowed error message.
Any suggestions?
1 year, 4 months
Cannot successfully import Windows vm to new ovirt deployment
by netracerx@mac.com
I am running into an issue that, from what little Google-fu I've been able to use, should be solved in oVirt 4.5. I'm trying to import WS2019 VMs from ESXi 7 as OVFs or even OVAs. But when I do the import and set the OS to Windows Server 2019 x64 in the admin portal, I get the error "Invalid time zone for the given OS type ". If I leave it at Other OS, the import fails with event ID 1153. I've been banging my head against this for over a week (previously couldn't get the management VM to complete setup), so any guidance is appreciated. Let me know what to pull to help me pin this down?
BTW, yes this is a self hosted install as a nested VM on ESXi. Kinda have to right now to test everything (easiest for me over doing a bare metal on my old PowerEdge 2900 at the moment). Running CentOS 9 Stream.
1 year, 4 months
Hosted-engine restore failing
by Devin A. Bougie
Hi, All. We are attempting to migrate to a new storage domain for our oVirt 4.5.4 self-hosted engine setup, and are failing with "cannot import name 'Callable' from 'collections'"
Please see below for the errors on the console.
Many thanks,
Devin
------
hosted-engine --deploy --restore-from-file=backup.bck --4
...
[ INFO ] Checking available network interfaces:
[ ERROR ] b'[WARNING]: Skipping plugin (/usr/share/ovirt-hosted-engine-\n'
[ ERROR ] b'setup/he_ansible/callback_plugins/2_ovirt_logger.py), cannot load: cannot\n'
[ ERROR ] b"import name 'Callable' from 'collections'\n"
[ ERROR ] b'(/usr/lib64/python3.11/collections/__init__.py)\n'
[ ERROR ] b"ERROR! Unexpected Exception, this is probably a bug: cannot import name 'Callable' from 'collections' (/usr/lib64/python3.11/collections/__init__.py)\n"
[ ERROR ] Failed to execute stage 'Environment customization': Failed executing ansible-playbook
[ INFO ] Stage: Clean up
[ INFO ] Cleaning temporary resources
[ ERROR ] b'[WARNING]: Skipping plugin (/usr/share/ovirt-hosted-engine-\n'
[ ERROR ] b'setup/he_ansible/callback_plugins/2_ovirt_logger.py), cannot load: cannot\n'
[ ERROR ] b"import name 'Callable' from 'collections'\n"
[ ERROR ] b'(/usr/lib64/python3.11/collections/__init__.py)\n'
[ ERROR ] b"ERROR! Unexpected Exception, this is probably a bug: cannot import name 'Callable' from 'collections' (/usr/lib64/python3.11/collections/__init__.py)\n"
[ ERROR ] Failed to execute stage 'Clean up': Failed executing ansible-playbook
[ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20231011110358.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed
Log file is located at
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20231011110352-raupj9.log
1 year, 4 months
Gluster: Ideas for migration
by jonas@rabe.ch
Hello
I have to migrate the Gluster volumes from an old oVirt cluster to a newly built one. I looked into migration strategies, but everything that Red Hat recommends is related to replacing old bricks. In a testing environment I created two clusters and wanted to migrate one volume after the other. Unfortunately that fails because a node cannot be part of two clusters at the same time.
The next thing I see, is to recreate the volumes on the new cluster, then constantly rsync the files from the old cluster to the new one and at a specified point in time make the cut over where I stop the applicaiton, do a final rsync and remount the new volume under the old path.
Is there any other, nicer way I could accomplish migrating a volume from one Gluster cluster to another?
1 year, 5 months
Engine on EL 9
by David Carvalho
Hello, good morning.
I’m using Oracle Linux and I intended to install a virtualization platffom with KVM and oracle VM. The Oracle documention only mentions Oracle Linux 8 and there are no oVirt repositories available for OL 9.
I visited ovirt.org site and at the download page it only mentions:
Engine:
* Red Hat Enterprise Linux 8.7 (or similar)
* CentOS Stream 8
I still had no reply at Oracle foruns. Will there be a possibility to use this with Oracle Linux 9 soon?
I have 3 servers to install and I also intend to use Gluster FS.
Thanks and regards.
Os melhores cumprimentos
David Alexandre M. de Carvalho
═══════════════════
Especialista de Informática
Departamento de Informática
Universidade da Beira Interior
1 year, 5 months
VMs randomly pause due to unknown storage error, unable to resume
by Jon Sattelberger
Hi,
> VM xxx has been paused due to unknown storage error.
> Migration failed due to a failed validation: [Migrating a VM in paused status due to I/O error is not supported.] (VM: xxx, Source: yyy).
Up until recently oVirt 4.5.4 has been running fine on our RHEL 8 hypervisors with primarily (and a few appliances) Linux guests. We started to add Windows 2019 VMs to the cluster with the guest agent installed. They seem to run fine at first, but some of the Windows VMs may randomly pause due to an unknown storage error. The VM cannot be resumed through the UI or virsh. The paused VM cannot be migrated to another Hypervisor. The GlusterFS storage volumes seem fine. Resetting the VM seems to work, but eventually it'll become paused again. The only thing that came to my mind is the virtual hard disks are thin provisioned. Is a preallocated disk necessary for Windows VMs? Any helpful hints on where to look next is greatly appreciated.
Thank you,
Jon
1 year, 5 months
Hosted-engine restore failing when migrating to new storage domain
by Devin A. Bougie
Hello,
We have a functioning oVirt 4.5.4 cluster running on fully-updated EL9.2 hosts. We are trying to migrate the self-hosted engine to a new iSCSI storage domain using the existing hosts, following the documented procedure:
- set the cluster into global maintenance mode
- backup the engine using "engine-backup --scope=all --mode=backup --file=backup.bck --log=backuplog.log"
- shutdown the engine
- restore the engine using "hosted-engine --deploy --4 --restore-from-file=backup.bck"
This almost works, but fails with the attached log file. Any help or suggestions would be greatly appreciated, including alternate procedures for migrating a self-hosted engine from one domain to another.
Many thanks,
Devin
1 year, 5 months
How to start QA and testing work in the oVirt project?
by song_chao@massclouds.com
Hello everyone, I am a testing and development engineer who has been working for 10 years. I want to learn and participate in Ovirt's testing work, and make my own contribution. Through learning and understanding, I believe that in the near future, I can also make some contributions to the ovirt community. Currently, I would like some information about QA and testing processes and methods. Can anyone provide me with it? Thank you.
1 year, 5 months
Multiple hosts stuck in Connecting state waiting for storage pool to go up.
by ivan.lezhnjov.iv@gmail.com
Hi!
We have a problem with multiple hosts stuck in Connecting state, which I hoped somebody here could help us wrap our heads around.
All hosts, except one, seem to have very similar symptoms but I'll focus on one host that represents the rest.
So, the host is stuck in Connecting state and this what we see in oVirt log files.
/var/log/ovirt-engine/engine.log:
2023-04-20 09:51:53,021+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesAsyncVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [] Command 'GetCapabilitiesAsyncVDSCommand(HostName = ABC010-176-XYZ, VdsIdAndVdsVDSCommandParametersBase:{hostId='2c458562-3d4d-4408-afc9-9a9484984a91', vds='Host[ABC010-176-XYZ,2c458562-3d4d-4408-afc9-9a9484984a91]'})' execution failed: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: SSL session is invalid
2023-04-20 09:55:16,556+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-67) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ABC010-176-XYZ command Get Host Capabilities failed: Message timeout which can be caused by communication issues
/var/log/vdsm/vdsm.log:
2023-04-20 17:48:51,977+0300 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList() from=internal, task_id=ebce7c8c-6ded-454e-9aee-86edf72764ef (api:31)
2023-04-20 17:48:51,977+0300 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=ebce7c8c-6ded-454e-9aee-86edf72764ef (api:37)
2023-04-20 17:48:51,978+0300 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:723)
Both engine.log and vdsm.log are flooded with these messages. They are repeated at regular intervals ad infinitum. This is one common symptom shared by multiple hosts in our deployment. They all have these message loops in engine.log and vdsm.log files. On all
Running vdsm-client Host getConnectedStoragePools also returns an empty list represented by [] on all hosts (but interestingly there is one that showed Storage Pool UUID and yet it was still stuck in Connecting state).
This particular host (ABC010-176-XYZ) is connected to 3 CEPH iSCSI Storage Domains and lsblk shows 3 block devices with matching UUIDs in their device components. So, the storage seems to be connected but the Storage Pool is not? How is that even possible?
Now, what's even more weird is that we tried rebooting the host (via Administrator Portal) and it didn't help. We even tried removing and re-adding the host in Administrator Portal but to no avail.
Additionally, the host refused to go into Maintenance mode so we had to enforce it by manually updating Engine DB.
We also tried reinstalling the host via Administrator Portal and ran into another weird problem, which I'm not sure if it's a related one or a problem that deserves a dedicated discussion thread but, basically, the underlying Ansible playbook exited with the following error message:
"stdout" : "fatal: [10.10.10.176]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Data could not be sent to remote host \\\"10.10.10.176\\\". Make sure this host can be reached over ssh: \", \"unreachable\": true}",
Counterintuitively, just before running Reinstall via Administrator Portal we had been able to reboot the same host (which as you know oVirt does via Ansible as well). So, no changes on the host in between just different Ansible playbooks. To confirm that we actually had access to the host over ssh we successfully ran ssh -p $PORT root(a)10.10.10.176 -i /etc/pki/ovirt-engine/keys/engine_id_rsa and it worked.
That made us scratch our heads for a while but what seems to had fixed Ansible's ssh access problems was manual full stop of all VDSM-related systemd services on the host. It was just a wild guess but as soon as we stopped all VDSM services Ansible stopped complaining about not being able to reach the target host and successfully did its job.
I'm sure you'd like to see more logs but I'm not certain what exactly is relevant. There are a ton of logs as this deployment is comprised of nearly 80 hosts. So, I guess it's best if you just request to see specific logs, messages or configuration details and I'll cherry-pick what's relevant.
We don't really understand what's going on and would appreciate any help. We tried just about anything we could think of to resolve this issue and are running out of ideas what to do next.
If you have any questions just ask and I'll do my best to answer them.
1 year, 5 months
Direct LUN I/O errors with SCSI Pass-through enabled
by mgs@ordix.de
Hi,
in our environment (Version 4.4.10.7) we use fibre channel LUNs, which we attach directly to the VMs (as Direct LUN) with VirtIO-SCSI and SCSI pass-through enabled. The virtual machines run an application that requires 4096 as physical_block_size and 512 as logical_block_size. For this reason, we had to enable SCSI pass-through. Only with SCSI pass-through the correct physical_block_size is passed through to the VM.
Now we have the following problem on just about every VM:
Error messages of the following form occur in the VMs (in /var/log/messages):
kernel: blk_update_request: I/O error, dev sdd, sector 352194592 op 0x1:(WRITE) flags 0xc800 phys_seg 16 prio class 0
This error message coincides with a crash of the application. The error message seems to belong to SCSI.
We are currently trying to find an alternative to SCSI pass-through. We want to use VirtIO and somehow pass the physical_block_size. Since the XML files of the VMs are transient, we cannot make any changes there.
Does anyone have an idea what the error could be or how to pass the correct physical_block_size? Could VDSM hooks help with this?
Thank you and regards
Miguel
1 year, 5 months