gluster in ovirt-node in 4.5
by Yedidyah Bar David
Hi all,
In relation to a recent question here (thread "[ovirt-devel] [ANN]
Schedule for oVirt 4.5.0"), we are now blocked with the following
chain of changes/dependencies:
1. ovirt-ansible-collection recently moved from ansible-2.9 to
ansible-core 2.12.
2. ovirt-hosted-engine-setup followed it.
3. ovirt-release-host-node (the package including dependencies for
ovirt-node) requires gluster-ansible-roles.
4. gluster-ansible-roles right now requires 'ansible >= 2.9' (not
core), and I only checked one of its dependencies,
gluster-ansible-infra, and this one requires 'ansible >= 2.5'.
5. ansible-core does not 'Provide: ansible', IIUC intentionally.
So we should do one of:
1. Fix gluster-ansible* packages to work with ansible-core 2.12.
2. Only patch gluster-ansible* packages to require ansible-core,
without making sure they actually work with it. This will satisfy all
deps (I guess), make the thing installable, but will likely break when
actually used. Not sure it's such a good option, but nonetheless
relevant. Might make sense if someone is going to work on (1.) soon
but not immediately. This is what would have happened in practice, if
ansible-core would have 'Provide:'-ed ansible.
3. Patch ovirt-release-host-node to not require gluster-ansible*
anymore. This means it will not be included in ovirt-node. Users that
will want to use it will have to install the dependencies manually,
somehow, presumably after (1.) is done independently.
Our team (RHV integration) does not have capacity for (1.). I intend
to do (3.) very soon, unless we get volunteers for doing (1.) or
strong voices for (2.).
Best regards,
--
Didi
2 years, 10 months
Commvault Backup fails to attach disk snapshots
by martin.kaufmann@snt.at
Hello all,
since we updated our Commvault to the version "service pack 26 hotfix 16", random VM backups are failing to attach disk snapshots to the backup proxy VM.
We are running oVirt Version 4.4.10.6-1.el8 and the hypervisor is oVirt Node 4.4.9.
These are the logs from the managerVM:
##############
2022-03-24 11:13:59,390+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Disk hot-plug: <?xml version="1.0" encoding="UTF-8"?><hotplug>
<devices>
<disk snapshot="no" type="file" device="disk">
<target dev="sda" bus="scsi"/>
<source file="/rhev/data-center/mnt/blockSD/6da85fb1-ff4b-4665-ade3-0f32f83250bc/images/e3c292e0-dfd2-4cd5-a27f-f87308a18e64/8b524896-902c-445f-a33a-65335cb75eff">
<seclabel model="dac" type="none" relabel="no"/>
</source>
<driver name="qemu" io="threads" type="qcow2" error_policy="stop" cache="writethrough"/>
<alias name="ua-e3c292e0-dfd2-4cd5-a27f-f87308a18e64"/>
<address bus="0" controller="0" unit="0" type="drive" target="0"/>
<serial>e3c292e0-dfd2-4cd5-a27f-f87308a18e64</serial>
</disk>
</devices>
<metadata xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
<ovirt-vm:vm>
<ovirt-vm:device devtype="disk" name="sda">
<ovirt-vm:poolID>98d981da-d010-11ea-9e4d-00163e1be424</ovirt-vm:poolID>
<ovirt-vm:volumeID>8b524896-902c-445f-a33a-65335cb75eff</ovirt-vm:volumeID>
<ovirt-vm:shared>transient</ovirt-vm:shared>
<ovirt-vm:imageID>e3c292e0-dfd2-4cd5-a27f-f87308a18e64</ovirt-vm:imageID>
<ovirt-vm:domainID>6da85fb1-ff4b-4665-ade3-0f32f83250bc</ovirt-vm:domainID>
</ovirt-vm:device>
</ovirt-vm:vm>
</metadata>
</hotplug>
2022-03-24 11:13:59,898+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Failed in 'HotPlugDiskVDS' method
2022-03-24 11:13:59,898+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand' return value 'StatusOnlyReturn [status=Status [code=45, message=Requested operation is not valid: Domain already contains a disk with that address]]'
2022-03-24 11:13:59,898+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] HostName = hypervisor1.domain
2022-03-24 11:13:59,898+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Command 'HotPlugDiskVDSCommand(HostName = hypervisor1.domain, HotPlugDiskVDSParameters:{hostId='c2fd81cd-8fc8-410e-8333-a0a36b87ab2b', vmId='890c105d-333b-435e-8d8b-0b889f4f9c14', diskId='e3c292e0-dfd2-4cd5-a27f-f87308a18e64'})' execution failed: VDSGenericException: VDSErrorException: Failed to HotPlugDiskVDS, error = Requested operation is not valid: Domain already contains a disk with that address, code = 45
2022-03-24 11:13:59,898+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] FINISH, HotPlugDiskVDSCommand, return: , log id: 536a4414
2022-03-24 11:13:59,898+01 ERROR [org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Command 'org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to HotPlugDiskVDS, error = Requested operation is not valid: Domain already contains a disk with that address, code = 45 (Failed with error FailedToPlugDisk and code 45)
2022-03-24 11:13:59,899+01 INFO [org.ovirt.engine.core.bll.CommandCompensator] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Command [id=87712152-4660-40ca-b292-397796c735d5]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.storage.DiskVmElement; snapshot: VmDeviceId:{deviceId='e3c292e0-dfd2-4cd5-a27f-f87308a18e64', vmId='890c105d-333b-435e-8d8b-0b889f4f9c14'}.
2022-03-24 11:13:59,900+01 INFO [org.ovirt.engine.core.bll.CommandCompensator] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Command [id=87712152-4660-40ca-b292-397796c735d5]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.VmDevice; snapshot: VmDeviceId:{deviceId='e3c292e0-dfd2-4cd5-a27f-f87308a18e64', vmId='890c105d-333b-435e-8d8b-0b889f4f9c14'}.
2022-03-24 11:13:59,910+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] EVENT_ID: USER_FAILED_ATTACH_DISK_TO_VM(2,017), Failed to attach Disk VM1_Disk1 to VM backupproxyVM (User: backupagent@internal-authz).
2022-03-24 11:13:59,910+01 INFO [org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand] (default task-1578) [0aeb5794-5c2e-44c6-8fc5-34ec0e81d16b] Lock freed to object 'EngineLock:{exclusiveLocks='[e3c292e0-dfd2-4cd5-a27f-f87308a18e64=DISK]', sharedLocks=''}'
2022-03-24 11:13:59,910+01 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1578) [] Operation Failed: [Failed to hot-plug disk]
###################
Here are the logs from the hypervisor:
##################
2022-03-24 11:13:59,712+0100 INFO (jsonrpc/4) [virt.vm] (vmId='890c105d-333b-435e-8d8b-0b889f4f9c14') Hotplug disk xml: <?xml version='1.0' encoding='utf-8'?>
<disk device="disk" snapshot="no" type="file">
<address bus="0" controller="0" target="0" type="drive" unit="0" />
<source file="/var/lib/vdsm/transient/6da85fb1-ff4b-4665-ade3-0f32f83250bc-8b524896-902c-445f-a33a-65335cb75eff.pxhtm0_4">
<seclabel model="dac" relabel="no" type="none" />
</source>
<target bus="scsi" dev="sdbzk" />
<serial>e3c292e0-dfd2-4cd5-a27f-f87308a18e64</serial>
<driver cache="writethrough" error_policy="stop" io="threads" name="qemu" type="qcow2" />
<alias name="ua-e3c292e0-dfd2-4cd5-a27f-f87308a18e64" />
</disk>
(vm:3851)
2022-03-24 11:13:59,721+0100 ERROR (jsonrpc/4) [virt.vm] (vmId='890c105d-333b-435e-8d8b-0b889f4f9c14') Hotplug failed (vm:3859)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 3857, in hotplugDisk
self._dom.attachDevice(driveXml)
File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python3.6/site-packages/libvirt.py", line 682, in attachDevice
raise libvirtError('virDomainAttachDevice() failed')
libvirt.libvirtError: Requested operation is not valid: Domain already contains a disk with that address
#####################
Any idea what could cause these errors?
Best regards,
Martin
2 years, 10 months
Re: No bootable device
by nicolas@devels.es
Hi,
The checkbox is already checked, when you mark a disk as "OS" it's
marked automatically. Still it won't boot.
Thanks.
El 2022-03-23 15:23, Angus Clarke escribió:
> Hi Nicolas
>
> In oVirt 4.3:
>
> Compute -> Virtual Machines -> Select VM
>
> On the VM screen:
>
> Disks -> Highlight disk -> Edit
>
> Check the bootable tick box
>
> Hope that helps
> Angus
>
> -------------------------
>
> FROM: nicolas(a)devels.es <nicolas(a)devels.es>
> SENT: 23 March 2022 14:00
> TO: users(a)ovirt.org <users(a)ovirt.org>
> SUBJECT: [ovirt-users] No bootable device
>
> Hi,
>
> We're running oVirt 4.4.8.6. We have uploaded a qcow2 image
> (metasploit
> v.3, FWIW) using the GUI (Storage -> Disks -> Upload -> Start). The
> image is in qcow2 format. No options on the right side were checked.
> The
> upload went smoothly, so we now tried to attach the disk to a VM.
>
> To do that, we opened the VM -> Disks -> Attach and selected the disk.
>
> As interface, VirtIO-iSCSI was chosen, and the disk was marked as OS,
> so
> the "bootable" checkbox was selected.
>
> The VM was later powered on, but when accessing the console the
> message
> "No bootable device." appears. We're pretty sure this is a bootable
> image, because it was tested on other virtualization infrastructure
> and
> it boots well. We also tried to upload the image in RAW format but the
>
> result is the same.
>
> What are we missing here? Is anything else needed to do so the disk is
>
> bootable?
>
> Thanks.
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovi...
> [1]
> oVirt Code of Conduct:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovi...
> [2]
> List Archives:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.o...
> [3]
>
>
> Links:
> ------
> [1]
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovi...
> [2]
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovi...
> [3]
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.o...
2 years, 10 months
VDSM Issue after Upgrade of Node in HCI
by Abe E
So I have my 2nd node in my cluster that showed an upgrade option in OVIRT.
I put it in maint mode and ran the upgrade, it went through it but at one point it lost its internet connection or connection within the gluster, it didnt get to the reboot process and simply lost its connection to the engine from there.
I can see the gluster is still running and was able to keep all 3 glusters syncing but it seems the VDSM may be the culprit here.
ovirt-ha-agent wont start and the hosted-engine --connect-storage returns:
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/connect_storage_server.py", line 30, in <module>
timeout=ohostedcons.Const.STORAGE_SERVER_TIMEOUT,
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 312, in connect_storage_server
sserver.connect_storage_server(timeout=timeout)
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_server.py", line 411, in connect_storage_server
timeout=timeout,
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 474, in connect_vdsm_json_rpc
__vdsm_json_rpc_connect(logger, timeout)
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 415, in __vdsm_json_rpc_connect
timeout=VDSM_MAX_RETRY * VDSM_DELAY
RuntimeError: Couldn't connect to VDSM within 60 seconds
VDSM just keeps loop restart and failing, vdsm-tool configure --force throws this :
[root@ovirt-2 ~]# vdsm-tool configure --force
Checking configuration status...
sanlock is configured for vdsm
abrt is already configured for vdsm
Current revision of multipath.conf detected, preserving
lvm is configured for vdsm
Managed volume database is already configured
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts
Running configure...
libsepol.context_from_record: type insights_client_var_lib_t is not defined
libsepol.context_from_record: could not create context structure
libsepol.context_from_string: could not create context structure
libsepol.sepol_context_to_sid: could not convert system_u:object_r:insights_client_var_lib_t:s0 to sid
invalid context system_u:object_r:insights_client_var_lib_t:s0
libsemanage.semanage_validate_and_compile_fcontexts: setfiles returned error code 255.
Traceback (most recent call last):
File "/usr/bin/vdsm-tool", line 209, in main
return tool_command[cmd]["command"](*args)
File "/usr/lib/python3.6/site-packages/vdsm/tool/__init__.py", line 40, in wrapper
func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurator.py", line 145, in configure
_configure(c)
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurator.py", line 92, in _configure
getattr(module, 'configure', lambda: None)()
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurators/sebool.py", line 88, in configure
_setup_booleans(True)
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurators/sebool.py", line 60, in _setup_booleans
sebool_obj.finish()
File "/usr/lib/python3.6/site-packages/seobject.py", line 340, in finish
self.commit()
File "/usr/lib/python3.6/site-packages/seobject.py", line 330, in commit
rc = semanage_commit(self.sh)
OSError: [Errno 0] Error
Anyone have ideas where I could recover this, I am not sure if something corrupted on update or on a reboot -- I would prefer updating notes from the CLI next time but unfortunately I have not looked that far, it would have helped me see what failed and where much easier.
2 years, 10 months
Cloning VM selecting part of the disks
by Gianluca Cecchi
Hello,
in recent versions of oVirt (eg my last 4.4.10) there is the feature to
make a clone of a running VM.
This operation passes through a temporary VM snapshot (then automatically
deleted) and cloning of this snapshot.
Sometimes there is a need to clone a VM but only a subset of its disks is
required (eg in my case I want to retain boot disk, 20Gb, and dedicated sw
disk, 20Gb, but not data disk, that usually is big... in my case 200Gb).
In this scenario I have to go the old path where I explicitly create a
snapshot of the VM, where I can select a subset of the disks, then I clone
the snapshot and last I delete the snapshot.
Do you think it is interesting to have the option of selecting disks when
you clone a running VM and go automatic..?
If I want to open a bugzilla as RFE, what components and options I have to
select?
Thanks,
Gianluca
2 years, 10 months
Engine across Clusters
by Abe E
Has anyone setup hype converged gluster (3Nodes) and then added more after while maintaining access to the engine?
An oversight on my end was 2 fold, Engine gluster being on engine nodes and new nodes requiring their own cluster due to different CPU type.
So basically I am trying to see if I can setup a new cluster for my other nodes that require it while trying to give them ability to run the engine and ofcourse because they arent part of the engine cluster, we all know how that goes. Has anyone dealt with this or worked around it, any advices?
2 years, 10 months
Gluster issue with brick going down
by Chris Adams
I have a hyper-converged cluster running oVirt 4.4.10 and Gluster 8.6.
Periodically, one brick of one volume will drop out, but it's seemingly
random as to which volume and brick is affected. All I see in the brick
log is:
[2022-03-19 13:27:36.360727] W [MSGID: 113075] [posix-helpers.c:2135:posix_fs_health_check] 0-vmstore-posix: aio_read_cmp_buf() on /gluster_bricks/vmstore/vmstore/.glusterfs/health_check returned ret is -1 error is Structure needs cleaning
[2022-03-19 13:27:36.361160] M [MSGID: 113075] [posix-helpers.c:2214:posix_health_check_thread_proc] 0-vmstore-posix: health-check failed, going down
[2022-03-19 13:27:36.361395] M [MSGID: 113075] [posix-helpers.c:2232:posix_health_check_thread_proc] 0-vmstore-posix: still alive! -> SIGTERM
Searching around, I see references to similar issues, but no real
solutions. I see a suggestion that changing the health-check-interval
from 10 to 30 seconds helps, but it looks like 30 seconds is the default
with this version of Gluster (and I don't see it explicitly set for any
of my volumes).
While "Structure needs cleaning" appears to be an XFS filesystem error,
I don't see any XFS errors from the kernel.
This is a low I/O cluster - the storage network is on two 10 gig
switches with a two-port LAG to each server, but typically is only
seeing a few tens of megabits per second.
--
Chris Adams <cma(a)cmadams.net>
2 years, 10 months
Unable to deploy ovirt 4.4 on alma 8.5
by Richa Gupta
Hi Team,
While installing ovirt 4.4 on alma Linux 8.5 we are facing following issue:
[ INFO ] TASK [ovirt.ovirt.engine_setup : Install oVirt Engine package]
[ ERROR ] fatal: [localhost -> 192.168.222.56]: FAILED! => {"changed": false, "msg": "Failed to download metadata for repo 'ovirt-4.4': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried", "rc": 1, "results": []}
Using repo as following:
[ovirt-4.4]
name=Latest oVirt 4.4 Release
#baseurl=https://resources.ovirt.org/pub/ovirt-4.4/rpm/el$releasever/
mirrorlist=https://mirrorlist.ovirt.org/mirrorlist-ovirt-4.4-el$releasever
enabled=1
countme=1
fastestmirror=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-ovirt-4.4
Can someone please help in resolving this issue?
2 years, 10 months