Ovirt Hosted Engine Setup
by Vijay Sachdeva
Hi All,
I am trying to deploy hosted engine setup, the setup got stuck at below for hours:
Is this is a known bug?
Thanks
Vijay Sachdeva
4 years, 5 months
Slow ova export performance
by francesco@shellrent.com
Hi All,
I'm facing a really slow export ov vms hosted on a single node cluster, in a local storage. The Vm disk is 600 GB and the effective usage is around 300 GB. I estimated that the following process would take up about 15 hours to end:
vdsm 25338 25332 99 04:14 pts/0 07:40:09 qemu-img measure -O qcow2 /rhev/data-center/mnt/_data/6775c41c-7d67-451b-8beb-4fd086eade2e/images/a084fa36-0f93-45c2-a323-ea9ca2d16677/55b3eac5-05b2-4bae-be50-37cde7050697
A strace -p of the pid shows a slow progression to reach the effective size.
lseek(11, 3056795648, SEEK_DATA) = 3056795648
lseek(11, 3056795648, SEEK_HOLE) = 13407092736
lseek(14, 128637468672, SEEK_DATA) = 128637468672
lseek(14, 128637468672, SEEK_HOLE) = 317708828672
lseek(14, 128646250496, SEEK_DATA) = 128646250496
lseek(14, 128646250496, SEEK_HOLE) = 317708828672
lseek(14, 128637730816, SEEK_DATA) = 128637730816
lseek(14, 128637730816, SEEK_HOLE) = 317708828672
lseek(14, 128646774784, SEEK_DATA) = 128646774784
lseek(14, 128646774784, SEEK_HOLE) = 317708828672
lseek(14, 128646709248, SEEK_DATA) = 128646709248
The process take a single full core, but i don't think this is the problem. The I/O is almost nothing.
Any idea/suggestion?
Thank you for your time
Regards
4 years, 5 months
VDSM HOST ISSUE - Message timeout which can be caused by communication issues
by lu.alfonsi@almaviva.it
Good morning, i have installed a new ovirt 4.3.10 enviroment but sometimes on the events of some hosts i read this error message:
VDSM node-1-ra command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Due to this issue hosts are in unresponsive state and the only way to resolve this is and activate host in the cluster, is to restart the ovirt engine service. Can anyone help me?
Thanks in advance
Luigi
4 years, 5 months
oVirt install questions
by David White
I'm reading through all of the documentation at https://ovirt.org/documentation/, and am a bit overwhelmed with all of the different options for installing oVirt.
My particular use case is that I'm looking for a way to manage VMs on multiple physical servers from 1 interface, and be able to deploy new VMs (or delete VMs) as necessary. Ideally, it would be great if I could move a VM from 1 host to a different host as well, particularly in the event that 1 host becomes degraded (bad HDD, bad processor, etc...)
I'm trying to figure out what the difference is between an oVirt Node and the oVirt Engine, and how the engine differs from the Manager.
I get the feeling that `Engine` = `Manager`. Same thing. I further think I understand the Engine to be essentially synonymous with a vCenter VM for ESXi hosts. Is this correct?
If so, then what's the difference between the `self-hosted` vs the `stand-alone` engines?
oVirt Engine requirements look to be a minimum of 4GB RAM and 2CPUs.
oVirt Nodes, on the other hand, require only 2GB RAM.
Is this a requirement just for the physical host, or is that how much RAM that each oVirt node process requires? In other words, if I have a physical host with 12GB of physical RAM, will I only be able to allocate 10GB of that to guest VMs? How much of that should I dedicated to the oVirt node processes?
Can you install the oVirt Engine as a VM onto an existing oVirt Node? And then connect that same node to the Engine, once the Engine is installed?
Reading through the documentation, it also sounds like oVirt Engine and oVirt Node require different versions of RHEL or CentOS.
I read that the Engine for oVirt 4.4.0 requires RHEL (or CentOS) 8.2, whereas each Node requires 7.x (although I'll plan to just use the oVirt Node ISO).
I'm also wondering about storage.
I don't really like the idea of using local storage, but a single NFS server would also be a single point of failure, and Gluster would be too expensive to deploy, so at this point, I'm leaning towards using local storage.
Any advice or clarity would be greatly appreciated.
Thanks,
David
Sent with ProtonMail Secure Email.
4 years, 5 months
about ovirt fence
by 崔涛的个人邮箱
If I configure fence for ovirt hosts, is the fence system used to fence host
or to fence the vm running on the host ?
4 years, 5 months
Info on transporting a gluster storage domain
by Gianluca Cecchi
Hello,
I have env1 and env2, both in 4.3 and both configured as single host HCI
environments.
On both I have the predefined hosted_storage, data and vmstore gluster
storage domains.
On env1 I have hosted_storage and data on a disk, and vmstore on a whole
different disk. I also have a "big2" gluster storage domain created in
another disk in a second time.
I want to scratch env1 but preserve and then import vmstore and big2
storage domains in env2
This "big2" is configured as a PV on the whole 4Tb disk on env1.
- nvme1n1 4Tb gluster_vg_4t2 serial: BTLF72840DVK4P0DGN
eui.01000000010000005cd2e45c02de4d51 dm-3 NVME,INTEL SSDPEDKX040T7
size=3.6T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
`- 1:0:1:0 nvme1n1 259:3 active undef running
[root@ovirt ~]# pvs /dev/mapper/eui.01000000010000005cd2e45c02de4d51
PV VG Fmt Attr
PSize PFree
/dev/mapper/eui.01000000010000005cd2e45c02de4d51 gluster_vg_4t2 lvm2 a--
<3.64t 0
[root@ovirt ~]#
[root@ovirt ~]# lvs gluster_vg_4t2
LV VG Attr LSize Pool Origin Data%
Meta% Move Log Cpy%Sync Convert
gluster_lv_big2 gluster_vg_4t2 Vwi-aot--- <4.35t my_pool 1.36
my_pool gluster_vg_4t2 twi-aot--- <3.61t 1.64
0.18
[root@ovirt ~]#
In similar way the vmstore storage domain has:
- nvme2n1 1Tb gluster_vg_nvme1n1 serial: PHLF8125037R1P0GGN
eui.01000000010000005cd2e4e359284f51 dm-1 NVME,INTEL SSDPE2KX010T7
size=932G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
`- 2:0:1:0 nvme2n1 259:2 active undef running
[root@ovirt ~]# pvs /dev/mapper/eui.01000000010000005cd2e4e359284f51
PV VG Fmt
Attr PSize PFree
/dev/mapper/eui.01000000010000005cd2e4e359284f51 gluster_vg_nvme1n1 lvm2
a-- 931.51g 0
[root@ovirt ~]#
[root@ovirt ~]# lvs gluster_vg_nvme1n1
LV VG Attr LSize
Pool Origin Data% Meta% Move Log Cpy%Sync
Convert
gluster_lv_vmstore gluster_vg_nvme1n1 Vwi-aot--- 930.00g
gluster_thinpool_gluster_vg_nvme1n1 63.99
gluster_thinpool_gluster_vg_nvme1n1 gluster_vg_nvme1n1 twi-aot--- 921.51g
64.58 1.80
[root@ovirt ~]#
I presume the correct way is:
- env1
delete and detach vmstore and big2 (without formatting/zeroing)
- env2
attach the two disks to the system
and then?
What should I do at gluster commands level to "import" the already setup
gluster bricks/volumes and then at oVirt side to import the corresponding
storage domain?
BTW: can I import the previous storage domain named "vmstore" giving anothe
r name such as vmstore2, not to create conflict with already existing
"vmstore" storage domain or the name is hard coded when importing and
creating possible conflict?
Thanks,
Gianluca
4 years, 5 months
Strange SD problem
by Arsène Gschwind
HI,
I'm having a strange behavior with a SD. When trying to manage the SD I see they "Add" button for the LUN which should already be the one use for that SD.
In the Logs I see the following:
2020-07-13 17:48:07,292+02 ERROR [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] Can't execute batch: Batch entry 0 select * from public.insertluns(CAST ('repl_HanaLogs_osd_01' AS varchar),CAST ('DPUtaW-Q5zp-aZos-HriP-5Z0v-hiWO-w7rmwG' AS varchar),CAST ('4TCXZ7-R1l1-xkdU-u0vx-S3n4-JWcE-qksPd1' AS varchar),CAST ('SHUAWEI_XSG1_2102350RMG10HC0000200035' AS varchar),CAST (7 AS int4),CAST ('HUAWEI' AS varchar),CAST ('XSG1' AS varchar),CAST (2548 AS int4),CAST (268435456 AS int8)) as result was aborted: ERROR: duplicate key value violates unique constraint "pk_luns"
Detail: Key (lun_id)=(repl_HanaLogs_osd_01) already exists.
Where: SQL statement "INSERT INTO LUNs (
LUN_id,
physical_volume_id,
volume_group_id,
serial,
lun_mapping,
vendor_id,
product_id,
device_size,
discard_max_size
)
VALUES (
v_LUN_id,
v_physical_volume_id,
v_volume_group_id,
v_serial,
v_lun_mapping,
v_vendor_id,
v_product_id,
v_device_size,
v_discard_max_size
)"
PL/pgSQL function insertluns(character varying,character varying,character varying,character varying,integer,character varying,character varying,integer,bigint) line 3 at SQL statement Call getNextException to see other errors in the batch.
2020-07-13 17:48:07,292+02 ERROR [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] Can't execute batch. Next exception is: ERROR: duplicate key value violates unique constraint "pk_luns"
Detail: Key (lun_id)=(repl_HanaLogs_osd_01) already exists.
Where: SQL statement "INSERT INTO LUNs (
LUN_id,
physical_volume_id,
volume_group_id,
serial,
lun_mapping,
vendor_id,
product_id,
device_size,
discard_max_size
)
VALUES (
v_LUN_id,
v_physical_volume_id,
v_volume_group_id,
v_serial,
v_lun_mapping,
v_vendor_id,
v_product_id,
v_device_size,
v_discard_max_size
)"
PL/pgSQL function insertluns(character varying,character varying,character varying,character varying,integer,character varying,character varying,integer,bigint) line 3 at SQL statement
2020-07-13 17:48:07,293+02 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] transaction rolled back
2020-07-13 17:48:07,293+02 ERROR [org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] Command 'org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand' failed: ConnectionCallback; ]; ERROR: duplicate key value violates unique constraint "pk_luns"
Detail: Key (lun_id)=(repl_HanaLogs_osd_01) already exists.
Where: SQL statement "INSERT INTO LUNs (
LUN_id,
physical_volume_id,
volume_group_id,
serial,
lun_mapping,
vendor_id,
product_id,
device_size,
discard_max_size
)
VALUES (
v_LUN_id,
v_physical_volume_id,
v_volume_group_id,
v_serial,
v_lun_mapping,
v_vendor_id,
v_product_id,
v_device_size,
v_discard_max_size
)"
It looks like the engine will add a LUN to an SD and it already exist...
Any Idea how to resolve that problem?
Thanks a lot
--
Arsène Gschwind <arsene.gschwind(a)unibas.ch<mailto:arsene.gschwind@unibas.ch>>
Universitaet Basel
4 years, 5 months
ovirt 4.4.1.1 hci and problems with ansible 2.9.10 and/or missing python2
by Gianluca Cecchi
Hello,
today I wanted to test a single host hci install using
ovirt-node-ng-installer-4.4.1-2020071311.el8.iso
On this same environment 4.4.0 gui wizard worked ok, apart a final problem
with cpu flags and final engine boot up, so I wanted to verify if all is ok
now, because that problem should have been fixed
But I incurred in the same problems recently discussed here:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3AARZD4VBNN...
The first part failed because of
TASK [gluster.infra/roles/backend_setup : Change to Install lvm tools for
RHEL systems.] ***
fatal: [novirt2st.storage.local]: FAILED! => {"changed": false, "msg": "The
Python 2 yum module is needed for this module. If you require Python 3
support use the `dnf` Ansible module instead."}
To have the first stage work (now I have to continue with the "Continue to
Hosted Engine Deployment" step) I have to modify:
- /etc/ansible/roles/gluster.infra/roles/backend_setup/tasks/main.yml
force to use dnf as package manager. For some reason not automatically
detected by the "package" ansible module
from
- name: Change to Install lvm tools for RHEL systems.
package:
name: device-mapper-persistent-data
state: present
when: ansible_os_family == 'RedHat'
to
- name: Change to Install lvm tools for RHEL systems.
package:
name: device-mapper-persistent-data
state: present
use: dnf
when: ansible_os_family == 'RedHat'
- /etc/ansible/roles/gluster.infra/roles/backend_setup/tasks/vdo_create.yml
change from yum to package and specify to use dnf
from:
- name: Install VDO dependencies
#maybe use package module?
yum:
name: "{{ packages }}"
register: vdo_deps
...
to:
- name: Install VDO dependencies
#maybe use package module?
package:
name: "{{ packages }}"
use: dnf
register: vdo_deps
...
So in my opinion a bug has to be opened if not already done.
The root cause being or for any reason ansible 2.9.10 that recently updated
2.9.9 or having removed python2 modules from the distribution, that before
was for some reason silently used with yum, while not using the expected
dnf module with package....
I'm going to test if in the next phase any further modification has to be
done.
Gianluca
4 years, 5 months
oVirt Node 4.4.1.1 Cockpit Hyperconverged Gluster deploy fails insufficient free space no matter how small the volume is set
by clam2718@gmail.com
Hi,
Deploying oVirt 4.4.1.1 via Cockpit --> Hosted Engine --> Hyperconverged fails at Gluster deployment:
TASK [gluster.infra/roles/backend_setup : Create thick logical volume] *********
failed: [fmov1n3.sn.dtcorp.com] (item={'vgname': 'gluster_vg_nvme0n1', 'lvname': 'gluster_lv_engine', 'size': '100G'}) => {"ansible_index_var": "index", "ansible_loop_var": "item", "changed": false, "err": " Volume group \"gluster_vg_nvme0n1\" has insufficient free space (25599 extents): 25600 required.\n", "index": 0, "item": {"lvname": "gluster_lv_engine", "size": "100G", "vgname": "gluster_vg_nvme0n1"}, "msg": "Creating logical volume 'gluster_lv_engine' failed", "rc": 5}
failed: [fmov1n1.sn.dtcorp.com] (item={'vgname': 'gluster_vg_nvme0n1', 'lvname': 'gluster_lv_engine', 'size': '100G'}) => {"ansible_index_var": "index", "ansible_loop_var": "item", "changed": false, "err": " Volume group \"gluster_vg_nvme0n1\" has insufficient free space (25599 extents): 25600 required.\n", "index": 0, "item": {"lvname": "gluster_lv_engine", "size": "100G", "vgname": "gluster_vg_nvme0n1"}, "msg": "Creating logical volume 'gluster_lv_engine' failed", "rc": 5}
failed: [fmov1n2.sn.dtcorp.com] (item={'vgname': 'gluster_vg_nvme0n1', 'lvname': 'gluster_lv_engine', 'size': '100G'}) => {"ansible_index_var": "index", "ansible_loop_var": "item", "changed": false, "err": " Volume group \"gluster_vg_nvme0n1\" has insufficient free space (25599 extents): 25600 required.\n", "index": 0, "item": {"lvname": "gluster_lv_engine", "size": "100G", "vgname": "gluster_vg_nvme0n1"}, "msg": "Creating logical volume 'gluster_lv_engine' failed", "rc": 5}
Deployment is on 3 count Dell PowerEdge R740xd with 5 count 1.6TB NVMe drives per host. Deployment is only to three as JBOD, 1 drive per node per volume (engine, data, vmstore) utilizing VDO.
Thus, deploying even a 100G volume to 1.6TB drive fails with "insufficient free space" error.
I suspect this might have to do with the Ansible playbook deploying Gluster mishandling the logical volume creation due to the rounding error as described here: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/...
If I can provide any additional information, logs, etc. please ask. Also, if anyone has experience/suggestions with Gluster config for hyperconverged setup on NVMe drives I would greatly appreciate any pearls of wisdom.
Thank you so very much for any assistance!
Charles
4 years, 5 months
Parent checkpoint ID does not match the actual leaf checkpoint
by Łukasz Kołaciński
Hello,
Thanks to previous answers, I was able to make backups. Unfortunately, we had some infrastructure issues and after the host reboots new problems appeared. I am not able to do any backup using the commands that worked yesterday. I looked through the logs and there is something like this:
2020-07-17 15:06:30,644+02 ERROR [org.ovirt.engine.core.bll.StartVmBackupCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-54) [944a1447-4ea5-4a1c-b971-0bc612b6e45e] Failed to execute VM backup operation 'StartVmBackup': {}: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to StartVmBackupVDS, error = Checkpoint Error: {'parent_checkpoint_id': None, 'leaf_checkpoint_id': 'cd078706-84c0-4370-a6ec-654ccd6a21aa', 'vm_id': '116aa6eb-31a1-43db-9b1e-ad6e32fb9260', 'reason': 'Parent checkpoint ID does not match the actual leaf checkpoint'}, code = 1610 (Failed with error unexpected and code 16)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:114)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2114)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.StartVmBackupCommand.performVmBackupOperation(StartVmBackupCommand.java:368)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.StartVmBackupCommand.runVmBackup(StartVmBackupCommand.java:225)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.StartVmBackupCommand.performNextOperation(StartVmBackupCommand.java:199)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
at org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
at org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
And the last error is:
2020-07-17 15:13:45,835+02 ERROR [org.ovirt.engine.core.bll.StartVmBackupCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-14) [f553c1f2-1c99-4118-9365-ba6b862da936] Failed to execute VM backup operation 'GetVmBackupInfo': {}: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetVmBackupInfoVDS, error = No such backup Error: {'vm_id': '116aa6eb-31a1-43db-9b1e-ad6e32fb9260', 'backup_id': 'bf1c26f7-c3e5-437c-bb5a-255b8c1b3b73', 'reason': 'VM backup not exists: Domain backup job id not found: no domain backup job present'}, code = 1601 (Failed with error unexpected and code 16)
(these errors are from full backup)
Like I said this is very strange because everything was working correctly.
Regards
Łukasz Kołaciński
Junior Java Developer
e-mail: l.kolacinski(a)storware.eu<mailto:l.kolacinski@storware.eu>
<mailto:m.helbert@storware.eu>
[STORWARE]<http://www.storware.eu/>
ul. Leszno 8/44
01-192 Warszawa
www.storware.eu <https://www.storware.eu/>
[facebook]<https://www.facebook.com/storware>
[twitter]<https://twitter.com/storware>
[linkedin]<https://www.linkedin.com/company/storware>
[Storware_Stopka_09]<https://www.youtube.com/channel/UCKvLitYPyAplBctXibFWrkw>
Storware Spółka z o.o. nr wpisu do ewidencji KRS dla M.St. Warszawa 000510131 , NIP 5213672602. Wiadomość ta jest przeznaczona jedynie dla osoby lub podmiotu, który jest jej adresatem i może zawierać poufne i/lub uprzywilejowane informacje. Zakazane jest jakiekolwiek przeglądanie, przesyłanie, rozpowszechnianie lub inne wykorzystanie tych informacji lub podjęcie jakichkolwiek działań odnośnie tych informacji przez osoby lub podmioty inne niż zamierzony adresat. Jeżeli Państwo otrzymali przez pomyłkę tę informację prosimy o poinformowanie o tym nadawcy i usunięcie tej wiadomości z wszelkich komputerów. This message is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you have received this message in error, please contact the sender and remove the material from all of your computer systems.
4 years, 5 months