Error: Adding new Host to ovirt-engine
by Ahmad Khiet
Hi,
Can't add new host to ovirt engine, because the following error:
2019-06-12 12:23:09,664 p=4134 u=engine | TASK [ovirt-host-deploy-facts :
Set facts] *************************************
2019-06-12 12:23:09,684 p=4134 u=engine | ok: [10.35.1.17] => {
"ansible_facts": {
"ansible_python_interpreter": "/usr/bin/python2",
"host_deploy_vdsm_version": "4.40.0"
},
"changed": false
}
2019-06-12 12:23:09,697 p=4134 u=engine | TASK [ovirt-provider-ovn-driver
: Install ovs] *********************************
2019-06-12 12:23:09,726 p=4134 u=engine | fatal: [10.35.1.17]: FAILED! =>
{}
MSG:
The conditional check 'cluster_switch == "ovs" or (ovn_central is defined
and ovn_central | ipaddr and ovn_engine_cluster_version is
version_compare('4.2', '>='))' failed. The error was: The ipaddr filter
requires python's netaddr be installed on the ansible controller
The error appears to be in
'/home/engine/apps/engine/share/ovirt-engine/playbooks/roles/ovirt-provider-ovn-driver/tasks/configure.yml':
line 3, column 5, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
- block:
- name: Install ovs
^ here
2019-06-12 12:23:09,728 p=4134 u=engine | PLAY RECAP
*********************************************************************
2019-06-12 12:23:09,728 p=4134 u=engine | 10.35.1.17 :
ok=3 changed=0 unreachable=0 failed=1 skipped=0 rescued=0
ignored=0
whats missing!?
Thanks
--
Ahmad Khiet
Red Hat <https://www.redhat.com/>
akhiet(a)redhat.com
M: +972-54-6225629
<https://red.ht/sig>
1 year, 5 months
Error Java SDK Issue??
by Geschwentner, Patrick
Dear Ladies and Gentlemen!
I am currently working with the java-sdk and I encountered a problem.
If I would like to retrieve the disk details, I get the following error:
Disk currDisk = ovirtConnection.followLink(diskAttachment.disk());
The Error is occurring in this line:
[cid:image001.png@01D44537.AF127FD0]
The getResponst looks quiet ok. (I inspected: [cid:image002.png@01D44537.AF127FD0] and it looks ok).
Error:
wrong number of arguments
The code is quiet similar to what you published on github (https://github.com/oVirt/ovirt-engine-sdk-java/blob/master/sdk/src/test/j... ).
Can you confirm the defect?
Best regards
Patrick
3 years, 10 months
CentOS Stream support
by Michal Skrivanek
Hi all,
we would like to ask about interest in community about oVirt moving to CentOS Stream.
There were some requests before but it’s hard to see how many people would really like to see that.
With CentOS releases lagging behind RHEL for months it’s interesting to consider moving to CentOS Stream as it is much more up to date and allows us to fix bugs faster, with less workarounds and overhead for maintaining old code. E.g. our current integration tests do not really pass on CentOS 8.1 and we can’t really do much about that other than wait for more up to date packages. It would also bring us closer to make oVirt run smoothly on RHEL as that is also much closer to Stream than it is to outdated CentOS.
So..would you like us to support CentOS Stream?
We don’t really have capacity to run 3 different platforms, would you still want oVirt to support CentOS Stream if it means “less support” for regular CentOS?
There are some concerns about Stream being a bit less stable, do you share those concerns?
Thank you for your comments,
michal
3 years, 12 months
implementing hotplugCd/hotunplugCd in vdsm
by Fedor Gavrilov
Hey,
So in an attempt to fix change CD functionality we discovered a few other potential issues and what Nir suggested was to implement two [somewhat] new functions in VDSM: hotplug and hotunplug for CDs similar to how it works for normal disks now. Existing changeCD function will be left as is for backwards compatibility.
As I found out, engine already calculates iface and index before invoking VDSM functions, so we will just pass these along with PDIV to the VDSM.
Suggested flow is, let me quote:
>So the complete change CD flow should be:
>
>1. get the previous drivespec from vm metadata
>2. prepare new drivespec
>3. add new drivespec to vm metadata
>4. attach a new device to vm
>5. teardown the previous drivespec
>6. remove previous drivespec from vm metadata
>
>When the vm is stopped, it must do:
>
>1. get drive spec from vm metadata
>2. teardown drivespec
>
>During attach, there are interesting races:
>- what happens if vdsm crashes after step 2? who will teardown the volume?
> maybe you need to add the new drivespec to the metadata first,
>before preparing it.
>- what happens if attach failed? who will remove the new drive from
>the metadata?
Now, what makes hotplugDisk/hotunplugDisk different? From what I understand, the flow is same there, so what difference is there as far as VDSM is concerned? If none, this means if I more or less copy that code, changing minor details and data accordingly for CDs, this should work, shouldn't it?
Thanks!
Fedor
4 years, 5 months
vdsm.storage.exception.UnknownTask: Task id unknown (was: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master - Build # 1641 - Still Failing!)
by Yedidyah Bar David
On Wed, Jun 17, 2020 at 6:28 AM <jenkins(a)jenkins.phx.ovirt.org> wrote:
>
> Project: https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
> Build: https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1641/
This one failed while trying to create the disk image for the hosted-egnine VM:
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/16...
:
2020-06-16 23:03:20,527-0400 INFO ansible task start {'status': 'OK',
'ansible_type': 'task', 'ansible_playbook':
'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
'ansible_task': 'ovirt.hosted_engine_setup : Add HE disks'}
...
2020-06-16 23:14:12,702-0400 DEBUG var changed: host "localhost" var
"add_disks" type "<class 'dict'>" value: "{
...
"msg": "Timeout exceed while waiting on result state of the entity."
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/16...
:
2020-06-16 23:03:22,612-04 INFO
[org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (default task-1)
[16c24599-0048-44eb-a410-d39b7ce98712]
CommandMultiAsyncTasks::attachTask: Attaching task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' to command
'fa81759d-c57a-4237-81e0-beb210faa64d'.
2020-06-16 23:03:22,659-04 INFO
[org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (default task-1)
[16c24599-0048-44eb-a410-d39b7ce98712] Adding task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' (Parent Command
'AddImageFromScratch', Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'),
polling hasn't started yet..
2020-06-16 23:03:22,699-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (default task-1)
[16c24599-0048-44eb-a410-d39b7ce98712]
BaseAsyncTask::startPollingTask: Starting to poll task
'6b2a7648-748c-430b-94b6-5e3f719df2ac'.
...
2020-06-16 23:03:25,835-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-25)
[] SPMAsyncTask::PollTask: Polling task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' (Parent Command
'AddImageFromScratch', Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters')
returned status 'finished', result 'success'.
2020-06-16 23:03:25,863-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-25)
[] BaseAsyncTask::onTaskEndSuccess: Task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' (Parent Command
'AddImageFromScratch', Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
successfully.
But then:
2020-06-16 23:03:25,897-04 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712]
CommandAsyncTask::HandleEndActionResult [within thread]: endAction for
action type 'AddImageFromScratch' succeeded, clearing tasks.
2020-06-16 23:03:25,897-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712] SPMAsyncTask::ClearAsyncTask:
Attempting to clear task '6b2a7648-748c-430b-94b6-5e3f719df2ac'
2020-06-16 23:03:25,899-04 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712] START, SPMClearTaskVDSCommand(
SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='3bcde3b4-b044-11ea-bbb6-5452c0a8c863',
ignoreFailoverLimit='false',
taskId='6b2a7648-748c-430b-94b6-5e3f719df2ac'}), log id: 481c2d3d
2020-06-16 23:03:25,900-04 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712] START,
HSMClearTaskVDSCommand(HostName = lago-he-basic-suite-master-host-0,
HSMTaskGuidBaseVDSCommandParameters:{hostId='85ecc51c-f2cb-46a1-9452-fd487399d8dd',
taskId='6b2a7648-748c-430b-94b6-5e3f719df2ac'}), log id: 17360b3d
...
2020-06-16 23:03:26,054-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712]
BaseAsyncTask::removeTaskFromDB: Removed task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' from DataBase
But then:
2020-06-16 23:03:26,315-04 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-55)
[7fe7b467] Command 'UploadStreamVDSCommand(HostName =
lago-he-basic-suite-master-host-0,
UploadStreamVDSCommandParameters:{hostId='85ecc51c-f2cb-46a1-9452-fd487399d8dd'})'
execution failed: javax.net.ssl.SSLPeerUnverifiedException:
Certificate for <lago-he-basic-suite-master-host-0.lago.local> doesn't
match any of the subject alternative names:
[lago-he-basic-suite-master-host-0.lago.local]
2020-06-16 23:03:26,315-04 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-55)
[7fe7b467] FINISH, UploadStreamVDSCommand, return: , log id: 7e3a3e80
2020-06-16 23:03:26,316-04 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-55)
[7fe7b467] Command
'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand'
failed: EngineException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
javax.net.ssl.SSLPeerUnverifiedException: Certificate for
<lago-he-basic-suite-master-host-0.lago.local> doesn't match any of
the subject alternative names:
[lago-he-basic-suite-master-host-0.lago.local] (Failed with error
VDS_NETWORK_ERROR and code 5022)
Any idea why?
Anything changed in how we check the certificate?
Perhaps related to upgrade to CentOS 8.2?
And, how come it failed only this late? Don't we check the certificate earlier?
Anyway, this left the host in "not responding" state, so:
2020-06-16 23:03:29,994-04 ERROR
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommandCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-79)
[16c24599-0048-44eb-a410-d39b7ce98712] Failed to get volume info:
org.ovirt.engine.core.common.errors.EngineException: EngineException:
No host was found to perform the operation (Failed with error
RESOURCE_MANAGER_VDS_NOT_FOUND and code 5004)
And perhaps due to an unrelated issue, also:
2020-06-16 23:03:31,177-04 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-43)
[16c24599-0048-44eb-a410-d39b7ce98712] Trying to revert unknown task
'6b2a7648-748c-430b-94b6-5e3f719df2ac'
I looked a bit also at:
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/16...
and see there some relevant stuff, but nothing I can spot about the
root cause (e.g. the word "cert" does not appear there).
Can anyone please have a look? Thanks.
> Build Number: 1641
> Build Status: Still Failing
> Triggered By: Started by timer
>
> -------------------------------------
> Changes Since Last Success:
> -------------------------------------
> Changes for Build #1633
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
> [Ehud Yonasi] mock: fix yum repos injection.
>
> [Ehud Yonasi] onboard ost-images to stdci.
>
>
> Changes for Build #1634
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1635
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1636
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1637
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1638
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
> [Ehud Yonasi] stdci_runner: update templates node to ost-images.
>
>
> Changes for Build #1639
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1640
> [Yedidyah Bar David] Allow engine 20 minutes to come up after VM restart
>
>
> Changes for Build #1641
> [Michal Skrivanek] test live storage migration again
>
> [Ehud Yonasi] poll: add ost-images to nightly.
>
>
>
>
> -----------------
> Failed Tests:
> -----------------
> No tests ran.
--
Didi
4 years, 6 months
Re: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master - Build # 1655 - Still Failing!
by Yedidyah Bar David
On Sun, Jun 28, 2020 at 6:23 AM <jenkins(a)jenkins.phx.ovirt.org> wrote:
>
> Project: https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
> Build: https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1655/
This fails for some time now. Checked last one ^^, and it failed in:
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/16...
2020-06-27 23:05:02,253-0400 INFO ansible task start {'status': 'OK',
'ansible_type': 'task', 'ansible_playbook':
'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
'ansible_task': 'ovirt.hosted_engine_setup : Check OVF_STORE volume
status'}
2020-06-27 23:05:02,253-0400 DEBUG ansible on_any args TASK:
ovirt.hosted_engine_setup : Check OVF_STORE volume status kwargs
is_conditional:False
2020-06-27 23:05:02,254-0400 DEBUG ansible on_any args localhostTASK:
ovirt.hosted_engine_setup : Check OVF_STORE volume status kwargs
2020-06-27 23:05:03,816-0400 DEBUG ansible on_any args
<ansible.executor.task_result.TaskResult object at 0x7f454e3d1eb8>
kwargs
...
2020-06-27 23:09:39,166-0400 DEBUG var changed: host "localhost" var
"ovf_store_status" type "<class 'dict'>" value: "{
"changed": true,
"failed": true,
"msg": "All items completed",
"results": [
{
"ansible_loop_var": "item",
"attempts": 12,
Meaning, it ran 12 times the command:
vdsm-client Volume getInfo
storagepoolID=41c9fdea-b8e9-11ea-ae2a-5452c0a8c863
storagedomainID=10a69775-8fb6-437d-9e78-2ecfd77c0a45
imageID=c2ad2065-1c8b-4ec1-afdd-f7cefc708cf9
volumeID=6b835f55-a512-4f83-9d25-f6837d8b5cb1
and never got a result with output including "Updated". Any idea?
Perhaps it's just a bug in the test, and we should test something else
other than "Updated"?
Or perhaps there was some other problem but I failed to find it?
Thanks,
> Build Number: 1655
> Build Status: Still Failing
> Triggered By: Started by timer
>
> -------------------------------------
> Changes Since Last Success:
> -------------------------------------
> Changes for Build #1643
> [Galit] Add missing repo which has collectd-write_syslog
>
> [Sandro Bonazzola] Revert "ovirt-release: run only on fc29 nodes"
>
>
> Changes for Build #1644
> [Michal Skrivanek] make GLANCE failures fatal
>
>
> Changes for Build #1645
> [Michal Skrivanek] use // in glance URL
>
>
> Changes for Build #1646
> [Galit] Fix: No module named ost_utils.memoized
>
>
> Changes for Build #1647
> [Martin Necas] ansible: fix deploy scripts to el8
>
> [arachmani] Add arachman(a)redhat.com to jenkins recipient lists
>
>
> Changes for Build #1648
> [Martin Necas] ansible: fix deploy scripts to el8
>
> [Evgheni Dereveanchin] Do not run jenkins check-patch job on fc29
>
>
> Changes for Build #1649
> [Martin Necas] ansible: fix deploy scripts to el8
>
>
> Changes for Build #1650
> [Martin Necas] ansible: fix deploy scripts to el8
>
>
> Changes for Build #1651
> [Galit] Add centos8.2 official to templates
>
>
> Changes for Build #1652
> [Evgeny Slutsky] he-basic-suite-master: Add Domain_name to the host when vm_run
>
> [Gal Ben Haim] Add gcc to fcraw
>
> [Sandro Bonazzola] pipelines: add ovirt-dependencies
>
>
> Changes for Build #1653
> [Evgeny Slutsky] he-basic-suite-master: Add Domain_name to the host when vm_run
>
>
> Changes for Build #1654
> [Galit] Add centos 8.2 image: el8.2-base
>
> [Evgheni Dereveanchin] do not expand values in run_oc_playbook
>
>
> Changes for Build #1655
> [Galit] Add centos 8.2 image: el8.2-base
>
>
>
>
> -----------------
> Failed Tests:
> -----------------
> No tests ran.
--
Didi
4 years, 6 months
Backup: how to download only used extents from imageio backend
by Michael Ablassmeier
hi,
im currently looking at the new incremental backup api that has been
part of the 4.4 and RHV 4.4-beta release. So far i was able to create
full/incremental backups and restore without any problem.
Now, using the backup_vm.py example from the ovirt-engine-sdk i get
the following is happening during a full backup:
1) imageio client api requests transfer
2) starts qemu-img to create a local qemu image with same size
3) starts qemu-nbd to serve this image
4) reads used extents from provided imageio source, passes data to
qemu-nbd process
5) resulting file is a thin provisioned qcow image with the actual
data of the VM's used space.
while this works great, it has one downside: if i backup a virtual
machine with lots of used extents, or multiple virtual machines at the
same time, i may run out of space, if my primary backup target is
not a regular disk.
Imagine i want to stream the FULL backup to tape directly like
backup_vm.py full [..] <vm_uuid> /dev/nst0
thats currently not possible, because qemu-img is not able to open
a tape device directly, given its nature of the qcow2 format.
So what iam basically looking for, is a way to download only the extents
from the imageio server that are really in use, not depending on qemu-*
tools, to be able to pipe the data somehwere else.
Standard tools, like for example curl, will allways download the full
provisioned image from the imageio backend (of course).
I noticed is that it is possible to query the extents via:
https://tranfer_node:54322/images/d471c659-889f-4e7f-b55a-a475649c48a6/ex...
As i failed to find them, are there any existing functions/api calls
that could be used to download only the used extents to a file/fifo
pipe?
So far, i played around with the _internal.io.copy function, beeing able
to at least read the data into a in memory BytesIO stream, but thats not
the solution to my "problem" :)
bye,
- michael
4 years, 6 months