Failed to configure management network on the host. - oVirt in a KVM
by lejeczek
Hi guys,
I'm trying oVirt in a KVM and I get this:
...
[ INFO ] The host has been set in non_operational status,
deployment errors: code 505: Host c8ovirt1.private.road
installation failed. Failed to configure management network
on the host., code 519: Host c8ovirt1.private.road does
not comply with the cluster road networks, the following
networks are missing on host: 'ovirtmgmt', code 1120:
Failed to configure management network on host
c8ovirt1.private.road due to setup networks failure.,
code 9000: Failed to verify Power Management configuration
for Host c8ovirt1.private.road., code 10802: VDSM
c8ovirt1.private.road command HostSetupNetworksVDS failed:
Internal JSON-RPC error: {'reason': 'Failed to find
interface to with route table ID 254 to store route rules'},
...
How much of a worry is it and how to fix such KVM vm so
oVirt would be happy?
many thanks, L.
3 years, 10 months
Re: Newer kernel for oVirt Node NG
by Shantur Rathore
Tried it again, it just breaks the boot.
After a lot of firefighting found that elrepo kernel-lt doesn't have `xfs`
in initramfs image and it fails mounting the lvm which uses xfs
# lsinitrd /boot/initramfs-4.18.0-240.1.1.el8_3.x86_64.img | grep xfs.ko
-rw-r--r-- 1 root root 442760 Aug 11 2020
usr/lib/modules/4.18.0-240.1.1.el8_3.x86_64/kernel/fs/xfs/xfs.ko.xz
Tested kernel-ml and it has the needed xfs module
After installing the new kernel, I needed to copy over the "options" line
from /boot/loader/entries/ovirt*.conf to the new one.
Hope this saves a lot of hair pulling for someone.
On Tue, Feb 9, 2021 at 7:23 PM Matthew.Stier(a)fujitsu.com <
Matthew.Stier(a)fujitsu.com> wrote:
> ‘Node’ images are just minimal CentOS plus all the packages for this
> release, in a sub-DVD sized ISO. Once it is installed, the operating
> system is still CentOS, and can be patched/modified, including installing
> alternative kernels.
>
>
>
> If it fails, boot back to the original kernel, and remove the package from
> Elrepo, or at worse, re-install.
>
>
>
> *From:* Shantur Rathore <rathore4u(a)gmail.com>
> *Sent:* Tuesday, February 9, 2021 11:11 AM
> *To:* Stier, Matthew <Matthew.Stier(a)fujitsu.com>; users <users(a)ovirt.org>
> *Subject:* Re: [ovirt-users] Newer kernel for oVirt Node NG
>
>
>
> Thanks Matthew,
>
>
>
> Are you sure about this?
>
> I thought oVirt Node NG images are immutable
>
>
>
> On Tue, Feb 9, 2021 at 3:42 PM Matthew.Stier(a)fujitsu.com <
> Matthew.Stier(a)fujitsu.com> wrote:
>
> Elrepo.org
>
>
>
> *From:* Shantur Rathore <rathore4u(a)gmail.com>
> *Sent:* Tuesday, February 9, 2021 9:28 AM
> *To:* users <users(a)ovirt.org>
> *Subject:* [ovirt-users] Newer kernel for oVirt Node NG
>
>
>
> Hi oVirt Users,
>
>
>
> I am trying to test some vfio related stuff on oVirt Node NG 4.4.4 based
> host.
>
> What would be the easiest way to have a 5.x kernel on this node?
>
> I don't mind compiling if it needs to.
>
>
>
> Cheers,
>
> Shantur
>
>
3 years, 10 months
Account on Zanata
by Vasiliy Kovrizhkin
Good day!
My name is Vasiliy and I am searching for some localization of oVirt (or
maybe i make my custom translation of user interface).
Please, can you make an account for me on Zanata?
Thank you!
--
3 years, 10 months
OVirt rest api 4.3. How do you get the job id started by the async parameter
by pascal@butterflyit.com
I am using the rest api to create a VM, because the VM is cloned from the template and it takes a long time, I am also passing the async parameters hoping to receive back a job id, which I could then query
https://xxxxx/ovirt-engine/api/vms?async=true&clone=true
however I get the new VM record which is fine but then I have no way of knowing the job id I should query to know when it is finished. And looking at all jobs there is no reference back to the VM execept for the description
<job href="/ovirt-engine/api/jobs/d17125c7-6668-4b6c-ad22-95121cb66a31" id="d17125c7-6668-4b6c-ad22-95121cb66a31">
<actions>
<link href="/ovirt-engine/api/jobs/d17125c7-6668-4b6c-ad22-95121cb66a31/clear" rel="clear"/>
<link href="/ovirt-engine/api/jobs/d17125c7-6668-4b6c-ad22-95121cb66a31/end" rel="end"/>
</actions>
<description>Creating VM DEMO-PCC-4 from Template MASTER-W10-20H2-CDrive in Cluster d1-c2</description>
<link href="/ovirt-engine/api/jobs/d17125c7-6668-4b6c-ad22-95121cb66a31/steps" rel="steps"/>
<auto_cleared>true</auto_cleared>
<external>false</external>
<last_updated>2021-01-21T12:49:06.700-08:00</last_updated>
<start_time>2021-01-21T12:48:59.453-08:00</start_time>
<status>started</status>
<owner href="/ovirt-engine/api/users/0f2291fa-872a-11e9-b13c-00163e449339" id="0f2291fa-872a-11e9-b13c-00163e449339"/>
</job>
3 years, 10 months
Re: HostedEngine VM Paused after power failure
by Ian Easter
Robert,
I understand the sentiment of the difficulty here. The recovery feels
brutal but the monolithic nature and the dense ecosystem is understandable
for the purpose it serves.
I am able to mount the raw disk image for the HostedEngine VM cleanly
without any errors and it seems to check out, so I don't believe there is
any corruption.
Everything looks to operate as expected and then it just seems to snag
somewhere through the startup. I suppose I'm just trying to trace down the
hiccup to clear it out of the way and let the VM boot up. My knowledge is
a bit limited digging in and troubleshooting the components here.
Additional snippet:
MainThread::INFO::2021-02-09
21:00:07,357::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
stderr: Command VM.getStats with args {'vmID':
'74b3c839-c89c-4857-ada0-95715672348a'} failed:
(code=1, message=Virtual machine does not exist: {'vmId':
'74b3c839-c89c-4857-ada0-95715672348a'})
MainThread::INFO::2021-02-09
21:00:07,357::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
Engine VM started on localhost
MainThread::INFO::2021-02-09
21:00:07,389::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineStart-EngineStarting)
sent? ignored
MainThread::INFO::2021-02-09
21:00:07,406::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state EngineStarting (score: 3400)
MainThread::INFO::2021-02-09
21:00:17,427::states::740::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Another host already took over..
*Thank you,*
*Ian Easter*
On Tue, Feb 9, 2021 at 6:31 PM Robert Tongue <phunyguy(a)neverserio.us> wrote:
> I've seen this happen with the VM disk itself becoming corrupt. If you
> try to read the contents of the file, and it gives you "Input/Output
> Error", then it is not good news. I've been testing oVirt recently, and
> these issues alone are preventing me from using it full time. I cannot
> help further, unfortunately, as I have no idea how to fix it. So best I
> can say is, hopefully someone else chimes in and helps both of us.
>
> -phunyguy
> ------------------------------
> *From:* ieaster(a)telvue.com <ieaster(a)telvue.com>
> *Sent:* Tuesday, February 9, 2021 6:25 PM
> *To:* users(a)ovirt.org <users(a)ovirt.org>
> *Subject:* [ovirt-users] Re: HostedEngine VM Paused after power failure
>
> Attempting to resume or start the VM doesn't yield any results.
>
> Here is the status of the VM:
> Host ID : 1
> Host timestamp : 115601
> Score : 3400
> Engine status : {"vm": "up", "health": "bad",
> "detail": "Paused", "reason": "bad vm status"}
> Hostname :
> Local maintenance : False
> stopped : False
> crc32 : 68efbf40
> conf_on_shared_storage : True
> local_conf_timestamp : 115601
> Status up-to-date : True
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=115601 (Tue Feb 9 18:25:48 2021)
> host-id=1
> score=3400
> vm_conf_refresh_time=115601 (Tue Feb 9 18:25:48 2021)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineStarting
> stopped=False
>
>
> Here is a chunk in agent.log that is a bit perplexing. I'm not too sure
> what it means that the VM doesn't exist. Storage is correctly mounted,
> everything looks fully operational. I can see the HostedEngine disk
> available to the Host.
>
> MainThread::INFO::2021-02-09
> 18:08:13,843::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineDown (score: 3400)
> MainThread::INFO::2021-02-09
> 18:08:23,864::states::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine down and local host has best score (3400), attempting to start
> engine VM
> MainThread::INFO::2021-02-09
> 18:08:23,894::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineDown-EngineStart)
> sent? ignored
> MainThread::INFO::2021-02-09
> 18:08:23,983::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineStart (score: 3400)
> MainThread::INFO::2021-02-09
> 18:08:24,000::hosted_engine::895::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
> Ensuring VDSM state is clear for engine VM
> MainThread::INFO::2021-02-09
> 18:08:24,005::hosted_engine::907::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
> Vdsm state for VM clean
> MainThread::INFO::2021-02-09
> 18:08:24,005::hosted_engine::853::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> Starting vm using `/usr/sbin/hosted-engine --vm-start`
> MainThread::INFO::2021-02-09
> 18:08:24,519::hosted_engine::862::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> stdout: VM in WaitForLaunch
>
> MainThread::INFO::2021-02-09
> 18:08:24,519::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> stderr: Command VM.getStats with args {'vmID':
> '74b3c839-c89c-4857-ada0-95715672348a'} failed:
> (code=1, message=Virtual machine does not exist: {'vmId':
> '74b3c839-c89c-4857-ada0-95715672348a'})
>
> MainThread::INFO::2021-02-09
> 18:08:24,519::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> Engine VM started on localhost
> MainThread::INFO::2021-02-09
> 18:08:24,552::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineStart-EngineStarting)
> sent? ignored
> MainThread::INFO::2021-02-09
> 18:08:24,565::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineStarting (score: 3400)
> MainThread::INFO::2021-02-09
> 18:08:34,585::states::736::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> VM is powering up..
> MainThread::INFO::2021-02-09
> 18:08:34,590::state_decorators::99::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Timeout set to Tue Feb 9 18:18:34 2021 while transitioning <class
> 'ovirt_hosted_engine_ha.agent.states.EngineStarting'> -> <class
> 'ovirt_hosted_engine_ha.agent.states.EngineStarting'>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDKODQL5A4N...
>
3 years, 10 months
Single Node Hyperconverged - Failing Gluster Setup
by jhamiltonactually@gmail.com
Ovirt newbie here - using v 4.4.4
Have been trying for days to get this installed on my HP DL380p G6. I have 2 disk 170GB Raid 0 for OS and 6 x 330GB disk Raid 5 for Gluster. DNS all set up (that took some working out), but I just can't fathom out whats (not) happening here. Block size is returned as 512.
I've had some help on Reddit where I've been told that Ovirt is seeing my single local disk ass an multipath device, which it is not/??! I think I removed the flag, but it still fails here.
So, Gluster install fails quite early through, though it carries on creating all the volumes (with default settings) but then gives me the 'Deployment Failed' message :( Here is where it fails....
Any help gratefully received!
TASK [fail] ********************************************************************
task path: /usr/share/cockpit/ovirt-dashboard/ansible/hc_wizard.yml:62
skipping: [ovirt-gluster.whichelo.com] => (item=[{'cmd': 'blockdev --getss /dev/sdb | grep -Po -q "512" && echo true || echo false\n', 'stdout': 'true', 'stderr': '', 'rc': 0, 'start': '2021-02-07 13:21:10.237701', 'end': '2021-02-07 13:21:10.243111', 'delta': '0:00:00.005410', 'changed': True, 'invocation': {'module_args': {'_raw_params': 'blockdev --getss /dev/sdb | grep -Po -q "512" && echo true || echo false\n', '_uses_shell': True, 'warn': True, 'stdin_add_newline': True, 'strip_empty_ends': True, 'argv': None, 'chdir': None, 'executable': None, 'creates': None, 'removes': None, 'stdin': None}}, 'stdout_lines': ['true'], 'stderr_lines': [], 'failed': False, 'item': {'vgname': 'gluster_vg_sdb', 'pvname': '/dev/sdb'}, 'ansible_loop_var': 'item'}, {'cmd': 'blockdev --getss /dev/sdb | grep -Po -q "4096" && echo true || echo false\n', 'stdout': 'false', 'stderr': '', 'rc': 0, 'start': '2021-02-07 13:21:14.760897', 'end': '2021-02-07 13:21:14.766395', 'delta': '0:00:00.005498', 'chang
ed': True, 'invocation': {'module_args': {'_raw_params': 'blockdev --getss /dev/sdb | grep -Po -q "4096" && echo true || echo false\n', '_uses_shell': True, 'warn': True, 'stdin_add_newline': True, 'strip_empty_ends': True, 'argv': None, 'chdir': None, 'executable': None, 'creates': None, 'removes': None, 'stdin': None}}, 'stdout_lines': ['false'], 'stderr_lines': [], 'failed': False, 'item': {'vgname': 'gluster_vg_sdb', 'pvname': '/dev/sdb'}, 'ansible_loop_var': 'item'}]) => {"ansible_loop_var": "item", "changed": false, "item": [{"ansible_loop_var": "item", "changed": true, "cmd": "blockdev --getss /dev/sdb | grep -Po -q \"512\" && echo true || echo false\n", "delta": "0:00:00.005410", "end": "2021-02-07 13:21:10.243111", "failed": false, "invocation": {"module_args": {"_raw_params": "blockdev --getss /dev/sdb | grep -Po -q \"512\" && echo true || echo false\n", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "
stdin_add_newline": true, "strip_empty_ends": true, "warn": true}}, "item": {"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"}, "rc": 0, "start": "2021-02-07 13:21:10.237701", "stderr": "", "stderr_lines": [], "stdout": "true", "stdout_lines": ["true"]}, {"ansible_loop_var": "item", "changed": true, "cm
d": "blockdev --getss /dev/sdb | grep -Po -q \"4096\" && echo true || echo false\n", "delta": "0:00:00.005498", "end": "2021-02-07 13:21:14.766395", "failed": false, "invocation": {"module_args": {"_raw_params": "blockdev --getss /dev/sdb | grep -Po -q \"4096\" && echo true || echo false\n", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": true}}, "item": {"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"}, "rc": 0, "start": "2021-02-07 13:21:14.760897", "stderr": "", "stderr_lines": [], "stdout": "false", "stdout_lines": ["false"]}], "skip_reason": "Conditional result was False"}
hc_wizard.yml excerpt:
- name: Check if block device is 4KN
shell: >
blockdev --getss {{ item.pvname }} | grep -Po -q "4096" && echo true || echo false
register: is4KN
with_items: "{{ gluster_infra_volume_groups }}"
- fail: ################ THIS IS LINE 62 #####################################
msg: "Mix of 4K and 512 Block devices are not allowed"
with_nested:
- "{{ is512.results }}"
- "{{ is4KN.results }}"
when: item[0].stdout|bool and item[1].stdout|bool
# logical block size of 512 bytes. To disable the check set
# gluster_features_512B_check to false. DELETE the below task once
# OVirt limitation is fixed
- name: Check if disks have logical block size of 512B
command: blockdev --getss {{ item.pvname }}
register: logical_blk_size
when: gluster_infra_volume_groups is defined and
item.pvname is not search("/dev/mapper") and
gluster_features_512B_check|default(false)
Can anyone help?
3 years, 10 months
HostedEngine VM Paused after power failure
by Ian Easter
Hello Users,
We have an oVirt (4.4) environment that had 2 hosts in the cluster. We
suffered from a power failure that caused the servers to be offline for
some time. Once restored, one of the hosts from the cluster lost its OS
raid and is not accessible.
The other server has the HostedEngine vm on it but in a paused state. I
have tried to manually start the vm with the hosted-engine CLI tool but it
indicates that HostedEngine is running on another host.
Is there any manual intervention I can accomplish here to start the
HostedEngine on the second, active host server?
*Thank you,*
*Ian Easter*
*DevOps Engineer*
*TelVue Support*
https://www.telvue.com/support/
3 years, 10 months
Re: Newer kernel for oVirt Node NG
by Shantur Rathore
Thanks Matthew,
Are you sure about this?
I thought oVirt Node NG images are immutable
On Tue, Feb 9, 2021 at 3:42 PM Matthew.Stier(a)fujitsu.com <
Matthew.Stier(a)fujitsu.com> wrote:
> Elrepo.org
>
>
>
> *From:* Shantur Rathore <rathore4u(a)gmail.com>
> *Sent:* Tuesday, February 9, 2021 9:28 AM
> *To:* users <users(a)ovirt.org>
> *Subject:* [ovirt-users] Newer kernel for oVirt Node NG
>
>
>
> Hi oVirt Users,
>
>
>
> I am trying to test some vfio related stuff on oVirt Node NG 4.4.4 based
> host.
>
> What would be the easiest way to have a 5.x kernel on this node?
>
> I don't mind compiling if it needs to.
>
>
>
> Cheers,
>
> Shantur
>
3 years, 10 months
Random Crash
by francesco@shellrent.com
Hi all,
I'm experiencing random reboot on several oVirt nodes (CentOS 7/8, oVirt 4.3/4.4 as well). Sometimes it happens three times in a day, and the more hosts I'm adding to my pool, the more I noticing.
The logs are not helpful: it's like a brute poweroff cause there are no entries at all in the messages, vdsm, secure (I looked all over the logs) from the last "normal" entry (user logged in/off, normal vdsm log ecc.) until the first entry of the boot. kdump is enabled and /var/crash is empty. I used to run Xen on the servers of the same provider and I didn't have all of these frequent reboots, that's why I'm not sure it is a hardware related issue.
Any advice on what enables for getting more info about this crash?
Thank you for your time,
Francesco
3 years, 10 months