oVirt 4.4 Upgrade issue?
by lee.hanel@gmail.com
Greetings,
I'm trying to preform an upgrade from 4.3 to 4.4 using the hosted engine option of https://github.com/ovirt/ovirt-ansible-collection.
Unfortunately, when it goes to create the hosted engine disk images I get the following:
[Cannot move Virtual Disk. The operation is not supported for HOSTED_ENGINE_METADATA disks.]
This appears to be related to https://bugzilla.redhat.com/show_bug.cgi?id=1883817, BUT I've manually applied that patch. Shouldn't it be creating a NEW disk image instead of trying to move the existing one?
Any help would be appriciated.
Thanks,
Lee
4 years, 5 months
vdsm with NFS storage reboot or shutdown more than 15 minutes. with error failed to unmount /rhev/data-center/mnt/172.18.81.14:_home_nfs_data: Device or resource busy
by lifuqiong@sunyainfo.com
Hi everyone:
I met problem as follows:
Description of problem:
When exec "reboot" or "shutdown -h 0" cmd on vdsm server, the vdsm server will reboot or shutdown more than 30 minutes. the screen shows '[FAILED] Failed unmouting /rhev/data-center/mnt/172.18.81.41:_home_nfs_data'.
other messages may be useful:
This message was shown in the screen.
[] watchdog: watchdog0: watchdog did not stop! []systemd-shutdown[5594]: Failed to unmount /rhev/data-center/mnt/172.18.81.14:_home_nfs_data: Device or resource busy
[]systemd-shutdown[1]: Failed to wait for process: Protocol error
[]systemd-shutdown[5595]: Failed to remount '/' read-only: Device or resource busy
[]systemd-shutdown[1]: Failed to wait for process: Protocol error
dracut Warning: Killing all remaining processes
dracut Warning: Killing all remaining processes
Version-Release number of selected component (if applicable):
Software Version:4.2.8.2-1.el7
OS: CentOS Linux release 7.5.1804 (Core)How reproducible:
100%
Steps to Reproduce:
1. my test enviroment is one Ovirt engine(172.17.81.17) with 4 vdsm servers, exec "reboot" cmd in one of the vdsm servers(172.17.99.105), the server will reboot more than 30 minutes.ovirt-engine : 172.17.81.17/16
vdsm: 172.17.99.105/16
nfs server: 172.17.81.14/16Actual results:
As above. the server will reboot more than 30 minutes
Expected results:
the server will reboot in a short time.
What I have done:
I have capture packet in nfs server while vdsm is rebooting, I found vdsm server keeps sending nfs packet to nfs server circularly ;there are some log files while I reboot vdsm 172.17.99.105 in 2020-10-26 22:12:34. Some conclusion is:
1. the vdsm.log said the vdsm 2020-10-26 22:12:34,461+0800 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/172.18.81.14:_home_nfs_data/02c4c6ea-7ca9-40f1-a1d0-f1636bc1824e/dom_md/metadata
2. the sanlock.log said 2020-10-26 22:13:05 1454 [3301]: s1 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/172.18.81.14:_home_nfs_data/02c4c6ea-7ca9-40f1-a1d0-f1636bc1824e/dom_md/ids
3. there is nothing message import to this issue.The logs is in the attachment.I'm very appreciate if anyone can help me. Thank you.Your Sincerely,Mark Lee
4 years, 5 months
vdsm with NFS storage reboot or shutdown more than 15 minutes. with error failed to unmount /rhev/data-center/mnt/172.18.81.14:_home_nfs_data: Device or resource busy
by lifuqiong@sunyainfo.com
Hi everyone:
I met a problem as follows:
Description of problem:
When exec "reboot" or "shutdown -h 0" cmd on vdsm server, the vdsm server will reboot or shutdown more than 30 minutes. the screen shows '[FAILED] Failed unmouting /rhev/data-center/mnt/172.18.81.41:_home_nfs_data'.
other messages may be useful:
This message was shown in the screen.
[] watchdog: watchdog0: watchdog did not stop! []systemd-shutdown[5594]: Failed to unmount /rhev/data-center/mnt/172.18.81.14:_home_nfs_data: Device or resource busy
[]systemd-shutdown[1]: Failed to wait for process: Protocol error
[]systemd-shutdown[5595]: Failed to remount '/' read-only: Device or resource busy
[]systemd-shutdown[1]: Failed to wait for process: Protocol error
dracut Warning: Killing all remaining processes
dracut Warning: Killing all remaining processes
Version-Release number of selected component (if applicable):
Software Version:4.2.8.2-1.el7
OS: CentOS Linux release 7.5.1804 (Core)How reproducible:
100%
Steps to Reproduce:
1. my test enviroment is one Ovirt engine(172.17.81.17) with 4 vdsm servers, exec "reboot" cmd in one of the vdsm servers(172.17.99.105), the server will reboot more than 30 minutes.ovirt-engine : 172.17.81.17/16
vdsm: 172.17.99.105/16
nfs server: 172.17.81.14/16Actual results:
As above. the server will reboot more than 30 minutes
Expected results:
the server will reboot in a short time.
What I have done:
I have capture packet in nfs server while vdsm is rebooting, I found vdsm is always sending nfs packet to nfs server circularly as follows:this is some log files while I reboot vdsm 172.17.99.105 in 2020-10-26 22:12:34. Some conclusion is:
1. the vdsm.log said the vdsm 2020-10-26 22:12:34,461+0800 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/172.18.81.14:_home_nfs_data/02c4c6ea-7ca9-40f1-a1d0-f1636bc1824e/dom_md/metadata
2. the sanlock.log said 2020-10-26 22:13:05 1454 [3301]: s1 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/172.18.81.14:_home_nfs_data/02c4c6ea-7ca9-40f1-a1d0-f1636bc1824e/dom_md/ids
3. there is nothing message import to this issue.The logs is in the attachment.I'm very appreciate if anyone can help me. Thank you.
4 years, 5 months
update host to 4.4: manual ovn config necessary?
by Gianluca Cecchi
Hello,
I have updated an external engine from 4.3 to 4.4 and the OVN configuration
seems to have been retained:
[root@ovmgr1 ovirt-engine]# ovn-nbctl show
switch fc2fc4e8-ff71-4ec3-ba03-536a870cd483
(ovirt-ovn192-1e252228-ade7-47c8-acda-5209be358fcf)
switch 101d686d-7930-4176-b41a-b306d7c30a1a
(ovirt-ovn17217-4bb1d1a7-020d-4843-9ac7-dc4204b528e5)
port c1ec60a4-b4f3-4cb5-8985-43c086156e83
addresses: ["00:1a:4a:19:01:89 dynamic"]
port 174b69f8-00ed-4e25-96fc-7db11ea8a8b9
addresses: ["00:1a:4a:19:01:59 dynamic"]
port ccbd6188-78eb-437b-9df9-9929e272974b
addresses: ["00:1a:4a:19:01:88 dynamic"]
port 7e96ca70-c9e3-4efe-9ac5-e56c18476437
addresses: ["00:1a:4a:19:01:83 dynamic"]
port d2c2d9f1-8fc3-4f17-9ada-76fe3a168e65
addresses: ["00:1a:4a:19:01:5e dynamic"]
port 4d13d63e-5ff3-41c1-9b6b-feac343b514b
addresses: ["00:1a:4a:19:01:60 dynamic"]
port 66359e79-56c4-47e0-8196-2241706329f6
addresses: ["00:1a:4a:19:01:68 dynamic"]
switch 87012fa6-ffaa-4fb0-bd91-b3eb7c0a2fc1
(ovirt-ovn193-d43a7928-0dc8-49d3-8755-5d766dff821a)
port 2ae7391b-4297-4247-a315-99312f6392e6
addresses: ["00:1a:4a:19:01:51 dynamic"]
switch 9e77163a-c4e4-4abf-a554-0388e6b5e4ce
(ovirt-ovn172-4ac7ba24-aad5-432d-b1d2-672eaeea7d63)
[root@ovmgr1 ovirt-engine]#
Then I updated one of the 3 Linux hosts (not node ng) through remove from
web admin gui, install from scratch of CentOS 8.2 OS, configure repos and
then add new host (with the same name) in engine and I was able to connect
to storage (iSCSI) and start VMs in general on the host.
Coming to OVN part it seems it has not been configured on the upgraded host.
Is it expected?
Eg on engine I only see chassis for the 2 hosts still in 4.3:
[root@ovmgr1 ovirt-engine]# ovn-sbctl show
Chassis "b8872ab5-4606-4a79-b77d-9d956a18d349"
hostname: "ov301.mydomain"
Encap geneve
ip: "10.4.192.34"
options: {csum="true"}
Port_Binding "174b69f8-00ed-4e25-96fc-7db11ea8a8b9"
Port_Binding "66359e79-56c4-47e0-8196-2241706329f6"
Chassis "ddecf0da-4708-4f93-958b-6af365a5eeca"
hostname: "ov300.mydomain"
Encap geneve
ip: "10.4.192.33"
options: {csum="true"}
Port_Binding "ccbd6188-78eb-437b-9df9-9929e272974b"
[root@ovmgr1 ovirt-engine]#
What to do to add the upgraded 4.4 host? Can they live together for the OVN
part?
Thanks,
Gianluca
4 years, 5 months
when configuring multi path logical network selection area is empty, hence not able to configure multipathing
by dhanaraj.ramesh@yahoo.com
Hi team,
I have 4 node cluster where in each node I configured 2 dedicated 10 gig NIC with dedicated subnet each ( NIC 1 = 10.10.10.0/24, NIC 2 = 10.10.20.0/24 ) and on the array side I configured 2 targets with 10.10.10.0 /24 & another 2 target with 10.10.20.0/24 subnet. without any errors I could check in all four paths and able to mount iscsi luns in all 4 nodes. However when i try to configure mutipathing at Data center level I could see all the paths but not the logical network, it stays empty although I configured logical network label for both NICs with dedicated names as ISCSI1 & ISCI2. these logical names visible and green at the host network level, no errors, they are just L2 IP config
Am I missing something here? what else I should do to enable multiplathing
4 years, 5 months
vdsm with NFS storage reboot or shutdown more than 15 minutes. with error failed to unmount /rhev/data-center/mnt/172.18.81.14:_home_nfs_data: Device or resource busy
by lifuqiong@sunyainfo.com
Hi everyone:
Description of problem:
When exec "reboot" or "shutdown -h 0" cmd on vdsm server, the vdsm server will reboot or shutdown more than 30 minutes. the screen shows '[FAILED] Failed unmouting /rhev/data-center/mnt/172.18.81.41:_home_nfs_data'.
other messages may be useful: [] watchdog: watchdog0: watchdog did not stop! []systemd-shutdown[5594]: Failed to unmount /rhev/data-center/mnt/172.18.81.14:_home_nfs_data: Device or resource busy
[]systemd-shutdown[1]: Failed to wait for process: Protocol error
[]systemd-shutdown[5595]: Failed to remount '/' read-only: Device or resource busy
[]systemd-shutdown[1]: Failed to wait for process: Protocol error
dracut Warning: Killing all remaining processes
dracut Warning: Killing all remaining processes
Version-Release number of selected component (if applicable):
Software Version:4.2.8.2-1.el7
OS: CentOS Linux release 7.5.1804 (Core)
How reproducible:
100%
Steps to Reproduce:
1. my test enviroment is one Ovirt engine(172.17.81.17) with 4 vdsm servers, exec "reboot" cmd in one of the vdsm servers(172.17.99.105), the server will reboot more than 30 minutes.ovirt-engine : 172.17.81.17/16
vdsm: 172.17.99.105/16
nfs server: 172.17.81.14/16Actual results:
As above. the server will reboot more than 30 minutes
Expected results:
the server will reboot in a short time.
What I have done:
I have capture packet in nfs server while vdsm is rebooting, I found vdsm is always sending nfs packet to nfs server circularly as follows:this is some log files while I reboot vdsm 172.17.99.105 in 2020-10-26 22:12:34. Some conclusion is:
1. the vdsm.log said the vdsm 2020-10-26 22:12:34,461+0800 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/172.18.81.14:_home_nfs_data/02c4c6ea-7ca9-40f1-a1d0-f1636bc1824e/dom_md/metadata
2. the sanlock.log said 2020-10-26 22:13:05 1454 [3301]: s1 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/172.18.81.14:_home_nfs_data/02c4c6ea-7ca9-40f1-a1d0-f1636bc1824e/dom_md/ids
3. there is nothing message import to this issue.The logs is in the attachment.I'm very appreciate if anyone can help me. Thank you.
4 years, 5 months
problems installing standard Linux as nodes in 4.4
by Gianluca Cecchi
Hello,
due to missing megaraid_sas kernel module in RH EL 8 and CentOS 8, I'm
deploying a new oVirt host using CentOS 8 and elrepo kernel driver and not
ovirt node ng.
Based on installation guide:
- install CentOS 8.2 ("Server" chosen as base environment)
- yum install https://resources.ovirt.org/pub/yum-repo/ovirt-release44.rpm
- yum install cockpit-ovirt-dashboard
- yum update
- reboot
Try to add host from engine web admin gui, I get:
Host ov200 installation failed. Failed to execute Ansible host-deploy role:
Failed to execute call to start playbook. . Please check logs for more
details: /var/log/ovirt-engine/ansible-runner-service.log.
Inside the log file above on engine:
2020-10-08 11:58:43,389 - runner_service.controllers.hosts - DEBUG -
Request received, content-type :None
2020-10-08 11:58:43,390 - runner_service.controllers.hosts - INFO -
127.0.0.1 - GET /api/v1/hosts/ov200
2020-10-08 11:58:43,398 - runner_service.controllers.playbooks - DEBUG -
Request received, content-type :application/json; charset=UTF-8
2020-10-08 11:58:43,398 - runner_service.controllers.playbooks - INFO -
127.0.0.1 - POST /api/v1/playbooks/ovirt-host-deploy.yml
Do I have to enable any module or pre-install anything else before adding
it?
BTW: on host
[root@ov200 ~]# rpm -q ansible
ansible-2.9.13-2.el8.noarch
[root@ov200 ~]#
Thanks,
Gianluca
4 years, 5 months
Left over hibernation disks that we can't delete
by james@deanimaconsulting.com
After putting the VMs in our environment into hibernation, one of the VMs that came out of hibernation still has the metadata and memory dumb hanging around as 2 disks with status ok. They are not attached to any VM but we are unable to delete them. The VM in question came out of the hibernation without any issues.
If they are no longer required how can we tidy them up?
Thanks, James
4 years, 5 months
adminstration portal wont complete load, looping
by Philip Brown
I have an odd situation:
When I go to
https://ovengine/ovirt-engine/webadmin/?locale=en_US
after authentication passes...
it shows the top banner of
oVirt OPEN VIRTUALIZATION MANAGER
and the
Loading ...
in the center. but never gets past that. Any suggestions on how I could investigate and fix this?
background:
I recently updated certs to be signed wildcard certs, but this broke consoles somehow.
So I restored the original certs, and restarted things... but got stuck with this.
Interestingly, the VM portal loads fine. But not the admin portal.
--
Philip Brown| Sr. Linux System Administrator | Medata, Inc.
5 Peters Canyon Rd Suite 250
Irvine CA 92606
Office 714.918.1310| Fax 714.918.1325
pbrown(a)medata.com| www.medata.com
4 years, 5 months
reinstall 4.3 host from 4.4 engine
by Gianluca Cecchi
Hello,
supposing I'm with an already upgraded 4.4.2 engine and in my environment I
still have some 4.3.10 hosts based on CentOS Linux 7.x (or 4.3.10 ng
nodes), is it supported to reinstall such host in case of any modification
in configuration?
Or will the engine try to install them in a "4.4" way?
Thanks in advance,
Gianluca
4 years, 5 months