ERROR: Installing oVirt Node & Hosted-Engine on one physical server
by ivanpashchuk@ipoft.com
oVirt node 4.4.4
Question 1: In General, can i install oVirt Node and hosted-engine on one physical server?
Question 2: What domain names should I specify in the DNS server and which ones IP should I reserve?
Question 3: "Failed to connect to the host via ssh: ssh: connect to host ovirt-engine-01.local port 22: No route to host" - Who and where is trying to establish a connection via SSH?
Can someone reply to questions 1-3 above? Any help will be much appreciated.
Thanks for the help,
Ivan.
Logs:
ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"attempts": 180, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.565239", "end": "2021-03-03 13:36:54.775239", "rc": 0, "start": "2021-03-03 13:36:54.210000", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"host-id\": 1, \"host-ts\": 16377, \"score\": 3400, \"engine-status\": {\"vm\": \"up\", \"health\": \"bad\", \"detail\": \"Up\", \"reason\": \"failed liveliness check\"}, \"hostname\": \"mng-ovirt-engine-01.local\", \"maintenance\": false, \"stopped\": false, \"crc32\": \"2b3ee5d1\", \"conf_on_shared_storage\": true, \"local_conf_timestamp\": 16377, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=16377 (Wed Mar 3 13:36:48 2021)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=16377 (Wed Mar 3 13:36:48 2021)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=
False\\n\", \"live-data\": true}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"host-id\": 1, \"host-ts\": 16377, \"score\": 3400, \"engine-status\": {\"vm\": \"up\", \"health\": \"bad\", \"detail\": \"Up\", \"reason\": \"failed liveliness check\"}, \"hostname\": \"mng-ovirt-engine-01.local\", \"maintenance\": false, \"stopped\": false, \"crc32\": \"2b3ee5d1\", \"conf_on_shared_storage\": true, \"local_conf_timestamp\": 16377, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=16377 (Wed Mar 3 13:36:48 2021)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=16377 (Wed Mar 3 13:36:48 2021)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"live-data\": true}, \"global_maintenance\": false}"]}
ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine-01.local resolves to 10.0.2.250. If you are using DHCP, check your DHCP reservation configuration"}
ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host ovirt-engine-01.local port 22: No route to host", "skip_reason": "Host localhost is unreachable", "unreachable": true}
3 years, 9 months
Strange Glitches with VM and Network (oVirt 4.4.4.7-1.el8)
by Andrei Verovski
Hi,
I’m running oVirt 4.4.4.7-1.el8 and need to connect one of the VMs straight to the ISP link via Ethernet cable.
oVirt already have 2 networks (ovirt mgmt local + DMZ).
Created new network provider and assigned to it available physical interface of HP ProLiant, connected via cable to ISP switch with public IP.
Options of this network: (VM Network = on, Port Isolation = off, NIC Type = VirtIO, the rest are defaults).
VM is Debian 10.
Link works but with strange artefacts. If VM left being idle for a while, it cant be connected or pinged from outside, until I initiate pings from VM itself.
I have only 2 IPs from this ISP so I’m sure there are no IP address conflicts.
Another port and public IP go to our VyOS router handling internal and DMZ zone.
How to fix this problem ?
Thanks in advance.
Andrei
3 years, 9 months
which functionality replaces vdsm-hook-vmdisk in ovirt 4.4 / CentOS 8 (and is it really deprecated?)
by Andrejs Baulins
Hi there!
I'm a newbie, so don't blame me fast :)
I have few questions related to Hypervisor's local disk passthrough to VM strategies
Preamble: we found vmdisk hook most comfortable way to use as a "gate to local storage functionality" in shared clusters.
Let's assume, You have
A. 2 or 4 TB large SSD, which You need to split on different usages: some diskspace provide for SQL / ETCd / Mongod etc. It must be fast, so no network in between. You have only one physical disk, so you can't passthrough whole of it to single VM. So you make LVM, and passthrough LVMs using vmdisk - this is how we did on some setups.
B. 2 or 4 TB HDD for gluster bricks.
At the end, You have few VMs, which "dies" together with Hypervisor, but most of VMs can live migration. In together with L7 application level fail-over, it can lead to non-critical level services level cluster, which is not so hard to throw in maintenance
Yes, it does not work at real device speeds: simple dd operation with unpartitioned block device performs 3-4 times slower in VM, than on hypervisor. But, ability to mix workloads (migratable VMS with non migratable) within small cluster of hypervisors is more important, at some setups.
So, there are questions coming...
1. As far as I understand, there is possibility to replace vmdisk-hook vith cmdline. But I didn't found a way, how to get it working. I always ended with "Could not open '/dev/***': No such file or directory." which I could not avoid anyhow. May be, someone can point me to an article or other thread, where there are clues, how to do that?
2. what's the story behind vmdisk hook? As far as I understand from https://github.com/oVirt/vdsm/blob/master/vdsm.spec.in its deprecated? Why? It is replaced with some functionality? Why did VDSM's hook repo been deleted from github (in order to avoid confuse?)?
3. Is it possible to achieve with any hook or some other instrument near hypervisor's performance of local disk? We are ready for inconveniences like custom setup handling / patches and so on
Thanks in advance!
It worked in 4.3.10 / CentOS 7, but the
3 years, 9 months
HCI upgrade path from single to three-nodes setup
by s.buelow@messaufficio.com
Hello
I'm planning an HCI configuration on a single server.
In my wishlist there's also the opportunity in the future to upgrade the cluster from single to fully fledged three node setup.
Is there any supported and tested migration path available ?
Thanks in advance
Best regards
3 years, 9 months
oVirt 4.3.9 readonly filesystem on multi path link failure
by alexio79@gmail.com
Hello everyone, I have a cluster with 4 oVirt Node and a self hosted engine, fiber channel storage for storage volumes and for boot in san of oVirt Node itself (nodes doesn't have physical disks but boots from SAN). We made a software update of SAN storage and one of the multi path links went down for some seconds before coming up again. No issues with VMs but oVirt Node marked as readonly a couple of mount points:
/dev/mapper/onn-var on /var type ext4 (ro,relatime,seclabel,discard,stripe=16,data=ordered)
/dev/mapper/onn-var_log on /var/log type ext4 (ro,relatime,seclabel,discard,stripe=16,data=ordered)
Every other mount point was r/w but not /var... We had to reboot every single node to restore r/w mounts. These are the last logs written before read only:
Mar 4 11:15:59 ovirt-node1 kernel: sd 1:0:0:4: [sds] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Mar 4 11:15:59 ovirt-node1 kernel: sd 1:0:0:4: [sds] CDB: Read(16) 88 00 00 00 00 01 64 5f 9a 18 00 00 00 20 00 00
Mar 4 11:15:59 ovirt-node1 kernel: blk_update_request: I/O error, dev sds, sector 5978954264
Mar 4 11:15:59 ovirt-node1 kernel: device-mapper: multipath: Failing path 65:32.
Mar 4 11:15:59 ovirt-node1 multipathd: sds: mark as failed
Mar 4 11:15:59 ovirt-node1 multipathd: 36001738c7c8069bb00000000000135eb: remaining active paths: 3
Mar 4 11:16:11 ovirt-node1 sanlock[10535]: 2021-03-04 11:16:11 28402637 [13668]: s4 delta_renew read timeout 10 sec offset 0 /dev/58b54b5f-993b-4710-b107-7744018d22b
And this is multipath -ll output:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX dm-2 IBM ,XXXXXX
size=237G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 0:0:0:2 sdc 8:32 active ready running
|- 1:0:0:2 sdg 8:96 active ready running
|- 1:0:1:2 sdn 8:208 active ready running
`- 0:0:1:2 sdr 65:16 active ready running
Any idea on what happed and how to prevent it? Thank you in advice for you help.
3 years, 9 months
Public IP routing question
by David White
If I have a private network (10.1.0.0/24) that is being used by the cluster for intra-host communication & replication, how do I get a block of public IP addresses routed to the virtual cluster?
For example, let's say I have a public /28, and let's use 1.1.1.0/28 for example purposes.
I'll assign 1.1.1.1 to the router.
How can I then route 1.1.1.2 - 1.1.1.16 down to the virtualized oVirt cluster?
Do I need to assign a public IP address to a 2nd physical NIC on each host, and put that network onto a totally different physical switch?
Or should I instead setup default routes on the 10.1.0.0/24 network?
I also wanted to follow up on my question below to see if anyone had any thoughts on how things would function when a portion of the network is lost.
Sent with ProtonMail Secure Email.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, March 4, 2021 4:53 AM, David White <dmwhite823(a)protonmail.com> wrote:
> I tested oVirt (4.3? I can't remember) last fall on a single host (hyperconverged).
>
> Now, I'm getting ready to deploy to a 3 physical node (possibly 4) hyperconverged cluster, and I guess I'll go ahead and go with 4.4.
> Although Red Hat's recent shift of CentOS 8 to the Stream model, as well as the announcement that RHV is going away makes me nervous. I really don't see any other virtualization software doing quite the same stuff as oVirt at the moment.
>
> One of my questions is around the back end out-of-band network for data replication.
> What happens if all 3 servers are healthy and the normal network is fine for serving traffic to the VM consumers, but the switching network for data replication goes down? Is it possible to configure oVirt to "fail over" to the front-end network?
> I'm also wondering if its possible to do away with a switch all together, and just link the physical hosts together directly (like a cross-over cable) for the data replication.
>
> I'm also wondering what would happen in the following scenario:
>
> - All 3 servers are healthy
> - The out-of-band data replication network is healthy
> - 1 or 2 of the servers suddenly lost network connectivity on the front-end network
>
> What then? Would everything just keep working, and network traffic be forced to go out the healthy interface(s) on the remaining hosts?
>
> Sent with ProtonMail Secure Email.
3 years, 9 months
deploy oVirt 4.4 errors
by grig.4n@gmail.com
grep ERROR /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20210227232700-gil2fj.log
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9786
2021-02-28 00:00:12,059+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 ovirtsdk4.ConnectionError: Error while sending HTTP request: (7, 'Failed to connect to ovirt4-adm.domain.local port 443: No route to host')
2021-02-28 00:00:12,160+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"attempts": 50, "changed": false, "msg": "Error while sending HTTP request: (7, 'Failed to connect to ovirt4-adm.domain.local port 443: No route to host')"}
2021-02-28 00:00:58,055+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
2021-02-28 00:00:58,759+0600 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Closing up': Failed executing ansible-playbook
2021-02-28 00:01:05,984+0600 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 fatal: [localhost]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host ovirt4-adm.domain.local. port 22: No route to host", "skip_reason": "Host localhost is unreachable", "unreachable": true}
2021-02-28 00:01:22,146+0600 ERROR otopi.plugins.gr_he_common.core.misc misc._terminate:167 Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
3 years, 9 months
HCI Fresh Deploy - Cannot upgrade 4.4. to 4.5 or add more hosts
by penguin pages
I reinstalled OS all nodes CentoOS 8 streams.
installed cockpit engine and ran through HCI deploy wizard with gluster. this deployed 4.4 version of engine.. then said 4.5 version available.
I try to add hosts and get error "Error while executing action: Server thor.penguinpages.local is already part of another cluster." (or other node).
I ssh into ovirt engine host and attempt to ssh from that host to three nodes of HCI cluster and it works fine. I added ssh keys for passwordless login but no change
I change cluster then datacenter to 4.5. Then attempt to run upgrade but get same error.
Seems like chicken / egg issue. I can't get nodes. I can't upgrade 4.4. to 4.5 because I have no nodes to allow maintenance mode. And engine shows "no updates" needed
[root@ovirte01 ~]# yum update
Last metadata expiration check: 1:39:16 ago on Mon 08 Mar 2021 08:36:40 AM EST.
Dependencies resolved.
Nothing to do.
Complete!
Suggestions?
3 years, 9 months
[security-updates] oVirt node
by Thiago Linhares
Hello there,
I wonder whats the right approach to get security updates for ovirt nodes? (installed using ovirt node iso image)
Eg.:
'sudo' package has a know vulnerability until version sudo-1.8.23-9.el7.x86_64.
When trying to update this package on a ovirt node, it would not.
Checking versionlock, I confirmed it is listed there:
# grep sudo /etc/yum/pluginconf.d/versionlock.list
0:libsss_sudo-1.16.4-37.el7_8.3.x86_64
0:sudo-1.8.23-9.el7.x86_64
Regards,
3 years, 9 months