July 2019 - Users - oVirt List Archives

engine-setup failure on 4.1 -> 4.2
by Alex K 26 Jul '19

26 Jul '19

Hi all, I have a self hosted engine setup, with 3 servers. I have successfully upgraded several other installations, from 4.1 to 4.2. On one of them I am encountering and issue with the engine-setup. I get the following warning: Found the following problems in PostgreSQL configuration for the Engine database: It is required to be at least '8192' Please note the following required changes in postgresql.conf on 'localhost': 'work_mem' is currently '1024'. It is required to be at least '8192'. postgresql.conf is usually in /var/lib/pgsql/data, /var/opt/rh/rh-postgresql95/lib/pgsql/data, or somewhere under /etc/postgresql* . You have to restart PostgreSQL after making these changes. The database requires these configurations values to be changed. Setup can fix them for you or abort. Fix automatically? (Yes, No) [Yes]: Then, if I select Yes to proceed, I get: [WARNING] This release requires PostgreSQL server 9.5.14 but the engine database is currently hosted on PostgreSQL server 9.2.24. Then finally: [ ERROR ] It seems that you are running your engine inside of the hosted-engine VM and are not in "Global Maintenance" mode. In that case you should put the system into the "Global Maintenance" mode before running engine-setup, or the hosted-engine HA agent might kill the machine, which might corrupt your data. [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup detected, but Global Maintenance is not set. [ INFO ] Stage: Clean up Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20190725124653-hvekp2.log [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20190725125154-setup.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Execution of setup failed I have put the cluster on global maintenance, though the engine thinks it is not. Are any steps that I may follow to avoid above? I am attaching also the last full setup log. Thank you! Alex

1 1

Clone function seem broken - regarding disk virtual size
by Vrgotic, Marko 26 Jul '19

26 Jul '19

Dear oVirt, All our CentOS templates are “baked” with 40GB disk (sparse) QCOW. When Creating new VM from template, using Thin storage allocation: * VM is created with 40GB disk (reading from oVirt GUI) and * from OS fdisk and/or parted report 40GB disk size. * The / partition is also matching expected size When Creating VM from template, using default Clone storage allocation: * VM is created with 40GB disk (reading from oVIrt GUI) * However: * from OS fdsik and/or parted report only actual used space, which in our case is 8GB. * The / partition is matching only actual disk size, not virtual one Why is Clone not taking into account Virtual size of the Template disk? Is it intended behavior that when image is cloned from template, only actual disk size information is copied, not the virtual? Please assist. Kindly awaiting your reply. oVirt 4.3.4 with SHE and NFS based storage and Local. — — — Met vriendelijke groet / Kind regards, Marko Vrgotic Sr. System Engineer @ System Administration m.vrgotic(a)activevideo.com<mailto:m.vrgotic@activevideo.com> tel. +31 (0)35 677 4131 ActiveVideo BV Mediacentrum 3741 Joop van den Endeplein 1 1217 WJ Hilversum

1 0

Re: major network changes
by Strahil 25 Jul '19

25 Jul '19

The CA can be downloaded via the web , or you can tell curl to just ignore the engine's cert via the '-k' flag. It will show you if the health page is working. Best Regards, Strahil NikolovOn Jul 24, 2019 19:39, carl langlois <crl.langlois(a)gmail.com> wrote: > > Strahil, not sure what to put for the --cacert. > > Yes Derek your are right at one point the port 8702 stop listening. > > tcp6 0 0 127.0.0.1:8702 :::* LISTEN 1607/ovirt-engine > > After some time the line above disappear. I am trying to figure why this port is being close after some time when the engine is running on the host on the 248.x network. On the 236.x network this port is kept alive all the time. > If you have any hint on why this port is closing do not hesitate because i am starting to be out of ideas. :-) > > > Thanks & Regards > > Carl > > > > > > > On Wed, Jul 24, 2019 at 11:11 AM Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote: >> >> A healthy engine should report: >> [root@ovirt1 ~]# curl --cacert CA https://engine.localdomain/ovirt-engine/services/health;echo >> DB Up!Welcome to Health Status! >> >> Of course you can use the '-k' switch to verify the situation. >> >> Best Regards, >> Strahil Nikolov >> >> В сряда, 24 юли 2019 г., 17:43:59 ч. Гринуич+3, Derek Atkins <derek(a)ihtfp.com> написа: >> >> >> Hi, >> >> carl langlois <crl.langlois(a)gmail.com> writes: >> >> > If i try to access http://ovengine/ovirt-engine/services/health >> > i always get "Service Unavailable" in the browser and each time i it reload in >> > the browser i get in the error_log >> > >> > [proxy_ajp:error] [pid 1868] [client 10.8.1.76:63512] AH00896: failed to make >> > connection to backend: 127.0.0.1 >> > [Tue Jul 23 14:04:10.074023 2019] [proxy:error] [pid 1416] (111)Connection >> > refused: AH00957: AJP: attempt to connect to 127.0.0.1:8702 (127.0.0.1) failed >> >> Sounds like a service isn't running on port 8702. >> >> > Thanks & Regards >> > >> > Carl >> >> -derek >> >> -- >> Derek Atkins 617-623-3745 >> >> derek(a)ihtfp.com >> www.ihtfp.com >> Computer and Internet Security Consultant

3 3

Hosted Engine VM - CPU and Memory Sizing
by paul.christian.suba＠cevalogistics.com 25 Jul '19

25 Jul '19

Hi, Is there a recommended CPU and memory size for the Hosted Engine VM? We have what started as a 4 node physical cluster lab with 4 vms that has now grown to 44vms. The dashboard is slow to load information and the HE VM is consistently seen with 99% CPU with the breakdown below. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12076 postgres 20 0 506516 276172 143772 R 99.3 1.7 206:17.15 postmaster 10337 postgres 20 0 484756 254156 144000 R 99.0 1.6 179:52.35 postmaster 38603 postgres 20 0 500068 267836 143992 S 70.1 1.6 528:04.13 postmaster 49217 postgres 20 0 468736 235484 143624 S 19.3 1.4 41:57.55 postmaster 5569 ovirt 20 0 6430912 2.3g 6368 S 1.3 14.5 894:20.75 java We used the default 4 CPU and 16 GB RAM. This is oVirt 4.3.1. I am curious to find out as well if the postgres processes using 99% CPU is normal?

2 1

ovirt-engine-appliance ova
by jingjie.jiang＠oracle.com 25 Jul '19

25 Jul '19

Hi, Can someone tell me how to generate ovirt-engine-appliance ova file in ovirt-engine-appliance-4.3-20190610.1.el7.x86_64.rpm? I tried to import ovirt-engine-appliance ova(ovirt-engine-appliance-4.3-20190610.1.el7.ova) from ovirt-engine, but I got error as following: Failed to load VM configuration from OVA file: /var/tmp/ovirt-engine-appliance-4.2-20190121.1.el7.ova I guess ovirt-engine-appliance-4.2-20190121.1.el7.ova has more than CentOS7.6. Thanks, Jingjie

5 7

Re: major network changes
by Strahil 25 Jul '19

25 Jul '19

According to another post in the mailing list, the Engine Hosts (that has ovirt-ha-agent/ovirt-ha-broker running) is checking http://{fqdn}/ovirt-engine/services/health As the IP is changed, I think you need to check the URL before and after thr mifgration. Best Regards, Strahil NikolovOn Jul 23, 2019 16:41, Derek Atkins <derek(a)ihtfp.com> wrote: > > Hi, > > If I understand it correctly, the HE Hosts try to ping (or SSH, or > otherwise reach) the Engine host. If it reaches it, then it passes the > liveness check. If it cannot reach it, then it fails. So to me this error > means that there is some configuration, somewhere, that is trying to reach > the engine on the old address (which fails when the engine has the new > address). > > I do not know where in the *host* configuration this data lives, so I > cannot suggest where you need to change it. > > Can 10.16.248.x reach 10.8.236.x and vice-versa? > > Maybe multi-home the engine on both networks for now until you figure it out? > > -derek > > On Tue, July 23, 2019 9:13 am, carl langlois wrote: > > Hi, > > > > We have managed to stabilize the DNS udpate in out network. Now the > > current > > situation is. > > I have 3 hosts that can run the engine (hosted-engine). > > They were all in the 10.8.236.x. Now i have moved one of them in the > > 10.16.248.x. > > > > If i boot the engine on one of the host that is in the 10.8.236.x the > > engine is going up with status "good". I can access the engine UI. I can > > see all my hosts even the one in the 10.16.248.x network. > > > > But if i boot the engine on the hosted-engine host that was switch to the > > 10.16.248.x the engine is booting. I can ssh to it but the status is > > always > > " fail for liveliness check". > > The main difference is that when i boot on the host that is in the > > 10.16.248.x network the engine gets a address in the 248.x network. > > > > On the engine i have this in the > > /var/log/ovirt-engine-dwh/ovirt-engine-dwhd.log > > 019-07-23 > > 09:05:30|MFzehi|YYTDiS|jTq2w8|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can > > not sample data, oVirt Engine is not updating the statistics. Please check > > your oVirt Engine status.|9704 > > the engine.log seems okey. > > > > So i need to understand what this " liveliness check" do(or try to do) so > > i > > can investigate why the engine status is not becoming good. > > > > The initial deployment was done in the 10.8.236.x network. Maybe is as > > something to do with that. > > > > Thanks & Regards > > > > Carl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 18, 2019 at 8:53 AM Miguel Duarte de Mora Barroso < > > mdbarroso(a)redhat.com> wrote: > > > >> On Thu, Jul 18, 2019 at 2:50 PM Miguel Duarte de Mora Barroso > >> <mdbarroso(a)redhat.com> wrote: > >> > > >> > On Thu, Jul 18, 2019 at 1:57 PM carl langlois <crl.langlois(a)gmail.com> > >> wrote: > >> > > > >> > > Hi Miguel, > >> > > > >> > > I have managed to change the config for the ovn-controler. > >> > > with those commands > >> > > ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=ssl: > >> 10.16.248.74:6642 > >> > > ovs-vsctl set Open_vSwitch . external-ids:ovn-encap-ip=10.16.248.65 > >> > > and restating the services > >> > > >> > Yes, that's what the script is supposed to do, check [0]. > >> > > >> > Not sure why running vdsm-tool didn't work for you. > >> > > >> > > > >> > > But even with this i still have the "fail for liveliness check" when > >> starting the ovirt engine. But one thing i notice with our new network > >> is > >> that the reverse DNS does not work(IP -> hostname). The forward is > >> working > >> fine. I am trying to see with our IT why it is not working. > >> > > >> > Do you guys use OVN? If not, you could disable the provider, install > >> > the hosted-engine VM, then, if needed, re-add / re-activate it . > >> > >> I'm assuming it fails for the same reason you've stated initially - > >> i.e. ovn-controller is involved; if it is not, disregard this msg :) > >> > > >> > [0] - > >> https://github.com/oVirt/ovirt-provider-ovn/blob/master/driver/scripts/setu… > >> > > >> > > > >> > > Regards. > >> > > Carl > >>

4 6

Active-Passive DR: mutual for different storage domains possible?
by Gianluca Cecchi 25 Jul '19

25 Jul '19

Hello, suppose I want to implement Active-Passive DR between 2 sites. Sites are SiteA and SiteB. I have 2 storage domains SD1 and SD2, that I can configure so that SD1 is active in storage array installed in SiteA with replica in SiteB and SD2 the reverse. I have 4 hosts: host1 and host2 in SiteA and host3 and host4 in SiteB. I would like to optimize compute resources and workload so that: oVirt env OV1 with ovmgr1 in SiteA (external engine) is composed by host1 and host2 and configured with SD1 oVirt env OV2 with ovmgr2 in SiteB (external engine) is composed by host3 and host4 and configured with SD2. Can I use OV2 as DR for OV1 for VMs installed on SD1 and at the same time OV1 as DR for OV2 for VMs installed on SD2? Thanks in advance, Gianluca Cecchi

2 5

Failed Installation
by Marcello Gentile 25 Jul '19

25 Jul '19

I’m installing a self-hosted engine on CentOS7.6 using cockpit. Keep getting this error [ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Check host status] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"} [ INFO ] TASK [ovirt.hosted_engine_setup : Fetch logs from the engine VM] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Set destination directory path] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Create destination directory] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Find the local appliance image] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Set local_vm_disk_path] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Give the vm time to flush dirty buffers] [ INFO ] ok: [localhost -> localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Copy engine logs] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Remove temporary entry in /etc/hosts for the local VM] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Notify the user about a failure] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n”} The logs don’t tell me much or at least the ones I’m checking Any help would be greatly appreciated Thanks

2 1

iSCSI-based storages won't login to the portals with all IPs on reboot
by nicolas＠devels.es 25 Jul '19

25 Jul '19

Hi, We're running oVirt 4.3.2. Currently, we have one storage backend (cabinet) with two controllers, each of them with 2 network interfaces (4 network interfaces in total). When we added the Storage Domain, we discovered the target for each of the 4 IPs and marked the LUN so it would be added with 4 different IPs. When we put a host on maintenance, all the paths are deactivated, and when we activate it back it discovers all the 4 paths for the storage backend. However, if we reboot the host, on activation it only activates one path. We can see this running 'multipath -ll'. We can manually activate the rest of the paths using this command for each of the IPs: # iscsiadm --mode discovery --type sendtargets --portal 10.X.X.X --login However, we wonder why oVirt wouldn't log into each of the IPs upon a boot. Is there something we're missing? Can this be fixed manually? Currently we're running a script on boot that will issue the command above for each of the IPs of the cabinet. Thanks for any help!

2 1

template permissions not inherited (4.3.4)
by Timmi 24 Jul '19

24 Jul '19

Hi oVirt List, I have just a quick question if I should open a ticket for this or if I'm doing something wrong. I created a new VM template with specific permissions in addition to the system wide permissions. If I create a new VM with the template I notices that only system permissions are copied to the permission of the new VM. Is this the intended behavior? I was somehow under the impression that the permission from the template should have been copied to the newly created VM. Tested with Version 4.3.4.3-1.el7 Best regards Christoph

4 6