May 2019 - Users - Ovirt List Archives

by Andrés Jiménez

Hi, all I do have a 5 hosts cluster with over 300 VMs in production with Red Hat Virtualization version 4.2.8 and I am willing to switch to a pure CentOS + oVirt 4.2 setup. Management is based in a RHVM hosted-engine in a dedicated iSCSI DataStore and our hosts are a mix of RHVH nodes (3) and CentOS with oVirt 4.2 repositories (2). I was wondering if I can just set up a new hosted-engine in a CentOS host and recover a backup of our current ovirt- engine (rhvm). Would that work straight away without affecting my running VMs? Cheers, -- Andrés Jiménez Gómez Linube e-mail: soporte(a)linube.com www.linube.com La información contenida en este mensaje y/o archivo(s) adjunto(s) es confidencial/privilegiada y está destinada a ser leída sólo por la(s) persona(s) a la(s) que va dirigida. Si usted lee este mensaje y no es el destinatario señalado, el empleado o el agente responsable de entregar el mensaje al destinatario, o ha recibido esta comunicación por error, le informamos que está totalmente prohibida, y puede ser ilegal, cualquier divulgación, distribución o reproducción de esta comunicación, y le rogamos que nos lo notifique inmediatamente y nos devuelva el mensaje original a la dirección arriba mencionada. Gracias

6 years, 1 month

1
0
0 / 0

Please Help, Ovirt Node Hosted Engine Deployment Problems 4.3.2

by Todd Barton

I've having to rebuild an environment that started back in the early 3.x days. A lot has changed and I'm attempting to use the Ovirt Node based setup to build a new environment, but I can't get through the hosted engine deployment process via the cockpit (I've done command line as well). I've tried static DHCP address and static IPs as well as confirmed I have resolvable host-names. This is a test environment so I can work through any issues in deployment. When the cockpit is displaying the waiting for host to come up task, the cockpit gets disconnected. It appears to a happen when the bridge network is setup. At that point, the deployment is messed up and I can't return to the cockpit. I've tried this with one or two nic/interfaces and tried every permutation of static and dynamic ip addresses. I've spent a week trying different setups and I've got to be doing something stupid. Attached is a screen capture of the resulting IP info after my latest try failing. I used two nics, one for the gluster and bridge network and the other for the ovirt cockpit access. I can't access cockpit on either ip address after the failure. I've attempted this setup as both a single host hyper-converged setup and a three host hyper-converged environment...same issue in both. Can someone please help me or give me some thoughts on what is wrong? Thanks! Todd Barton

6 years, 1 month

4
13
0 / 0

oVirt 4.3.3 third async update is now available

by Sandro Bonazzola

The oVirt Team has just released a new version of the following packages: - ovirt-engine-4.3.3.7 The async release includes an additional fix ( https://gerrit.ovirt.org/#/c/99738/) for Bug 1697367 <https://bugzilla.redhat.com/show_bug.cgi?id=1697367> - make SMT setting more user friendly Thanks, -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://red.ht/sig> <https://redhat.com/summit>

6 years, 1 month

1
0
0 / 0

Stale hosted engine node information harmful ?

by Andreas Elvers

I have 5 nodes (node01 to node05). Originally all those nodes were part of our default datacenter/cluster with a NFS storage domain for vmdisk, engine and iso-images. All five nodes were engine HA nodes. Later node01, node02 and node03 were re-installed to have engine HA removed. Then those nodes were removed from the default cluster. Eventually node01,02 and 03 were completely re-installed to host our new Ceph/Gluster based datecenter. The engine is still running on the old default Datacenter. Now I wish to move it over to our ceph/gluster datacenter. when I look at the current output of "hosted-engine --vm-status" I see: --== Host node01.infra.solutions.work (id: 1) status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : node01.infra.solutions.work Host ID : 1 Engine status : unknown stale-data Score : 0 stopped : True Local maintenance : False crc32 : e437bff4 local_conf_timestamp : 155627 Host timestamp : 155877 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=155877 (Fri Aug 3 13:09:19 2018) host-id=1 score=0 vm_conf_refresh_time=155627 (Fri Aug 3 13:05:08 2018) conf_on_shared_storage=True maintenance=False state=AgentStopped stopped=True --== Host node02.infra.solutions.work (id: 2) status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : node02.infra.solutions.work Host ID : 2 Engine status : unknown stale-data Score : 0 stopped : True Local maintenance : False crc32 : 11185b04 local_conf_timestamp : 154757 Host timestamp : 154856 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=154856 (Fri Aug 3 13:22:19 2018) host-id=2 score=0 vm_conf_refresh_time=154757 (Fri Aug 3 13:20:40 2018) conf_on_shared_storage=True maintenance=False state=AgentStopped stopped=True --== Host node03.infra.solutions.work (id: 3) status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : node03.infra.solutions.work Host ID : 3 Engine status : unknown stale-data Score : 0 stopped : False Local maintenance : True crc32 : 9595bed9 local_conf_timestamp : 14363 Host timestamp : 14362 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=14362 (Thu Aug 2 18:03:25 2018) host-id=3 score=0 vm_conf_refresh_time=14363 (Thu Aug 2 18:03:25 2018) conf_on_shared_storage=True maintenance=True state=LocalMaintenance stopped=False --== Host node04.infra.solutions.work (id: 4) status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : node04.infra.solutions.work Host ID : 4 Engine status : {"health": "good", "vm": "up", "detail": "Up"} Score : 3400 stopped : False Local maintenance : False crc32 : 245854b1 local_conf_timestamp : 317498 Host timestamp : 317498 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=317498 (Thu May 2 09:44:47 2019) host-id=4 score=3400 vm_conf_refresh_time=317498 (Thu May 2 09:44:47 2019) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False --== Host node05.infra.solutions.work (id: 5) status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : node05.infra.solutions.work Host ID : 5 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 3400 stopped : False Local maintenance : False crc32 : 0711afa0 local_conf_timestamp : 318044 Host timestamp : 318044 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=318044 (Thu May 2 09:44:45 2019) host-id=5 score=3400 vm_conf_refresh_time=318044 (Thu May 2 09:44:45 2019) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False The old node01, node02 and node03 are still present. The new incarnations of node01, node02 and node03 will be the destination the the deployment of the new home of our engine to which I wish to restore the backup to. But I'm not sure, if (and how) the old date should be removed first.

6 years, 2 months

2
2
1 / 0

oVirt 4.3.3 - servers cannot be reached

by Wood Peter

Hi, I setup AD authentication and from command line all looks good. Unfortunately, on the Web UI users sometimes login successfully but most of the times the login screen just hangs and after 2-3 min. it displays "Unable to log in because servers cannot be reached. Try again later." In engine.log I see this: 2019-05-02 11:12:11,581-07 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-72) [] EVENT_ID: USER_VDC_LOGIN_FAILED(114), User peter(a)ad.mycompany.com connecting from '10.12.29.48' failed to log in : 'Unable to log in because servers cannot be reached. Try again later.'. 2019-05-02 11:12:11,583-07 ERROR [org.ovirt.engine.core.sso.servlets.InteractiveAuthServlet] (default task-68) [] Cannot authenticate user 'peter(a)ad.mycompany.com' connecting from '10.12.29.48': Unable to log in because servers cannot be reached. Try again later. Even when my login attempt on the Web UI is hanging I can still successfully run the login test from shell: ovirt-engine-extensions-tool aaa login-user --profile=ad.mycompany.com --user-name=peter The above command never fails. That makes me wonder why am I getting the "servers cannot be reached" error? I assume the AD servers cannot be reached but from the command line it works perfect every time. Any idea what could be the problem or where to look for the error. Thank you, -- Peter

6 years, 2 months

1
0
0 / 0

oVirt VM Table View Survey

by Sandro Bonazzola

Hi, the oVirt UX team at Red Hat would like to get your feedback on the VM table view that's featured in the admin portal. The survey will also run during Summit out of the UXD research and feedback booth. We kindly ask you to join the survey and answer at https://redhatdg.co1.qualtrics.com/jfe/form/SV_cGwhuzRUQGTXnp3 This survey is under oVirt Privacy Policy available at https://www.ovirt.org/site/privacy-policy.html -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://red.ht/sig> <https://redhat.com/summit>

6 years, 2 months

1
0
0 / 0

Prevent Gluster Volume being monitored by oVirt

by smirta＠gmx.net

Dear all We are running a oVirt GlusterFS hyperconverged setup with arbiter version 4.2.6.4-1.el7 here. It looks like we are affected by this bug https://bugzilla.redhat.com/show_bug.cgi?id=1614430 with one or more gluster volumes that are not storage domains of oVirt. Is there a way to exclude them from monitoring (vdsm-gluster) or to prevent being added to the engine's database? Thanks Simon

6 years, 2 months

1
0
0 / 0

Re: Stale hosted engine node information harmful ?

by Strahil

In a Red Hat Solution , it is recommended to restart ovirt-ha-agent & ovirt-ha-broker. I usually set the global maintenance and wait 20s-30s . Then I just stop on all nodes ovirt-ha-agent.service & ovirt-ha-broker.service . Once everywhere is stopped, start the 2 services on all nodes and wait 4-5min. Last verify the status from each host, before removing global maintenance. Best Regards, Strahil NikolovOn May 2, 2019 12:30, Andreas Elvers <andreas.elvers+ovirtforum(a)solutions.work> wrote: > > I have 5 nodes (node01 to node05). Originally all those nodes were part of our default datacenter/cluster with a NFS storage domain for vmdisk, engine and iso-images. All five nodes were engine HA nodes. > Later node01, node02 and node03 were re-installed to have engine HA removed. Then those nodes were removed from the default cluster. Eventually node01,02 and 03 were completely re-installed to host our new Ceph/Gluster based datecenter. The engine is still running on the old default Datacenter. Now I wish to move it over to our ceph/gluster datacenter. > > when I look at the current output of "hosted-engine --vm-status" I see: > > --== Host node01.infra.solutions.work (id: 1) status ==-- > > conf_on_shared_storage : True > Status up-to-date : False > Hostname : node01.infra.solutions.work > Host ID : 1 > Engine status : unknown stale-data > Score : 0 > stopped : True > Local maintenance : False > crc32 : e437bff4 > local_conf_timestamp : 155627 > Host timestamp : 155877 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=155877 (Fri Aug 3 13:09:19 2018) > host-id=1 > score=0 > vm_conf_refresh_time=155627 (Fri Aug 3 13:05:08 2018) > conf_on_shared_storage=True > maintenance=False > state=AgentStopped > stopped=True > > > --== Host node02.infra.solutions.work (id: 2) status ==-- > > conf_on_shared_storage : True > Status up-to-date : False > Hostname : node02.infra.solutions.work > Host ID : 2 > Engine status : unknown stale-data > Score : 0 > stopped : True > Local maintenance : False > crc32 : 11185b04 > local_conf_timestamp : 154757 > Host timestamp : 154856 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=154856 (Fri Aug 3 13:22:19 2018) > host-id=2 > score=0 > vm_conf_refresh_time=154757 (Fri Aug 3 13:20:40 2018) > conf_on_shared_storage=True > maintenance=False > state=AgentStopped > stopped=True > > > --== Host node03.infra.solutions.work (id: 3) status ==-- > > conf_on_shared_storage : True > Status up-to-date : False > Hostname : node03.infra.solutions.work > Host ID : 3 > Engine status : unknown stale-data > Score : 0 > stopped : False > Local maintenance : True > crc32 : 9595bed9 > local_conf_timestamp : 14363 > Host timestamp : 14362 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=14362 (Thu Aug 2 18:03:25 2018) > host-id=3 > score=0 > vm_conf_refresh_time=14363 (Thu Aug 2 18:03:25 2018) > conf_on_shared_storage=True > maintenance=True > state=LocalMaintenance > stopped=False > > > --== Host node04.infra.solutions.work (id: 4) status ==-- > > conf_on_shared_storage : True > Status up-to-date : True > Hostname : node04.infra.solutions.work > Host ID : 4 > Engine status : {"health": "good", "vm": "up", "detail": "Up"} > Score : 3400 > stopped : False > Local maintenance : False > crc32 : 245854b1 > local_conf_timestamp : 317498 > Host timestamp : 317498 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=317498 (Thu May 2 09:44:47 2019) > host-id=4 > score=3400 > vm_conf_refresh_time=317498 (Thu May 2 09:44:47 2019) > conf_on_shared_storage=True > maintenance=False > state=EngineUp > stopped=False > > > --== Host node05.infra.solutions.work (id: 5) status ==-- > > conf_on_shared_storage : True > Status up-to-date : True > Hostname : node05.infra.solutions.work > Host ID : 5 > Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} > Score : 3400 > stopped : False > Local maintenance : False > crc32 : 0711afa0 > local_conf_timestamp : 318044 > Host timestamp : 318044 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=318044 (Thu May 2 09:44:45 2019) > host-id=5 > score=3400 > vm_conf_refresh_time=318044 (Thu May 2 09:44:45 2019) > conf_on_shared_storage=True > maintenance=False > state=EngineDown > stopped=False > > > The old node01, node02 and node03 are still present. > > The new incarnations of node01, node02 and node03 will be the destination the the deployment of the new home of our engine to which I wish to restore the backup to. But I'm not sure, if (and how) the old date should be removed first. > _______________________________________________ > Users mailing list -- users(a)ovirt.org > To unsubscribe send an email to users-leave(a)ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WF5BNCWZHS2...

6 years, 2 months

2
1
0 / 0

Re: All hosts non-operational after upgrading from 4.2 to 4.3

by Strahil

Are you able to access your iSCSI via the /rhev/data-center/mnt... mount point ? Best Regards, Strahil NikolovOn Apr 5, 2019 19:04, John Florian <jflorian(a)doubledog.org> wrote: > > I am in a severe pinch here. A while back I upgraded from 4.2.8 to 4.3.3 and only had one step remaining and that was to set the cluster compat level to 4.3 (from 4.2). When I tried this it gave the usual warning that each VM would have to be rebooted to complete, but then I got my first unusual piece when it then told me next that this could not be completed until each host was in maintenance mode. Quirky I thought, but I stopped all VMs and put both hosts into maintenance mode. I then set the cluster to 4.3. Things didn't want to become active again and I eventually noticed that I was being told the DC needed to be 4.3 as well. Don't remember that from before, but oh well that was easy. > > However, the DC and SD remains down. The hosts are non-op. I've powered everything off and started fresh but still wind up in the same state. Hosts will look like their active for a bit (green triangle) but then go non-op after about a minute. It appears that my iSCSI sessions are active/logged in. The one glaring thing I see in the logs is this in vdsm.log: > > 2019-04-05 12:03:30,225-0400 ERROR (monitor/07bb1bf) [storage.Monitor] Setting up monitor for 07bb1bf8-3b3e-4dc0-bc43-375b09e06683 failed (monitor:329) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 326, in _setupLoop > self._setupMonitor() > File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 348, in _setupMonitor > self._produceDomain() > File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 158, in wrapper > value = meth(self, *a, **kw) > File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 366, in _produceDomain > self.domain = sdCache.produce(self.sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce > domain.getRealDomain() > File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain > return self._cache._realProduce(self._sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce > domain = self._findDomain(sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain > return findMethod(sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain > raise se.StorageDomainDoesNotExist(sdUUID) > StorageDomainDoesNotExist: Storage domain does not exist: (u'07bb1bf8-3b3e-4dc0-bc43-375b09e06683',) > > How do I proceed to get back operational?

6 years, 2 months

2
1
0 / 0

Re: Fwd: [Gluster-users] Announcing Gluster release 5.5

by Strahil

Hi Darrel, Will it fix the cluster brick sudden death issue ? Best Regards, Strahil NikolovOn Mar 21, 2019 21:56, Darrell Budic <budic(a)onholyground.com> wrote: > > This release of Gluster 5.5 appears to fix the gluster 3.12->5.3 migration problems many ovirt users have encountered. > > I’ll try and test it out this weekend and report back. If anyone else gets a chance to check it out, let us know how it goes! > > -Darrell > >> Begin forwarded message: >> >> From: Shyam Ranganathan <srangana(a)redhat.com> >> Subject: [Gluster-users] Announcing Gluster release 5.5 >> Date: March 21, 2019 at 6:06:33 AM CDT >> To: announce(a)gluster.org, gluster-users Discussion List <gluster-users(a)gluster.org> >> Cc: GlusterFS Maintainers <maintainers(a)gluster.org> >> >> The Gluster community is pleased to announce the release of Gluster >> 5.5 (packages available at [1]). >> >> Release notes for the release can be found at [3]. >> >> Major changes, features and limitations addressed in this release: >> >> - Release 5.4 introduced an incompatible change that prevented rolling >> upgrades, and hence was never announced to the lists. As a result we are >> jumping a release version and going to 5.5 from 5.3, that does not have >> the problem. >> >> Thanks, >> Gluster community >> >> [1] Packages for 5.5: >> https://download.gluster.org/pub/gluster/glusterfs/5/5.5/ >> >> [2] Release notes for 5.5: >> https://docs.gluster.org/en/latest/release-notes/5.5/ >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users(a)gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > >

6 years, 2 months

9
18
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Users May 2019