[ovirt-users] [Call for feedback] did you install/update to 4.1.0?

Fri Feb 3 10:18:28 UTC 2017

----- Original Message -----
> From: "Ralf Schenk" <rs at databay.de>
> To: users at ovirt.org
> Sent: Friday, February 3, 2017 3:24:55 PM
> Subject: Re: [ovirt-users] [Call for feedback] did you install/update to 4.1.0?
> 
> 
> 
> Hello,
> 
> I upgraded my cluster of 8 hosts with gluster storage and hosted-engine-ha.
> They were already Centos 7.3 and using Ovirt 4.0.6 and gluster 3.7.x
> packages from storage-sig testing.
> 
> 
> I'm missing the storage listed under storage tab but this is already filed by
> a bug. Increasing Cluster and Storage Compability level and also "reset
> emulated machine" after having upgraded one host after another without the
> need to shutdown vm's works well. (VM's get sign that there will be changes
> after reboot).
> 
> Important: you also have to issue a yum update on the host for upgrading
> additional components like i.e. gluster to 3.8.x. I was frightened of this
> step but It worked well except a configuration issue I was responsible for
> in gluster.vol (I had "transport socket, rdma")
> 
> 
> Bugs/Quirks so far:
> 
> 
> 1. After restarting a single VM that used RNG-Device I got an error (it was
> german) but like "RNG Device not supported by cluster". I hat to disable RNG
> Device save the settings. Again settings and enable RNG Device. Then machine
> boots up.
> I think there is a migration step missing from /dev/random to /dev/urandom
> for exisiting VM's.
> 
> 2. I'm missing any gluster specific management features as my gluster is not
> managable in any way from the GUI. I expected to see my gluster now in
> dashboard and be able to add volumes etc. What do I need to do to "import"
> my existing gluster (Only one volume so far) to be managable ?
> 
> 

If it is a hyperconverged cluster, then all your hosts are already managed by ovirt. So you just need to enable 'Gluster Service' in the Cluster, gluster volume will be imported automatically when you enable gluster service. 

If it is not a hyperconverged cluster, then you have to create a new cluster and enable only 'Gluster Service'. Then you can import or add the gluster hosts to this Gluster cluster. 

You may also need to define a gluster network if you are using a separate network for gluster data traffic. More at http://www.ovirt.org/develop/release-management/features/network/select-network-for-gluster/

> 3. Three of my hosts have the hosted engine deployed for ha. First all three
> where marked by a crown (running was gold and others where silver). After
> upgrading the 3 Host deployed hosted engine ha is not active anymore.
> 
> I can't get this host back with working ovirt-ha-agent/broker. I already
> rebooted, manually restarted the services but It isn't able to get cluster
> state according to
> "hosted-engine --vm-status". The other hosts state the host status as
> "unknown stale-data"
> 
> I already shut down all agents on all hosts and issued a "hosted-engine
> --reinitialize-lockspace" but that didn't help.
> 
> 
> Agents stops working after a timeout-error according to log:
> 
> MainThread::INFO::2017-02-02
> 19:24:52,040::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::INFO::2017-02-02
> 19:24:59,185::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::INFO::2017-02-02
> 19:25:06,333::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::INFO::2017-02-02
> 19:25:13,554::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::INFO::2017-02-02
> 19:25:20,710::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::INFO::2017-02-02
> 19:25:27,865::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::ERROR::2017-02-02
> 19:25:27,866::hosted_engine::815::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
> Failed to start monitoring domain
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during
> domain acquisition
> MainThread::WARNING::2017-02-02
> 19:25:27,866::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Error while monitoring engine: Failed to start monitoring domain
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during
> domain acquisition
> MainThread::WARNING::2017-02-02
> 19:25:27,866::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Unexpected error
> Traceback (most recent call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 443, in start_monitoring
> self._initialize_domain_monitor()
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 816, in _initialize_domain_monitor
> raise Exception(msg)
> Exception: Failed to start monitoring domain
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during
> domain acquisition
> MainThread::ERROR::2017-02-02
> 19:25:27,866::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Shutting down the agent because of 3 failures in a row!
> MainThread::INFO::2017-02-02
> 19:25:32,087::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
> VDSM domain monitor status: PENDING
> MainThread::INFO::2017-02-02
> 19:25:34,250::hosted_engine::769::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
> Failed to stop monitoring domain
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): Storage domain is member of
> pool: u'domain=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'
> MainThread::INFO::2017-02-02
> 19:25:34,254::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> Agent shutting down
> 
> 
> 
> The gluster volume of the engine is mounted corrctly in the host and
> accessible. Files are also readable etc. No clue what to do.
> 
> 
> 4. Last but not least: Ovirt is still using fuse to access VM-Disks on
> Gluster. I know - scheduled for 4.1.1 - but it was already there in 3.5.x
> and was scheduled for every release since then. I had this feature with
> opennebula already two years ago and performance is sooo much better.... So
> please GET IT IN !
> 
> 

This is blocked because of various changes required in libvirt/QEMU layers. But I hope this will fixed now :-)

Regards,
Ramesh

> Bye
> 
> 
> 
> 
> Am 02.02.2017 um 13:19 schrieb Sandro Bonazzola:
> 
> 
> Hi,
> did you install/update to 4.1.0? Let us know your experience!
> We end up knowing only when things doesn't work well, let us know it works
> fine for you :-)
> 
> --
> 
> 
> 
> 	
> 	
> Ralf Schenk
> fon +49 (0) 24 05 / 40 83 70
> fax +49 (0) 24 05 / 40 83 759
> mail rs at databay.de
> 		
> Databay AG
> Jens-Otto-Krag-Straße 11
> D-52146 Würselen
> www.databay.de
> 	
> Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
> Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
> Philipp Hermanns
> Aufsichtsratsvorsitzender: Wilhelm Dohmen
> 
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>