[ovirt-users] [Call for feedback] did you install/update to 4.1.0?

Ralf Schenk rs at databay.de
Fri Feb 3 09:54:55 UTC 2017


Hello,

I upgraded my cluster of 8 hosts with gluster storage and
hosted-engine-ha. They were already Centos 7.3 and using Ovirt 4.0.6 and
gluster 3.7.x packages from storage-sig testing.

I'm missing the storage listed under storage tab but this is already
filed by a bug. Increasing Cluster and Storage Compability level and
also "reset emulated machine" after having upgraded one host after
another without the need to shutdown vm's works well. (VM's get sign
that there will be changes after reboot).

Important: you also have to issue a yum update on the host for upgrading
additional components like i.e. gluster to 3.8.x. I was frightened of
this step but It worked well except a configuration issue I was
responsible for in gluster.vol (I had "transport socket, rdma")

Bugs/Quirks so far:

1. After restarting a single VM that used RNG-Device I got an error (it
was german) but like "RNG Device not supported by cluster". I hat to
disable RNG Device save the settings. Again settings and enable RNG
Device. Then machine boots up.
I think there is a migration step missing from /dev/random to
/dev/urandom for exisiting VM's.

2. I'm missing any gluster specific management features as my gluster is
not managable in any way from the GUI. I expected to see my gluster now
in dashboard and be able to add volumes etc. What do I need to do to
"import" my existing gluster (Only one volume so far) to be managable ?

3. Three of my hosts have the hosted engine deployed for ha. First all
three where marked by a crown (running was gold and others where
silver). After upgrading the 3 Host deployed hosted engine ha is not
active anymore.

I can't get this host back with working ovirt-ha-agent/broker. I already
rebooted, manually restarted the services but It isn't able to get
cluster state according to
"hosted-engine --vm-status". The other hosts state the host status as
"unknown stale-data"

I already shut down all agents on all hosts and issued a "hosted-engine
--reinitialize-lockspace" but that didn't help.

Agents stops working after a timeout-error according to log:

MainThread::INFO::2017-02-02
19:24:52,040::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::INFO::2017-02-02
19:24:59,185::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::INFO::2017-02-02
19:25:06,333::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::INFO::2017-02-02
19:25:13,554::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::INFO::2017-02-02
19:25:20,710::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::INFO::2017-02-02
19:25:27,865::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::ERROR::2017-02-02
19:25:27,866::hosted_engine::815::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
Failed to start monitoring domain
(sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout
during domain acquisition
MainThread::WARNING::2017-02-02
19:25:27,866::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Error while monitoring engine: Failed to start monitoring domain
(sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout
during domain acquisition
MainThread::WARNING::2017-02-02
19:25:27,866::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Unexpected error
Traceback (most recent call last):
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 443, in start_monitoring
    self._initialize_domain_monitor()
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 816, in _initialize_domain_monitor
    raise Exception(msg)
Exception: Failed to start monitoring domain
(sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout
during domain acquisition
MainThread::ERROR::2017-02-02
19:25:27,866::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Shutting down the agent because of 3 failures in a row!
MainThread::INFO::2017-02-02
19:25:32,087::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
VDSM domain monitor status: PENDING
MainThread::INFO::2017-02-02
19:25:34,250::hosted_engine::769::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
Failed to stop monitoring domain
(sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): Storage domain is member
of pool: u'domain=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'
MainThread::INFO::2017-02-02
19:25:34,254::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
Agent shutting down

The gluster volume of the engine is mounted corrctly in the host and
accessible. Files are also readable etc. No clue what to do.

4. Last but not least: Ovirt is still using fuse to access VM-Disks on
Gluster. I know - scheduled for 4.1.1 - but it was already there in
3.5.x and was scheduled for every release since then. I had this feature
with opennebula already two years ago and performance is sooo much
better.... So please GET IT IN  !

Bye



Am 02.02.2017 um 13:19 schrieb Sandro Bonazzola:
> Hi,
> did you install/update to 4.1.0? Let us know your experience!
> We end up knowing only when things doesn't work well, let us know it
> works fine for you :-)

-- 


*Ralf Schenk*
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail *rs at databay.de* <mailto:rs at databay.de>
	  	
*Databay AG*
Jens-Otto-Krag-Straße 11
D-52146 Würselen
*www.databay.de* <http://www.databay.de>

Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
Philipp Hermanns
Aufsichtsratsvorsitzender: Wilhelm Dohmen

------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170203/b227c234/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logo_databay_email.gif
Type: image/gif
Size: 1250 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170203/b227c234/attachment-0001.gif>


More information about the Users mailing list