
This is a multi-part message in MIME format. --------------6531EB0B252802DEF6E22DB2 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Hello, I upgraded my cluster of 8 hosts with gluster storage and hosted-engine-ha. They were already Centos 7.3 and using Ovirt 4.0.6 and gluster 3.7.x packages from storage-sig testing. I'm missing the storage listed under storage tab but this is already filed by a bug. Increasing Cluster and Storage Compability level and also "reset emulated machine" after having upgraded one host after another without the need to shutdown vm's works well. (VM's get sign that there will be changes after reboot). Important: you also have to issue a yum update on the host for upgrading additional components like i.e. gluster to 3.8.x. I was frightened of this step but It worked well except a configuration issue I was responsible for in gluster.vol (I had "transport socket, rdma") Bugs/Quirks so far: 1. After restarting a single VM that used RNG-Device I got an error (it was german) but like "RNG Device not supported by cluster". I hat to disable RNG Device save the settings. Again settings and enable RNG Device. Then machine boots up. I think there is a migration step missing from /dev/random to /dev/urandom for exisiting VM's. 2. I'm missing any gluster specific management features as my gluster is not managable in any way from the GUI. I expected to see my gluster now in dashboard and be able to add volumes etc. What do I need to do to "import" my existing gluster (Only one volume so far) to be managable ? 3. Three of my hosts have the hosted engine deployed for ha. First all three where marked by a crown (running was gold and others where silver). After upgrading the 3 Host deployed hosted engine ha is not active anymore. I can't get this host back with working ovirt-ha-agent/broker. I already rebooted, manually restarted the services but It isn't able to get cluster state according to "hosted-engine --vm-status". The other hosts state the host status as "unknown stale-data" I already shut down all agents on all hosts and issued a "hosted-engine --reinitialize-lockspace" but that didn't help. Agents stops working after a timeout-error according to log: MainThread::INFO::2017-02-02 19:24:52,040::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::INFO::2017-02-02 19:24:59,185::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::INFO::2017-02-02 19:25:06,333::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::INFO::2017-02-02 19:25:13,554::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::INFO::2017-02-02 19:25:20,710::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::INFO::2017-02-02 19:25:27,865::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::815::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor) Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during domain acquisition MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Error while monitoring engine: Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during domain acquisition MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Unexpected error Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 443, in start_monitoring self._initialize_domain_monitor() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 816, in _initialize_domain_monitor raise Exception(msg) Exception: Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during domain acquisition MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row! MainThread::INFO::2017-02-02 19:25:32,087::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING MainThread::INFO::2017-02-02 19:25:34,250::hosted_engine::769::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor) Failed to stop monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): Storage domain is member of pool: u'domain=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96' MainThread::INFO::2017-02-02 19:25:34,254::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down The gluster volume of the engine is mounted corrctly in the host and accessible. Files are also readable etc. No clue what to do. 4. Last but not least: Ovirt is still using fuse to access VM-Disks on Gluster. I know - scheduled for 4.1.1 - but it was already there in 3.5.x and was scheduled for every release since then. I had this feature with opennebula already two years ago and performance is sooo much better.... So please GET IT IN ! Bye Am 02.02.2017 um 13:19 schrieb Sandro Bonazzola:
Hi, did you install/update to 4.1.0? Let us know your experience! We end up knowing only when things doesn't work well, let us know it works fine for you :-)
-- *Ralf Schenk* fon +49 (0) 24 05 / 40 83 70 fax +49 (0) 24 05 / 40 83 759 mail *rs@databay.de* <mailto:rs@databay.de> *Databay AG* Jens-Otto-Krag-Straße 11 D-52146 Würselen *www.databay.de* <http://www.databay.de> Sitz/Amtsgericht Aachen HRB:8437 USt-IdNr.: DE 210844202 Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm. Philipp Hermanns Aufsichtsratsvorsitzender: Wilhelm Dohmen ------------------------------------------------------------------------ --------------6531EB0B252802DEF6E22DB2 Content-Type: multipart/related; boundary="------------9D3918195F67A45D867502C6" --------------9D3918195F67A45D867502C6 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <p>Hello,</p> <p>I upgraded my cluster of 8 hosts with gluster storage and hosted-engine-ha. They were already Centos 7.3 and using Ovirt 4.0.6 and gluster 3.7.x packages from storage-sig testing.<br> </p> <p>I'm missing the storage listed under storage tab but this is already filed by a bug. Increasing Cluster and Storage Compability level and also "reset emulated machine" after having upgraded one host after another without the need to shutdown vm's works well. (VM's get sign that there will be changes after reboot).</p> <p>Important: you also have to issue a yum update on the host for upgrading additional components like i.e. gluster to 3.8.x. I was frightened of this step but It worked well except a configuration issue I was responsible for in gluster.vol (I had "transport socket, rdma") <br> </p> <p>Bugs/Quirks so far:<br> </p> <p>1. After restarting a single VM that used RNG-Device I got an error (it was german) but like "RNG Device not supported by cluster". I hat to disable RNG Device save the settings. Again settings and enable RNG Device. Then machine boots up. <br> I think there is a migration step missing from /dev/random to /dev/urandom for exisiting VM's.</p> <p>2. I'm missing any gluster specific management features as my gluster is not managable in any way from the GUI. I expected to see my gluster now in dashboard and be able to add volumes etc. What do I need to do to "import" my existing gluster (Only one volume so far) to be managable ? <br> </p> <p>3. Three of my hosts have the hosted engine deployed for ha. First all three where marked by a crown (running was gold and others where silver). After upgrading the 3 Host deployed hosted engine ha is not active anymore.</p> <p>I can't get this host back with working ovirt-ha-agent/broker. I already rebooted, manually restarted the services but It isn't able to get cluster state according to <br> "hosted-engine --vm-status". The other hosts state the host status as "unknown stale-data"</p> <p>I already shut down all agents on all hosts and issued a "hosted-engine --reinitialize-lockspace" but that didn't help.<br> </p> <p>Agents stops working after a timeout-error according to log:</p> <p><tt>MainThread::INFO::2017-02-02 19:24:52,040::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:24:59,185::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:06,333::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:13,554::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:20,710::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:27,865::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::815::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor) Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during domain acquisition</tt><tt><br> </tt><tt>MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Error while monitoring engine: Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during domain acquisition</tt><tt><br> </tt><tt>MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Unexpected error</tt><tt><br> </tt><tt>Traceback (most recent call last):</tt><tt><br> </tt><tt> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 443, in start_monitoring</tt><tt><br> </tt><tt> self._initialize_domain_monitor()</tt><tt><br> </tt><tt> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 816, in _initialize_domain_monitor</tt><tt><br> </tt><tt> raise Exception(msg)</tt><tt><br> </tt><tt>Exception: Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during domain acquisition</tt><tt><br> </tt><tt>MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row!</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:32,087::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status: PENDING</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:34,250::hosted_engine::769::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor) Failed to stop monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): Storage domain is member of pool: u'domain=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'</tt><tt><br> </tt><tt>MainThread::INFO::2017-02-02 19:25:34,254::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down</tt><tt><br> </tt><br> </p> <p>The gluster volume of the engine is mounted corrctly in the host and accessible. Files are also readable etc. No clue what to do.<br> </p> <p>4. Last but not least: Ovirt is still using fuse to access VM-Disks on Gluster. I know - scheduled for 4.1.1 - but it was already there in 3.5.x and was scheduled for every release since then. I had this feature with opennebula already two years ago and performance is sooo much better.... So please GET IT IN !<br> </p> <p>Bye<br> </p> <p><br> </p> <br> <div class="moz-cite-prefix">Am 02.02.2017 um 13:19 schrieb Sandro Bonazzola:<br> </div> <blockquote cite="mid:CAPQRNT=zqmmbyXW-xgsa7CnRb4KOyyOy0Hr4+upcDL1n4xT+YQ@mail.gmail.com" type="cite">Hi, <div>did you install/update to 4.1.0? Let us know your experience!</div> <div>We end up knowing only when things doesn't work well, let us know it works fine for you :-)</div> </blockquote> <br> <div class="moz-signature">-- <br> <p> </p> <table border="0" cellpadding="0" cellspacing="0"> <tbody> <tr> <td colspan="3"><img src="cid:part1.46D494BA.8C43FBE5@databay.de" height="30" border="0" width="151"></td> </tr> <tr> <td valign="top"> <font face="Verdana, Arial, sans-serif" size="-1"><br> <b>Ralf Schenk</b><br> fon +49 (0) 24 05 / 40 83 70<br> fax +49 (0) 24 05 / 40 83 759<br> mail <a href="mailto:rs@databay.de"><font color="#FF0000"><b>rs@databay.de</b></font></a><br> </font> </td> <td width="30"> </td> <td valign="top"> <font face="Verdana, Arial, sans-serif" size="-1"><br> <b>Databay AG</b><br> Jens-Otto-Krag-Straße 11<br> D-52146 Würselen<br> <a href="http://www.databay.de"><font color="#FF0000"><b>www.databay.de</b></font></a> </font> </td> </tr> <tr> <td colspan="3" valign="top"> <font face="Verdana, Arial, sans-serif" size="1"><br> Sitz/Amtsgericht Aachen HRB:8437 USt-IdNr.: DE 210844202<br> Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm. Philipp Hermanns<br> Aufsichtsratsvorsitzender: Wilhelm Dohmen </font> </td> </tr> </tbody> </table> <hr color="#000000" noshade="noshade" size="1" width="100%"> </div> </body> </html> --------------9D3918195F67A45D867502C6 Content-Type: image/gif; name="logo_databay_email.gif" Content-Transfer-Encoding: base64 Content-ID: <part1.46D494BA.8C43FBE5@databay.de> Content-Disposition: inline; filename="logo_databay_email.gif" R0lGODlhlwAeAMQAAObm5v9QVf/R0oKBgfDw8NfX105MTLi3t/r6+sfHx/+rrf98gC0sLP8L EhIQEKalpf/g4ZmYmHd2dmppaf8uNP/y8v8cIv+Ym//AwkE/P46NjRwbG11cXP8ABwUDA/// /yH5BAAAAAAALAAAAACXAB4AAAX/4CeOYnUJZKqubOu+cCzPNA0tVnfVfO//wGAKk+t0Ap+K QMFUYCDCqHRKJVUWDaPRUsFktZ1G4AKtms9o1gKsFVS+7I5ll67bpd647hPQawNld4KDMQJF bA07F35aFBiEkJEpfXEBjx8KjI0Vkp2DEIdaCySgFBShbEgrCQOtrq+uEQcALQewrQUjEbe8 rgkkD7y5KhMZB3drqSoVFQhdlHGXKQYe1dbX2BvHKwzY1RMiAN7j1xEjBeTmKeIeD3cYCxRf FigvChRxFJwkBBvk5A7cpZhAjgGCDwn+kfslgto4CSoSehh2BwEEBQvowDAUR0EKdArHZTg4 4oDCXBFC/3qj9SEluZEpHnjYQFIGgpo1KgSasYjNKBImrzF4NaFbNgIjCGRQeIyVKwneOLzS cLCAg38OWI4Y4GECgQcSOEwYcADnh6/FNjAwoGFYAQ0atI4AAFeEFwsLFLiJUQEfGH0kNGAD x8+oNQdIRQg+7NCaOhIgD8sVgYADNsPVGI5YWjRqzQTdHDDIYHRDLokaUhCglkFEJi0NKJhl 0RP2TsvXUg88KiLBVWsZrF6DmMKlNYMqglqTik1guN8OBgAgkGCpB+L9ugK4iSCBvwEfECw1 kILrBpa1jVCQIQBRvbP+rlEcQVAoSevWyv6uhpwE12uEkQAAZucpVw1xIsjkgf8B863mQVYt eQATCZYJZJ5WBfij2wfpHcEeHGG8Z+BMszVWDXkfKLhceJhBSAJ+1ThH32AfRFZNayNAtUFi wFSTSwEHJIYAAQU84IADwyjIEALU9MchG+vFgIF7W2GDI2T7HfjBgNcgKQKMHmwjgnCSpeCb ULRkdxhF1CDY40RjgmUAA/v1J5FAKW2gGSZscBFDMraNgJs1AYpAAGYP5jJoNQ4Y4Gh8jpFg HH9mgbmWo1l6oA4C3Ygp6UwEIFBfNRtkMIBlKMLnAXgAXLWhXXH85EIFqMhGGZgDEKArABGA ed0HI4bk5qgnprCYSt88B6dqS0FEEAMPJDCdCJYViur/B1BlwGMJqDTwnhqxJgUpo0ceOQ4D 0yEakpMm/jqCRMgWm2I1j824Y6vLvuuPjHnqOJkIgP6xzwp5sCFNsCFp88Gxh11lrjfDcNrc CEx64/CD3iAHlQcMUEQXvcA+qBkBB4Q2X1CusjBlJdKMYAKI6g28MbKN5hJsBAXknHOwutn4 oFYqkpqAzjnPbE0u1PxmwAQGXLWBbvhuIIEGEnRjlAHO4SvhbCNAkwoGzEBwgV9U0lfu2WiX OkDEGaCdKgl0nk2YkWdPOCDabvaGdkAftL1LlgwCM+7Tq11V71IO7LkM2XE0YAHMYMhqqK6U V165CpaHukLmiXFO8XSVzzakX+UH6TrmAajPNxfqByTQec41AeBPvSwIALkmAnuiexCsca3C BajgfsROuxcPA8kHQJX4DAIwjnsAvhsvfXHWKEwDAljg7sj03L9wwAQTxOWD2AE0YP75eCkw cPfs+xACADs= --------------9D3918195F67A45D867502C6-- --------------6531EB0B252802DEF6E22DB2--