<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Mar 20, 2017 at 10:12 AM, Paolo Margara <span dir="ltr">&lt;<a href="mailto:paolo.margara@polito.it" target="_blank">paolo.margara@polito.it</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Yedidyah,<br>
<span class="gmail-"><br>
Il 19/03/2017 11:55, Yedidyah Bar David ha scritto:<br>
&gt; On Sat, Mar 18, 2017 at 12:25 PM, Paolo Margara &lt;<a href="mailto:paolo.margara@polito.it">paolo.margara@polito.it</a>&gt; wrote:<br>
&gt;&gt; Hi list,<br>
&gt;&gt;<br>
&gt;&gt; I&#39;m working on a system running on oVirt 3.6 and the Engine is reporting<br>
&gt;&gt; the warning &quot;The Hosted Engine Storage Domain doesn&#39;t exist. It should<br>
&gt;&gt; be imported into the setup.&quot; repeatedly in the Events tab into the Admin<br>
&gt;&gt; Portal.<br>
&gt;&gt;<br>
&gt;&gt; I&#39;ve read into the list that Hosted Engine Storage Domain should be<br>
&gt;&gt; imported automatically into the setup during the upgrade to 3.6<br>
&gt;&gt; (original setup was on 3.5), but this not happened while the<br>
&gt;&gt; HostedEngine is correctly visible into the VM tab after the upgrade.<br>
&gt; Was the upgrade to 3.6 successful and clean?<br>
</span>The upgrade from 3.5 to 3.6 was successful, as every subsequent minor<br>
release upgrades. I rechecked the upgrade logs I haven&#39;t seen any<br>
relevant error.<br>
One addition information: I&#39;m currently running on CentOS 7 and also the<br>
original setup was on this release version.<br>
<div><div class="gmail-h5">&gt;<br>
&gt;&gt; The Hosted Engine Storage Domain is on a dedicated gluster volume but<br>
&gt;&gt; considering that, if I remember correctly, oVirt 3.5 at that time did<br>
&gt;&gt; not support gluster as a backend for the HostedEngine at that time I had<br>
&gt;&gt; installed the engine using gluster&#39;s NFS server using<br>
&gt;&gt; &#39;localhost:/hosted-engine&#39; as a mount point.<br>
&gt;&gt;<br>
&gt;&gt; Currently on every nodes I can read into the log of the<br>
&gt;&gt; ovirt-hosted-engine-ha agent the following lines:<br>
&gt;&gt;<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:17,773::hosted_engine::<wbr>462::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring)<br>
&gt;&gt; Current state EngineUp (score: 3400)<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:17,774::hosted_engine::<wbr>467::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring)<br>
&gt;&gt; Best remote host virtnode-0-1 (id: 2<br>
&gt;&gt; , score: 3400)<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:27,956::hosted_engine::<wbr>613::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm)<br>
&gt;&gt; Initializing VDSM<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:28,055::hosted_engine::<wbr>658::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
&gt;&gt; Connecting the storage<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:28,078::storage_server::<wbr>218::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server)<br>
&gt;&gt; Connecting storage server<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:28,278::storage_server::<wbr>222::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server)<br>
&gt;&gt; Connecting storage server<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:28,398::storage_server::<wbr>230::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server)<br>
&gt;&gt; Refreshing the storage domain<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:28,822::hosted_engine::<wbr>685::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
&gt;&gt; Preparing images<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:28,822::image::126::<wbr>ovirt_hosted_engine_ha.lib.<wbr>image.Image::(prepare_images)<br>
&gt;&gt; Preparing images<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:29,308::hosted_engine::<wbr>688::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
&gt;&gt; Reloading vm.conf from the<br>
&gt;&gt;  shared storage domain<br>
&gt;&gt; MainThread::INFO::2017-03-17<br>
&gt;&gt; 14:04:29,309::config::206::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_local_conf_<wbr>file)<br>
&gt;&gt; Trying to get a fresher copy<br>
&gt;&gt; of vm configuration from the OVF_STORE<br>
&gt;&gt; MainThread::WARNING::2017-03-<wbr>17<br>
&gt;&gt; 14:04:29,567::ovf_store::104::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(scan)<br>
&gt;&gt; Unable to find OVF_STORE<br>
&gt;&gt; MainThread::ERROR::2017-03-17<br>
&gt;&gt; 14:04:29,691::config::235::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_local_conf_<wbr>file)<br>
&gt;&gt; Unable to get vm.conf from O<br>
&gt;&gt; VF_STORE, falling back to initial vm.conf<br>
&gt; This is normal at your current state.<br>
&gt;<br>
&gt;&gt; ...and the following lines into the logfile engine.log inside the Hosted<br>
&gt;&gt; Engine:<br>
&gt;&gt;<br>
&gt;&gt; 2017-03-16 07:36:28,087 INFO<br>
&gt;&gt; [org.ovirt.engine.core.bll.<wbr>ImportHostedEngineStorageDomai<wbr>nCommand]<br>
&gt;&gt; (org.ovirt.thread.pool-8-<wbr>thread-38) [236d315c] Lock Acquired to object<br>
&gt;&gt; &#39;EngineLock:{exclusiveLocks=&#39;[<wbr>]&#39;, sharedLocks=&#39;null&#39;}&#39;<br>
&gt;&gt; 2017-03-16 07:36:28,115 WARN<br>
&gt;&gt; [org.ovirt.engine.core.bll.<wbr>ImportHostedEngineStorageDomai<wbr>nCommand]<br>
&gt;&gt; (org.ovirt.thread.pool-8-<wbr>thread-38) [236d315c] CanDoAction of action<br>
&gt;&gt; &#39;<wbr>ImportHostedEngineStorageDomai<wbr>n&#39; failed for user SYSTEM. Reasons:<br>
&gt;&gt; VAR__ACTION__ADD,VAR__TYPE__<wbr>STORAGE__DOMAIN,ACTION_TYPE_<wbr>FAILED_STORAGE_DOMAIN_NOT_<wbr>EXIST<br>
&gt; That&#39;s the thing to debug. Did you check vdsm logs on the hosts, near<br>
&gt; the time this happens?<br>
</div></div>Some moments before I saw the following lines into the vdsm.log of the<br>
host that execute the hosted engine and that is the SPM, but I see the<br>
same lines also on the other nodes:<br>
<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,412::task::595::<wbr>Storage.TaskManager.Task::(_<wbr>updateState)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::moving from state init -&gt;<br>
state preparing<br>
Thread-1746094::INFO::2017-03-<wbr>16<br>
07:36:00,413::logUtils::48::<wbr>dispatcher::(wrapper) Run and protect:<br>
getImagesList(sdUUID=&#39;<wbr>3b5db584-5d21-41dc-8f8d-<wbr>712ce9423a27&#39;, options=None)<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,413::resourceManager:<wbr>:199::Storage.ResourceManager.<wbr>Request::(__init__)<br>
ResName=`Storage.3b5db584-<wbr>5d21-41dc-8f8d-712ce9423a27`<wbr>ReqID=`8ea3c7f3-8ccd-4127-<wbr>96b1-ec97a3c7b8d4`::Request<br>
was made in &#39;/usr/share/vdsm/storage/hsm.<wbr>py&#39; line &#39;3313&#39; at &#39;getImagesList&#39;<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,413::resourceManager:<wbr>:545::Storage.ResourceManager:<wbr>:(registerResource)<br>
Trying to register resource<br>
&#39;Storage.3b5db584-5d21-41dc-<wbr>8f8d-712ce9423a27&#39; for lock type &#39;shared&#39;<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,414::resourceManager:<wbr>:604::Storage.ResourceManager:<wbr>:(registerResource)<br>
Resource &#39;Storage.3b5db584-5d21-41dc-<wbr>8f8d-712ce9423a27&#39; is free. Now<br>
locking as &#39;shared&#39; (1 active user)<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,414::resourceManager:<wbr>:239::Storage.ResourceManager.<wbr>Request::(grant)<br>
ResName=`Storage.3b5db584-<wbr>5d21-41dc-8f8d-712ce9423a27`<wbr>ReqID=`8ea3c7f3-8ccd-4127-<wbr>96b1-ec97a3c7b8d4`::Granted<br>
request<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,414::task::827::<wbr>Storage.TaskManager.Task::(<wbr>resourceAcquired)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::_<wbr>resourcesAcquired:<br>
Storage.3b5db584-5d21-41dc-<wbr>8f8d-712ce9423a27 (shared)<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,414::task::993::<wbr>Storage.TaskManager.Task::(_<wbr>decref)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::ref 1 aborting False<br>
Thread-1746094::ERROR::2017-<wbr>03-16<br>
07:36:00,415::task::866::<wbr>Storage.TaskManager.Task::(_<wbr>setError)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Unexpected error<br>
Traceback (most recent call last):<br>
  File &quot;/usr/share/vdsm/storage/task.<wbr>py&quot;, line 873, in _run<br>
    return fn(*args, **kargs)<br>
  File &quot;/usr/share/vdsm/logUtils.py&quot;, line 49, in wrapper<br>
    res = f(*args, **kwargs)<br>
  File &quot;/usr/share/vdsm/storage/hsm.<wbr>py&quot;, line 3315, in getImagesList<br>
    images = dom.getAllImages()<br>
  File &quot;/usr/share/vdsm/storage/<wbr>fileSD.py&quot;, line 373, in getAllImages<br>
    self.getPools()[0],<br>
IndexError: list index out of range<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,415::task::885::<wbr>Storage.TaskManager.Task::(_<wbr>run)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Task._run:<br>
ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6<br>
(&#39;3b5db584-5d21-41dc-8f8d-<wbr>712ce9423a27&#39;,) {} failed - stopping task<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,415::task::1246::<wbr>Storage.TaskManager.Task::(<wbr>stop)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::stopping in state preparing<br>
(force False)<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,416::task::993::<wbr>Storage.TaskManager.Task::(_<wbr>decref)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::ref 1 aborting True<br>
Thread-1746094::INFO::2017-03-<wbr>16<br>
07:36:00,416::task::1171::<wbr>Storage.TaskManager.Task::(<wbr>prepare)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::aborting: Task is aborted:<br>
u&#39;list index out of range&#39; - code 100<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,416::task::1176::<wbr>Storage.TaskManager.Task::(<wbr>prepare)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Prepare: aborted: list<br>
index out of range<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,416::task::993::<wbr>Storage.TaskManager.Task::(_<wbr>decref)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::ref 0 aborting True<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,416::task::928::<wbr>Storage.TaskManager.Task::(_<wbr>doAbort)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Task._doAbort: force False<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,416::resourceManager:<wbr>:980::Storage.ResourceManager.<wbr>Owner::(cancelAll)<br>
Owner.cancelAll requests {}<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,417::task::595::<wbr>Storage.TaskManager.Task::(_<wbr>updateState)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::moving from state preparing<br>
-&gt; state aborting<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,417::task::550::<wbr>Storage.TaskManager.Task::(__<wbr>state_aborting)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::_aborting: recover policy none<br>
Thread-1746094::DEBUG::2017-<wbr>03-16<br>
07:36:00,417::task::595::<wbr>Storage.TaskManager.Task::(_<wbr>updateState)<br>
Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::moving from state aborting<br>
-&gt; state failed<br>
<br>
After that I tried to execute a simple query on storage domains using<br>
vdsClient and I got the following information:<br>
<br>
# vdsClient -s 0 getStorageDomainsList<br>
3b5db584-5d21-41dc-8f8d-<wbr>712ce9423a27<br>
0966f366-b5ae-49e8-b05e-<wbr>bee1895c2d54<br>
35223b83-e0bd-4c8d-91a9-<wbr>8c6b85336e7d<br>
2c3994e3-1f93-4f2a-8a0a-<wbr>0b5d388a2be7<br>
# vdsClient -s 0 getStorageDomainInfo 3b5db584-5d21-41dc-8f8d-<wbr>712ce9423a27<br>
    uuid = 3b5db584-5d21-41dc-8f8d-<wbr>712ce9423a27<br>
    version = 3<br>
    role = Regular<br>
    remotePath = localhost:/hosted-engine<br></blockquote><div><br></div><div>Your issue is probably here: by design all the hosts of a single datacenter should be able to see all the storage domains including the hosted-engine one but if try to mount it as localhost:/hosted-engine this will not be possible.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    type = NFS<br>
    class = Data<br>
    pool = []<br>
    name = default<br>
# vdsClient -s 0 getImagesList 3b5db584-5d21-41dc-8f8d-<wbr>712ce9423a27<br>
list index out of range<br>
<br>
All other storage domains have the pool attribute defined, could be this<br>
the issue? How can I assign to a pool the Hosted Engine Storage Domain?<br></blockquote><div><br></div><div>This will be the result of the auto import process once feasible.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail-HOEnZb"><div class="gmail-h5">&gt;<br>
&gt;&gt; 2017-03-16 07:36:28,116 INFO<br>
&gt;&gt; [org.ovirt.engine.core.bll.<wbr>ImportHostedEngineStorageDomai<wbr>nCommand]<br>
&gt;&gt; (org.ovirt.thread.pool-8-<wbr>thread-38) [236d315c] Lock freed to object<br>
&gt;&gt; &#39;EngineLock:{exclusiveLocks=&#39;[<wbr>]&#39;, sharedLocks=&#39;null&#39;}&#39;<br>
&gt;&gt;<br>
&gt;&gt; How can I safely import the Hosted Engine Storage Domain into my setup?<br>
&gt;&gt; In this situation is safe to upgrade to oVirt 4.0?<br>
&gt; I&#39;d first try to solve this.<br>
&gt;<br>
&gt; What OS do you have on your hosts? Are they all upgraded to 3.6?<br>
&gt;<br>
&gt; See also:<br>
&gt;<br>
&gt; <a href="https://www.ovirt.org/documentation/how-to/hosted-engine-host-OS-upgrade/" rel="noreferrer" target="_blank">https://www.ovirt.org/<wbr>documentation/how-to/hosted-<wbr>engine-host-OS-upgrade/</a><br>
&gt;<br>
&gt; Best,<br>
&gt;<br>
&gt;&gt;<br>
&gt;&gt; Greetings,<br>
&gt;&gt;     Paolo<br>
&gt;&gt;<br>
&gt;&gt; ______________________________<wbr>_________________<br>
&gt;&gt; Users mailing list<br>
&gt;&gt; <a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
&gt;&gt; <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
&gt;<br>
&gt;<br>
Greetings,<br>
    Paolo<br>
______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
</div></div></blockquote></div><br></div></div>