<div dir="ltr">I am facing the same issue here as well. The engine comes up and web UI is reachable. Initial login takes about 6 minutes to finally let me in and then once I am in under the events tab there is events for "storage domain <uid> does not exist" yet they are all there. After this comes 'reconstructing master domain' and it tries to cycle through my 2 storage domains not including ISO_UPLOAD and hosted_engine domains. Eventually it will either 1.) Settle on one and actually able to bring it up master domain or 2.) they all stay down and I have to manually activate one<div><br></div><div>Its not really an issue since on three tests now I have recovered fine but it required some manual intervention on at least one occasion but otherwise it just flaps about until it can settle on one and actually bring it up</div><div><br></div><div>Clocking it today its usually like this:</div><div><br></div><div>7 minutes for HE to come up on node 1 and access to web UI</div><div>+6 minutes while hanging on logging in to web UI</div><div>+9 minutes for one of the two storage domains to get activated as master</div><div><br></div><div>Total around 20 minutes before entire cluster is usable.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Dec 3, 2016 at 11:14 AM, Yoann Laissus <span dir="ltr"><<a href="mailto:yoann.laissus@gmail.com" target="_blank">yoann.laissus@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
I'm running into some weird issues with vdsm and my storage domains<br>
after a reboot or a shutdown. I can't manage to figure out what's<br>
going on...<br>
<br>
Currently, my cluster (4.0.5 with hosted engine) is composed of one<br>
main node. (and another inactive one but unrelated to this issue).<br>
It has local storage exposed to oVirt via 3 NFS exports (one specific<br>
for the hosted engine vm) reachable from my local network.<br>
<br>
When I wan't to shutdown or reboot my main host (and so the whole<br>
cluster), I use a custom script :<br>
1. Shutdown all VM<br>
2. Shutdown engine VM<br>
3. Stop HA agent and broker<br>
4. Stop vdsmd<br>
5. Release the sanlock on the hosted engine SD<br>
6. Shutdown / Reboot<br>
<br>
It works just fine, but at the next boot, VDSM takes at least 10-15<br>
minutes to find storage domains, except the hosted engine one. The<br>
engine loops trying to reconstruct the SPM.<br>
During this time, vdsClient getConnectedStoragePoolsList returns nothing.<br>
getStorageDomainsList returns only the hosted engine domain.<br>
NFS exports are mountable from another server.<br>
<br>
But when I restart vdsm manually after the boot, it seems to detect<br>
immediately the storage domains.<br>
<br>
Is there some kind of staled storage data used by vdsm and a timeout<br>
to invalidate them ?<br>
Am I missing something on the vdsm side in my shutdown procedure ?<br>
<br>
Thanks !<br>
<br>
Engine and vdsm logs are attached.<br>
<span class="HOEnZb"><font color="#888888"><br>
<br>
--<br>
Yoann Laissus<br>
</font></span><br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
<br></blockquote></div><br></div>