[ovirt-users] Storage domains not found by vdsm after a reboot

Nir Soffer nsoffer at redhat.com
Sat Dec 3 19:58:03 UTC 2016


On Sat, Dec 3, 2016 at 6:14 PM, Yoann Laissus <yoann.laissus at gmail.com> wrote:
> Hello,
>
> I'm running into some weird issues with vdsm and my storage domains
> after a reboot or a shutdown. I can't manage to figure out what's
> going on...
>
> Currently, my cluster (4.0.5 with hosted engine) is composed of one
> main node. (and another inactive one but unrelated to this issue).
> It has local storage exposed to oVirt via 3 NFS exports (one specific
> for the hosted engine vm) reachable from my local network.
>
> When I wan't to shutdown or reboot my main host (and so the whole
> cluster), I use a custom script :
> 1. Shutdown all VM
> 2. Shutdown engine VM
> 3. Stop HA agent and broker
> 4. Stop vdsmd

This leave vdsm connected to all storage domains, and sanlock is
still maintaining the lockspace on all storage domains.

> 5. Release the sanlock on the hosted engine SD

You should not do that but use local/global maintenance mode in hosted
engine agent.

> 6. Shutdown / Reboot
>
> It works just fine, but at the next boot, VDSM takes at least 10-15
> minutes to find storage domains, except the hosted engine one. The
> engine loops trying to reconstruct the SPM.
> During this time, vdsClient getConnectedStoragePoolsList returns nothing.
> getStorageDomainsList returns only the hosted engine domain.
> NFS exports are mountable from another server.

The correct way to shutdown a host is to move the host to maintenance.
This deactivate all storage domains on this host, release sanlock leases
and disconnect from the storage server (e.g. log out from iscsi connection,
unmount nfs mounts).

If you don't this, sanlock will need more time to join the lockspace in the
next time.

I'm not sure what is the correct procedure when using hosted engine, since
hosted engine will not let you put a host into maintenance if the hosted
engine vm is running on this host. You can stop the hosted engine vm
but then you cannot move the host into maintenance since you don't have
engine :-)

There must be a documented way to perform this operation, I hope that
Simone will point us to the documentation.

Nir

>
> But when I restart vdsm manually after the boot, it seems to detect
> immediately the storage domains.
>
> Is there some kind of staled storage data used by vdsm and a timeout
> to invalidate them ?
> Am I missing something on the vdsm side in my shutdown procedure ?
>
> Thanks !
>
> Engine and vdsm logs are attached.
>
>
> --
> Yoann Laissus
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



More information about the Users mailing list