[ovirt-users] The Hosted Engine Storage Domain doesn't exist. It should be imported into the setup.

Paolo Margara paolo.margara at polito.it
Tue Mar 21 08:24:23 UTC 2017


Hi Simone,

I'll respond inline


Il 20/03/2017 11:59, Simone Tiraboschi ha scritto:
>
>
> On Mon, Mar 20, 2017 at 11:15 AM, Simone Tiraboschi
> <stirabos at redhat.com <mailto:stirabos at redhat.com>> wrote:
>
>
>     On Mon, Mar 20, 2017 at 10:12 AM, Paolo Margara
>     <paolo.margara at polito.it <mailto:paolo.margara at polito.it>> wrote:
>
>         Hi Yedidyah,
>
>         Il 19/03/2017 11:55, Yedidyah Bar David ha scritto:
>         > On Sat, Mar 18, 2017 at 12:25 PM, Paolo Margara
>         <paolo.margara at polito.it <mailto:paolo.margara at polito.it>> wrote:
>         >> Hi list,
>         >>
>         >> I'm working on a system running on oVirt 3.6 and the Engine
>         is reporting
>         >> the warning "The Hosted Engine Storage Domain doesn't
>         exist. It should
>         >> be imported into the setup." repeatedly in the Events tab
>         into the Admin
>         >> Portal.
>         >>
>         >> I've read into the list that Hosted Engine Storage Domain
>         should be
>         >> imported automatically into the setup during the upgrade to 3.6
>         >> (original setup was on 3.5), but this not happened while the
>         >> HostedEngine is correctly visible into the VM tab after the
>         upgrade.
>         > Was the upgrade to 3.6 successful and clean?
>         The upgrade from 3.5 to 3.6 was successful, as every
>         subsequent minor
>         release upgrades. I rechecked the upgrade logs I haven't seen any
>         relevant error.
>         One addition information: I'm currently running on CentOS 7
>         and also the
>         original setup was on this release version.
>         >
>         >> The Hosted Engine Storage Domain is on a dedicated gluster
>         volume but
>         >> considering that, if I remember correctly, oVirt 3.5 at
>         that time did
>         >> not support gluster as a backend for the HostedEngine at
>         that time I had
>         >> installed the engine using gluster's NFS server using
>         >> 'localhost:/hosted-engine' as a mount point.
>         >>
>         >> Currently on every nodes I can read into the log of the
>         >> ovirt-hosted-engine-ha agent the following lines:
>         >>
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:17,773::hosted_engine::462::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>         >> Current state EngineUp (score: 3400)
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:17,774::hosted_engine::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>         >> Best remote host virtnode-0-1 (id: 2
>         >> , score: 3400)
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:27,956::hosted_engine::613::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
>         >> Initializing VDSM
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:28,055::hosted_engine::658::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>         >> Connecting the storage
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:28,078::storage_server::218::ovirt_hosted_engine_ha.li
>         <http://ovirt_hosted_engine_ha.li>b.storage_server.StorageServer::(connect_storage_server)
>         >> Connecting storage server
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:28,278::storage_server::222::ovirt_hosted_engine_ha.li
>         <http://ovirt_hosted_engine_ha.li>b.storage_server.StorageServer::(connect_storage_server)
>         >> Connecting storage server
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:28,398::storage_server::230::ovirt_hosted_engine_ha.li
>         <http://ovirt_hosted_engine_ha.li>b.storage_server.StorageServer::(connect_storage_server)
>         >> Refreshing the storage domain
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:28,822::hosted_engine::685::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>         >> Preparing images
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:28,822::image::126::ovirt_hosted_engine_ha.lib.image.Image::(prepare_images)
>         >> Preparing images
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:29,308::hosted_engine::688::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>         >> Reloading vm.conf from the
>         >>  shared storage domain
>         >> MainThread::INFO::2017-03-17
>         >>
>         14:04:29,309::config::206::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>         >> Trying to get a fresher copy
>         >> of vm configuration from the OVF_STORE
>         >> MainThread::WARNING::2017-03-17
>         >>
>         14:04:29,567::ovf_store::104::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>         >> Unable to find OVF_STORE
>         >> MainThread::ERROR::2017-03-17
>         >>
>         14:04:29,691::config::235::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>         >> Unable to get vm.conf from O
>         >> VF_STORE, falling back to initial vm.conf
>         > This is normal at your current state.
>         >
>         >> ...and the following lines into the logfile engine.log
>         inside the Hosted
>         >> Engine:
>         >>
>         >> 2017-03-16 07:36:28,087 INFO
>         >>
>         [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand]
>         >> (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock
>         Acquired to object
>         >> 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}'
>         >> 2017-03-16 07:36:28,115 WARN
>         >>
>         [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand]
>         >> (org.ovirt.thread.pool-8-thread-38) [236d315c] CanDoAction
>         of action
>         >> 'ImportHostedEngineStorageDomain' failed for user SYSTEM.
>         Reasons:
>         >>
>         VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_STORAGE_DOMAIN_NOT_EXIST
>         > That's the thing to debug. Did you check vdsm logs on the
>         hosts, near
>         > the time this happens?
>         Some moments before I saw the following lines into the
>         vdsm.log of the
>         host that execute the hosted engine and that is the SPM, but I
>         see the
>         same lines also on the other nodes:
>
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,412::task::595::Storage.TaskManager.Task::(_updateState)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state
>         init ->
>         state preparing
>         Thread-1746094::INFO::2017-03-16
>         07:36:00,413::logUtils::48::dispatcher::(wrapper) Run and protect:
>         getImagesList(sdUUID='3b5db584-5d21-41dc-8f8d-712ce9423a27',
>         options=None)
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,413::resourceManager::199::Storage.ResourceManager.Request::(__init__)
>         ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27`ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Request
>         was made in '/usr/share/vdsm/storage/hsm.py' line '3313' at
>         'getImagesList'
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,413::resourceManager::545::Storage.ResourceManager::(registerResource)
>         Trying to register resource
>         'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' for lock type
>         'shared'
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,414::resourceManager::604::Storage.ResourceManager::(registerResource)
>         Resource 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' is
>         free. Now
>         locking as 'shared' (1 active user)
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,414::resourceManager::239::Storage.ResourceManager.Request::(grant)
>         ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27`ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Granted
>         request
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,414::task::827::Storage.TaskManager.Task::(resourceAcquired)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_resourcesAcquired:
>         Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27 (shared)
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,414::task::993::Storage.TaskManager.Task::(_decref)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting False
>         Thread-1746094::ERROR::2017-03-16
>         07:36:00,415::task::866::Storage.TaskManager.Task::(_setError)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Unexpected error
>         Traceback (most recent call last):
>           File "/usr/share/vdsm/storage/task.py", line 873, in _run
>             return fn(*args, **kargs)
>           File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
>             res = f(*args, **kwargs)
>           File "/usr/share/vdsm/storage/hsm.py", line 3315, in
>         getImagesList
>             images = dom.getAllImages()
>           File "/usr/share/vdsm/storage/fileSD.py", line 373, in
>         getAllImages
>             self.getPools()[0],
>         IndexError: list index out of range
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,415::task::885::Storage.TaskManager.Task::(_run)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._run:
>         ae5af1a1-207c-432d-acfa-f3e03e014ee6
>         ('3b5db584-5d21-41dc-8f8d-712ce9423a27',) {} failed - stopping
>         task
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,415::task::1246::Storage.TaskManager.Task::(stop)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::stopping in state
>         preparing
>         (force False)
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,416::task::993::Storage.TaskManager.Task::(_decref)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting True
>         Thread-1746094::INFO::2017-03-16
>         07:36:00,416::task::1171::Storage.TaskManager.Task::(prepare)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::aborting: Task is
>         aborted:
>         u'list index out of range' - code 100
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,416::task::1176::Storage.TaskManager.Task::(prepare)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Prepare: aborted:
>         list
>         index out of range
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,416::task::993::Storage.TaskManager.Task::(_decref)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 0 aborting True
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,416::task::928::Storage.TaskManager.Task::(_doAbort)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._doAbort:
>         force False
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,416::resourceManager::980::Storage.ResourceManager.Owner::(cancelAll)
>         Owner.cancelAll requests {}
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state
>         preparing
>         -> state aborting
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,417::task::550::Storage.TaskManager.Task::(__state_aborting)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_aborting:
>         recover policy none
>         Thread-1746094::DEBUG::2017-03-16
>         07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState)
>         Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state
>         aborting
>         -> state failed
>
>         After that I tried to execute a simple query on storage
>         domains using
>         vdsClient and I got the following information:
>
>         # vdsClient -s 0 getStorageDomainsList
>         3b5db584-5d21-41dc-8f8d-712ce9423a27
>         0966f366-b5ae-49e8-b05e-bee1895c2d54
>         35223b83-e0bd-4c8d-91a9-8c6b85336e7d
>         2c3994e3-1f93-4f2a-8a0a-0b5d388a2be7
>         # vdsClient -s 0 getStorageDomainInfo
>         3b5db584-5d21-41dc-8f8d-712ce9423a27
>             uuid = 3b5db584-5d21-41dc-8f8d-712ce9423a27
>             version = 3
>             role = Regular
>             remotePath = localhost:/hosted-engine
>
>
>     Your issue is probably here: by design all the hosts of a single
>     datacenter should be able to see all the storage domains including
>     the hosted-engine one but if try to mount it as
>     localhost:/hosted-engine this will not be possible.
>      
>
>             type = NFS
>             class = Data
>             pool = []
>             name = default
>         # vdsClient -s 0 getImagesList
>         3b5db584-5d21-41dc-8f8d-712ce9423a27
>         list index out of range
>
>         All other storage domains have the pool attribute defined,
>         could be this
>         the issue? How can I assign to a pool the Hosted Engine
>         Storage Domain?
>
>
>     This will be the result of the auto import process once feasible.
>      
>
>         >
>         >> 2017-03-16 07:36:28,116 INFO
>         >>
>         [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand]
>         >> (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock freed
>         to object
>         >> 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}'
>         >>
>         >> How can I safely import the Hosted Engine Storage Domain
>         into my setup?
>         >> In this situation is safe to upgrade to oVirt 4.0?
>
>
> This could be really tricky because I think that upgrading from an
> hosted-engine env from 3.5 deployed on hyperconverged env but mounted
> on NFS over a localhost loopback mount to 4.1 is something that is by
> far out of the paths we tested so I think you can hit a few surprises
> there.
>
> In 4.1 the expected configuration under
> /etc/ovirt-hosted-engine/hosted-engine.conf includes:
> domainType=glusterfs
> storage=<FIRST_HOST_ADDR>:/path
> mnt_options=backup-volfile-servers=<SECOND_HOST_ADDR>:<THIRD_HOST_ADDR>
> But these requires more recent vdsm and ovirt-hosted-engine-ha versions.
> Then you also have to configure your engine to have both virt and
> gluster on the same cluster.
> Nothing is going to do them automatically for you on upgrades.
> I see two options here: 1. easy, but with a substantial downtime:
> shutdown your whole DC, start from scratch with gdeploy from 4.1 to
> configure a new gluster volume and anew engine over there, once you
> have a new engine import your existing storage domain and restart your VMs
> 2. a lot trickier, try to reach 4.1 status manually
> editing /etc/ovirt-hosted-engine/hosted-engine.conf and so on and
> upgrading everything to 4.1; this could be pretty risky because you
> are on a path we never tested since hyperconverged hosted-engine at
> 3.5 wasn't released.
I understood, definitely bad news for me. But currently I'm running
oVirt 3.6.7 that, if I remember correctly, supports hyperconverged
setup, it's not possible to fix this issue with my current version? I've
installed vdsm 4.17.32-1.el7 with the vdsm-gluster package and
ovirt-hosted-engine-ha 1.3.5.7-1.el7.centos on CentOS 7.2.1511 and my
engine it's already configured to have both virt and gluster on the same
cluster. I cannot put the cluster in maintenance, stop the
hosted-engine, stop ovirt-hosted-engine-ha, edit hosted-engine.conf by
changing domainType, storage and mnt_options and restart
aovirt-hosted-engine-ha and the hosted-engine?
>  
>
>         > I'd first try to solve this. > > What OS do you have on your hosts? Are they all upgraded
>         to 3.6? > > See also: > >
>         https://www.ovirt.org/documentation/how-to/hosted-engine-host-OS-upgrade/
>         <https://www.ovirt.org/documentation/how-to/hosted-engine-host-OS-upgrade/>
>         > > Best, > >> >> Greetings, >>     Paolo >> >>
>         _______________________________________________ >> Users
>         mailing list >> Users at ovirt.org <mailto:Users at ovirt.org> >>
>         http://lists.ovirt.org/mailman/listinfo/users
>         <http://lists.ovirt.org/mailman/listinfo/users> > > Greetings,
>             Paolo _______________________________________________
>         Users mailing list Users at ovirt.org <mailto:Users at ovirt.org>
>         http://lists.ovirt.org/mailman/listinfo/users
>         <http://lists.ovirt.org/mailman/listinfo/users> 
>
Greetings,     Paolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170321/d5ee248d/attachment-0001.html>


More information about the Users mailing list