
On Mon, Mar 20, 2017 at 10:12 AM, Paolo Margara <paolo.margara@polito.it> wrote:
Hi Yedidyah,
On Sat, Mar 18, 2017 at 12:25 PM, Paolo Margara <paolo.margara@polito.it> wrote:
Hi list,
I'm working on a system running on oVirt 3.6 and the Engine is reporting the warning "The Hosted Engine Storage Domain doesn't exist. It should be imported into the setup." repeatedly in the Events tab into the Admin Portal.
I've read into the list that Hosted Engine Storage Domain should be imported automatically into the setup during the upgrade to 3.6 (original setup was on 3.5), but this not happened while the HostedEngine is correctly visible into the VM tab after the upgrade. Was the upgrade to 3.6 successful and clean? The upgrade from 3.5 to 3.6 was successful, as every subsequent minor release upgrades. I rechecked the upgrade logs I haven't seen any relevant error. One addition information: I'm currently running on CentOS 7 and also the original setup was on this release version.
The Hosted Engine Storage Domain is on a dedicated gluster volume but considering that, if I remember correctly, oVirt 3.5 at that time did not support gluster as a backend for the HostedEngine at that time I had installed the engine using gluster's NFS server using 'localhost:/hosted-engine' as a mount point.
Currently on every nodes I can read into the log of the ovirt-hosted-engine-ha agent the following lines:
MainThread::INFO::2017-03-17 14:04:17,773::hosted_engine::462::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUp (score: 3400) MainThread::INFO::2017-03-17 14:04:17,774::hosted_engine::467::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(start_monitoring) Best remote host virtnode-0-1 (id: 2 , score: 3400) MainThread::INFO::2017-03-17 14:04:27,956::hosted_engine::613::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM MainThread::INFO::2017-03-17 14:04:28,055::hosted_engine::658::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage MainThread::INFO::2017-03-17 14:04:28,078::storage_server::218::ovirt_hosted_engine_ha.
Connecting storage server MainThread::INFO::2017-03-17 14:04:28,278::storage_server::222::ovirt_hosted_engine_ha.
Connecting storage server MainThread::INFO::2017-03-17 14:04:28,398::storage_server::230::ovirt_hosted_engine_ha.
Il 19/03/2017 11:55, Yedidyah Bar David ha scritto: lib.storage_server.StorageServer::(connect_storage_server) lib.storage_server.StorageServer::(connect_storage_server) lib.storage_server.StorageServer::(connect_storage_server)
Refreshing the storage domain MainThread::INFO::2017-03-17 14:04:28,822::hosted_engine::685::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_initialize_storage_images) Preparing images MainThread::INFO::2017-03-17 14:04:28,822::image::126::ovirt_hosted_engine_ha.lib. image.Image::(prepare_images) Preparing images MainThread::INFO::2017-03-17 14:04:29,308::hosted_engine::688::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_initialize_storage_images) Reloading vm.conf from the shared storage domain MainThread::INFO::2017-03-17 14:04:29,309::config::206::ovirt_hosted_engine_ha.agent. hosted_engine.HostedEngine.config::(refresh_local_conf_file) Trying to get a fresher copy of vm configuration from the OVF_STORE MainThread::WARNING::2017-03-17 14:04:29,567::ovf_store::104::ovirt_hosted_engine_ha.lib. ovf.ovf_store.OVFStore::(scan) Unable to find OVF_STORE MainThread::ERROR::2017-03-17 14:04:29,691::config::235::ovirt_hosted_engine_ha.agent. hosted_engine.HostedEngine.config::(refresh_local_conf_file) Unable to get vm.conf from O VF_STORE, falling back to initial vm.conf This is normal at your current state.
...and the following lines into the logfile engine.log inside the Hosted Engine:
2017-03-16 07:36:28,087 INFO [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}' 2017-03-16 07:36:28,115 WARN [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-38) [236d315c] CanDoAction of action 'ImportHostedEngineStorageDomain' failed for user SYSTEM. Reasons: VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_ FAILED_STORAGE_DOMAIN_NOT_EXIST That's the thing to debug. Did you check vdsm logs on the hosts, near the time this happens? Some moments before I saw the following lines into the vdsm.log of the host that execute the hosted engine and that is the SPM, but I see the same lines also on the other nodes:
Thread-1746094::DEBUG::2017-03-16 07:36:00,412::task::595::Storage.TaskManager.Task::(_updateState) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state init -> state preparing Thread-1746094::INFO::2017-03-16 07:36:00,413::logUtils::48::dispatcher::(wrapper) Run and protect: getImagesList(sdUUID='3b5db584-5d21-41dc-8f8d-712ce9423a27', options=None) Thread-1746094::DEBUG::2017-03-16 07:36:00,413::resourceManager::199::Storage.ResourceManager. Request::(__init__) ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27` ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '3313' at 'getImagesList' Thread-1746094::DEBUG::2017-03-16 07:36:00,413::resourceManager::545::Storage.ResourceManager: :(registerResource) Trying to register resource 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' for lock type 'shared' Thread-1746094::DEBUG::2017-03-16 07:36:00,414::resourceManager::604::Storage.ResourceManager: :(registerResource) Resource 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' is free. Now locking as 'shared' (1 active user) Thread-1746094::DEBUG::2017-03-16 07:36:00,414::resourceManager::239::Storage.ResourceManager. Request::(grant) ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27` ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Granted request Thread-1746094::DEBUG::2017-03-16 07:36:00,414::task::827::Storage.TaskManager.Task::(resourceAcquired) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_resourcesAcquired: Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27 (shared) Thread-1746094::DEBUG::2017-03-16 07:36:00,414::task::993::Storage.TaskManager.Task::(_decref) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting False Thread-1746094::ERROR::2017-03-16 07:36:00,415::task::866::Storage.TaskManager.Task::(_setError) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3315, in getImagesList images = dom.getAllImages() File "/usr/share/vdsm/storage/fileSD.py", line 373, in getAllImages self.getPools()[0], IndexError: list index out of range Thread-1746094::DEBUG::2017-03-16 07:36:00,415::task::885::Storage.TaskManager.Task::(_run) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._run: ae5af1a1-207c-432d-acfa-f3e03e014ee6 ('3b5db584-5d21-41dc-8f8d-712ce9423a27',) {} failed - stopping task Thread-1746094::DEBUG::2017-03-16 07:36:00,415::task::1246::Storage.TaskManager.Task::(stop) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::stopping in state preparing (force False) Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::993::Storage.TaskManager.Task::(_decref) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting True Thread-1746094::INFO::2017-03-16 07:36:00,416::task::1171::Storage.TaskManager.Task::(prepare) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::aborting: Task is aborted: u'list index out of range' - code 100 Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::1176::Storage.TaskManager.Task::(prepare) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Prepare: aborted: list index out of range Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::993::Storage.TaskManager.Task::(_decref) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 0 aborting True Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::928::Storage.TaskManager.Task::(_doAbort) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._doAbort: force False Thread-1746094::DEBUG::2017-03-16 07:36:00,416::resourceManager::980::Storage.ResourceManager. Owner::(cancelAll) Owner.cancelAll requests {} Thread-1746094::DEBUG::2017-03-16 07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state preparing -> state aborting Thread-1746094::DEBUG::2017-03-16 07:36:00,417::task::550::Storage.TaskManager.Task::(__state_aborting) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_aborting: recover policy none Thread-1746094::DEBUG::2017-03-16 07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state aborting -> state failed
After that I tried to execute a simple query on storage domains using vdsClient and I got the following information:
# vdsClient -s 0 getStorageDomainsList 3b5db584-5d21-41dc-8f8d-712ce9423a27 0966f366-b5ae-49e8-b05e-bee1895c2d54 35223b83-e0bd-4c8d-91a9-8c6b85336e7d 2c3994e3-1f93-4f2a-8a0a-0b5d388a2be7 # vdsClient -s 0 getStorageDomainInfo 3b5db584-5d21-41dc-8f8d-712ce9423a27 uuid = 3b5db584-5d21-41dc-8f8d-712ce9423a27 version = 3 role = Regular remotePath = localhost:/hosted-engine
Your issue is probably here: by design all the hosts of a single datacenter should be able to see all the storage domains including the hosted-engine one but if try to mount it as localhost:/hosted-engine this will not be possible.
type = NFS class = Data pool = [] name = default # vdsClient -s 0 getImagesList 3b5db584-5d21-41dc-8f8d-712ce9423a27 list index out of range
All other storage domains have the pool attribute defined, could be this the issue? How can I assign to a pool the Hosted Engine Storage Domain?
This will be the result of the auto import process once feasible.
2017-03-16 07:36:28,116 INFO [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}'
How can I safely import the Hosted Engine Storage Domain into my setup? In this situation is safe to upgrade to oVirt 4.0? I'd first try to solve this.
What OS do you have on your hosts? Are they all upgraded to 3.6?
See also:
engine-host-OS-upgrade/
Best,
Greetings, Paolo
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Greetings, Paolo _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users