
This is a multi-part message in MIME format. --------------17C4B33F916F521113721D9F Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Hi Simone, I'll respond inline Il 20/03/2017 11:59, Simone Tiraboschi ha scritto:
On Mon, Mar 20, 2017 at 11:15 AM, Simone Tiraboschi <stirabos@redhat.com <mailto:stirabos@redhat.com>> wrote:
On Mon, Mar 20, 2017 at 10:12 AM, Paolo Margara <paolo.margara@polito.it <mailto:paolo.margara@polito.it>> wrote:
Hi Yedidyah,
Il 19/03/2017 11:55, Yedidyah Bar David ha scritto: > On Sat, Mar 18, 2017 at 12:25 PM, Paolo Margara <paolo.margara@polito.it <mailto:paolo.margara@polito.it>> wrote: >> Hi list, >> >> I'm working on a system running on oVirt 3.6 and the Engine is reporting >> the warning "The Hosted Engine Storage Domain doesn't exist. It should >> be imported into the setup." repeatedly in the Events tab into the Admin >> Portal. >> >> I've read into the list that Hosted Engine Storage Domain should be >> imported automatically into the setup during the upgrade to 3.6 >> (original setup was on 3.5), but this not happened while the >> HostedEngine is correctly visible into the VM tab after the upgrade. > Was the upgrade to 3.6 successful and clean? The upgrade from 3.5 to 3.6 was successful, as every subsequent minor release upgrades. I rechecked the upgrade logs I haven't seen any relevant error. One addition information: I'm currently running on CentOS 7 and also the original setup was on this release version. > >> The Hosted Engine Storage Domain is on a dedicated gluster volume but >> considering that, if I remember correctly, oVirt 3.5 at that time did >> not support gluster as a backend for the HostedEngine at that time I had >> installed the engine using gluster's NFS server using >> 'localhost:/hosted-engine' as a mount point. >> >> Currently on every nodes I can read into the log of the >> ovirt-hosted-engine-ha agent the following lines: >> >> MainThread::INFO::2017-03-17 >> 14:04:17,773::hosted_engine::462::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >> Current state EngineUp (score: 3400) >> MainThread::INFO::2017-03-17 >> 14:04:17,774::hosted_engine::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >> Best remote host virtnode-0-1 (id: 2 >> , score: 3400) >> MainThread::INFO::2017-03-17 >> 14:04:27,956::hosted_engine::613::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) >> Initializing VDSM >> MainThread::INFO::2017-03-17 >> 14:04:28,055::hosted_engine::658::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) >> Connecting the storage >> MainThread::INFO::2017-03-17 >> 14:04:28,078::storage_server::218::ovirt_hosted_engine_ha.li <http://ovirt_hosted_engine_ha.li>b.storage_server.StorageServer::(connect_storage_server) >> Connecting storage server >> MainThread::INFO::2017-03-17 >> 14:04:28,278::storage_server::222::ovirt_hosted_engine_ha.li <http://ovirt_hosted_engine_ha.li>b.storage_server.StorageServer::(connect_storage_server) >> Connecting storage server >> MainThread::INFO::2017-03-17 >> 14:04:28,398::storage_server::230::ovirt_hosted_engine_ha.li <http://ovirt_hosted_engine_ha.li>b.storage_server.StorageServer::(connect_storage_server) >> Refreshing the storage domain >> MainThread::INFO::2017-03-17 >> 14:04:28,822::hosted_engine::685::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) >> Preparing images >> MainThread::INFO::2017-03-17 >> 14:04:28,822::image::126::ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) >> Preparing images >> MainThread::INFO::2017-03-17 >> 14:04:29,308::hosted_engine::688::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) >> Reloading vm.conf from the >> shared storage domain >> MainThread::INFO::2017-03-17 >> 14:04:29,309::config::206::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) >> Trying to get a fresher copy >> of vm configuration from the OVF_STORE >> MainThread::WARNING::2017-03-17 >> 14:04:29,567::ovf_store::104::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) >> Unable to find OVF_STORE >> MainThread::ERROR::2017-03-17 >> 14:04:29,691::config::235::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) >> Unable to get vm.conf from O >> VF_STORE, falling back to initial vm.conf > This is normal at your current state. > >> ...and the following lines into the logfile engine.log inside the Hosted >> Engine: >> >> 2017-03-16 07:36:28,087 INFO >> [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] >> (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock Acquired to object >> 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}' >> 2017-03-16 07:36:28,115 WARN >> [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] >> (org.ovirt.thread.pool-8-thread-38) [236d315c] CanDoAction of action >> 'ImportHostedEngineStorageDomain' failed for user SYSTEM. Reasons: >> VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_STORAGE_DOMAIN_NOT_EXIST > That's the thing to debug. Did you check vdsm logs on the hosts, near > the time this happens? Some moments before I saw the following lines into the vdsm.log of the host that execute the hosted engine and that is the SPM, but I see the same lines also on the other nodes:
Thread-1746094::DEBUG::2017-03-16 07:36:00,412::task::595::Storage.TaskManager.Task::(_updateState) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state init -> state preparing Thread-1746094::INFO::2017-03-16 07:36:00,413::logUtils::48::dispatcher::(wrapper) Run and protect: getImagesList(sdUUID='3b5db584-5d21-41dc-8f8d-712ce9423a27', options=None) Thread-1746094::DEBUG::2017-03-16 07:36:00,413::resourceManager::199::Storage.ResourceManager.Request::(__init__) ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27`ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '3313' at 'getImagesList' Thread-1746094::DEBUG::2017-03-16 07:36:00,413::resourceManager::545::Storage.ResourceManager::(registerResource) Trying to register resource 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' for lock type 'shared' Thread-1746094::DEBUG::2017-03-16 07:36:00,414::resourceManager::604::Storage.ResourceManager::(registerResource) Resource 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' is free. Now locking as 'shared' (1 active user) Thread-1746094::DEBUG::2017-03-16 07:36:00,414::resourceManager::239::Storage.ResourceManager.Request::(grant) ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27`ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Granted request Thread-1746094::DEBUG::2017-03-16 07:36:00,414::task::827::Storage.TaskManager.Task::(resourceAcquired) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_resourcesAcquired: Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27 (shared) Thread-1746094::DEBUG::2017-03-16 07:36:00,414::task::993::Storage.TaskManager.Task::(_decref) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting False Thread-1746094::ERROR::2017-03-16 07:36:00,415::task::866::Storage.TaskManager.Task::(_setError) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3315, in getImagesList images = dom.getAllImages() File "/usr/share/vdsm/storage/fileSD.py", line 373, in getAllImages self.getPools()[0], IndexError: list index out of range Thread-1746094::DEBUG::2017-03-16 07:36:00,415::task::885::Storage.TaskManager.Task::(_run) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._run: ae5af1a1-207c-432d-acfa-f3e03e014ee6 ('3b5db584-5d21-41dc-8f8d-712ce9423a27',) {} failed - stopping task Thread-1746094::DEBUG::2017-03-16 07:36:00,415::task::1246::Storage.TaskManager.Task::(stop) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::stopping in state preparing (force False) Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::993::Storage.TaskManager.Task::(_decref) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting True Thread-1746094::INFO::2017-03-16 07:36:00,416::task::1171::Storage.TaskManager.Task::(prepare) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::aborting: Task is aborted: u'list index out of range' - code 100 Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::1176::Storage.TaskManager.Task::(prepare) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Prepare: aborted: list index out of range Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::993::Storage.TaskManager.Task::(_decref) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 0 aborting True Thread-1746094::DEBUG::2017-03-16 07:36:00,416::task::928::Storage.TaskManager.Task::(_doAbort) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._doAbort: force False Thread-1746094::DEBUG::2017-03-16 07:36:00,416::resourceManager::980::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-1746094::DEBUG::2017-03-16 07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state preparing -> state aborting Thread-1746094::DEBUG::2017-03-16 07:36:00,417::task::550::Storage.TaskManager.Task::(__state_aborting) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_aborting: recover policy none Thread-1746094::DEBUG::2017-03-16 07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState) Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state aborting -> state failed
After that I tried to execute a simple query on storage domains using vdsClient and I got the following information:
# vdsClient -s 0 getStorageDomainsList 3b5db584-5d21-41dc-8f8d-712ce9423a27 0966f366-b5ae-49e8-b05e-bee1895c2d54 35223b83-e0bd-4c8d-91a9-8c6b85336e7d 2c3994e3-1f93-4f2a-8a0a-0b5d388a2be7 # vdsClient -s 0 getStorageDomainInfo 3b5db584-5d21-41dc-8f8d-712ce9423a27 uuid = 3b5db584-5d21-41dc-8f8d-712ce9423a27 version = 3 role = Regular remotePath = localhost:/hosted-engine
Your issue is probably here: by design all the hosts of a single datacenter should be able to see all the storage domains including the hosted-engine one but if try to mount it as localhost:/hosted-engine this will not be possible.
type = NFS class = Data pool = [] name = default # vdsClient -s 0 getImagesList 3b5db584-5d21-41dc-8f8d-712ce9423a27 list index out of range
All other storage domains have the pool attribute defined, could be this the issue? How can I assign to a pool the Hosted Engine Storage Domain?
This will be the result of the auto import process once feasible.
> >> 2017-03-16 07:36:28,116 INFO >> [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] >> (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock freed to object >> 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}' >> >> How can I safely import the Hosted Engine Storage Domain into my setup? >> In this situation is safe to upgrade to oVirt 4.0?
This could be really tricky because I think that upgrading from an hosted-engine env from 3.5 deployed on hyperconverged env but mounted on NFS over a localhost loopback mount to 4.1 is something that is by far out of the paths we tested so I think you can hit a few surprises there.
In 4.1 the expected configuration under /etc/ovirt-hosted-engine/hosted-engine.conf includes: domainType=glusterfs storage=<FIRST_HOST_ADDR>:/path mnt_options=backup-volfile-servers=<SECOND_HOST_ADDR>:<THIRD_HOST_ADDR> But these requires more recent vdsm and ovirt-hosted-engine-ha versions. Then you also have to configure your engine to have both virt and gluster on the same cluster. Nothing is going to do them automatically for you on upgrades. I see two options here: 1. easy, but with a substantial downtime: shutdown your whole DC, start from scratch with gdeploy from 4.1 to configure a new gluster volume and anew engine over there, once you have a new engine import your existing storage domain and restart your VMs 2. a lot trickier, try to reach 4.1 status manually editing /etc/ovirt-hosted-engine/hosted-engine.conf and so on and upgrading everything to 4.1; this could be pretty risky because you are on a path we never tested since hyperconverged hosted-engine at 3.5 wasn't released.
I understood, definitely bad news for me. But currently I'm running oVirt 3.6.7 that, if I remember correctly, supports hyperconverged setup, it's not possible to fix this issue with my current version? I've installed vdsm 4.17.32-1.el7 with the vdsm-gluster package and ovirt-hosted-engine-ha 1.3.5.7-1.el7.centos on CentOS 7.2.1511 and my engine it's already configured to have both virt and gluster on the same cluster. I cannot put the cluster in maintenance, stop the hosted-engine, stop ovirt-hosted-engine-ha, edit hosted-engine.conf by changing domainType, storage and mnt_options and restart aovirt-hosted-engine-ha and the hosted-engine?
> I'd first try to solve this. > > What OS do you have on your hosts? Are they all upgraded to 3.6? > > See also: > > https://www.ovirt.org/documentation/how-to/hosted-engine-host-OS-upgrade/ <https://www.ovirt.org/documentation/how-to/hosted-engine-host-OS-upgrade/> > > Best, > >> >> Greetings, >> Paolo >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org <mailto:Users@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/users <http://lists.ovirt.org/mailman/listinfo/users> > > Greetings, Paolo _______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users <http://lists.ovirt.org/mailman/listinfo/users>
Greetings, Paolo --------------17C4B33F916F521113721D9F Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <p>Hi Simone,</p> <p>I'll respond inline<br> </p> <br> <div class="moz-cite-prefix">Il 20/03/2017 11:59, Simone Tiraboschi ha scritto:<br> </div> <blockquote cite="mid:CAN8-ONr2sC7nKPnDKh1j+rGRQ+kmKfODS6AJF9Wk3kYa8ujOcA@mail.gmail.com" type="cite"> <div dir="ltr"><br> <div class="gmail_extra"><br> <div class="gmail_quote">On Mon, Mar 20, 2017 at 11:15 AM, Simone Tiraboschi <span dir="ltr"><<a moz-do-not-send="true" href="mailto:stirabos@redhat.com" target="_blank">stirabos@redhat.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <div dir="ltr"> <div class="gmail_extra"><br> <div class="gmail_quote"> <div> <div class="gmail-m_-4888476689598940363gmail-h5">On Mon, Mar 20, 2017 at 10:12 AM, Paolo Margara <span dir="ltr"><<a moz-do-not-send="true" href="mailto:paolo.margara@polito.it" target="_blank">paolo.margara@polito.it</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Yedidyah,<br> <span class="gmail-m_-4888476689598940363gmail-m_7393503524263847303gmail-"><br> Il 19/03/2017 11:55, Yedidyah Bar David ha scritto:<br> > On Sat, Mar 18, 2017 at 12:25 PM, Paolo Margara <<a moz-do-not-send="true" href="mailto:paolo.margara@polito.it" target="_blank">paolo.margara@polito.it</a>> wrote:<br> >> Hi list,<br> >><br> >> I'm working on a system running on oVirt 3.6 and the Engine is reporting<br> >> the warning "The Hosted Engine Storage Domain doesn't exist. It should<br> >> be imported into the setup." repeatedly in the Events tab into the Admin<br> >> Portal.<br> >><br> >> I've read into the list that Hosted Engine Storage Domain should be<br> >> imported automatically into the setup during the upgrade to 3.6<br> >> (original setup was on 3.5), but this not happened while the<br> >> HostedEngine is correctly visible into the VM tab after the upgrade.<br> > Was the upgrade to 3.6 successful and clean?<br> </span>The upgrade from 3.5 to 3.6 was successful, as every subsequent minor<br> release upgrades. I rechecked the upgrade logs I haven't seen any<br> relevant error.<br> One addition information: I'm currently running on CentOS 7 and also the<br> original setup was on this release version.<br> <div> <div class="gmail-m_-4888476689598940363gmail-m_7393503524263847303gmail-h5">><br> >> The Hosted Engine Storage Domain is on a dedicated gluster volume but<br> >> considering that, if I remember correctly, oVirt 3.5 at that time did<br> >> not support gluster as a backend for the HostedEngine at that time I had<br> >> installed the engine using gluster's NFS server using<br> >> 'localhost:/hosted-engine' as a mount point.<br> >><br> >> Currently on every nodes I can read into the log of the<br> >> ovirt-hosted-engine-ha agent the following lines:<br> >><br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:17,773::hosted_engine::4<wbr>62::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring)<br> >> Current state EngineUp (score: 3400)<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:17,774::hosted_engine::4<wbr>67::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring)<br> >> Best remote host virtnode-0-1 (id: 2<br> >> , score: 3400)<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:27,956::hosted_engine::6<wbr>13::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm)<br> >> Initializing VDSM<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:28,055::hosted_engine::6<wbr>58::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images)<br> >> Connecting the storage<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:28,078::storage_server::<wbr>218::<a moz-do-not-send="true" href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server)<br> >> Connecting storage server<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:28,278::storage_server::<wbr>222::<a moz-do-not-send="true" href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server)<br> >> Connecting storage server<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:28,398::storage_server::<wbr>230::<a moz-do-not-send="true" href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server)<br> >> Refreshing the storage domain<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:28,822::hosted_engine::6<wbr>85::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images)<br> >> Preparing images<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:28,822::image::126::ovir<wbr>t_hosted_engine_ha.lib.image.I<wbr>mage::(prepare_images)<br> >> Preparing images<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:29,308::hosted_engine::6<wbr>88::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images)<br> >> Reloading vm.conf from the<br> >> shared storage domain<br> >> MainThread::<a class="moz-txt-link-freetext" href="INFO::2017-03-17">INFO::2017-03-17</a><br> >> 14:04:29,309::config::206::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_local_conf_file)<br> >> Trying to get a fresher copy<br> >> of vm configuration from the OVF_STORE<br> >> MainThread::WARNING::2017-03-1<wbr>7<br> >> 14:04:29,567::ovf_store::104::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(scan)<br> >> Unable to find OVF_STORE<br> >> MainThread::ERROR::2017-03-17<br> >> 14:04:29,691::config::235::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_local_conf_file)<br> >> Unable to get vm.conf from O<br> >> VF_STORE, falling back to initial vm.conf<br> > This is normal at your current state.<br> ><br> >> ...and the following lines into the logfile engine.log inside the Hosted<br> >> Engine:<br> >><br> >> 2017-03-16 07:36:28,087 INFO<br> >> [org.ovirt.engine.core.bll.Imp<wbr>ortHostedEngineStorageDomainCo<wbr>mmand]<br> >> (org.ovirt.thread.pool-8-threa<wbr>d-38) [236d315c] Lock Acquired to object<br> >> 'EngineLock:{exclusiveLocks='[<wbr>]', sharedLocks='null'}'<br> >> 2017-03-16 07:36:28,115 WARN<br> >> [org.ovirt.engine.core.bll.Imp<wbr>ortHostedEngineStorageDomainCo<wbr>mmand]<br> >> (org.ovirt.thread.pool-8-threa<wbr>d-38) [236d315c] CanDoAction of action<br> >> 'ImportHostedEngineStorageDoma<wbr>in' failed for user SYSTEM. Reasons:<br> >> VAR__ACTION__ADD,VAR__TYPE__ST<wbr>ORAGE__DOMAIN,ACTION_TYPE_FAIL<wbr>ED_STORAGE_DOMAIN_NOT_EXIST<br> > That's the thing to debug. Did you check vdsm logs on the hosts, near<br> > the time this happens?<br> </div> </div> Some moments before I saw the following lines into the vdsm.log of the<br> host that execute the hosted engine and that is the SPM, but I see the<br> same lines also on the other nodes:<br> <br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,412::task::595::Stora<wbr>ge.TaskManager.Task::(_updateS<wbr>tate)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::moving from state init -><br> state preparing<br> Thread-1746094::<a class="moz-txt-link-freetext" href="INFO::2017-03">INFO::2017-03</a>-<wbr>16<br> 07:36:00,413::logUtils::48::di<wbr>spatcher::(wrapper) Run and protect:<br> getImagesList(sdUUID='3b5db584<wbr>-5d21-41dc-8f8d-712ce9423a27', options=None)<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,413::resourceManager:<wbr>:199::Storage.ResourceManager.<wbr>Request::(__init__)<br> ResName=`Storage.3b5db584-5d21<wbr>-41dc-8f8d-712ce9423a27`ReqID=<wbr>`8ea3c7f3-8ccd-4127-96b1-ec97a<wbr>3c7b8d4`::Request<br> was made in '/usr/share/vdsm/storage/hsm.p<wbr>y' line '3313' at 'getImagesList'<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,413::resourceManager:<wbr>:545::Storage.ResourceManager:<wbr>:(registerResource)<br> Trying to register resource<br> 'Storage.3b5db584-5d21-41dc-8f<wbr>8d-712ce9423a27' for lock type 'shared'<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,414::resourceManager:<wbr>:604::Storage.ResourceManager:<wbr>:(registerResource)<br> Resource 'Storage.3b5db584-5d21-41dc-8f<wbr>8d-712ce9423a27' is free. Now<br> locking as 'shared' (1 active user)<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,414::resourceManager:<wbr>:239::Storage.ResourceManager.<wbr>Request::(grant)<br> ResName=`Storage.3b5db584-5d21<wbr>-41dc-8f8d-712ce9423a27`ReqID=<wbr>`8ea3c7f3-8ccd-4127-96b1-ec97a<wbr>3c7b8d4`::Granted<br> request<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,414::task::827::Stora<wbr>ge.TaskManager.Task::(resource<wbr>Acquired)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::_resourcesAcqui<wbr>red:<br> Storage.3b5db584-5d21-41dc-8f8<wbr>d-712ce9423a27 (shared)<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,414::task::993::Stora<wbr>ge.TaskManager.Task::(_decref)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::ref 1 aborting False<br> Thread-1746094::ERROR::2017-03<wbr>-16<br> 07:36:00,415::task::866::Stora<wbr>ge.TaskManager.Task::(_setErro<wbr>r)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Unexpected error<br> Traceback (most recent call last):<br> File "/usr/share/vdsm/storage/task.<wbr>py", line 873, in _run<br> return fn(*args, **kargs)<br> File "/usr/share/vdsm/logUtils.py", line 49, in wrapper<br> res = f(*args, **kwargs)<br> File "/usr/share/vdsm/storage/hsm.p<wbr>y", line 3315, in getImagesList<br> images = dom.getAllImages()<br> File "/usr/share/vdsm/storage/fileS<wbr>D.py", line 373, in getAllImages<br> self.getPools()[0],<br> IndexError: list index out of range<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,415::task::885::Stora<wbr>ge.TaskManager.Task::(_run)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Task._run:<br> ae5af1a1-207c-432d-acfa-f3e03e<wbr>014ee6<br> ('3b5db584-5d21-41dc-8f8d-712c<wbr>e9423a27',) {} failed - stopping task<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,415::task::1246::Stor<wbr>age.TaskManager.Task::(stop)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::stopping in state preparing<br> (force False)<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,416::task::993::Stora<wbr>ge.TaskManager.Task::(_decref)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::ref 1 aborting True<br> Thread-1746094::<a class="moz-txt-link-freetext" href="INFO::2017-03">INFO::2017-03</a>-<wbr>16<br> 07:36:00,416::task::1171::Stor<wbr>age.TaskManager.Task::(prepare<wbr>)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::aborting: Task is aborted:<br> u'list index out of range' - code 100<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,416::task::1176::Stor<wbr>age.TaskManager.Task::(prepare<wbr>)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Prepare: aborted: list<br> index out of range<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,416::task::993::Stora<wbr>ge.TaskManager.Task::(_decref)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::ref 0 aborting True<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,416::task::928::Stora<wbr>ge.TaskManager.Task::(_doAbort<wbr>)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::Task._doAbort: force False<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,416::resourceManager:<wbr>:980::Storage.ResourceManager.<wbr>Owner::(cancelAll)<br> Owner.cancelAll requests {}<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,417::task::595::Stora<wbr>ge.TaskManager.Task::(_updateS<wbr>tate)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::moving from state preparing<br> -> state aborting<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,417::task::550::Stora<wbr>ge.TaskManager.Task::(__state_<wbr>aborting)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::_aborting: recover policy none<br> Thread-1746094::DEBUG::2017-03<wbr>-16<br> 07:36:00,417::task::595::Stora<wbr>ge.TaskManager.Task::(_updateS<wbr>tate)<br> Task=`ae5af1a1-207c-432d-acfa-<wbr>f3e03e014ee6`::moving from state aborting<br> -> state failed<br> <br> After that I tried to execute a simple query on storage domains using<br> vdsClient and I got the following information:<br> <br> # vdsClient -s 0 getStorageDomainsList<br> 3b5db584-5d21-41dc-8f8d-712ce9<wbr>423a27<br> 0966f366-b5ae-49e8-b05e-bee189<wbr>5c2d54<br> 35223b83-e0bd-4c8d-91a9-8c6b85<wbr>336e7d<br> 2c3994e3-1f93-4f2a-8a0a-0b5d38<wbr>8a2be7<br> # vdsClient -s 0 getStorageDomainInfo 3b5db584-5d21-41dc-8f8d-712ce9<wbr>423a27<br> uuid = 3b5db584-5d21-41dc-8f8d-712ce9<wbr>423a27<br> version = 3<br> role = Regular<br> remotePath = localhost:/hosted-engine<br> </blockquote> <div><br> </div> </div> </div> <div>Your issue is probably here: by design all the hosts of a single datacenter should be able to see all the storage domains including the hosted-engine one but if try to mount it as localhost:/hosted-engine this will not be possible.</div> <span class="gmail-m_-4888476689598940363gmail-"> <div> </div> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> type = NFS<br> class = Data<br> pool = []<br> name = default<br> # vdsClient -s 0 getImagesList 3b5db584-5d21-41dc-8f8d-712ce9<wbr>423a27<br> list index out of range<br> <br> All other storage domains have the pool attribute defined, could be this<br> the issue? How can I assign to a pool the Hosted Engine Storage Domain?<br> </blockquote> <div><br> </div> </span> <div>This will be the result of the auto import process once feasible.</div> <span class="gmail-m_-4888476689598940363gmail-"> <div> </div> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <div class="gmail-m_-4888476689598940363gmail-m_7393503524263847303gmail-HOEnZb"> <div class="gmail-m_-4888476689598940363gmail-m_7393503524263847303gmail-h5">><br> >> 2017-03-16 07:36:28,116 INFO<br> >> [org.ovirt.engine.core.bll.Imp<wbr>ortHostedEngineStorageDomainCo<wbr>mmand]<br> >> (org.ovirt.thread.pool-8-threa<wbr>d-38) [236d315c] Lock freed to object<br> >> 'EngineLock:{exclusiveLocks='[<wbr>]', sharedLocks='null'}'<br> >><br> >> How can I safely import the Hosted Engine Storage Domain into my setup?<br> >> In this situation is safe to upgrade to oVirt 4.0?<br> </div> </div> </blockquote> </span></div> </div> </div> </blockquote> <div><br> </div> <div>This could be really tricky because I think that upgrading from an hosted-engine env from 3.5 deployed on hyperconverged env but mounted on NFS over a localhost loopback mount to 4.1 is something that is by far out of the paths we tested so I think you can hit a few surprises there.</div> <div><br> </div> <div>In 4.1 the expected configuration under /etc/ovirt-hosted-engine/<wbr>hosted-engine.conf includes:</div> <div>domainType=glusterfs<br> </div> <div>storage=<FIRST_HOST_ADDR>:/<wbr>path<br> </div> <div>mnt_options=<span class="gmail-il" style="white-space:pre-wrap">backup</span><span style="color:rgb(85,85,85);white-space:pre-wrap">-</span><span class="gmail-il" style="white-space:pre-wrap">volfile</span><span style="color:rgb(85,85,85);white-space:pre-wrap">-</span><wbr style="color:rgb(85,85,85);white-space:pre-wrap"><span class="gmail-il" style="white-space:pre-wrap">servers</span><span style="color:rgb(85,85,85);white-space:pre-wrap">=</span><SECOND_HOST_ADDR><span style="color:rgb(85,85,85);white-space:pre-wrap">:</span><THIRD_HOST_ADDR> </div><div> </div><div>But these requires more recent vdsm and ovirt-hosted-engine-ha versions.</div><div>Then you also have to configure your engine to have both virt and gluster on the same cluster.</div><div>Nothing is going to do them automatically for you on upgrades.</div><div> </div>I see two options here: 1. easy, but with a substantial downtime: shutdown your whole DC, start from scratch with gdeploy from 4.1 to configure a new gluster volume and anew engine over there, once you have a new engine import your existing storage domain and restart your VMs</div><div class="gmail_quote">2. a lot trickier, try to reach 4.1 status manually editing /etc/ovirt-hosted-engine/hosted-engine.conf and so on and upgrading everything to 4.1; this could be pretty risky because you are on a path we never tested since hyperconverged hosted-engine at 3.5 wasn't released.</div></div></div></blockquote>I understood, definitely bad news for me. But currently I'm running oVirt 3.6.7 that, if I remember correctly, supports hyperconverged setup, it's not possible to fix this issue with my current version? I've installed vdsm 4.17.32-1.el7 with the vdsm-gluster package and ovirt-hosted-engine-ha 1.3.5.7-1.el7.centos on CentOS 7.2.1511 and my engine it's already configured to have both virt and gluster on the same cluster. I cannot put the cluster in maintenance, stop the hosted-engine, stop ovirt-hosted-engine-ha, edit hosted-engine.conf by changing domainType, storage and mnt_options and restart aovirt-hosted-engine-ha and the hosted-engine? <blockquote cite="mid:CAN8-ONr2sC7nKPnDKh1j+rGRQ+kmKfODS6AJF9Wk3kYa8ujOcA@mail.gmail.com" type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="gmail-m_-4888476689598940363gmail-"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-m_-4888476689598940363gmail-m_7393503524263847303gmail-HOEnZb"><div class="gmail-m_-4888476689598940363gmail-m_7393503524263847303gmail-h5"> > I'd first try to solve this. > > What OS do you have on your hosts? Are they all upgraded to 3.6? > > See also: > > <a moz-do-not-send="true" href="https://www.ovirt.org/documentation/how-to/hosted-engine-host-OS-upgrade/" rel="noreferrer" target="_blank">https://www.ovirt.org/document<wbr>ation/how-to/hosted-engine-hos<wbr>t-OS-upgrade/</a> > > Best, > >> >> Greetings, >> Paolo >> >> ______________________________<wbr>_________________ >> Users mailing list >> <a moz-do-not-send="true" href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a> >> <a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a> > > Greetings, Paolo ______________________________<wbr>_________________ Users mailing list <a moz-do-not-send="true" href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a> <a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a> </div></div></blockquote></span></div> </div></div></blockquote></div></div></div></blockquote>Greetings, Paolo </body></html> --------------17C4B33F916F521113721D9F--