On Mon, Oct 3, 2016 at 6:47 PM, Sam Cappello <samc@oracool.net> wrote:
Hi,
so i was running a 3.4 hosted engine two node setup on centos 6, had some disk issues so i tried to upgrade to centos 7 and follow the path 3.4 > 3.5 > 3.6 > 4.0.  i screwed up dig time somewhere between 3.6 and 4.0, so i wiped the drives, installed a fresh 4.0.3, then created the database and restored the 3.6 engine backup before running engine-setup as per the docs.   things seemed to work, but i have the the following issues / symptoms:
- ovirt-ha-agent running 100% CPU on both nodes
- messages in the UI that the Hosted Engine storage Domain isn't active and Failed to import the Hosted Engine Storage Domain
- hosted engine is not visible in the UI
and the following repeating in the agent.log:

MainThread::INFO::2016-10-03 12:38:27,718::hosted_engine::461::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUp (score: 3400)
MainThread::INFO::2016-10-03 12:38:27,720::hosted_engine::466::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Best remote host vmhost1.oracool.net (id: 1, score: 3400)
MainThread::INFO::2016-10-03 12:38:37,979::states::421::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine vm running on localhost
MainThread::INFO::2016-10-03 12:38:37,985::hosted_engine::612::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM
MainThread::INFO::2016-10-03 12:38:45,645::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage
MainThread::INFO::2016-10-03 12:38:45,647::storage_server::219::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server
MainThread::INFO::2016-10-03 12:39:00,543::storage_server::226::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server
MainThread::INFO::2016-10-03 12:39:00,562::storage_server::233::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain
MainThread::INFO::2016-10-03 12:39:01,235::hosted_engine::666::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Preparing images
MainThread::INFO::2016-10-03 12:39:01,236::image::126::ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
MainThread::INFO::2016-10-03 12:39:09,295::hosted_engine::669::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Reloading vm.conf from the shared storage domain
MainThread::INFO::2016-10-03 12:39:09,296::config::206::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Trying to get a fresher copy of vm configuration from the OVF_STORE
MainThread::WARNING::2016-10-03 12:39:16,928::ovf_store::107::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Unable to find OVF_STORE

The engine will automatically create it once the hosted-engine storage domain and the engine VM are correctly been imported.
 
MainThread::ERROR::2016-10-03 12:39:16,934::config::235::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf

I have searched a bit and not really found a solution, and have come to the conclusion that i have made a mess of things, and am wondering if the best solution is to export the VMs, and reinstall everything then import them back?
i am using remote  NFS storage.
if i try and add the hosted engine storage domain it says it is already registered.

The best option here is to manually remove it from the DB and let the engine import it again.
I'm working on an helper utility here but it's still not fully tested:
https://gerrit.ovirt.org/#/c/64966/
 
i have also upgraded and am now running oVirt Engine Version: 4.0.4.4-1.el7.centos
hosts were installed using ovirt-node.  currently at  3.10.0-327.28.3.el7.x86_64
if a fresh install is best, any advice / pointer to doc that explains best way to do this?
i have not moved my most important server over to this cluster yet so i can take some downtime to reinstall.
thanks!
sam



_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users