
Hello, I am attempting to build a fairly basic proof of concept ovirt 3.6 machine with the engine, host and guests all on a single physical CentOS 7.2 box with local storage (to start), but am running into issues where engine-install produces the following error: [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. The engine webconsole seems to be working fine after this - I can login and click around - but there's no host CPU/storage resources available, as expected. From the logs shown below, there seems to be an issue with the vdsm installation process being unable to contact the engine host during installation. Any ideas what's going wrong and/or what I can do to debug things further so I can move beyond this issue? Thanks, -Matthew Steps I took to install and diagnose the problem: 1. Installed system with our configuration management system. 2. Deleted users using UID=36 (needed by vdsm user) and UID=108 (needed by ovirt user). 3. Ensured that /etc/sudoers contained the line "#includedir /etc/sudoers.d" so that /etc/sudoers.d/50_vdsm will take effect. 4. Made directories for isos and images: # mkdir /state/partition1/images/; chmod 777 /state/partition1/images/ # mkdir /state/partition1/iso/; chmod 777 /state/partition1/iso/ 5. Ensured selinux was disabled and no firewall rules were installed. 6. Installed RPMs: # yum -y install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm # yum -y install ovirt-engine ovirt-engine-setup-plugin-allinone # rpm -qa | grep ovirt-release36 ovirt-release36-002-2.noarch 7. Installed engine with an all-in-one configuration (Configure VDSM on this host? Yes): # cat /root/ovirt-engine.ans # action=setup [environment:default] OVESETUP_DIALOG/confirmSettings=bool:True OVESETUP_CONFIG/applicationMode=str:virt OVESETUP_CONFIG/remoteEngineSetupStyle=none:None OVESETUP_CONFIG/sanWipeAfterDelete=bool:False OVESETUP_CONFIG/storageIsLocal=bool:False OVESETUP_CONFIG/firewallManager=none:None OVESETUP_CONFIG/remoteEngineHostRootPassword=none:None OVESETUP_CONFIG/firewallChangesReview=none:None OVESETUP_CONFIG/updateFirewall=bool:False OVESETUP_CONFIG/remoteEngineHostSshPort=none:None OVESETUP_CONFIG/fqdn=<...host.fqdn...> OVESETUP_CONFIG/storageType=none:None OSETUP_RPMDISTRO/requireRollback=none:None OSETUP_RPMDISTRO/enableUpgrade=none:None OVESETUP_DB/secured=bool:False OVESETUP_DB/host=str:localhost OVESETUP_DB/user=str:engine OVESETUP_DB/dumper=str:pg_custom OVESETUP_DB/database=str:engine OVESETUP_DB/fixDbViolations=none:None OVESETUP_DB/port=int:5432 OVESETUP_DB/filter=none:None OVESETUP_DB/restoreJobs=int:2 OVESETUP_DB/securedHostValidation=bool:False OVESETUP_ENGINE_CORE/enable=bool:True OVESETUP_CORE/engineStop=none:None OVESETUP_SYSTEM/memCheckEnabled=bool:True OVESETUP_SYSTEM/nfsConfigEnabled=bool:True OVESETUP_PKI/organization=str:<...dn...> OVESETUP_PKI/renew=none:None OVESETUP_CONFIG/isoDomainName=str:ISO_DOMAIN OVESETUP_CONFIG/engineHeapMax=str:7975M OVESETUP_CONFIG/adminPassword=str:<...password...> OVESETUP_CONFIG/isoDomainACL=str:*(rw) OVESETUP_CONFIG/isoDomainMountPoint=str:/state/partition1/iso OVESETUP_CONFIG/engineDbBackupDir=str:/var/lib/ovirt-engine/backups OVESETUP_CONFIG/engineHeapMin=str:7975M OVESETUP_AIO/configure=bool:True OVESETUP_AIO/storageDomainName=str:local_storage OVESETUP_AIO/storageDomainDir=str:/state/partition1/images/ OVESETUP_PROVISIONING/postgresProvisioningEnabled=bool:True OVESETUP_APACHE/configureRootRedirection=bool:True OVESETUP_APACHE/configureSsl=bool:True OVESETUP_VMCONSOLE_PROXY_CONFIG/vmconsoleProxyConfig=bool:True OVESETUP_ENGINE_CONFIG/fqdn=str:<...fqdn...> OVESETUP_CONFIG/websocketProxyConfig=bool:True # engine-setup --config-append=/root/ovirt-engine.ans ... [ INFO ] Starting engine service [ INFO ] Restarting httpd [ INFO ] Waiting for VDSM host to become operational. This may take several minutes... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [WARNING] Local storage domain not added because the VDSM host was not up. Please add it manually. [ INFO ] Restarting ovirt-vmconsole proxy service [ INFO ] Stage: Clean up Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20151228113003-setup.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ INFO ] Execution of setup completed successfully 8. Examined /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log and found this error message which seems to indicate that the vdsm installation process was unable to contact the engine: 2015-12-28 11:29:29 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'start', 'httpd.service') stderr: 2015-12-28 11:29:29 DEBUG otopi.context context._executeMethod:142 Stage closeup METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi.Plugin._closeup 2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:211 Connecting to the Engine 2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:103 Waiting Engine API response 2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:133 Cannot connect to engine Traceback (most recent call last): File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py", line 127, in _waitEngineUp insecure=True, File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in __init__ url='' File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 115, in request persistent_auth=self.__persistent_auth File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_request persistent_auth) File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 155, in __do_request raise errors.RequestError(response_code, response_reason, response_body) RequestError: ^M status: 503^M reason: Service Unavailable^M detail: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>503 Service Unavailable</title> </head><body> <h1>Service Unavailable</h1> <p>The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.</p> </body></html> 2015-12-28 11:29:36 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:133 Cannot connect to engine Traceback (most recent call last): File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py", line 127, in _waitEngineUp insecure=True, File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in __init__ url='' File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 115, in request persistent_auth=self.__persistent_auth File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_request persistent_auth) File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 155, in __do_request raise errors.RequestError(response_code, response_reason, response_body) RequestError: ^M status: 404^M reason: Not Found^M detail: ... <!DOCTYPE html> <html> <head> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <link id="id-link-favicon" rel="shortcut icon" href="/ovirt-engine/theme-resource/favicon" type="image/x-icon" /> <title>404 - Page not found</title> ... </html> 2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:171 Database: 'None', Statement: ' select version, option_value from vdc_options where option_name = %(name)s ', args: {'name': 'SupportedClusterLevels'} 2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:221 Result: [{'version': 'general', 'option_value': '3.0,3.1,3.2,3.3,3.4,3.5,3.6'}] 2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:225 engine SupportedClusterLevels [3.0,3.1,3.2,3.3,3.4,3.5,3.6], PACKAGE_VERSION [3.6.1.3], 2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._getSupportedClusterLevels:181 Attempting to load the dsaversion vdsm module 2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:236 VDSM SupportedClusterLevels [['3.4', '3.5', '3.6']], VDSM VERSION [4.17.13-0.el7.centos], 2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:259 Creating the local data center 2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:269 Creating the local cluster into the local data center 2015-12-28 11:29:52 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:284 Adding the local host to the local cluster 2015-12-28 11:29:55 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:58 Waiting for VDSM host to become operational. This may take several minutes... 2015-12-28 11:29:55 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state 2015-12-28 11:29:56 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state 2015-12-28 11:29:58 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state 2015-12-28 11:29:59 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state 2015-12-28 11:30:00 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state 2015-12-28 11:30:01 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state 2015-12-28 11:30:02 ERROR otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:77 The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. 2015-12-28 11:30:02 WARNING otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:306 Local storage domain not added because the VDSM host was not up. Please add it manually. 2015-12-28 11:30:02 DEBUG otopi.context context._executeMethod:142 Stage closeup METHOD otopi.plugins.ovirt_engine_setup.vmconsole_proxy_helper.system.Plugin._closeup 9. Looked at the vdsm logs and examined service status: # ls -l /var/log/vdsm/ total 8 drwxr-xr-x 2 vdsm kvm 6 Dec 9 03:24 backup -rw-r--r-- 1 vdsm kvm 0 Dec 28 11:13 connectivity.log -rw-r--r-- 1 vdsm kvm 0 Dec 28 11:13 mom.log -rw-r--r-- 1 root root 2958 Dec 28 11:30 supervdsm.log -rw-r--r-- 1 root root 1811 Dec 28 11:30 upgrade.log -rw-r--r-- 1 vdsm kvm 0 Dec 28 11:13 vdsm.log # cat /var/log/vdsm/upgrade.log MainThread::DEBUG::2015-12-28 11:30:02,803::upgrade::90::upgrade::(apply_upgrade) Running upgrade upgrade-unified-persistence MainThread::DEBUG::2015-12-28 11:30:02,806::libvirtconnection::160::root::(get) trying to connect libvirt MainThread::DEBUG::2015-12-28 11:30:02,813::utils::669::root::(execCmd) /sbin/ip route show to 0.0.0.0/0 table main (cwd None) MainThread::DEBUG::2015-12-28 11:30:02,826::utils::687::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainThread::DEBUG::2015-12-28 11:30:02,826::unified_persistence::46::root::(run) upgrade-unified-persistence upgrade persisting networks {} and bondings {} MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ MainThread::DEBUG::2015-12-28 11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent config to clear. MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ MainThread::DEBUG::2015-12-28 11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent config to clear. MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ MainThread::DEBUG::2015-12-28 11:30:02,827::utils::669::root::(execCmd) /usr/share/vdsm/vdsm-store-net-config unified (cwd None) MainThread::DEBUG::2015-12-28 11:30:02,836::utils::687::root::(execCmd) SUCCESS: <err> = 'cp: cannot stat \xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or directory\n'; <rc> = 0 MainThread::DEBUG::2015-12-28 11:30:02,837::upgrade::51::upgrade::(_upgrade_seal) Upgrade upgrade-unified-persistence successfully performed # cat /var/log/vdsm/supervdsm.log MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::539::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsm MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::548::SuperVdsm.Server::(main) Parsing cmd args MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::551::SuperVdsm.Server::(main) Cleaning old socket /var/run/vdsm/svdsm.sock MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::555::SuperVdsm.Server::(main) Setting up keep alive thread MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::561::SuperVdsm.Server::(main) Creating remote object manager MainThread::DEBUG::2015-12-28 11:30:02,416::fileUtils::192::Storage.fileUtils::(chown) Changing owner for /var/run/vdsm/svdsm.sock, to (36:36) MainThread::DEBUG::2015-12-28 11:30:02,416::supervdsmServer::572::SuperVdsm.Server::(main) Started serving super vdsm object sourceRoute::DEBUG::2015-12-28 11:30:02,416::sourceroutethread::79::root::(_subscribeToInotifyLoop) sourceRouteThread.subscribeToInotifyLoop started restore-net::DEBUG::2015-12-28 11:30:03,080::libvirtconnection::160::root::(get) trying to connect libvirt restore-net::INFO::2015-12-28 11:30:03,188::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.1. restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.0. restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.3. restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.2. restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::385::root::(restore) starting network restoration. restore-net::DEBUG::2015-12-28 11:30:03,189::vdsm-restore-net-config::183::root::(_remove_networks_in_running_config) Not cleaning running configuration since it is empty. restore-net::INFO::2015-12-28 11:30:03,205::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ restore-net::DEBUG::2015-12-28 11:30:03,206::netconfpersistence::187::root::(_clearDisk) No existent config to clear. restore-net::INFO::2015-12-28 11:30:03,206::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ restore-net::DEBUG::2015-12-28 11:30:03,207::vdsm-restore-net-config::329::root::(_wait_for_for_all_devices_up) All devices are up. restore-net::INFO::2015-12-28 11:30:03,214::netconfpersistence::71::root::(setBonding) Adding bond0({'nics': [], 'options': ''}) restore-net::INFO::2015-12-28 11:30:03,214::vdsm-restore-net-config::396::root::(restore) restoration completed successfully. # systemctl status supervdsmd ● supervdsmd.service - Auxiliary vdsm service for running helper functions as root Loaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static; vendor preset: enabled) Active: active (running) since Mon 2015-12-28 11:30:02 EST; 1h 8min ago Main PID: 81535 (supervdsmServer) CGroup: /system.slice/supervdsmd.service └─81535 /usr/bin/python /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock Dec 28 11:30:02 hostname systemd[1]: Started Auxiliary vdsm service for running helper functions as root. Dec 28 11:30:02 hostname systemd[1]: Starting Auxiliary vdsm service for running helper functions as root.