Hello,

I am attempting to build a fairly basic proof of concept ovirt 3.6 machine with the engine, host and guests all on a single physical CentOS 7.2 box with local storage (to start), but am running into issues where engine-install produces the following error: 

[ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.

The engine webconsole seems to be working fine after this - I can login and click around - but there's no host CPU/storage resources available, as expected.  From the logs shown below, there seems to be an issue with the vdsm installation process being unable to contact the engine host during installation.  Any ideas what's going wrong and/or what I can do to debug things further so I can move beyond this issue?

Thanks,

-Matthew


Steps I took to install and diagnose the problem:

1. Installed system with our configuration management system.

2. Deleted users using UID=36 (needed by vdsm user) and UID=108 (needed by ovirt user).

3. Ensured that /etc/sudoers contained the line "#includedir /etc/sudoers.d" so that /etc/sudoers.d/50_vdsm will take effect.

4. Made directories for isos and images:
# mkdir /state/partition1/images/; chmod 777 /state/partition1/images/
# mkdir /state/partition1/iso/; chmod 777 /state/partition1/iso/

5. Ensured selinux was disabled and no firewall rules were installed.

6. Installed RPMs:
# yum -y install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
# yum -y install ovirt-engine ovirt-engine-setup-plugin-allinone
# rpm -qa | grep ovirt-release36
ovirt-release36-002-2.noarch

7. Installed engine with an all-in-one configuration (Configure VDSM on this host? Yes):
# cat /root/ovirt-engine.ans 
# action=setup
[environment:default]
OVESETUP_DIALOG/confirmSettings=bool:True
OVESETUP_CONFIG/applicationMode=str:virt
OVESETUP_CONFIG/remoteEngineSetupStyle=none:None
OVESETUP_CONFIG/sanWipeAfterDelete=bool:False
OVESETUP_CONFIG/storageIsLocal=bool:False
OVESETUP_CONFIG/firewallManager=none:None
OVESETUP_CONFIG/remoteEngineHostRootPassword=none:None
OVESETUP_CONFIG/firewallChangesReview=none:None
OVESETUP_CONFIG/updateFirewall=bool:False
OVESETUP_CONFIG/remoteEngineHostSshPort=none:None
OVESETUP_CONFIG/fqdn=<...host.fqdn...>
OVESETUP_CONFIG/storageType=none:None
OSETUP_RPMDISTRO/requireRollback=none:None
OSETUP_RPMDISTRO/enableUpgrade=none:None
OVESETUP_DB/secured=bool:False
OVESETUP_DB/host=str:localhost
OVESETUP_DB/user=str:engine
OVESETUP_DB/dumper=str:pg_custom
OVESETUP_DB/database=str:engine
OVESETUP_DB/fixDbViolations=none:None
OVESETUP_DB/port=int:5432
OVESETUP_DB/filter=none:None
OVESETUP_DB/restoreJobs=int:2
OVESETUP_DB/securedHostValidation=bool:False
OVESETUP_ENGINE_CORE/enable=bool:True
OVESETUP_CORE/engineStop=none:None
OVESETUP_SYSTEM/memCheckEnabled=bool:True
OVESETUP_SYSTEM/nfsConfigEnabled=bool:True
OVESETUP_PKI/organization=str:<...dn...>
OVESETUP_PKI/renew=none:None
OVESETUP_CONFIG/isoDomainName=str:ISO_DOMAIN
OVESETUP_CONFIG/engineHeapMax=str:7975M
OVESETUP_CONFIG/adminPassword=str:<...password...>
OVESETUP_CONFIG/isoDomainACL=str:*(rw)
OVESETUP_CONFIG/isoDomainMountPoint=str:/state/partition1/iso
OVESETUP_CONFIG/engineDbBackupDir=str:/var/lib/ovirt-engine/backups
OVESETUP_CONFIG/engineHeapMin=str:7975M
OVESETUP_AIO/configure=bool:True
OVESETUP_AIO/storageDomainName=str:local_storage
OVESETUP_AIO/storageDomainDir=str:/state/partition1/images/
OVESETUP_PROVISIONING/postgresProvisioningEnabled=bool:True
OVESETUP_APACHE/configureRootRedirection=bool:True
OVESETUP_APACHE/configureSsl=bool:True
OVESETUP_VMCONSOLE_PROXY_CONFIG/vmconsoleProxyConfig=bool:True
OVESETUP_ENGINE_CONFIG/fqdn=str:<...fqdn...>
OVESETUP_CONFIG/websocketProxyConfig=bool:True

# engine-setup --config-append=/root/ovirt-engine.ans
...
[ INFO  ] Starting engine service
[ INFO  ] Restarting httpd
[ INFO  ] Waiting for VDSM host to become operational. This may take several minutes...
[ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.
[WARNING] Local storage domain not added because the VDSM host was not up. Please add it manually.
[ INFO  ] Restarting ovirt-vmconsole proxy service
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20151228113003-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Execution of setup completed successfully

8. Examined /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log and found this error message which seems to indicate that the vdsm installation process was unable to contact the engine:
2015-12-28 11:29:29 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'start', 'httpd.service') stderr:


2015-12-28 11:29:29 DEBUG otopi.context context._executeMethod:142 Stage closeup METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi.Plugin._closeup
2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:211 Connecting to the Engine
2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:103 Waiting Engine API response
2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:133 Cannot connect to engine
Traceback (most recent call last):
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py", line 127, in _waitEngineUp
    insecure=True,
  File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in __init__
    url=''
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 115, in request
    persistent_auth=self.__persistent_auth
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_request
    persistent_auth)
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 155, in __do_request
    raise errors.RequestError(response_code, response_reason, response_body)
RequestError: ^M
status: 503^M
reason: Service Unavailable^M
detail:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

2015-12-28 11:29:36 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:133 Cannot connect to engine
Traceback (most recent call last):
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py", line 127, in _waitEngineUp
    insecure=True,
  File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in __init__
    url=''
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 115, in request
    persistent_auth=self.__persistent_auth
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_request
    persistent_auth)
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 155, in __do_request
    raise errors.RequestError(response_code, response_reason, response_body)
RequestError: ^M
status: 404^M
reason: Not Found^M
detail:
...
<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />


<link id="id-link-favicon" rel="shortcut icon" href="/ovirt-engine/theme-resource/favicon" type="image/x-icon" />

    <title>404 - Page not found</title>
...
</html>

2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:171 Database: 'None', Statement: '
                select version, option_value
                from vdc_options
                where option_name = %(name)s
            ', args: {'name': 'SupportedClusterLevels'}
2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:221 Result: [{'version': 'general', 'option_value': '3.0,3.1,3.2,3.3,3.4,3.5,3.6'}]
2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:225 engine SupportedClusterLevels [3.0,3.1,3.2,3.3,3.4,3.5,3.6], PACKAGE_VERSION [3.6.1.3],
2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._getSupportedClusterLevels:181 Attempting to load the dsaversion vdsm module
2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:236 VDSM SupportedClusterLevels [['3.4', '3.5', '3.6']], VDSM VERSION [4.17.13-0.el7.centos],
2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:259 Creating the local data center
2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:269 Creating the local cluster into the local data center
2015-12-28 11:29:52 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:284 Adding the local host to the local cluster
2015-12-28 11:29:55 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:58 Waiting for VDSM host to become operational. This may take several minutes...
2015-12-28 11:29:55 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:29:56 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:29:58 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:29:59 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:30:00 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:30:01 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:30:02 ERROR otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:77 The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.
2015-12-28 11:30:02 WARNING otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:306 Local storage domain not added because the VDSM host was not up. Please add it manually.
2015-12-28 11:30:02 DEBUG otopi.context context._executeMethod:142 Stage closeup METHOD otopi.plugins.ovirt_engine_setup.vmconsole_proxy_helper.system.Plugin._closeup

9. Looked at the vdsm logs and examined service status:
# ls -l /var/log/vdsm/
total 8
drwxr-xr-x 2 vdsm kvm     6 Dec  9 03:24 backup
-rw-r--r-- 1 vdsm kvm     0 Dec 28 11:13 connectivity.log
-rw-r--r-- 1 vdsm kvm     0 Dec 28 11:13 mom.log
-rw-r--r-- 1 root root 2958 Dec 28 11:30 supervdsm.log
-rw-r--r-- 1 root root 1811 Dec 28 11:30 upgrade.log
-rw-r--r-- 1 vdsm kvm     0 Dec 28 11:13 vdsm.log

# cat /var/log/vdsm/upgrade.log 
MainThread::DEBUG::2015-12-28 11:30:02,803::upgrade::90::upgrade::(apply_upgrade) Running upgrade upgrade-unified-persistence
MainThread::DEBUG::2015-12-28 11:30:02,806::libvirtconnection::160::root::(get) trying to connect libvirt
MainThread::DEBUG::2015-12-28 11:30:02,813::utils::669::root::(execCmd) /sbin/ip route show to 0.0.0.0/0 table main (cwd None)
MainThread::DEBUG::2015-12-28 11:30:02,826::utils::687::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainThread::DEBUG::2015-12-28 11:30:02,826::unified_persistence::46::root::(run) upgrade-unified-persistence upgrade persisting networks {} and bondings {}
MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2015-12-28 11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent config to clear.
MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2015-12-28 11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent config to clear.
MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2015-12-28 11:30:02,827::utils::669::root::(execCmd) /usr/share/vdsm/vdsm-store-net-config unified (cwd None)
MainThread::DEBUG::2015-12-28 11:30:02,836::utils::687::root::(execCmd) SUCCESS: <err> = 'cp: cannot stat \xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or directory\n'; <rc> = 0
MainThread::DEBUG::2015-12-28 11:30:02,837::upgrade::51::upgrade::(_upgrade_seal) Upgrade upgrade-unified-persistence successfully performed

# cat /var/log/vdsm/supervdsm.log 
MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::539::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsm
MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::548::SuperVdsm.Server::(main) Parsing cmd args
MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::551::SuperVdsm.Server::(main) Cleaning old socket /var/run/vdsm/svdsm.sock
MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::555::SuperVdsm.Server::(main) Setting up keep alive thread
MainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::561::SuperVdsm.Server::(main) Creating remote object manager
MainThread::DEBUG::2015-12-28 11:30:02,416::fileUtils::192::Storage.fileUtils::(chown) Changing owner for /var/run/vdsm/svdsm.sock, to (36:36)
MainThread::DEBUG::2015-12-28 11:30:02,416::supervdsmServer::572::SuperVdsm.Server::(main) Started serving super vdsm object
sourceRoute::DEBUG::2015-12-28 11:30:02,416::sourceroutethread::79::root::(_subscribeToInotifyLoop) sourceRouteThread.subscribeToInotifyLoop started
restore-net::DEBUG::2015-12-28 11:30:03,080::libvirtconnection::160::root::(get) trying to connect libvirt
restore-net::INFO::2015-12-28 11:30:03,188::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.1.
restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.0.
restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.3.
restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.2.
restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::385::root::(restore) starting network restoration.
restore-net::DEBUG::2015-12-28 11:30:03,189::vdsm-restore-net-config::183::root::(_remove_networks_in_running_config) Not cleaning running configuration since it is empty.
restore-net::INFO::2015-12-28 11:30:03,205::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
restore-net::DEBUG::2015-12-28 11:30:03,206::netconfpersistence::187::root::(_clearDisk) No existent config to clear.
restore-net::INFO::2015-12-28 11:30:03,206::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
restore-net::DEBUG::2015-12-28 11:30:03,207::vdsm-restore-net-config::329::root::(_wait_for_for_all_devices_up) All devices are up.
restore-net::INFO::2015-12-28 11:30:03,214::netconfpersistence::71::root::(setBonding) Adding bond0({'nics': [], 'options': ''})
restore-net::INFO::2015-12-28 11:30:03,214::vdsm-restore-net-config::396::root::(restore) restoration completed successfully.

# systemctl status supervdsmd
● supervdsmd.service - Auxiliary vdsm service for running helper functions as root
   Loaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static; vendor preset: enabled)
   Active: active (running) since Mon 2015-12-28 11:30:02 EST; 1h 8min ago
 Main PID: 81535 (supervdsmServer)
   CGroup: /system.slice/supervdsmd.service
           └─81535 /usr/bin/python /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock

Dec 28 11:30:02 hostname systemd[1]: Started Auxiliary vdsm service for running helper functions as root.
Dec 28 11:30:02 hostname systemd[1]: Starting Auxiliary vdsm service for running helper functions as root.