Progress: I managed to get a host setup and then a vm setup on the host, but the VM won't start. I'm looking for some help getting the VM to start. Details...I was able to work around the VDSM host in a failed state issue (I think), by doing a host reinstall from the engine web GUI, after specifying the the root password for the host. That is, I changed to password authentication, instead of what was originally set to SSH PublicKey authentication. After the host reinstall completed, I was left with what appears to be an all-in-one setup with the engine and a host running on the same server. Maybe this would have worked automatically, without the re-install step, if I had ovirt SSH keys setup for root as expected?Question: Any hints on to how to debug this in terms of host install SSH PublicKey auth not working? What user is attempting to SSH from the engine to the host with what private key to complete the setup? I see an ovirt-engine public key that was added to /root/.ssh/authorized_keys. Is this the public key that's expected to work with the automated all-in-one host setup? If so, where's the corresponding private key so I might attempt to debug further?After adding a virtual local disk to the new host and setting up a new VM through the web interface, I tried to attempt to start the VM but got this permission denied error in engine.log:2015-12-29 12:01:47,814 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-3) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM nycovirttest1 is down with error. Exit message: internal error: process exited while connecting to monitor: 2015-12-29T17:01:45.424728Z qemu-kvm: -drive file=/rhev/data-center/62bbbc20-00c6-46b3-8ac9-2677a4da69e4/954ebee1-521b-454d-b54c-db9516f36cff/images/e87e1633-7afc-4835-931e-c444e1cf72f5/f6dc9f53-b373-4c96-9015-25b316dcbf0a,if=none,id=drive-virtio-disk0,format=raw,serial=e87e1633-7afc-4835-931e-c444e1cf72f5,cache=none,werror=stop,rerror=stop,aio=threads: Could not open '/rhev/data-center/62bbbc20-00c6-46b3-8ac9-2677a4da69e4/954ebee1-521b-454d-b54c-db9516f36cff/images/e87e1633-7afc-4835-931e-c444e1cf72f5/f6dc9f53-b373-4c96-9015-25b316dcbf0a': Permission deniedQuestion: How can I debug this VM startup permission denied error further? Is there a command line I can issue that's similar to what ovirt engine is attempting that can give me more detail? Something I can strace?The vdsm user can read the above referenced image just fine, so it's something more than just simple filesystem permissions I think?:$ iduid=36(vdsm) gid=36(kvm) groups=36(kvm),107(qemu),179(sanlock)$ file /rhev/data-center/62bbbc20-00c6-46b3-8ac9-2677a4da69e4/954ebee1-521b-454d-b54c-db9516f36cff/images/e87e1633-7afc-4835-931e-c444e1cf72f5/f6dc9f53-b373-4c96-9015-25b316dcbf0a/rhev/data-center/62bbbc20-00c6-46b3-8ac9-2677a4da69e4/954ebee1-521b-454d-b54c-db9516f36cff/images/e87e1633-7afc-4835-931e-c444e1cf72f5/f6dc9f53-b373-4c96-9015-25b316dcbf0a: dataThanks for your help,-MatthewOn Mon, Dec 28, 2015 at 12:46 PM, Matthew Bohnsack <bohnsack@gmail.com> wrote:Hello,I am attempting to build a fairly basic proof of concept ovirt 3.6 machine with the engine, host and guests all on a single physical CentOS 7.2 box with local storage (to start), but am running into issues where engine-install produces the following error:[ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.The engine webconsole seems to be working fine after this - I can login and click around - but there's no host CPU/storage resources available, as expected. From the logs shown below, there seems to be an issue with the vdsm installation process being unable to contact the engine host during installation. Any ideas what's going wrong and/or what I can do to debug things further so I can move beyond this issue?
Thanks,-MatthewSteps I took to install and diagnose the problem:1. Installed system with our configuration management system.2. Deleted users using UID=36 (needed by vdsm user) and UID=108 (needed by ovirt user).3. Ensured that /etc/sudoers contained the line "#includedir /etc/sudoers.d" so that /etc/sudoers.d/50_vdsm will take effect.4. Made directories for isos and images:# mkdir /state/partition1/images/; chmod 777 /state/partition1/images/# mkdir /state/partition1/iso/; chmod 777 /state/partition1/iso/5. Ensured selinux was disabled and no firewall rules were installed.6. Installed RPMs:# yum -y install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm# yum -y install ovirt-engine ovirt-engine-setup-plugin-allinone# rpm -qa | grep ovirt-release36ovirt-release36-002-2.noarch7. Installed engine with an all-in-one configuration (Configure VDSM on this host? Yes):# cat /root/ovirt-engine.ans# action=setup[environment:default]OVESETUP_DIALOG/confirmSettings=bool:TrueOVESETUP_CONFIG/applicationMode=str:virtOVESETUP_CONFIG/remoteEngineSetupStyle=none:NoneOVESETUP_CONFIG/sanWipeAfterDelete=bool:FalseOVESETUP_CONFIG/storageIsLocal=bool:FalseOVESETUP_CONFIG/firewallManager=none:NoneOVESETUP_CONFIG/remoteEngineHostRootPassword=none:NoneOVESETUP_CONFIG/firewallChangesReview=none:NoneOVESETUP_CONFIG/updateFirewall=bool:FalseOVESETUP_CONFIG/remoteEngineHostSshPort=none:NoneOVESETUP_CONFIG/fqdn=<...host.fqdn...>OVESETUP_CONFIG/storageType=none:NoneOSETUP_RPMDISTRO/requireRollback=none:NoneOSETUP_RPMDISTRO/enableUpgrade=none:NoneOVESETUP_DB/secured=bool:FalseOVESETUP_DB/host=str:localhostOVESETUP_DB/user=str:engineOVESETUP_DB/dumper=str:pg_customOVESETUP_DB/database=str:engineOVESETUP_DB/fixDbViolations=none:NoneOVESETUP_DB/port=int:5432OVESETUP_DB/filter=none:NoneOVESETUP_DB/restoreJobs=int:2OVESETUP_DB/securedHostValidation=bool:FalseOVESETUP_ENGINE_CORE/enable=bool:TrueOVESETUP_CORE/engineStop=none:NoneOVESETUP_SYSTEM/memCheckEnabled=bool:TrueOVESETUP_SYSTEM/nfsConfigEnabled=bool:TrueOVESETUP_PKI/organization=str:<...dn...>OVESETUP_PKI/renew=none:NoneOVESETUP_CONFIG/isoDomainName=str:ISO_DOMAINOVESETUP_CONFIG/engineHeapMax=str:7975MOVESETUP_CONFIG/adminPassword=str:<...password...>OVESETUP_CONFIG/isoDomainACL=str:*(rw)OVESETUP_CONFIG/isoDomainMountPoint=str:/state/partition1/isoOVESETUP_CONFIG/engineDbBackupDir=str:/var/lib/ovirt-engine/backupsOVESETUP_CONFIG/engineHeapMin=str:7975MOVESETUP_AIO/configure=bool:TrueOVESETUP_AIO/storageDomainName=str:local_storageOVESETUP_AIO/storageDomainDir=str:/state/partition1/images/OVESETUP_PROVISIONING/postgresProvisioningEnabled=bool:TrueOVESETUP_APACHE/configureRootRedirection=bool:TrueOVESETUP_APACHE/configureSsl=bool:TrueOVESETUP_VMCONSOLE_PROXY_CONFIG/vmconsoleProxyConfig=bool:TrueOVESETUP_ENGINE_CONFIG/fqdn=str:<...fqdn...>OVESETUP_CONFIG/websocketProxyConfig=bool:True# engine-setup --config-append=/root/ovirt-engine.ans...[ INFO ] Starting engine service[ INFO ] Restarting httpd[ INFO ] Waiting for VDSM host to become operational. This may take several minutes...[ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.[WARNING] Local storage domain not added because the VDSM host was not up. Please add it manually.[ INFO ] Restarting ovirt-vmconsole proxy service[ INFO ] Stage: Clean upLog file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log[ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20151228113003-setup.conf'[ INFO ] Stage: Pre-termination[ INFO ] Stage: Termination[ INFO ] Execution of setup completed successfully8. Examined /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log and found this error message which seems to indicate that the vdsm installation process was unable to contact the engine:2015-12-28 11:29:29 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'start', 'httpd.service') stderr:2015-12-28 11:29:29 DEBUG otopi.context context._executeMethod:142 Stage closeup METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi.Plugin._closeup2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:211 Connecting to the Engine2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:103 Waiting Engine API response2015-12-28 11:29:29 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:133 Cannot connect to engineTraceback (most recent call last):File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py", line 127, in _waitEngineUpinsecure=True,File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in __init__url=''File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 115, in requestpersistent_auth=self.__persistent_authFile "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_requestpersistent_auth)File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 155, in __do_requestraise errors.RequestError(response_code, response_reason, response_body)RequestError: ^Mstatus: 503^Mreason: Service Unavailable^Mdetail:<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>503 Service Unavailable</title></head><body><h1>Service Unavailable</h1><p>The server is temporarily unable to service yourrequest due to maintenance downtime or capacityproblems. Please try again later.</p></body></html>2015-12-28 11:29:36 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitEngineUp:133 Cannot connect to engineTraceback (most recent call last):File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py", line 127, in _waitEngineUpinsecure=True,File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in __init__url=''File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 115, in requestpersistent_auth=self.__persistent_authFile "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_requestpersistent_auth)File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 155, in __do_requestraise errors.RequestError(response_code, response_reason, response_body)RequestError: ^Mstatus: 404^Mreason: Not Found^Mdetail:...<!DOCTYPE html><html><head><meta http-equiv="Content-type" content="text/html; charset=utf-8" /><link id="id-link-favicon" rel="shortcut icon" href="/ovirt-engine/theme-resource/favicon" type="image/x-icon" /><title>404 - Page not found</title>...</html>2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:171 Database: 'None', Statement: 'select version, option_valuefrom vdc_optionswhere option_name = %(name)s', args: {'name': 'SupportedClusterLevels'}2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:221 Result: [{'version': 'general', 'option_value': '3.0,3.1,3.2,3.3,3.4,3.5,3.6'}]2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:225 engine SupportedClusterLevels [3.0,3.1,3.2,3.3,3.4,3.5,3.6], PACKAGE_VERSION [3.6.1.3],2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._getSupportedClusterLevels:181 Attempting to load the dsaversion vdsm module2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:236 VDSM SupportedClusterLevels [['3.4', '3.5', '3.6']], VDSM VERSION [4.17.13-0.el7.centos],2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:259 Creating the local data center2015-12-28 11:29:50 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:269 Creating the local cluster into the local data center2015-12-28 11:29:52 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:284 Adding the local host to the local cluster2015-12-28 11:29:55 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:58 Waiting for VDSM host to become operational. This may take several minutes...2015-12-28 11:29:55 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state2015-12-28 11:29:56 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state2015-12-28 11:29:58 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state2015-12-28 11:29:59 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state2015-12-28 11:30:00 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state2015-12-28 11:30:01 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:87 VDSM host in installing state2015-12-28 11:30:02 ERROR otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._waitVDSMHostUp:77 The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.2015-12-28 11:30:02 WARNING otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi vdsmi._closeup:306 Local storage domain not added because the VDSM host was not up. Please add it manually.2015-12-28 11:30:02 DEBUG otopi.context context._executeMethod:142 Stage closeup METHOD otopi.plugins.ovirt_engine_setup.vmconsole_proxy_helper.system.Plugin._closeup9. Looked at the vdsm logs and examined service status:# ls -l /var/log/vdsm/total 8drwxr-xr-x 2 vdsm kvm 6 Dec 9 03:24 backup-rw-r--r-- 1 vdsm kvm 0 Dec 28 11:13 connectivity.log-rw-r--r-- 1 vdsm kvm 0 Dec 28 11:13 mom.log-rw-r--r-- 1 root root 2958 Dec 28 11:30 supervdsm.log-rw-r--r-- 1 root root 1811 Dec 28 11:30 upgrade.log-rw-r--r-- 1 vdsm kvm 0 Dec 28 11:13 vdsm.log# cat /var/log/vdsm/upgrade.logMainThread::DEBUG::2015-12-28 11:30:02,803::upgrade::90::upgrade::(apply_upgrade) Running upgrade upgrade-unified-persistenceMainThread::DEBUG::2015-12-28 11:30:02,806::libvirtconnection::160::root::(get) trying to connect libvirtMainThread::DEBUG::2015-12-28 11:30:02,813::utils::669::root::(execCmd) /sbin/ip route show to 0.0.0.0/0 table main (cwd None)MainThread::DEBUG::2015-12-28 11:30:02,826::utils::687::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0MainThread::DEBUG::2015-12-28 11:30:02,826::unified_persistence::46::root::(run) upgrade-unified-persistence upgrade persisting networks {} and bondings {}MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/MainThread::DEBUG::2015-12-28 11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent config to clear.MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/MainThread::DEBUG::2015-12-28 11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent config to clear.MainThread::INFO::2015-12-28 11:30:02,827::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/MainThread::DEBUG::2015-12-28 11:30:02,827::utils::669::root::(execCmd) /usr/share/vdsm/vdsm-store-net-config unified (cwd None)MainThread::DEBUG::2015-12-28 11:30:02,836::utils::687::root::(execCmd) SUCCESS: <err> = 'cp: cannot stat \xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or directory\n'; <rc> = 0MainThread::DEBUG::2015-12-28 11:30:02,837::upgrade::51::upgrade::(_upgrade_seal) Upgrade upgrade-unified-persistence successfully performed# cat /var/log/vdsm/supervdsm.logMainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::539::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsmMainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::548::SuperVdsm.Server::(main) Parsing cmd argsMainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::551::SuperVdsm.Server::(main) Cleaning old socket /var/run/vdsm/svdsm.sockMainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::555::SuperVdsm.Server::(main) Setting up keep alive threadMainThread::DEBUG::2015-12-28 11:30:02,415::supervdsmServer::561::SuperVdsm.Server::(main) Creating remote object managerMainThread::DEBUG::2015-12-28 11:30:02,416::fileUtils::192::Storage.fileUtils::(chown) Changing owner for /var/run/vdsm/svdsm.sock, to (36:36)MainThread::DEBUG::2015-12-28 11:30:02,416::supervdsmServer::572::SuperVdsm.Server::(main) Started serving super vdsm objectsourceRoute::DEBUG::2015-12-28 11:30:02,416::sourceroutethread::79::root::(_subscribeToInotifyLoop) sourceRouteThread.subscribeToInotifyLoop startedrestore-net::DEBUG::2015-12-28 11:30:03,080::libvirtconnection::160::root::(get) trying to connect libvirtrestore-net::INFO::2015-12-28 11:30:03,188::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.1.restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.0.restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.3.restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:01:00.2.restore-net::INFO::2015-12-28 11:30:03,189::vdsm-restore-net-config::385::root::(restore) starting network restoration.restore-net::DEBUG::2015-12-28 11:30:03,189::vdsm-restore-net-config::183::root::(_remove_networks_in_running_config) Not cleaning running configuration since it is empty.restore-net::INFO::2015-12-28 11:30:03,205::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/restore-net::DEBUG::2015-12-28 11:30:03,206::netconfpersistence::187::root::(_clearDisk) No existent config to clear.restore-net::INFO::2015-12-28 11:30:03,206::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/restore-net::DEBUG::2015-12-28 11:30:03,207::vdsm-restore-net-config::329::root::(_wait_for_for_all_devices_up) All devices are up.restore-net::INFO::2015-12-28 11:30:03,214::netconfpersistence::71::root::(setBonding) Adding bond0({'nics': [], 'options': ''})restore-net::INFO::2015-12-28 11:30:03,214::vdsm-restore-net-config::396::root::(restore) restoration completed successfully.# systemctl status supervdsmd● supervdsmd.service - Auxiliary vdsm service for running helper functions as rootLoaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static; vendor preset: enabled)Active: active (running) since Mon 2015-12-28 11:30:02 EST; 1h 8min agoMain PID: 81535 (supervdsmServer)CGroup: /system.slice/supervdsmd.service└─81535 /usr/bin/python /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sockDec 28 11:30:02 hostname systemd[1]: Started Auxiliary vdsm service for running helper functions as root.Dec 28 11:30:02 hostname systemd[1]: Starting Auxiliary vdsm service for running helper functions as root.