[ovirt-users] ovirt 3.6 centos7.2 all-in-one install: The VDSM host was found in a failed state.

Matthew Bohnsack bohnsack at gmail.com
Mon Dec 28 17:46:09 UTC 2015


Hello,

I am attempting to build a fairly basic proof of concept ovirt 3.6 machine
with the engine, host and guests all on a single physical CentOS 7.2 box
with local storage (to start), but am running into issues where
engine-install produces the following error:

[ ERROR ] The VDSM host was found in a failed state. Please check engine
and bootstrap installation logs.


The engine webconsole seems to be working fine after this - I can login and
click around - but there's no host CPU/storage resources available, as
expected.  From the logs shown below, there seems to be an issue with the
vdsm installation process being unable to contact the engine host during
installation.  Any ideas what's going wrong and/or what I can do to debug
things further so I can move beyond this issue?

Thanks,

-Matthew


Steps I took to install and diagnose the problem:

1. Installed system with our configuration management system.

2. Deleted users using UID=36 (needed by vdsm user) and UID=108 (needed by
ovirt user).

3. Ensured that /etc/sudoers contained the line "#includedir
/etc/sudoers.d" so that /etc/sudoers.d/50_vdsm will take effect.

4. Made directories for isos and images:

# mkdir /state/partition1/images/; chmod 777 /state/partition1/images/
# mkdir /state/partition1/iso/; chmod 777 /state/partition1/iso/


5. Ensured selinux was disabled and no firewall rules were installed.

6. Installed RPMs:

# yum -y install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm

# yum -y install ovirt-engine ovirt-engine-setup-plugin-allinone
# rpm -qa | grep ovirt-release36
ovirt-release36-002-2.noarch


7. Installed engine with an all-in-one configuration (Configure VDSM on
this host? Yes):

# cat /root/ovirt-engine.ans
# action=setup
[environment:default]
OVESETUP_DIALOG/confirmSettings=bool:True
OVESETUP_CONFIG/applicationMode=str:virt
OVESETUP_CONFIG/remoteEngineSetupStyle=none:None
OVESETUP_CONFIG/sanWipeAfterDelete=bool:False
OVESETUP_CONFIG/storageIsLocal=bool:False
OVESETUP_CONFIG/firewallManager=none:None
OVESETUP_CONFIG/remoteEngineHostRootPassword=none:None
OVESETUP_CONFIG/firewallChangesReview=none:None
OVESETUP_CONFIG/updateFirewall=bool:False
OVESETUP_CONFIG/remoteEngineHostSshPort=none:None
OVESETUP_CONFIG/fqdn=<...host.fqdn...>
OVESETUP_CONFIG/storageType=none:None
OSETUP_RPMDISTRO/requireRollback=none:None
OSETUP_RPMDISTRO/enableUpgrade=none:None
OVESETUP_DB/secured=bool:False
OVESETUP_DB/host=str:localhost
OVESETUP_DB/user=str:engine
OVESETUP_DB/dumper=str:pg_custom
OVESETUP_DB/database=str:engine
OVESETUP_DB/fixDbViolations=none:None
OVESETUP_DB/port=int:5432
OVESETUP_DB/filter=none:None
OVESETUP_DB/restoreJobs=int:2
OVESETUP_DB/securedHostValidation=bool:False
OVESETUP_ENGINE_CORE/enable=bool:True
OVESETUP_CORE/engineStop=none:None
OVESETUP_SYSTEM/memCheckEnabled=bool:True
OVESETUP_SYSTEM/nfsConfigEnabled=bool:True
OVESETUP_PKI/organization=str:<...dn...>
OVESETUP_PKI/renew=none:None
OVESETUP_CONFIG/isoDomainName=str:ISO_DOMAIN
OVESETUP_CONFIG/engineHeapMax=str:7975M
OVESETUP_CONFIG/adminPassword=str:<...password...>
OVESETUP_CONFIG/isoDomainACL=str:*(rw)
OVESETUP_CONFIG/isoDomainMountPoint=str:/state/partition1/iso
OVESETUP_CONFIG/engineDbBackupDir=str:/var/lib/ovirt-engine/backups
OVESETUP_CONFIG/engineHeapMin=str:7975M
OVESETUP_AIO/configure=bool:True
OVESETUP_AIO/storageDomainName=str:local_storage
OVESETUP_AIO/storageDomainDir=str:/state/partition1/images/
OVESETUP_PROVISIONING/postgresProvisioningEnabled=bool:True
OVESETUP_APACHE/configureRootRedirection=bool:True
OVESETUP_APACHE/configureSsl=bool:True
OVESETUP_VMCONSOLE_PROXY_CONFIG/vmconsoleProxyConfig=bool:True
OVESETUP_ENGINE_CONFIG/fqdn=str:<...fqdn...>
OVESETUP_CONFIG/websocketProxyConfig=bool:True

# engine-setup --config-append=/root/ovirt-engine.ans

...
[ INFO  ] Starting engine service
[ INFO  ] Restarting httpd
[ INFO  ] Waiting for VDSM host to become operational. This may take
several minutes...
[ ERROR ] The VDSM host was found in a failed state. Please check engine
and bootstrap installation logs.
[WARNING] Local storage domain not added because the VDSM host was not up.
Please add it manually.
[ INFO  ] Restarting ovirt-vmconsole proxy service
[ INFO  ] Stage: Clean up
          Log file is located at
/var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log
[ INFO  ] Generating answer file
'/var/lib/ovirt-engine/setup/answers/20151228113003-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Execution of setup completed successfully

8. Examined /var/log/ovirt-engine/setup/ovirt-engine-setup-20151228112813-iksxhe.log
and found this error message which seems to indicate that the vdsm
installation process was unable to contact the engine:

2015-12-28 11:29:29 DEBUG otopi.plugins.otopi.services.systemd
plugin.execute:941 execute-output: ('/bin/systemctl', 'start',
'httpd.service') stderr:


2015-12-28 11:29:29 DEBUG otopi.context context._executeMethod:142 Stage
closeup METHOD
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi.Plugin._closeup
2015-12-28 11:29:29 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:211 Connecting to the Engine
2015-12-28 11:29:29 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitEngineUp:103 Waiting Engine API response
2015-12-28 11:29:29 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitEngineUp:133 Cannot connect to engine
Traceback (most recent call last):
  File
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py",
line 127, in _waitEngineUp
    insecure=True,
  File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in
__init__
    url=''
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py",
line 115, in request
    persistent_auth=self.__persistent_auth
  File
"/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
line 79, in do_request
    persistent_auth)
  File
"/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
line 155, in __do_request
    raise errors.RequestError(response_code, response_reason, response_body)
RequestError: ^M
status: 503^M
reason: Service Unavailable^M
detail:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

2015-12-28 11:29:36 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitEngineUp:133 Cannot connect to engine
Traceback (most recent call last):
  File
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/all-in-one/vdsmi.py",
line 127, in _waitEngineUp
    insecure=True,
  File "/usr/lib/python2.7/site-packages/ovirtsdk/api.py", line 191, in
__init__
    url=''
  File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py",
line 115, in request
    persistent_auth=self.__persistent_auth
  File
"/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
line 79, in do_request
    persistent_auth)
  File
"/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
line 155, in __do_request
    raise errors.RequestError(response_code, response_reason, response_body)
RequestError: ^M
status: 404^M
reason: Not Found^M
detail:
...
<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />


<link id="id-link-favicon" rel="shortcut icon"
href="/ovirt-engine/theme-resource/favicon" type="image/x-icon" />

    <title>404 - Page not found</title>
...
</html>

2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database
database.execute:171 Database: 'None', Statement: '
                select version, option_value
                from vdc_options
                where option_name = %(name)s
            ', args: {'name': 'SupportedClusterLevels'}
2015-12-28 11:29:50 DEBUG otopi.ovirt_engine_setup.engine_common.database
database.execute:221 Result: [{'version': 'general', 'option_value':
'3.0,3.1,3.2,3.3,3.4,3.5,3.6'}]
2015-12-28 11:29:50 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:225 engine SupportedClusterLevels
[3.0,3.1,3.2,3.3,3.4,3.5,3.6], PACKAGE_VERSION [3.6.1.3],
2015-12-28 11:29:50 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._getSupportedClusterLevels:181 Attempting to load the dsaversion vdsm
module
2015-12-28 11:29:50 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:236 VDSM SupportedClusterLevels [['3.4', '3.5', '3.6']],
VDSM VERSION [4.17.13-0.el7.centos],
2015-12-28 11:29:50 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:259 Creating the local data center
2015-12-28 11:29:50 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:269 Creating the local cluster into the local data center
2015-12-28 11:29:52 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:284 Adding the local host to the local cluster
2015-12-28 11:29:55 INFO
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:58 Waiting for VDSM host to become operational. This
may take several minutes...
2015-12-28 11:29:55 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:29:56 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:29:58 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:29:59 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:30:00 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:30:01 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:87 VDSM host in installing state
2015-12-28 11:30:02 ERROR
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._waitVDSMHostUp:77 The VDSM host was found in a failed state. Please
check engine and bootstrap installation logs.
2015-12-28 11:30:02 WARNING
otopi.plugins.ovirt_engine_setup.ovirt_engine.all-in-one.vdsmi
vdsmi._closeup:306 Local storage domain not added because the VDSM host was
not up. Please add it manually.
2015-12-28 11:30:02 DEBUG otopi.context context._executeMethod:142 Stage
closeup METHOD
otopi.plugins.ovirt_engine_setup.vmconsole_proxy_helper.system.Plugin._closeup

9. Looked at the vdsm logs and examined service status:

# ls -l /var/log/vdsm/
total 8
drwxr-xr-x 2 vdsm kvm     6 Dec  9 03:24 backup
-rw-r--r-- 1 vdsm kvm     0 Dec 28 11:13 connectivity.log
-rw-r--r-- 1 vdsm kvm     0 Dec 28 11:13 mom.log
-rw-r--r-- 1 root root 2958 Dec 28 11:30 supervdsm.log
-rw-r--r-- 1 root root 1811 Dec 28 11:30 upgrade.log
-rw-r--r-- 1 vdsm kvm     0 Dec 28 11:13 vdsm.log


# cat /var/log/vdsm/upgrade.log
MainThread::DEBUG::2015-12-28
11:30:02,803::upgrade::90::upgrade::(apply_upgrade) Running upgrade
upgrade-unified-persistence
MainThread::DEBUG::2015-12-28
11:30:02,806::libvirtconnection::160::root::(get) trying to connect libvirt
MainThread::DEBUG::2015-12-28 11:30:02,813::utils::669::root::(execCmd)
/sbin/ip route show to 0.0.0.0/0 table main (cwd None)
MainThread::DEBUG::2015-12-28 11:30:02,826::utils::687::root::(execCmd)
SUCCESS: <err> = ''; <rc> = 0
MainThread::DEBUG::2015-12-28
11:30:02,826::unified_persistence::46::root::(run)
upgrade-unified-persistence upgrade persisting networks {} and bondings {}
MainThread::INFO::2015-12-28
11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing
/var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2015-12-28
11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent
config to clear.
MainThread::INFO::2015-12-28
11:30:02,827::netconfpersistence::179::root::(_clearDisk) Clearing
/var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2015-12-28
11:30:02,827::netconfpersistence::187::root::(_clearDisk) No existent
config to clear.
MainThread::INFO::2015-12-28
11:30:02,827::netconfpersistence::129::root::(save) Saved new config
RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and
/var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2015-12-28 11:30:02,827::utils::669::root::(execCmd)
/usr/share/vdsm/vdsm-store-net-config unified (cwd None)
MainThread::DEBUG::2015-12-28 11:30:02,836::utils::687::root::(execCmd)
SUCCESS: <err> = 'cp: cannot stat
\xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or
directory\n'; <rc> = 0
MainThread::DEBUG::2015-12-28
11:30:02,837::upgrade::51::upgrade::(_upgrade_seal) Upgrade
upgrade-unified-persistence successfully performed


# cat /var/log/vdsm/supervdsm.log
MainThread::DEBUG::2015-12-28
11:30:02,415::supervdsmServer::539::SuperVdsm.Server::(main) Making sure
I'm root - SuperVdsm
MainThread::DEBUG::2015-12-28
11:30:02,415::supervdsmServer::548::SuperVdsm.Server::(main) Parsing cmd
args
MainThread::DEBUG::2015-12-28
11:30:02,415::supervdsmServer::551::SuperVdsm.Server::(main) Cleaning old
socket /var/run/vdsm/svdsm.sock
MainThread::DEBUG::2015-12-28
11:30:02,415::supervdsmServer::555::SuperVdsm.Server::(main) Setting up
keep alive thread
MainThread::DEBUG::2015-12-28
11:30:02,415::supervdsmServer::561::SuperVdsm.Server::(main) Creating
remote object manager
MainThread::DEBUG::2015-12-28
11:30:02,416::fileUtils::192::Storage.fileUtils::(chown) Changing owner for
/var/run/vdsm/svdsm.sock, to (36:36)
MainThread::DEBUG::2015-12-28
11:30:02,416::supervdsmServer::572::SuperVdsm.Server::(main) Started
serving super vdsm object
sourceRoute::DEBUG::2015-12-28
11:30:02,416::sourceroutethread::79::root::(_subscribeToInotifyLoop)
sourceRouteThread.subscribeToInotifyLoop started
restore-net::DEBUG::2015-12-28
11:30:03,080::libvirtconnection::160::root::(get) trying to connect libvirt
restore-net::INFO::2015-12-28
11:30:03,188::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs)
SRIOV network device which is not persisted found at: 0000:01:00.1.
restore-net::INFO::2015-12-28
11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs)
SRIOV network device which is not persisted found at: 0000:01:00.0.
restore-net::INFO::2015-12-28
11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs)
SRIOV network device which is not persisted found at: 0000:01:00.3.
restore-net::INFO::2015-12-28
11:30:03,189::vdsm-restore-net-config::86::root::(_restore_sriov_numvfs)
SRIOV network device which is not persisted found at: 0000:01:00.2.
restore-net::INFO::2015-12-28
11:30:03,189::vdsm-restore-net-config::385::root::(restore) starting
network restoration.
restore-net::DEBUG::2015-12-28
11:30:03,189::vdsm-restore-net-config::183::root::(_remove_networks_in_running_config)
Not cleaning running configuration since it is empty.
restore-net::INFO::2015-12-28
11:30:03,205::netconfpersistence::179::root::(_clearDisk) Clearing
/var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
restore-net::DEBUG::2015-12-28
11:30:03,206::netconfpersistence::187::root::(_clearDisk) No existent
config to clear.
restore-net::INFO::2015-12-28
11:30:03,206::netconfpersistence::129::root::(save) Saved new config
RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and
/var/run/vdsm/netconf/bonds/
restore-net::DEBUG::2015-12-28
11:30:03,207::vdsm-restore-net-config::329::root::(_wait_for_for_all_devices_up)
All devices are up.
restore-net::INFO::2015-12-28
11:30:03,214::netconfpersistence::71::root::(setBonding) Adding
bond0({'nics': [], 'options': ''})
restore-net::INFO::2015-12-28
11:30:03,214::vdsm-restore-net-config::396::root::(restore) restoration
completed successfully.

# systemctl status supervdsmd
● supervdsmd.service - Auxiliary vdsm service for running helper functions
as root
   Loaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static;
vendor preset: enabled)
   Active: active (running) since Mon 2015-12-28 11:30:02 EST; 1h 8min ago
 Main PID: 81535 (supervdsmServer)
   CGroup: /system.slice/supervdsmd.service
           └─81535 /usr/bin/python /usr/share/vdsm/supervdsmServer
--sockfile /var/run/vdsm/svdsm.sock

Dec 28 11:30:02 hostname systemd[1]: Started Auxiliary vdsm service for
running helper functions as root.
Dec 28 11:30:02 hostname systemd[1]: Starting Auxiliary vdsm service for
running helper functions as root.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151228/9934ef2c/attachment-0001.html>


More information about the Users mailing list