[Users] test day: using VM has host for vdsm

Ryan Harper ryanh at us.ibm.com
Wed Jan 18 17:02:54 UTC 2012


* Rami Vaknin <rvaknin at redhat.com> [2012-01-18 10:49]:
> On 01/18/2012 06:39 PM, Ryan Harper wrote:
> >* Ayal Baron<abaron at redhat.com>  [2012-01-18 10:35]:
> >>
> >>----- Original Message -----
> >>>* Haim Ateya<hateya at redhat.com>  [2012-01-18 10:15]:
> >>>>On Wed 18 Jan 2012 06:09:46 PM IST, Ryan Harper wrote:
> >>>>>* Haim Ateya<hateya at redhat.com>   [2012-01-18 08:02]:
> >>>>>>On Wed 18 Jan 2012 03:48:08 PM IST, Ryan Harper wrote:
> >>>>>>>* Haim Ateya<hateya at redhat.com>    [2012-01-18 07:13]:
> >>>>>>>>On Wed 18 Jan 2012 02:59:01 PM IST, Ryan Harper wrote:
> >>>>>>>>>I've created some f16 VMs that contain both ovirt-engine and a
> >>>>>>>>>few
> >>>>>>>>>to run vdsm as nodes.  When I add in the VM host into the
> >>>>>>>>>engine and it
> >>>>>>>>>attempts to install vdsm (even though I've already installed
> >>>>>>>>>vdsm) the
> >>>>>>>>>install fails because the vdsm install script is checking to
> >>>>>>>>>see if the
> >>>>>>>>>host has virt capabilities; since I'm not running nested KVM,
> >>>>>>>>>this
> >>>>>>>>>fails.  Is there a way to work around this can enable a VM to
> >>>>>>>>>be a host
> >>>>>>>>>in oVirt?  I had heard in the past there was a way to create
> >>>>>>>>>fake VMs
> >>>>>>>>>when attempting to do ovirt-engine stress testing, wondering
> >>>>>>>>>if that
> >>>>>>>>>might be of help here.
> >>>>>>>>>
> >>>>>>>>>Also, are their vdsm rpms built for RHEL6.x available?
> >>>>>>>>>
> >>>>>>>>>Thanks!
> >>>>>>>>>
> >>>>>>>>Hi Rayn,
> >>>>>>>>
> >>>>>>>>- login to your ovirt-engine machine
> >>>>>>>>- edit
> >>>>>>>>/usr/share/ovirt-engine/engine.ear/components.war/vds/vds_bootstrap.py
> >>>>>>>>- comment out the following:
> >>>>>>>>
> >>>>>>>>    836     if not oDeploy.virtExplorer(random_num):
> >>>>>>>>    837         logging.error('virtExplorer test failed')
> >>>>>>>>    838         return False
> >>>>>>>>- reinstall host
> >>>>>>>So I'm getting further, but now the bootstrap.log has more
> >>>>>>>errors below.
> >>>>>>>If I follow the test day instructions, it indicates to install
> >>>>>>>vdsm and
> >>>>>>>includes instructions, but it's clear that ovirt-engine is
> >>>>>>>configured by
> >>>>>>>default to push out vdsm and install it.  If I've already
> >>>>>>>configured and
> >>>>>>>installed vdsm on the node is there any way to not attempting to
> >>>>>>>bootstrap
> >>>>>>>vdsm
> >>>>>>>at all and just attempt to have it connect?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG<BSTRAP component='VDS PACKAGES'
> >>>>>>>status='OK' result='qemu-kvm-tools'
> >>>>>>>message='qemu-kvm-tools-0.15.1-3.fc16.x86_64 '/>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    Basic configuration found,
> >>>>>>>skipping
> >>>>>>>this step
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG<BSTRAP component='CreateConf'
> >>>>>>>status='OK'
> >>>>>>>message='Basic configuration found, skipping this step'/>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG<BSTRAP
> >>>>>>>component='CheckLocalHostname'
> >>>>>>>status='OK' message='Local hostname is correct.'/>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    Bridge ovirtmgmt not found,
> >>>>>>>need to
> >>>>>>>create it.
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    getAddress Entry.
> >>>>>>>url=http://ichigo-dom223.phx.austin.ibm.com:8080/Components/vds/
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    getAddress return.
> >>>>>>>address=ichigo-dom223.phx.austin.ibm.com port=8080
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    makeBridge begin.
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    _getMGTIface: read host name:
> >>>>>>>ichigo-dom223.phx.austin.ibm.com
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    _getMGTIface: using host name
> >>>>>>>ichigo-dom223.phx.austin.ibm.com strIP= 192.168.68.223
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    _getMGTIface
> >>>>>>>IP=192.168.68.223
> >>>>>>>strIface=engine
> >>>>>>>Wed, 18 Jan 2012 08:35:37 ERROR    makeBridge found existing
> >>>>>>>bridge
> >>>>>>>named:
> >>>>>>>engine
> >>>>>>>Wed, 18 Jan 2012 08:35:37 ERROR    makeBridge errored:  out=
> >>>>>>>err=None
> >>>>>>>ret=None
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    makeBridge return.
> >>>>>>>Wed, 18 Jan 2012 08:35:37 ERROR    addNetwork error trying to
> >>>>>>>add
> >>>>>>>management bridge
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG<BSTRAP component='SetNetworking'
> >>>>>>>status='FAIL' message='addNetwork error trying to add management
> >>>>>>>bridge'/>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    getAddress Entry.
> >>>>>>>url=http://ichigo-dom223.phx.austin.ibm.com:8080/Components/vds/
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    getAddress return.
> >>>>>>>address=ichigo-dom223.phx.austin.ibm.com port=8080
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    getRemoteFile start. IP =
> >>>>>>>ichigo-dom223.phx.austin.ibm.com port = 8080 fileName =
> >>>>>>>"/engine.ssh.key.txt"
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    /engine.ssh.key.txt failed in
> >>>>>>>HTTPS.
> >>>>>>>Retrying using HTTP.
> >>>>>>>Traceback (most recent call last):
> >>>>>>>   File "/tmp/deployUtil.py", line 1334, in getRemoteFile
> >>>>>>>     conn.sock = getSSLSocket(sock, certPath)
> >>>>>>>   File "/tmp/deployUtil.py", line 1178, in getSSLSocket
> >>>>>>>     cert_reqs=ssl.CERT_REQUIRED)
> >>>>>>>   File "/usr/lib64/python2.7/ssl.py", line 372, in wrap_socket
> >>>>>>>     ciphers=ciphers)
> >>>>>>>   File "/usr/lib64/python2.7/ssl.py", line 132, in __init__
> >>>>>>>     ciphers)
> >>>>>>>SSLError: [Errno 185090050] _ssl.c:340: error:0B084002:x509
> >>>>>>>certificate
> >>>>>>>routines:X509_load_cert_crl_file:system lib
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    getRemoteFile end.
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    handleSSHKey start
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    Failed to read
> >>>>>>>/root/.ssh/authorized_keys
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    handleSSHKey: failed to chmod
> >>>>>>>authorized_keys
> >>>>>>>Traceback (most recent call last):
> >>>>>>>   File "/tmp/deployUtil.py", line 608, in handleSSHKey
> >>>>>>>     silentRestoreCon(P_ROOT_AUTH_KEYS)
> >>>>>>>   File "/tmp/deployUtil.py", line 576, in silentRestoreCon
> >>>>>>>     import selinux
> >>>>>>>   File
> >>>>>>>   "/usr/lib64/python2.7/site-packages/selinux/__init__.py",
> >>>>>>>   line
> >>>>>>>   26,
> >>>>>>>   in<module>
> >>>>>>>     _selinux = swig_import_helper()
> >>>>>>>   File
> >>>>>>>   "/usr/lib64/python2.7/site-packages/selinux/__init__.py",
> >>>>>>>   line
> >>>>>>>   22,
> >>>>>>>   in swig_import_helper
> >>>>>>>     _mod = imp.load_module('_selinux', fp, pathname,
> >>>>>>>     description)
> >>>>>>>ImportError:
> >>>>>>>/usr/lib64/python2.7/site-packages/selinux/_selinux.so:
> >>>>>>>undefined symbol: selinux_check_access
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    handleSSHKey end
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG<BSTRAP component='SetSSHAccess'
> >>>>>>>status='FAIL' message='Failed to write server~s SSH key.'/>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 ERROR    setSSHAccess test failed
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG<BSTRAP component='RHEV_INSTALL'
> >>>>>>>status='FAIL'/>
> >>>>>>>Wed, 18 Jan 2012 08:35:37 DEBUG    **** End VDS Validation ****
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>- add fake_kvm_support = True to your vdsm.conf under
> >>>>>>>>/etc/vdsm/vdsm.conf
> >>>>>>>>- restart vdsmd service
> >>>>>>please make sure selinux is set at least on permissive mode;
> >>>>>>
> >>>>>>sed -i   's/SELINUX=disabled/SELINUX=permissive/g'
> >>>>>>/etc/sysconfig/selinux
> >>>>>>
> >>>>>>reboot and reinstall.
> >>>>>>
> >>>>>>anyhow, if this is the case, its a known issue and patch is
> >>>>>>pending
> >>>>>>upstream.
> >>>>>I did this, but I was also able to just re-run the installer and
> >>>>>bootstrap completed.  However, now I have another issue.
> >>>>>
> >>>>>THe host is marked unresponsive in engine, engine.log shows a
> >>>>>connectivity issue, but both hosts can ping and share data.
> >>>>>
> >>>>>. Stage completed. (Stage: Running second installation script on
> >>>>>Host)
> >>>>>2012-01-18 09:58:08,550 INFO
> >>>>>[org.ovirt.engine.core.utils.hostinstall.MinaInstallWrapper]
> >>>>>(pool-5-thread-49) RunSSHCommand returns true
> >>>>>2012-01-18 09:58:08,550 INFO
> >>>>>[org.ovirt.engine.core.bll.VdsInstaller]
> >>>>>(pool-5-thread-49)  FinishCommand ended:true
> >>>>>2012-01-18 09:58:08,554 INFO
> >>>>>[org.ovirt.engine.core.bll.InstallVdsCommand] (pool-5-thread-49)
> >>>>>After
> >>>>>Installation pool-5-thread-49
> >>>>>2012-01-18 09:58:08,555 INFO
> >>>>>[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> >>>>>(pool-5-thread-49) START, SetVdsStatusVDSCommand(vdsId =
> >>>>>8c627fa8-41d8-11e1-8d2f-00fffe0000df, status=Reboot,
> >>>>>nonOperationalReason=NONE), log id: 703c3cbd
> >>>>>2012-01-18 09:58:08,560 INFO
> >>>>>[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> >>>>>(pool-5-thread-49) FINISH, SetVdsStatusVDSCommand, log id:
> >>>>>703c3cbd
> >>>>>2012-01-18 09:58:08,560 INFO
> >>>>>[org.ovirt.engine.core.bll.VdsCommand]
> >>>>>(pool-5-thread-50) Waiting 300 seconds, for server to finish
> >>>>>reboot
> >>>>>process.
> >>>>>2012-01-18 10:03:08,561 INFO
> >>>>>[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> >>>>>(pool-5-thread-50) START, SetVdsStatusVDSCommand(vdsId =
> >>>>>8c627fa8-41d8-11e1-8d2f-00fffe0000df, status=NonResponsive,
> >>>>>nonOperationalReason=NONE), log id: 3e57bdd2
> >>>>>2012-01-18 10:03:08,570 INFO
> >>>>>[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> >>>>>(pool-5-thread-50) FINISH, SetVdsStatusVDSCommand, log id:
> >>>>>3e57bdd2
> >>>>>2012-01-18 10:03:10,201 ERROR
> >>>>>[org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand]
> >>>>>(QuartzScheduler_Worker-35) XML RPC error in command
> >>>>>GetCapabilitiesVDS (
> >>>>>Vds: ichigo-dom224 ), the error was:
> >>>>>java.util.concurrent.ExecutionException:
> >>>>>java.lang.reflect.InvocationTargetException
> >>>>>
> >>>>>
> >>>>>I can query vdsm on the on the node:
> >>>>>
> >>>>>[root at f16-node1 ~]# vdsClient -s 0 getVdsCaps
> >>>>>     HBAInventory = {'iSCSI': [{'InitiatorName':
> >>>>>     'iqn.1994-05.com.redhat:2abcda43e16d'}], 'FC': []}
> >>>>>     ISCSIInitiatorName = iqn.1994-05.com.redhat:2abcda43e16d
> >>>>>     bondings = {'bond4': {'hwaddr': '00:00:00:00:00:00', 'cfg':
> >>>>>     {},
> >>>>>     'netmask': '', 'addr': '', 'slaves': []}, 'bond0': {'hwaddr':
> >>>>>     '00:00:00:00:00:00', 'cfg': {}, 'netmask': '', 'addr': '',
> >>>>>     'slaves':
> >>>>>     []}, 'bond1': {'hwaddr': '00:00:00:00:00:00', 'cfg': {},
> >>>>>     'netmask':
> >>>>>     '', 'addr': '', 'slaves': []}, 'bond2': {'hwaddr':
> >>>>>     '00:00:00:00:00:00', 'cfg': {}, 'netmask': '', 'addr': '',
> >>>>>     'slaves':
> >>>>>     []}, 'bond3': {'hwaddr': '00:00:00:00:00:00', 'cfg': {},
> >>>>>     'netmask':
> >>>>>     '', 'addr': '', 'slaves': []}}
> >>>>>     clusterLevels = ['3.0']
> >>>>>     cpuCores = 1
> >>>>>     cpuFlags =
> >>>>>     pge,clflush,sep,syscall,tsc,vmx,cmov,nx,constant_tsc,pat,sse4_1,lm,msr,fpu,fxsr,pae,nopl,mmx,cx8,mce,de,mca,pse,pni,popcnt,apic,sse,sse4_2,lahf_lm,sse2,hypervisor,up,ssse3,cx16,pse36,mtrr,x2apicmodel_486,model_pentium,model_pentium2,model_pentium3,model_pentiumpro,model_qemu32,model_coreduo,model_core2duo,model_n270,model_Conroe,model_Penryn,model_Nehalem,model_Opteron_G1
> >>>>>     cpuModel = Intel(Fake) CPU
> >>>>>     cpuSockets = 1
> >>>>>     cpuSpeed = 2800.482
> >>>>>     emulatedMachines = ['pc-0.14', 'pc', 'fedora-13', 'pc-0.13',
> >>>>>     'pc-0.12', 'pc-0.11', 'pc-0.10', 'isapc']
> >>>>>     guestOverhead = 65
> >>>>>     hooks = {}
> >>>>>     kvmEnabled = true
> >>>>>     management_ip =
> >>>>>     memSize = 7988
> >>>>>     networks = {'ovirtmgmt': {'addr': '192.168.68.224', 'cfg':
> >>>>>     {'DEVICE':
> >>>>>     'ovirtmgmt', 'DELAY': '0', 'BOOTPROTO': 'dhcp', 'TYPE':
> >>>>>     'Bridge',
> >>>>>     'ONBOOT': 'yes'}, 'ports': ['eth0'], 'netmask':
> >>>>>     '255.255.192.0',
> >>>>>     'stp': 'off', 'gateway': '192.168.68.1'}}
> >>>>>     nics = {'eth0': {'hwaddr': '00:FF:FE:00:00:E0', 'netmask':
> >>>>>     '',
> >>>>>     'speed': 0, 'addr': ''}}
> >>>>>     operatingSystem = {'release': '1', 'version': '16', 'name':
> >>>>>     'Fedora'}
> >>>>>     packages2 = {'kernel': {'release': '7.fc16.x86_64',
> >>>>>     'buildtime':
> >>>>>     1320196248.0, 'version': '3.1.0'}, 'spice-server':
> >>>>>     {'release':
> >>>>>     '1.fc16', 'buildtime': '1321276111', 'version': '0.10.0'},
> >>>>>     'vdsm':
> >>>>>     {'release': '0.fc16', 'buildtime': '1326734129', 'version':
> >>>>>     '4.9.3.1'}, 'qemu-kvm': {'release': '3.fc16', 'buildtime':
> >>>>>     '1321651456', 'version': '0.15.1'}, 'libvirt': {'release':
> >>>>>     '4.fc16',
> >>>>>     'buildtime': '1324326688', 'version': '0.9.6'}, 'qemu-img':
> >>>>>     {'release': '3.fc16', 'buildtime': '1321651456', 'version':
> >>>>>     '0.15.1'}}
> >>>>>     reservedMem = 321
> >>>>>     software_revision = 0
> >>>>>     software_version = 4.9
> >>>>>     supportedProtocols = ['2.2', '2.3']
> >>>>>     supportedRHEVMs = ['3.0']
> >>>>>     uuid = 922F4AE6-8EEA-4B11-44C4-EA1E1D665AC2_00:FF:FE:00:00:E0
> >>>>>     version_name = Snow Man
> >>>>>     vlans = {}
> >>>>>     vmTypes = ['kvm']
> >>>>>
> >>>>>
> >>>>can you check if problem is solved if you run iptables -F ?
> >>>It doesn't.
> >>can you also post the vdsm.log to see if the request made it and was 
> >>rejected for some reason?
> >yes, I think this is the issue:
> >
> >Thread-1016::ERROR::2012-01-18
> >11:36:50,986::SecureXMLRPCServer::73::root::(handle_error) client 
> >('192.168.68.223', 58819)
> >Traceback (most recent call last):
> >      File "/usr/lib64/python2.7/SocketServer.py", line 582, in 
> >      process_request_thread
> >          self.finish_request(request, client_address)
> >      File "/usr/share/vdsm/SecureXMLRPCServer.py", line 66, in 
> >      finish_request
> >          request.do_handshake()
> >      File "/usr/lib64/python2.7/ssl.py", line 296, in do_handshake
> >          self._sslobj.do_handshake()
> >
> >I have vdsm.conf with ssl=true, however, if I set ssl=false, then I
> >cannot query vdsm from the localhost client:
> >
> >[root at f16-node1 vdsm]# vdsClient -s 0 getVdsCaps
> >
> >with ssl=false, that returns connection refused.
> >
> >
> Indeed. If you want to work without ssl, you need to also change the 
> "UseSecureConnectionWithServers" option_name to "false" in vdc_options 
> table in ovirt-engine database and restart jboss-as service , so it will 
> query vdsm without SSL.

ok, now with ssl=false, and the database updated and jboss-as restart,
engine sees the host as up, but the local query of the capabilities
fails.

[root at f16-node1 vdsm]# vdsClient -s 0 getVdsCaps            
Traceback (most recent call last):
  File "/usr/share/vdsm/vdsClient.py", line 1972, in <module>
    code, message = commands[command][0](commandArgs)
  File "/usr/share/vdsm/vdsClient.py", line 346, in do_getCap
    return self.ExecAndExit(self.s.getVdsCapabilities())
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1224, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1575, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1264, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1292, in single_request
    self.send_content(h, request_body)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1439, in send_content
    connection.endheaders(request_body)
  File "/usr/lib64/python2.7/httplib.py", line 951, in endheaders
    self._send_output(message_body)
  File "/usr/lib64/python2.7/httplib.py", line 811, in _send_output
    self.send(msg)
  File "/usr/lib64/python2.7/httplib.py", line 773, in send
    self.connect()
  File "/usr/share/vdsm/SecureXMLRPCServer.py", line 98, in connect
    cert_reqs=self.cert_reqs)
  File "/usr/lib64/python2.7/ssl.py", line 372, in wrap_socket
    ciphers=ciphers)
  File "/usr/lib64/python2.7/ssl.py", line 134, in __init__
    self.do_handshake()
  File "/usr/lib64/python2.7/ssl.py", line 296, in do_handshake
    self._sslobj.do_handshake()
SSLError: [Errno 1] _ssl.c:503: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol


> 
> 
> -- 
> 
> Thanks,
> 
> Rami Vaknin, QE @ Red Hat, TLV, IL.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh at us.ibm.com




More information about the Users mailing list