[ovirt-users] R: Problem installing oVirt 3.6.3-rc with GlusterFS over InfiniBand
Giuseppe Berellini
Giuseppe.Berellini at ptvgroup.com
Wed Feb 10 09:28:10 UTC 2016
Hi Elad,
thank you for the information!
I met a couple of people of oVirt team at FOSDEM and they told me they’re looking for opinions from users:
I could not find in the quick start guide the information about NTP (or needs of precision for the clock) and I was not caring very much about that.
Given that it is really important, I would expect the installation script to setup NTP on the hosted-engine by itself (or something similar… Maybe a question like “Is this time correct?”). Incorrect day time is a common problem for virtual machines… Also “Uncaught exception” is not helpful to solve an issue. :-)
I started again with a new deploy of oVirt Engine, I set the NTP on the VM and, as soon as oVirt engine was up, I added the new storage domain and… GREAT! :-D
Now my datacenter is up, data storage (master) is active, but:
- The storage domain for oVirt Engine is not listed in the storage tab
- the hosted engine not in the Virtual Machines tab
Is it OK? What should I do now?
Thanks for your support!
Giuseppe
--
Giuseppe Berellini
PTV SISTeMA
Phone +39 06 993 444 15
Mobile +39 349 3241969
Fax +39 06 993 348 72
Via Ruggero Bonghi, 11/B – 00184 Roma
giuseppe.berellini at ptvgroup.com
www.sistemaits.com<http://www.sistemaits.com/>
facebook.com/sistemaits<https://www.facebook.com/sistemaits>
linkedin.com/SISTeMA<https://www.linkedin.com/company/sistema-soluzioni-per-l-ingegneria-dei-sistemi-di-trasporto-e-l-infomobilit-s-r-l->
Da: Elad Ben Aharon [mailto:ebenahar at redhat.com]
Inviato: martedì 9 febbraio 2016 17:05
A: Giuseppe Berellini <Giuseppe.Berellini at ptvgroup.com>
Cc: users at ovirt.org
Oggetto: Re: [ovirt-users] Problem installing oVirt 3.6.3-rc with GlusterFS over InfiniBand
Hi,
A new behaviour has been introduced for hosted-engine in 3.6.3 that involves an automatic import of the hosted-engine storage domain. This auto import will take place along with first DC initialization (first storage pool creation).
That means that once a first data domain is created in the setup, the hosted-engine storage domain will be imported to this DC and the hosted-engine VM will be registered and appear in the VM's tab.
Regarding the “Uncaught exception” you're getting when trying to create the domain, Can't know for sure why it happens, but my guess is that the engine's date and time configuration are not set right.
Thanks
On Tue, Feb 9, 2016 at 4:41 PM, Giuseppe Berellini <Giuseppe.Berellini at ptvgroup.com<mailto:Giuseppe.Berellini at ptvgroup.com>> wrote:
Hi,
I’m trying to setup oVirt 3.6.3 with self-hosted engine on 4 servers (vmhost-03, vmhost-04, vmhost-05 for compute; stor-01 for storage). The storage server is GlusterFS 3.7.6, all the servers are in the same network and are also connected through InfiniBand DDR.
Network is OK, RDMA is working, IPoIB has been configured, it is possible to manually mount GlusterFS volumes on each vmhost. firewalld and SELinux are disabled. Ovirtmgmt network is on ethernet.
The problem is that, after installing the hosted engine, I can connect to oVirt admin panel but:
- Datacenter is marked as down
- The only host is NOT recognized as an SPM
- In the storage tab there is no storage domain for the hosted engine (I only see a detached ISO domain and oVirt repo)
- when I try to create a storage domain, an error shows up (it’s an “Uncaught exception”)
- when I try to import a storage domain, an error shows up (it’s about datacenter down and SPM not available)
- also, in Virtual Machines tab there are no VMs (neither the hosted engine, which is obviously up and reported as up by command “hosted-engine --vm-status”)
So basically it is not possible to do anything.
After setting the host in maintenance mode and rebooting, I cannot start the engine VM anymore:
[root at SRV-VMHOST-05 ~]# hosted-engine --vm-start
VM exists and is down, destroying it
Machine destroyed
429eec6e-2126-4740-9911-9c5ad482e09f
Status = WaitForLaunch
nicModel = rtl8139,pv
statusTime = 4300834920
emulatedMachine = pc
pid = 0
vmName = HostedEngine
devices = [{'index': '2', 'iface': 'ide', 'specParams': {}, 'readonly': 'true', 'deviceId': '1c2205da-17c6-4ffe-9408-602a998d90dc', 'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0', 'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type': 'disk'}, {'index': '0', 'iface': 'virtio', 'format': 'raw', 'bootOrder': '1', 'poolID': '00000000-0000-0000-0000-000000000000', 'volumeID': 'fe82ba21-942d-48cc-9bdb-f41c0f172dde', 'imageID': '131460bc-4599-4326-a026-e9e224e4bb5f', 'specParams': {}, 'readonly': 'false', 'domainID': '162fc2e5-1897-46fb-b382-195c11ab8546', 'optional': 'false', 'deviceId': '131460bc-4599-4326-a026-e9e224e4bb5f', 'address': {'slot': '0x06', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'disk', 'shared': 'exclusive', 'propagateErrors': 'off', 'type': 'disk'}, {'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller'}, {'nicModel': 'pv', 'macAddr': '00:16:3e:30:a9:6e', 'linkActive': 'true', 'network': 'ovirtmgmt', 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'deviceId': '3d3259a3-19a8-42c3-a50c-6724b475c1ab', 'address': {'slot': '0x03', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'interface'}, {'device': 'console', 'specParams': {}, 'type': 'console', 'deviceId': '885cca16-2b59-42e4-a57c-0a89a0e823e8', 'alias': 'console0'}]
guestDiskMapping = {}
vmType = kvm
clientIp =
displaySecurePort = -1
memSize = 8192
displayPort = -1
cpuType = Nehalem
spiceSecureChannels = smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
smp = 4
displayIp = 0
display = vnc
but the status remains {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"}
We tried to use, for the engine volume, both rdma and tcp – nothing changed
In /var/log/ovirt-hosted-engine-ha/agent.log , these are the only error we found:
MainThread::WARNING::2016-02-08 18:17:23,160::ovf_store::105::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Unable to find OVF_STORE
MainThread::ERROR::2016-02-08 18:17:23,161::config::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf
In vdsm.og I see
Thread-16399::INFO::2016-02-09 14:54:39,478::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:39823<http://127.0.0.1:39823> started
Thread-16399::DEBUG::2016-02-09 14:54:39,478::bindingxmlrpc::1257::vds::(wrapper) client [127.0.0.1]::call vmGetStats with ('429eec6e-2126-4740-9911-9c5ad482e09f',) {}
Thread-16399::DEBUG::2016-02-09 14:54:39,479::bindingxmlrpc::1264::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'exitMessage': 'Failed to acquire lock: No space left on device', 'statusTime': '4302636100<tel:4302636100>', 'vmId': '429eec6e-2126-4740-9911-9c5ad482e09f', 'exitReason': 1, 'exitCode': 1}]}
When executing hosted-engine –vm-start, in vdsm.log appears this:
Thread-16977::ERROR::2016-02-09 14:59:12,146::vm::759::virt.vm::(_startUnderlyingVm) vmId=`429eec6e-2126-4740-9911-9c5ad482e09f`::The vm start process failed
Traceback (most recent call last):
File "/usr/share/vdsm/virt/vm.py", line 703, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/virt/vm.py", line 1941, in _run
self._connection.createXML(domxml, flags),
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3611, in createXML
if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: Failed to acquire lock: No space left on device
But
[root at SRV-VMHOST-05 vdsm]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos_srv--vmhost--05-root 50G 2.8G 48G 6% /
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 16G 105M 16G 1% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/centos_srv--vmhost--05-home 84G 33M 84G 1% /home
/dev/sda1 497M 178M 319M 36% /boot
srv-stor-01:/ovirtengine 3.7T 3.0G 3.7T 1% /rhev/data-center/mnt/glusterSD/srv-stor-01:_ovirtengine
tmpfs 3.2G 0 3.2G 0% /run/user/0
I also verified that Gluster storage was correctly mounted:
[root at SRV-VMHOST-05 ~]# mount | grep gluster
srv-stor-01:/ovirtengine on /rhev/data-center/mnt/glusterSD/srv-stor-01:_ovirtengine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
(if I create a file in that folder, it appears on the gluster server).
On the engine VM in /var/log/ovirt-engine/engine.log I found the following:
2016-02-09 11:55:41,165 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (DefaultQuartzScheduler_Worker-93) [] START, FullListVDSCommand(HostName = , FullListVDSCommandParameters:{runAsyn
c='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0', vds='Host[,13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0]', vmIds='[429eec6e-2126-4740-9911-9c5ad482e09f]'}), log id: 61eda464
2016-02-09 11:55:42,169 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (DefaultQuartzScheduler_Worker-93) [] FINISH, FullListVDSCommand, return: [{status=Up, nicModel=rtl8139,pv, emulat
edMachine=pc, guestDiskMapping={}, vmId=429eec6e-2126-4740-9911-9c5ad482e09f, pid=11133, devices=[Ljava.lang.Object;@2099d011, smp=4, vmType=kvm, displayIp=0, display=vnc, displaySecurePort=-1, memSize=8192,
displayPort=5900, cpuType=Nehalem, spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir, statusTime=4364469020, vmName=HostedEngine, clientIp=, pauseCode=NOERR}], log id
: 61eda464
2016-02-09 11:55:42,173 INFO [org.ovirt.engine.core.bll.storage.GetExistingStorageDomainListQuery] (org.ovirt.thread.pool-8-thread-35) [] START, GetExistingStorageDomainListQuery(GetExistingStorageDomainLis
tParameters:{refresh='true', filtered='false'}), log id: 5611a666
2016-02-09 11:55:42,173 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] START, HSMGetStorageDomainsListVDSCommand(HostName = srv-vm
host-05, HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null', storageDomainT
ype='Data', path='null'}), log id: 63695be3
2016-02-09 11:55:43,298 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] FINISH, HSMGetStorageDomainsListVDSCommand, return: [162fc2
e5-1897-46fb-b382-195c11ab8546], log id: 63695be3
2016-02-09 11:55:43,365 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] START, HSMGetStorageDomainInfoVDSCommand(HostName = srv-vmho
st-05, HSMGetStorageDomainInfoVDSCommandParameters:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0', storageDomainId='162fc2e5-1897-46fb-b382-195c11ab8546'}), log id: 7e520f35
2016-02-09 11:55:44,377 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] FINISH, HSMGetStorageDomainInfoVDSCommand, return: <StorageD
omainStatic:{name='EngineStorage', id='162fc2e5-1897-46fb-b382-195c11ab8546'}, null>, log id: 7e520f35
2016-02-09 11:55:44,377 INFO [org.ovirt.engine.core.bll.storage.GetExistingStorageDomainListQuery] (org.ovirt.thread.pool-8-thread-35) [] FINISH, GetExistingStorageDomainListQuery, log id: 5611a666
2016-02-09 11:55:44,378 INFO [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-35) [23427de7] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', share
dLocks='null'}'
2016-02-09 11:55:44,379 WARN [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-35) [23427de7] CanDoAction of action 'ImportHostedEngineStorageDomain' failed
for user SYSTEM. Reasons: VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_MASTER_STORAGE_DOMAIN_NOT_ACTIVE
2016-02-09 11:55:44,379 INFO [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-35) [23427de7] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLo
cks='null'}'
2016-02-09 11:55:46,625 INFO [org.ovirt.engine.core.bll.UpdateVdsGroupCommand] (default task-26) [5118b768] Running command: UpdateVdsGroupCommand internal: false. Entities affected : ID: 00000002-0002-000
2-0002-0000000000d9 Type: VdsGroupsAction group EDIT_CLUSTER_CONFIGURATION with role type ADMIN
2016-02-09 11:55:46,765 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-26) [5118b768] Correlation ID: 5118b768, Call Stack: null, Custom Event ID: -1, Message: Hos
t cluster Default was updated by admin at internal
2016-02-09 11:55:46,932 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-6) [] START, GlusterServersListVDSCommand(HostName = srv-vmhost-05, VdsIdVDSCommandParameter
sBase:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0'}), log id: 559ab127
2016-02-09 11:55:47,503 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-13) [] START, GlusterServersListVDSCommand(HostName = srv-vmhost-05, VdsIdVDSCommandParamete
rsBase:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0'}), log id: 62d703e5
2016-02-09 11:55:47,510 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-6) [] FINISH, GlusterServersListVDSCommand, log id: 559ab127
2016-02-09 11:55:47,511 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-6) [] Query 'GetAddedGlusterServersQuery' failed: null
2016-02-09 11:55:47,511 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-6) [] Exception: java.lang.NullPointerException
at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.getAddedGlusterServers(GetAddedGlusterServersQuery.java:54) [bll.jar:]
at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.executeQueryCommand(GetAddedGlusterServersQuery.java:45) [bll.jar:]
at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:82) [bll.jar:]
at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:537) [bll.jar:]
at org.ovirt.engine.core.bll.Backend.runQuery(Backend.java:511) [bll.jar:]
at sun.reflect.GeneratedMethodAccessor98.invoke(Unknown Source) [:1.8.0_71]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_71]
at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_71]
at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53)
at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:407)
at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-8.2.1.Final.jar:8.2.1.Final]
at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:80) [wildfly-weld-8.2.1.Final.jar:8.2.1.Final]
at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:93) [wildfly-weld-8.2.1.Final.jar:8.2.1.Final]
at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:407)
at org.ovirt.engine.core.bll.interceptors.CorrelationIdTrackerInterceptor.aroundInvoke(CorrelationIdTrackerInterceptor.java:13) [bll.jar:]
at sun.reflect.GeneratedMethodAccessor74.invoke(Unknown Source) [:1.8.0_71]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_71]
....
2016-02-09 11:55:47,985 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-14) [] START, GlusterServersListVDSCommand(HostName = srv-vmhost-05, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0'}), log id: 61100c4d
2016-02-09 11:55:47,986 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-13) [] FINISH, GlusterServersListVDSCommand, log id: 62d703e5
2016-02-09 11:55:47,986 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-13) [] Query 'GetAddedGlusterServersQuery' failed: null
2016-02-09 11:55:47,987 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-13) [] Exception: java.lang.NullPointerException
at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.getAddedGlusterServers(GetAddedGlusterServersQuery.java:54) [bll.jar:]
at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.executeQueryCommand(GetAddedGlusterServersQuery.java:45) [bll.jar:]
at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:82) [bll.jar:]
at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:537) [bll.jar:]
at org.ovirt.engine.core.bll.Backend.runQuery(Backend.java:511) [bll.jar:]
at sun.reflect.GeneratedMethodAccessor98.invoke(Unknown Source) [:1.8.0_71]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_71]
at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_71]
at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53)
....
Do you have any ideas about what I should do?
Thanks,
Giuseppe
--
Giuseppe Berellini
PTV SISTeMA
Phone +39 06 993 444 15
Mobile +39 349 3241969<tel:%2B39%20349%203241969>
Fax +39 06 993 348 72
Via Ruggero Bonghi, 11/B – 00184 Roma
giuseppe.berellini at ptvgroup.com<mailto:giuseppe.berellini at ptvgroup.com>
www.sistemaits.com<http://www.sistemaits.com/>
facebook.com/sistemaits<https://www.facebook.com/sistemaits>
linkedin.com/SISTeMA<https://www.linkedin.com/company/sistema-soluzioni-per-l-ingegneria-dei-sistemi-di-trasporto-e-l-infomobilit-s-r-l->
_______________________________________________
Users mailing list
Users at ovirt.org<mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160210/1739fc72/attachment-0001.html>
More information about the Users
mailing list