Hi,

A new behaviour has been introduced for hosted-engine in 3.6.3 that involves an automatic import of the hosted-engine storage domain. This auto import will take place along with first DC initialization (first storage pool creation).
That means that once a first data domain is created in the setup, the hosted-engine storage domain will be imported to this DC and the hosted-engine VM will be registered and appear in the VM's tab.

Regarding the “Uncaught exception” you're getting when trying to create the domain, Can't know for sure why it happens, but my guess is that the engine's date and time configuration are not set right.

Thanks

On Tue, Feb 9, 2016 at 4:41 PM, Giuseppe Berellini <Giuseppe.Berellini@ptvgroup.com> wrote:

Hi,

 

I’m trying to setup oVirt 3.6.3 with self-hosted engine on 4 servers (vmhost-03, vmhost-04, vmhost-05 for compute; stor-01 for storage). The storage server is GlusterFS 3.7.6, all the servers are in the same network and are also connected through InfiniBand DDR.

 

Network is OK, RDMA is working, IPoIB has been configured, it is possible to manually mount GlusterFS volumes on each vmhost. firewalld and SELinux are disabled. Ovirtmgmt network is on ethernet.

 

The problem is that, after installing the hosted engine, I can connect to oVirt admin panel but:

- Datacenter is marked as down

- The only host is NOT recognized as an SPM

- In the storage tab there is no storage domain for the hosted engine (I only see a detached ISO domain and oVirt repo)

- when I try to create a storage domain, an error shows up (it’s an “Uncaught exception”)

- when I try to import a storage domain, an error shows up (it’s about datacenter down and SPM not available)

- also, in Virtual Machines tab there are no VMs (neither the hosted engine, which is obviously up and reported as up by command “hosted-engine --vm-status”)

 

So basically it is not possible to do anything.

After setting the host in maintenance mode and rebooting, I cannot start the engine VM anymore:

 

[root@SRV-VMHOST-05 ~]# hosted-engine --vm-start

VM exists and is down, destroying it

Machine destroyed

 

429eec6e-2126-4740-9911-9c5ad482e09f

        Status = WaitForLaunch

        nicModel = rtl8139,pv

        statusTime = 4300834920

        emulatedMachine = pc

        pid = 0

        vmName = HostedEngine

        devices = [{'index': '2', 'iface': 'ide', 'specParams': {}, 'readonly': 'true', 'deviceId': '1c2205da-17c6-4ffe-9408-602a998d90dc', 'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0', 'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type': 'disk'}, {'index': '0', 'iface': 'virtio', 'format': 'raw', 'bootOrder': '1', 'poolID': '00000000-0000-0000-0000-000000000000', 'volumeID': 'fe82ba21-942d-48cc-9bdb-f41c0f172dde', 'imageID': '131460bc-4599-4326-a026-e9e224e4bb5f', 'specParams': {}, 'readonly': 'false', 'domainID': '162fc2e5-1897-46fb-b382-195c11ab8546', 'optional': 'false', 'deviceId': '131460bc-4599-4326-a026-e9e224e4bb5f', 'address': {'slot': '0x06', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'disk', 'shared': 'exclusive', 'propagateErrors': 'off', 'type': 'disk'}, {'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller'}, {'nicModel': 'pv', 'macAddr': '00:16:3e:30:a9:6e', 'linkActive': 'true', 'network': 'ovirtmgmt', 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'deviceId': '3d3259a3-19a8-42c3-a50c-6724b475c1ab', 'address': {'slot': '0x03', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'interface'}, {'device': 'console', 'specParams': {}, 'type': 'console', 'deviceId': '885cca16-2b59-42e4-a57c-0a89a0e823e8', 'alias': 'console0'}]

        guestDiskMapping = {}

        vmType = kvm

        clientIp =

        displaySecurePort = -1

        memSize = 8192

        displayPort = -1

        cpuType = Nehalem

        spiceSecureChannels = smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir

        smp = 4

        displayIp = 0

        display = vnc

but the status remains {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"}

We tried to use, for the engine volume, both rdma and tcp – nothing changed

 

In /var/log/ovirt-hosted-engine-ha/agent.log , these are the only error we found:

 

MainThread::WARNING::2016-02-08 18:17:23,160::ovf_store::105::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Unable to find OVF_STORE

MainThread::ERROR::2016-02-08 18:17:23,161::config::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf

 

In vdsm.og I see

Thread-16399::INFO::2016-02-09 14:54:39,478::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:39823 started

Thread-16399::DEBUG::2016-02-09 14:54:39,478::bindingxmlrpc::1257::vds::(wrapper) client [127.0.0.1]::call vmGetStats with ('429eec6e-2126-4740-9911-9c5ad482e09f',) {}

Thread-16399::DEBUG::2016-02-09 14:54:39,479::bindingxmlrpc::1264::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'exitMessage': 'Failed to acquire lock: No space left on device', 'statusTime': '4302636100', 'vmId': '429eec6e-2126-4740-9911-9c5ad482e09f', 'exitReason': 1, 'exitCode': 1}]}

 

When executing hosted-engine –vm-start, in vdsm.log appears this:

Thread-16977::ERROR::2016-02-09 14:59:12,146::vm::759::virt.vm::(_startUnderlyingVm) vmId=`429eec6e-2126-4740-9911-9c5ad482e09f`::The vm start process failed

Traceback (most recent call last):

  File "/usr/share/vdsm/virt/vm.py", line 703, in _startUnderlyingVm

    self._run()

  File "/usr/share/vdsm/virt/vm.py", line 1941, in _run

    self._connection.createXML(domxml, flags),

  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper

    ret = f(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in wrapper

    return func(inst, *args, **kwargs)

  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3611, in createXML

    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)

libvirtError: Failed to acquire lock: No space left on device

 

But

[root@SRV-VMHOST-05 vdsm]# df -h

Filesystem                               Size  Used Avail Use% Mounted on

/dev/mapper/centos_srv--vmhost--05-root   50G  2.8G   48G   6% /

devtmpfs                                  16G     0   16G   0% /dev

tmpfs                                     16G     0   16G   0% /dev/shm

tmpfs                                     16G  105M   16G   1% /run

tmpfs                                     16G     0   16G   0% /sys/fs/cgroup

/dev/mapper/centos_srv--vmhost--05-home   84G   33M   84G   1% /home

/dev/sda1                                497M  178M  319M  36% /boot

srv-stor-01:/ovirtengine                 3.7T  3.0G  3.7T   1% /rhev/data-center/mnt/glusterSD/srv-stor-01:_ovirtengine

tmpfs                                    3.2G     0  3.2G   0% /run/user/0

 

 

I also verified that Gluster storage was correctly mounted:

[root@SRV-VMHOST-05 ~]# mount | grep gluster

srv-stor-01:/ovirtengine on /rhev/data-center/mnt/glusterSD/srv-stor-01:_ovirtengine type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

 

(if I create a file in that folder, it appears on the gluster server).

 

 

 

On the engine VM in /var/log/ovirt-engine/engine.log I found the following:

2016-02-09 11:55:41,165 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (DefaultQuartzScheduler_Worker-93) [] START, FullListVDSCommand(HostName = , FullListVDSCommandParameters:{runAsyn

c='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0', vds='Host[,13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0]', vmIds='[429eec6e-2126-4740-9911-9c5ad482e09f]'}), log id: 61eda464

2016-02-09 11:55:42,169 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (DefaultQuartzScheduler_Worker-93) [] FINISH, FullListVDSCommand, return: [{status=Up, nicModel=rtl8139,pv, emulat

edMachine=pc, guestDiskMapping={}, vmId=429eec6e-2126-4740-9911-9c5ad482e09f, pid=11133, devices=[Ljava.lang.Object;@2099d011, smp=4, vmType=kvm, displayIp=0, display=vnc, displaySecurePort=-1, memSize=8192,

displayPort=5900, cpuType=Nehalem, spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir, statusTime=4364469020, vmName=HostedEngine, clientIp=, pauseCode=NOERR}], log id

: 61eda464

2016-02-09 11:55:42,173 INFO  [org.ovirt.engine.core.bll.storage.GetExistingStorageDomainListQuery] (org.ovirt.thread.pool-8-thread-35) [] START, GetExistingStorageDomainListQuery(GetExistingStorageDomainLis

tParameters:{refresh='true', filtered='false'}), log id: 5611a666

2016-02-09 11:55:42,173 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] START, HSMGetStorageDomainsListVDSCommand(HostName = srv-vm

host-05, HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null', storageDomainT

ype='Data', path='null'}), log id: 63695be3

2016-02-09 11:55:43,298 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] FINISH, HSMGetStorageDomainsListVDSCommand, return: [162fc2

e5-1897-46fb-b382-195c11ab8546], log id: 63695be3

2016-02-09 11:55:43,365 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] START, HSMGetStorageDomainInfoVDSCommand(HostName = srv-vmho

st-05, HSMGetStorageDomainInfoVDSCommandParameters:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0', storageDomainId='162fc2e5-1897-46fb-b382-195c11ab8546'}), log id: 7e520f35

2016-02-09 11:55:44,377 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-8-thread-35) [] FINISH, HSMGetStorageDomainInfoVDSCommand, return: <StorageD

omainStatic:{name='EngineStorage', id='162fc2e5-1897-46fb-b382-195c11ab8546'}, null>, log id: 7e520f35

2016-02-09 11:55:44,377 INFO  [org.ovirt.engine.core.bll.storage.GetExistingStorageDomainListQuery] (org.ovirt.thread.pool-8-thread-35) [] FINISH, GetExistingStorageDomainListQuery, log id: 5611a666

2016-02-09 11:55:44,378 INFO  [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-35) [23427de7] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', share

dLocks='null'}'

2016-02-09 11:55:44,379 WARN  [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-35) [23427de7] CanDoAction of action 'ImportHostedEngineStorageDomain' failed

for user SYSTEM. Reasons: VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_MASTER_STORAGE_DOMAIN_NOT_ACTIVE

2016-02-09 11:55:44,379 INFO  [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-8-thread-35) [23427de7] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLo

cks='null'}'

2016-02-09 11:55:46,625 INFO  [org.ovirt.engine.core.bll.UpdateVdsGroupCommand] (default task-26) [5118b768] Running command: UpdateVdsGroupCommand internal: false. Entities affected :  ID: 00000002-0002-000

2-0002-0000000000d9 Type: VdsGroupsAction group EDIT_CLUSTER_CONFIGURATION with role type ADMIN

2016-02-09 11:55:46,765 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-26) [5118b768] Correlation ID: 5118b768, Call Stack: null, Custom Event ID: -1, Message: Hos

t cluster Default was updated by admin@internal

2016-02-09 11:55:46,932 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-6) [] START, GlusterServersListVDSCommand(HostName = srv-vmhost-05, VdsIdVDSCommandParameter

sBase:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0'}), log id: 559ab127

2016-02-09 11:55:47,503 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-13) [] START, GlusterServersListVDSCommand(HostName = srv-vmhost-05, VdsIdVDSCommandParamete

rsBase:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0'}), log id: 62d703e5

2016-02-09 11:55:47,510 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-6) [] FINISH, GlusterServersListVDSCommand, log id: 559ab127

2016-02-09 11:55:47,511 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-6) [] Query 'GetAddedGlusterServersQuery' failed: null

2016-02-09 11:55:47,511 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-6) [] Exception: java.lang.NullPointerException

        at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.getAddedGlusterServers(GetAddedGlusterServersQuery.java:54) [bll.jar:]

        at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.executeQueryCommand(GetAddedGlusterServersQuery.java:45) [bll.jar:]

        at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:82) [bll.jar:]

        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]

        at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:537) [bll.jar:]

        at org.ovirt.engine.core.bll.Backend.runQuery(Backend.java:511) [bll.jar:]

        at sun.reflect.GeneratedMethodAccessor98.invoke(Unknown Source) [:1.8.0_71]

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_71]

        at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_71]

        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)

        at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53)

        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)

        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:407)

        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-8.2.1.Final.jar:8.2.1.Final]

        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:80) [wildfly-weld-8.2.1.Final.jar:8.2.1.Final]

        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:93) [wildfly-weld-8.2.1.Final.jar:8.2.1.Final]

        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)

        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:407)

        at org.ovirt.engine.core.bll.interceptors.CorrelationIdTrackerInterceptor.aroundInvoke(CorrelationIdTrackerInterceptor.java:13) [bll.jar:]

        at sun.reflect.GeneratedMethodAccessor74.invoke(Unknown Source) [:1.8.0_71]

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_71]

                               ....

2016-02-09 11:55:47,985 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-14) [] START, GlusterServersListVDSCommand(HostName = srv-vmhost-05, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='13ce38e6-f4b6-42fa-bb8c-5ec84ad00ce0'}), log id: 61100c4d

2016-02-09 11:55:47,986 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (default task-13) [] FINISH, GlusterServersListVDSCommand, log id: 62d703e5

2016-02-09 11:55:47,986 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-13) [] Query 'GetAddedGlusterServersQuery' failed: null

2016-02-09 11:55:47,987 ERROR [org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery] (default task-13) [] Exception: java.lang.NullPointerException

        at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.getAddedGlusterServers(GetAddedGlusterServersQuery.java:54) [bll.jar:]

        at org.ovirt.engine.core.bll.gluster.GetAddedGlusterServersQuery.executeQueryCommand(GetAddedGlusterServersQuery.java:45) [bll.jar:]

        at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:82) [bll.jar:]

        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]

        at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:537) [bll.jar:]

        at org.ovirt.engine.core.bll.Backend.runQuery(Backend.java:511) [bll.jar:]

        at sun.reflect.GeneratedMethodAccessor98.invoke(Unknown Source) [:1.8.0_71]

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_71]

        at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_71]

        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)

        at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53)

                               ....

 

 

 

Do you have any ideas about what I should do?

 

Thanks,

        Giuseppe

 

 

--
Giuseppe Berellini

PTV SISTeMA

Phone +39 06 993 444 15
Mobile +39 349 3241969
Fax +39 06 993 348 72
Via Ruggero Bonghi, 11/B – 00184 Roma

giuseppe.berellini@ptvgroup.com
www.sistemaits.com

facebook.com/sistemaits
linkedin.com/SISTeMA
 


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users