[ovirt-users] hosted-storage import fails on hyperconverged glusterFS

Liebe, André-Sebastian andre.liebe at gematik.de
Tue Feb 7 14:31:34 UTC 2017


Hello Sahina,

First of all, sorry for the late reply, but I got distracted by other things and left off on vacation.
The problem vanished after a complete shutdown of the whole datacenter (due to hardware maintenance).


Sincerely
André-Sebastian Liebe

Von: Sahina Bose [mailto:sabose at redhat.com]
Gesendet: Montag, 23. Januar 2017 08:15
An: Liebe, André-Sebastian
Cc: users at ovirt.org
Betreff: Re: [ovirt-users] hosted-storage import fails on hyperconverged glusterFS



On Fri, Jan 20, 2017 at 3:01 PM, Liebe, André-Sebastian <andre.liebe at gematik.de<mailto:andre.liebe at gematik.de>> wrote:
Hello List,

I run into trouble after moving our hosted engine from nfs to hyperconverged glusterFS by backup/restore[1] procedure. The engine logs it  can't import and activate the hosted-storage although I can see the storage.
Any Hints how to fix this?

- I created the ha-replica-3 gluster volume prior to hosted-engine-setup using the hosts short name.
- Then ran hosted-engine-setup to install an new hosted engine (by installing centOS7 and ovirt-engine amnually)
- inside the new hosted-engine I restored the last successfull backup (wich was in running state)
- then I connected to the engine-database and removed the old hosted-engine by hand (as part of this patch would do: https://gerrit.ovirt.org/#/c/64966/) and all known hosts (after marking all vms as down, where I got ETL error messages later on for this)

Did you also clean up the old HE storage domain? The error further down indicates that engine has a reference to the HE storage domain.
- then I finished up the engine installation by running the engine-setup inside the hosted_engine
- and finally completed the hosted-engine-setup


The new hosted engine came up successfully with all prior known storage and after enabling glusterFS, the cluster this HA-host is part of, I could see it in the volumes and storage tab. After adding the remaining two hosts, the volume was marked as active.

But here's the the error message I get repeadately since then:
> 2017-01-19 08:49:36,652 WARN  [org.ovirt.engine.core.bll.storage.domain.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-6-thread-10) [3b955ecd] Validation of action 'ImportHostedEngineStorageDomain' failed for user SYSTEM. Reasons: VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_STORAGE_DOMAIN_ALREADY_EXIST


There are also some repeating messages about this ha-replica-3 volume, because I used the hosts short name on volume creation, which I can't change afaik without a complete cluster shutdown.
> 2017-01-19 08:48:03,134 INFO  [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand] (DefaultQuartzScheduler3) [7471d7de] Running command: AddUnmanagedVmsCommand internal: true.
> 2017-01-19 08:48:03,134 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (DefaultQuartzScheduler3) [7471d7de] START, FullListVDSCommand(HostName = , FullListVDSCommandParameters:{runAsync='true', hostId='f62c7d04-9c95-453f-92d5-6dabf9da874a', vds='Host[,f62c7d04-9c95-453f-92d5-6dabf9da874a]', vmIds='[dfea96e8-e94a-407e-af46-3019fd3f2991]'}), log id: 2d0941f9
> 2017-01-19 08:48:03,163 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (DefaultQuartzScheduler3) [7471d7de] FINISH, FullListVDSCommand, return: [{guestFQDN=, emulatedMachine=pc, pid=0, guestDiskMapping={}, devices=[Ljava.lang.Object;@4181d938, cpuType=Haswell-noTSX, smp=2, vmType=kvm, memSize=8192, vmName=HostedEngine, username=, exitMessage=XML error: maximum vcpus count must be an integer, vmId=dfea96e8-e94a-407e-af46-3019fd3f2991, displayIp=0, displayPort=-1, guestIPs=, spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir, exitCode=1, nicModel=rtl8139,pv, exitReason=1, status=Down, maxVCpus=None, clientIp=, statusTime=6675071780, display=vnc, displaySecurePort=-1}], log id: 2d0941f9
> 2017-01-19 08:48:03,163 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (DefaultQuartzScheduler3) [7471d7de] null architecture type, replacing with x86_64, %s
> 2017-01-19 08:48:17,779 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler3) [7471d7de] START, GlusterServersListVDSCommand(HostName = lvh2, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='23297fc2-db12-4778-a5ff-b74d6fc9554b'}), log id: 57d029dc
> 2017-01-19 08:48:18,177 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler3) [7471d7de] FINISH, GlusterServersListVDSCommand, return: [172.31.1.22/24:CONNECTED<http://172.31.1.22/24:CONNECTED>, lvh3.lab.gematik.de:CONNECTED, lvh4.lab.gematik.de:CONNECTED], log id: 57d029dc
> 2017-01-19 08:48:18,180 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler3) [7471d7de] START, GlusterVolumesListVDSCommand(HostName = lvh2, GlusterVolumesListVDSParameters:{runAsync='true', hostId='23297fc2-db12-4778-a5ff-b74d6fc9554b'}), log id: 5cd11a39
> 2017-01-19 08:48:18,282 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler3) [7471d7de] Could not associate brick 'lvh2:/data/gluster/0/brick' of volume '7dc6410d-8f2a-406c-812a-8235fa6f721c' with correct network as no gluster network found in cluster '57ff41c2-0297-039d-039c-000000000362'
> 2017-01-19 08:48:18,284 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler3) [7471d7de] Could not associate brick 'lvh3:/data/gluster/0/brick' of volume '7dc6410d-8f2a-406c-812a-8235fa6f721c' with correct network as no gluster network found in cluster '57ff41c2-0297-039d-039c-000000000362'
> 2017-01-19 08:48:18,285 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler3) [7471d7de] Could not associate brick 'lvh4:/data/gluster/0/brick' of volume '7dc6410d-8f2a-406c-812a-8235fa6f721c' with correct network as no gluster network found in cluster '57ff41c2-0297-039d-039c-000000000362'

To get rid of these errors, the VM running engine needs to be able to resolve the short names to the correct interface (associated with gluster network)

> 2017-01-19 08:48:18,286 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler3) [7471d7de] FINISH, GlusterVolumesListVDSCommand, return: {7dc6410d-8f2a-406c-812a-8235fa6f721c=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at 2459e231, 000391cc-9946-47b3-82c9-af17da69d182=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at 42990505}, log id: 5cd11a39


[1] http://www.ovirt.org/documentation/admin-guide/hosted-engine-backup-and-restore seems to be moved to http://www.ovirt.org/documentation/self-hosted/chap-Backing_up_and_Restoring_an_EL-Based_Self-Hosted_Environment/#creating-a-new-self-hosted-engine-environment-to-be-used-as-the-restored-environment

sincerely,
André-Sebastian Liebe

Technik / Innovation
Telefon: +49 30 40041-197
Telefax: +49 30 40041-111

gematik- Gesellschaft für Telematikanwendungen der Gesundheitskarte mbH
Friedrichstraße 136, 10117 Berlin
Amtsgericht Berlin-Charlottenburg HRB 96351 B
Geschäftsführer: Alexander Beyer

_______________________________________________
Users mailing list
Users at ovirt.org<mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170207/6b27bda2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4056 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170207/6b27bda2/attachment-0001.bin>


More information about the Users mailing list