[ovirt-users] HostedEngine VM not visible, but running

cmc iucounu at gmail.com
Thu Jun 22 10:19:31 UTC 2017


Hi Martin,

>
> just as a random comment, do you still have the database backup from
> the bare metal -> VM attempt? It might be possible to just try again
> using it. Or in the worst case.. update the offending value there
> before restoring it to the new engine instance.

I still have the backup. I'd rather do the latter, as re-running the
HE deployment is quite lengthy and involved (I have to re-initialise
the FC storage each time). Do you know what the offending value(s)
would be? Would it be in the Postgres DB or in a config file
somewhere?

Cheers,

Cam

> Regards
>
> Martin Sivak
>
> On Thu, Jun 22, 2017 at 11:39 AM, cmc <iucounu at gmail.com> wrote:
>> Hi Yanir,
>>
>> Thanks for the reply.
>>
>>> First of all, maybe a chain reaction of :
>>> WARN  [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action 'ImportVm'
>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>> is causing the hosted engine vm not to be set up correctly  and further
>>> actions were made when the hosted engine vm wasnt in a stable state.
>>>
>>> As for now, are you trying to revert back to a previous/initial state ?
>>
>> I'm not trying to revert it to a previous state for now. This was a
>> migration from a bare metal engine, and it didn't report any error
>> during the migration. I'd had some problems on my first attempts at
>> this migration, whereby it never completed (due to a proxy issue) but
>> I managed to resolve this. Do you know of a way to get the Hosted
>> Engine VM into a stable state, without rebuilding the entire cluster
>> from scratch (since I have a lot of VMs on it)?
>>
>> Thanks for any help.
>>
>> Regards,
>>
>> Cam
>>
>>> Regards,
>>> Yanir
>>>
>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc <iucounu at gmail.com> wrote:
>>>>
>>>> Hi Jenny/Martin,
>>>>
>>>> Any idea what I can do here? The hosted engine VM has no log on any
>>>> host in /var/log/libvirt/qemu, and I fear that if I need to put the
>>>> host into maintenance, e.g., to upgrade it that I created it on (which
>>>> I think is hosting it), or if it fails for any reason, it won't get
>>>> migrated to another host, and I will not be able to manage the
>>>> cluster. It seems to be a very dangerous position to be in.
>>>>
>>>> Thanks,
>>>>
>>>> Cam
>>>>
>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc <iucounu at gmail.com> wrote:
>>>> > Thanks Martin. The hosts are all part of the same cluster.
>>>> >
>>>> > I get these errors in the engine.log on the engine:
>>>> >
>>>> > 2017-06-19 03:28:05,030Z WARN
>>>> > [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>> > (org.ovirt.thread.pool-6-thread-23) [] Validation of action 'ImportVm'
>>>> > failed for user SYST
>>>> > EM. Reasons:
>>>> > VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>> > 2017-06-19 03:28:05,030Z INFO
>>>> > [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>> > (org.ovirt.thread.pool-6-thread-23) [] Lock freed to object
>>>> > 'EngineLock:{exclusiveLocks='[a
>>>> > 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>> > ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>,
>>>> > HostedEngine=<VM_NAME, ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>> > sharedLocks=
>>>> > '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>> > ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>]'}'
>>>> > 2017-06-19 03:28:05,030Z ERROR
>>>> > [org.ovirt.engine.core.bll.HostedEngineImporter]
>>>> > (org.ovirt.thread.pool-6-thread-23) [] Failed importing the Hosted
>>>> > Engine VM
>>>> >
>>>> > The sanlock.log reports conflicts on that same host, and a different
>>>> > error on the other hosts, not sure if they are related.
>>>> >
>>>> > And this in the /var/log/ovirt-hosted-engine-ha/agent log on the host
>>>> > which I deployed the hosted engine VM on:
>>>> >
>>>> > MainThread::ERROR::2017-06-19
>>>> >
>>>> > 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>> > Unable to extract HEVM OVF
>>>> > MainThread::ERROR::2017-06-19
>>>> >
>>>> > 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> > Failed extracting VM OVF from the OVF_STORE volume, falling back to
>>>> > initial vm.conf
>>>> >
>>>> > I've seen some of these issues reported in bugzilla, but they were for
>>>> > older versions of oVirt (and appear to be resolved).
>>>> >
>>>> > I will install that package on the other two hosts, for which I will
>>>> > put them in maintenance as vdsm is installed as an upgrade. I guess
>>>> > restarting vdsm is a good idea after that?
>>>> >
>>>> > Thanks,
>>>> >
>>>> > Campbell
>>>> >
>>>> > On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak <msivak at redhat.com>
>>>> > wrote:
>>>> >> Hi,
>>>> >>
>>>> >> you do not have to install it on all hosts. But you should have more
>>>> >> than one and ideally all hosted engine enabled nodes should belong to
>>>> >> the same engine cluster.
>>>> >>
>>>> >> Best regards
>>>> >>
>>>> >> Martin Sivak
>>>> >>
>>>> >> On Wed, Jun 21, 2017 at 11:29 AM, cmc <iucounu at gmail.com> wrote:
>>>> >>> Hi Jenny,
>>>> >>>
>>>> >>> Does ovirt-hosted-engine-ha need to be installed across all hosts?
>>>> >>> Could that be the reason it is failing to see it properly?
>>>> >>>
>>>> >>> Thanks,
>>>> >>>
>>>> >>> Cam
>>>> >>>
>>>> >>> On Mon, Jun 19, 2017 at 1:27 PM, cmc <iucounu at gmail.com> wrote:
>>>> >>>> Hi Jenny,
>>>> >>>>
>>>> >>>> Logs are attached. I can see errors in there, but am unsure how they
>>>> >>>> arose.
>>>> >>>>
>>>> >>>> Thanks,
>>>> >>>>
>>>> >>>> Campbell
>>>> >>>>
>>>> >>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar <etokar at redhat.com>
>>>> >>>> wrote:
>>>> >>>>> From the output it looks like the agent is down, try starting it by
>>>> >>>>> running:
>>>> >>>>> systemctl start ovirt-ha-agent.
>>>> >>>>>
>>>> >>>>> The engine is supposed to see the hosted engine storage domain and
>>>> >>>>> import it
>>>> >>>>> to the system, then it should import the hosted engine vm.
>>>> >>>>>
>>>> >>>>> Can you attach the agent log from the host
>>>> >>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>> >>>>> and the engine log from the engine vm
>>>> >>>>> (/var/log/ovirt-engine/engine.log)?
>>>> >>>>>
>>>> >>>>> Thanks,
>>>> >>>>> Jenny
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc <iucounu at gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>>  Hi Jenny,
>>>> >>>>>>
>>>> >>>>>> > What version are you running?
>>>> >>>>>>
>>>> >>>>>> 4.1.2.2-1.el7.centos
>>>> >>>>>>
>>>> >>>>>> > For the hosted engine vm to be imported and displayed in the
>>>> >>>>>> > engine, you
>>>> >>>>>> > must first create a master storage domain.
>>>> >>>>>>
>>>> >>>>>> To provide a bit more detail: this was a migration of a bare-metal
>>>> >>>>>> engine in an existing cluster to a hosted engine VM for that
>>>> >>>>>> cluster.
>>>> >>>>>> As part of this migration, I built an entirely new host and ran
>>>> >>>>>> 'hosted-engine --deploy' (followed these instructions:
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/).
>>>> >>>>>> I restored the backup from the engine and it completed without any
>>>> >>>>>> errors. I didn't see any instructions regarding a master storage
>>>> >>>>>> domain in the page above. The cluster has two existing master
>>>> >>>>>> storage
>>>> >>>>>> domains, one is fibre channel, which is up, and one ISO domain,
>>>> >>>>>> which
>>>> >>>>>> is currently offline.
>>>> >>>>>>
>>>> >>>>>> > What do you mean the hosted engine commands are failing? What
>>>> >>>>>> > happens
>>>> >>>>>> > when
>>>> >>>>>> > you run hosted-engine --vm-status now?
>>>> >>>>>>
>>>> >>>>>> Interestingly, whereas when I ran it before, it exited with no
>>>> >>>>>> output
>>>> >>>>>> and a return code of '1', it now reports:
>>>> >>>>>>
>>>> >>>>>> --== Host 1 status ==--
>>>> >>>>>>
>>>> >>>>>> conf_on_shared_storage             : True
>>>> >>>>>> Status up-to-date                  : False
>>>> >>>>>> Hostname                           : kvm-ldn-03.ldn.fscfc.co.uk
>>>> >>>>>> Host ID                            : 1
>>>> >>>>>> Engine status                      : unknown stale-data
>>>> >>>>>> Score                              : 0
>>>> >>>>>> stopped                            : True
>>>> >>>>>> Local maintenance                  : False
>>>> >>>>>> crc32                              : 0217f07b
>>>> >>>>>> local_conf_timestamp               : 2911
>>>> >>>>>> Host timestamp                     : 2897
>>>> >>>>>> Extra metadata (valid at timestamp):
>>>> >>>>>>         metadata_parse_version=1
>>>> >>>>>>         metadata_feature_version=1
>>>> >>>>>>         timestamp=2897 (Thu Jun 15 16:22:54 2017)
>>>> >>>>>>         host-id=1
>>>> >>>>>>         score=0
>>>> >>>>>>         vm_conf_refresh_time=2911 (Thu Jun 15 16:23:08 2017)
>>>> >>>>>>         conf_on_shared_storage=True
>>>> >>>>>>         maintenance=False
>>>> >>>>>>         state=AgentStopped
>>>> >>>>>>         stopped=True
>>>> >>>>>>
>>>> >>>>>> Yet I can login to the web GUI fine. I guess it is not HA due to
>>>> >>>>>> being
>>>> >>>>>> in an unknown state currently? Does the hosted-engine-ha rpm need
>>>> >>>>>> to
>>>> >>>>>> be installed across all nodes in the cluster, btw?
>>>> >>>>>>
>>>> >>>>>> Thanks for the help,
>>>> >>>>>>
>>>> >>>>>> Cam
>>>> >>>>>>
>>>> >>>>>> >
>>>> >>>>>> > Jenny Tokar
>>>> >>>>>> >
>>>> >>>>>> >
>>>> >>>>>> > On Thu, Jun 15, 2017 at 6:32 PM, cmc <iucounu at gmail.com> wrote:
>>>> >>>>>> >>
>>>> >>>>>> >> Hi,
>>>> >>>>>> >>
>>>> >>>>>> >> I've migrated from a bare-metal engine to a hosted engine. There
>>>> >>>>>> >> were
>>>> >>>>>> >> no errors during the install, however, the hosted engine did not
>>>> >>>>>> >> get
>>>> >>>>>> >> started. I tried running:
>>>> >>>>>> >>
>>>> >>>>>> >> hosted-engine --status
>>>> >>>>>> >>
>>>> >>>>>> >> on the host I deployed it on, and it returns nothing (exit code
>>>> >>>>>> >> is 1
>>>> >>>>>> >> however). I could not ping it either. So I tried starting it via
>>>> >>>>>> >> 'hosted-engine --vm-start' and it returned:
>>>> >>>>>> >>
>>>> >>>>>> >> Virtual machine does not exist
>>>> >>>>>> >>
>>>> >>>>>> >> But it then became available. I logged into it successfully. It
>>>> >>>>>> >> is not
>>>> >>>>>> >> in the list of VMs however.
>>>> >>>>>> >>
>>>> >>>>>> >> Any ideas why the hosted-engine commands fail, and why it is not
>>>> >>>>>> >> in
>>>> >>>>>> >> the list of virtual machines?
>>>> >>>>>> >>
>>>> >>>>>> >> Thanks for any help,
>>>> >>>>>> >>
>>>> >>>>>> >> Cam
>>>> >>>>>> >> _______________________________________________
>>>> >>>>>> >> Users mailing list
>>>> >>>>>> >> Users at ovirt.org
>>>> >>>>>> >> http://lists.ovirt.org/mailman/listinfo/users
>>>> >>>>>> >
>>>> >>>>>> >
>>>> >>>>>
>>>> >>>>>
>>>> >>> _______________________________________________
>>>> >>> Users mailing list
>>>> >>> Users at ovirt.org
>>>> >>> http://lists.ovirt.org/mailman/listinfo/users
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>


More information about the Users mailing list