Re: [ovirt-users] HostedEngine VM not visible, but running

27 Jun 2017

      Hi Martin,

Thanks for the reply. I have done this, and the deployment completed
without error. However, it still will not allow the Hosted Engine
migrate to another host. The
/etc/ovirt-hosted-engine/hosted-engine.conf got created ok on the host
I re-installed, but the ovirt-ha-broker.service, though it starts,
reports:

--------------------8<-------------------

Jun 27 14:58:26 kvm-ldn-01 systemd[1]: Starting oVirt Hosted Engine
High Availability Communications Broker...
Jun 27 14:58:27 kvm-ldn-01 ovirt-ha-broker[6101]: ovirt-ha-broker
ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker ERROR
Failed to read metadata from
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata
                                                  Traceback (most
recent call last):
                                                    File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
line 129, in get_raw_stats_for_service_type
                                                      f =
os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)
                                                  OSError: [Errno 2]
No such file or directory:
'/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'

--------------------8<-------------------

I checked the path, and it exists. I can run 'less -f' on it fine. The
perms are slightly different on the host that is running the VM vs the
one that is reporting errors (600 vs 660), ownership is vdsm:qemu. Is
this a san locking issue?

Thanks for any help,

Cam

On Tue, Jun 27, 2017 at 1:41 PM, Martin Sivak <msivak@redhat.com> wrote:
...
...
Should it be? It was not in the instructions for the migration from
bare-metal to Hosted VM
The hosted engine will only migrate to hosts that have the services
running. Please put one other host to maintenance and select Hosted
engine action: DEPLOY in the reinstall dialog.
Best regards
Martin Sivak
On Tue, Jun 27, 2017 at 1:23 PM, cmc <iucounu@gmail.com> wrote:
...
I changed the 'os.other.devices.display.protocols.value.3.6 =
spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display protocols
as 4 and the hosted engine now appears in the list of VMs. I am
guessing the compatibility version was causing it to use the 3.6
version. However, I am still unable to migrate the engine VM to
another host. When I try putting the host it is currently on into
maintenance, it reports:
Error while executing action: Cannot switch the Host(s) to Maintenance mode.
There are no available hosts capable of running the engine VM.
Running 'hosted-engine --vm-status' still shows 'Engine status:
unknown stale-data'.
The ovirt-ha-broker service is only running on one host. It was set to
'disabled' in systemd. It won't start as there is no
/etc/ovirt-hosted-engine/hosted-engine.conf on the other two hosts.
Should it be? It was not in the instructions for the migration from
bare-metal to Hosted VM
Thanks,
Cam
On Thu, Jun 22, 2017 at 1:07 PM, cmc <iucounu@gmail.com> wrote:
...
Hi Tomas,
So in my /usr/share/ovirt-engine/conf/osinfo-defaults.properties on my
engine VM, I have:
os.other.devices.display.protocols.value = spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
os.other.devices.display.protocols.value.3.6 = spice/qxl,vnc/cirrus,vnc/qxl
That seems to match - I assume since this is 4.1, the 3.6 should not apply
Is there somewhere else I should be looking?
Thanks,
Cam
On Thu, Jun 22, 2017 at 11:40 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
...
On Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek
<michal.skrivanek@redhat.com> wrote:
...
...
On 22 Jun 2017, at 12:31, Martin Sivak <msivak@redhat.com> wrote:
Tomas, what fields are needed in a VM to pass the check that causes
the following error?
>>>> WARN  [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>> 'ImportVm'
>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>
>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
to match the OS and VM Display type;-)
Configuration is in osinfo….e.g. if that is import from older releases on
Linux this is typically caused by the cahgen of cirrus to vga for non-SPICE
VMs
yep, the default supported combinations for 4.0+ is this:
os.other.devices.display.protocols.value =
spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
...
...
Thanks.
On Thu, Jun 22, 2017 at 12:19 PM, cmc <iucounu@gmail.com> wrote:
> Hi Martin,
>
>>
>> just as a random comment, do you still have the database backup from
>> the bare metal -> VM attempt? It might be possible to just try again
>> using it. Or in the worst case.. update the offending value there
>> before restoring it to the new engine instance.
>
> I still have the backup. I'd rather do the latter, as re-running the
> HE deployment is quite lengthy and involved (I have to re-initialise
> the FC storage each time). Do you know what the offending value(s)
> would be? Would it be in the Postgres DB or in a config file
> somewhere?
>
> Cheers,
>
> Cam
>
>> Regards
>>
>> Martin Sivak
>>
>> On Thu, Jun 22, 2017 at 11:39 AM, cmc <iucounu@gmail.com> wrote:
>>> Hi Yanir,
>>>
>>> Thanks for the reply.
>>>
>>>> First of all, maybe a chain reaction of :
>>>> WARN  [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>> 'ImportVm'
>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>
>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>> is causing the hosted engine vm not to be set up correctly  and
>>>> further
>>>> actions were made when the hosted engine vm wasnt in a stable state.
>>>>
>>>> As for now, are you trying to revert back to a previous/initial
>>>> state ?
>>>
>>> I'm not trying to revert it to a previous state for now. This was a
>>> migration from a bare metal engine, and it didn't report any error
>>> during the migration. I'd had some problems on my first attempts at
>>> this migration, whereby it never completed (due to a proxy issue) but
>>> I managed to resolve this. Do you know of a way to get the Hosted
>>> Engine VM into a stable state, without rebuilding the entire cluster
>>> from scratch (since I have a lot of VMs on it)?
>>>
>>> Thanks for any help.
>>>
>>> Regards,
>>>
>>> Cam
>>>
>>>> Regards,
>>>> Yanir
>>>>
>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc <iucounu@gmail.com> wrote:
>>>>>
>>>>> Hi Jenny/Martin,
>>>>>
>>>>> Any idea what I can do here? The hosted engine VM has no log on any
>>>>> host in /var/log/libvirt/qemu, and I fear that if I need to put the
>>>>> host into maintenance, e.g., to upgrade it that I created it on
>>>>> (which
>>>>> I think is hosting it), or if it fails for any reason, it won't get
>>>>> migrated to another host, and I will not be able to manage the
>>>>> cluster. It seems to be a very dangerous position to be in.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Cam
>>>>>
>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc <iucounu@gmail.com> wrote:
>>>>>> Thanks Martin. The hosts are all part of the same cluster.
>>>>>>
>>>>>> I get these errors in the engine.log on the engine:
>>>>>>
>>>>>> 2017-06-19 03:28:05,030Z WARN
>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>> 'ImportVm'
>>>>>> failed for user SYST
>>>>>> EM. Reasons:
>>>>>>
>>>>>> VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>> 2017-06-19 03:28:05,030Z INFO
>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock freed to object
>>>>>> 'EngineLock:{exclusiveLocks='[a
>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>,
>>>>>> HostedEngine=<VM_NAME, ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>>>> sharedLocks=
>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>]'}'
>>>>>> 2017-06-19 03:28:05,030Z ERROR
>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter]
>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed importing the Hosted
>>>>>> Engine VM
>>>>>>
>>>>>> The sanlock.log reports conflicts on that same host, and a
>>>>>> different
>>>>>> error on the other hosts, not sure if they are related.
>>>>>>
>>>>>> And this in the /var/log/ovirt-hosted-engine-ha/agent log on the
>>>>>> host
>>>>>> which I deployed the hosted engine VM on:
>>>>>>
>>>>>> MainThread::ERROR::2017-06-19
>>>>>>
>>>>>>
>>>>>> 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>> Unable to extract HEVM OVF
>>>>>> MainThread::ERROR::2017-06-19
>>>>>>
>>>>>>
>>>>>> 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>> Failed extracting VM OVF from the OVF_STORE volume, falling back
>>>>>> to
>>>>>> initial vm.conf
>>>>>>
>>>>>> I've seen some of these issues reported in bugzilla, but they were
>>>>>> for
>>>>>> older versions of oVirt (and appear to be resolved).
>>>>>>
>>>>>> I will install that package on the other two hosts, for which I
>>>>>> will
>>>>>> put them in maintenance as vdsm is installed as an upgrade. I
>>>>>> guess
>>>>>> restarting vdsm is a good idea after that?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Campbell
>>>>>>
>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak <msivak@redhat.com>
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> you do not have to install it on all hosts. But you should have
>>>>>>> more
>>>>>>> than one and ideally all hosted engine enabled nodes should
>>>>>>> belong to
>>>>>>> the same engine cluster.
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>> Martin Sivak
>>>>>>>
>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc <iucounu@gmail.com> wrote:
>>>>>>>> Hi Jenny,
>>>>>>>>
>>>>>>>> Does ovirt-hosted-engine-ha need to be installed across all
>>>>>>>> hosts?
>>>>>>>> Could that be the reason it is failing to see it properly?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Cam
>>>>>>>>
>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc <iucounu@gmail.com> wrote:
>>>>>>>>> Hi Jenny,
>>>>>>>>>
>>>>>>>>> Logs are attached. I can see errors in there, but am unsure how
>>>>>>>>> they
>>>>>>>>> arose.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Campbell
>>>>>>>>>
>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar
>>>>>>>>> <etokar@redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>> From the output it looks like the agent is down, try starting
>>>>>>>>>> it by
>>>>>>>>>> running:
>>>>>>>>>> systemctl start ovirt-ha-agent.
>>>>>>>>>>
>>>>>>>>>> The engine is supposed to see the hosted engine storage domain
>>>>>>>>>> and
>>>>>>>>>> import it
>>>>>>>>>> to the system, then it should import the hosted engine vm.
>>>>>>>>>>
>>>>>>>>>> Can you attach the agent log from the host
>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>>>>>>>> and the engine log from the engine vm
>>>>>>>>>> (/var/log/ovirt-engine/engine.log)?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jenny
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc <iucounu@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>
>>>>>>>>>>>> What version are you running?
>>>>>>>>>>>
>>>>>>>>>>> 4.1.2.2-1.el7.centos
>>>>>>>>>>>
>>>>>>>>>>>> For the hosted engine vm to be imported and displayed in the
>>>>>>>>>>>> engine, you
>>>>>>>>>>>> must first create a master storage domain.
>>>>>>>>>>>
>>>>>>>>>>> To provide a bit more detail: this was a migration of a
>>>>>>>>>>> bare-metal
>>>>>>>>>>> engine in an existing cluster to a hosted engine VM for that
>>>>>>>>>>> cluster.
>>>>>>>>>>> As part of this migration, I built an entirely new host and
>>>>>>>>>>> ran
>>>>>>>>>>> 'hosted-engine --deploy' (followed these instructions:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Meta...).
>>>>>>>>>>> I restored the backup from the engine and it completed
>>>>>>>>>>> without any
>>>>>>>>>>> errors. I didn't see any instructions regarding a master
>>>>>>>>>>> storage
>>>>>>>>>>> domain in the page above. The cluster has two existing master
>>>>>>>>>>> storage
>>>>>>>>>>> domains, one is fibre channel, which is up, and one ISO
>>>>>>>>>>> domain,
>>>>>>>>>>> which
>>>>>>>>>>> is currently offline.
>>>>>>>>>>>
>>>>>>>>>>>> What do you mean the hosted engine commands are failing?
>>>>>>>>>>>> What
>>>>>>>>>>>> happens
>>>>>>>>>>>> when
>>>>>>>>>>>> you run hosted-engine --vm-status now?
>>>>>>>>>>>
>>>>>>>>>>> Interestingly, whereas when I ran it before, it exited with
>>>>>>>>>>> no
>>>>>>>>>>> output
>>>>>>>>>>> and a return code of '1', it now reports:
>>>>>>>>>>>
>>>>>>>>>>> --== Host 1 status ==--
>>>>>>>>>>>
>>>>>>>>>>> conf_on_shared_storage             : True
>>>>>>>>>>> Status up-to-date                  : False
>>>>>>>>>>> Hostname                           :
>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk
>>>>>>>>>>> Host ID                            : 1
>>>>>>>>>>> Engine status                      : unknown stale-data
>>>>>>>>>>> Score                              : 0
>>>>>>>>>>> stopped                            : True
>>>>>>>>>>> Local maintenance                  : False
>>>>>>>>>>> crc32                              : 0217f07b
>>>>>>>>>>> local_conf_timestamp               : 2911
>>>>>>>>>>> Host timestamp                     : 2897
>>>>>>>>>>> Extra metadata (valid at timestamp):
>>>>>>>>>>>        metadata_parse_version=1
>>>>>>>>>>>        metadata_feature_version=1
>>>>>>>>>>>        timestamp=2897 (Thu Jun 15 16:22:54 2017)
>>>>>>>>>>>        host-id=1
>>>>>>>>>>>        score=0
>>>>>>>>>>>        vm_conf_refresh_time=2911 (Thu Jun 15 16:23:08 2017)
>>>>>>>>>>>        conf_on_shared_storage=True
>>>>>>>>>>>        maintenance=False
>>>>>>>>>>>        state=AgentStopped
>>>>>>>>>>>        stopped=True
>>>>>>>>>>>
>>>>>>>>>>> Yet I can login to the web GUI fine. I guess it is not HA due
>>>>>>>>>>> to
>>>>>>>>>>> being
>>>>>>>>>>> in an unknown state currently? Does the hosted-engine-ha rpm
>>>>>>>>>>> need
>>>>>>>>>>> to
>>>>>>>>>>> be installed across all nodes in the cluster, btw?
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>
>>>>>>>>>>> Cam
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Jenny Tokar
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc <iucounu@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've migrated from a bare-metal engine to a hosted engine.
>>>>>>>>>>>>> There
>>>>>>>>>>>>> were
>>>>>>>>>>>>> no errors during the install, however, the hosted engine
>>>>>>>>>>>>> did not
>>>>>>>>>>>>> get
>>>>>>>>>>>>> started. I tried running:
>>>>>>>>>>>>>
>>>>>>>>>>>>> hosted-engine --status
>>>>>>>>>>>>>
>>>>>>>>>>>>> on the host I deployed it on, and it returns nothing (exit
>>>>>>>>>>>>> code
>>>>>>>>>>>>> is 1
>>>>>>>>>>>>> however). I could not ping it either. So I tried starting
>>>>>>>>>>>>> it via
>>>>>>>>>>>>> 'hosted-engine --vm-start' and it returned:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Virtual machine does not exist
>>>>>>>>>>>>>
>>>>>>>>>>>>> But it then became available. I logged into it
>>>>>>>>>>>>> successfully. It
>>>>>>>>>>>>> is not
>>>>>>>>>>>>> in the list of VMs however.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas why the hosted-engine commands fail, and why it
>>>>>>>>>>>>> is not
>>>>>>>>>>>>> in
>>>>>>>>>>>>> the list of virtual machines?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cam
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>> Users@ovirt.org
>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list
>>>>>>>> Users@ovirt.org
>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users@ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users