Both services are up on all three hosts. The broke logs just report:
Thread-6549::INFO::2017-06-29
17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
Connection established
Thread-6549::INFO::2017-06-29
17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Connection closed
Thanks,
Cam
On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak <msivak(a)redhat.com> wrote:
Hi,
please make sure that both ovirt-ha-agent and ovirt-ha-broker services
are restarted and up. The error says the agent can't talk to the
broker. Is there anything in the broker.log?
Best regards
Martin Sivak
On Thu, Jun 29, 2017 at 4:42 PM, cmc <iucounu(a)gmail.com> wrote:
> I've restarted those two services across all hosts, have taken the
> Hosted Engine host out of maintenance, and when I try to migrate the
> Hosted Engine over to another host, it reports that all three hosts
> 'did not satisfy internal filter HA because it is not a Hosted Engine
> host'.
>
> On the host that the Hosted Engine is currently on it reports in the agent.log:
>
> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
> Connection closed: Connection closed
> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception
> getting service path: Connection closed
> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
> call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
> return action(he)
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 64, in action_proper
> return
> he.start_monitoring()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 411, in start_monitoring
> self._initialize_sanlock()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 691, in _initialize_sanlock
>
> constants.SERVICE_TYPE + constants.LOCKSPACE_EXTENSION)
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 162, in get_service_path
> .format(str(e)))
> RequestError: Failed
> to get service path: Connection closed
> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>
> On Thu, Jun 29, 2017 at 1:25 PM, Martin Sivak <msivak(a)redhat.com> wrote:
>> Hi,
>>
>> yep, you have to restart the ovirt-ha-agent and ovirt-ha-broker services.
>>
>> The scheduling message just means that the host has score 0 or is not
>> reporting score at all.
>>
>> Martin
>>
>> On Thu, Jun 29, 2017 at 1:33 PM, cmc <iucounu(a)gmail.com> wrote:
>>> Thanks Martin, do I have to restart anything? When I try to use the
>>> 'migrate' operation, it complains that the other two hosts 'did
not
>>> satisfy internal filter HA because it is not a Hosted Engine host..'
>>> (even though I reinstalled both these hosts with the 'deploy hosted
>>> engine' option, which suggests that something needs restarting. Should
>>> I worry about the sanlock errors, or will that be resolved by the
>>> change in host_id?
>>>
>>> Kind regards,
>>>
>>> Cam
>>>
>>> On Thu, Jun 29, 2017 at 12:22 PM, Martin Sivak <msivak(a)redhat.com>
wrote:
>>>> Change the ids so they are distinct. I need to check if there is a way
>>>> to read the SPM ids from the engine as using the same numbers would be
>>>> the best.
>>>>
>>>> Martin
>>>>
>>>>
>>>>
>>>> On Thu, Jun 29, 2017 at 12:46 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>> Is there any way of recovering from this situation? I'd prefer to
fix
>>>>> the issue rather than re-deploy, but if there is no recovery path, I
>>>>> could perhaps try re-deploying the hosted engine. In which case,
would
>>>>> the best option be to take a backup of the Hosted Engine, and then
>>>>> shut it down, re-initialise the SAN partition (or use another
>>>>> partition) and retry the deployment? Would it be better to use the
>>>>> older backup from the bare metal engine that I originally used, or
use
>>>>> a backup from the Hosted Engine? I'm not sure if any VMs have
been
>>>>> added since switching to Hosted Engine.
>>>>>
>>>>> Unfortunately I have very little time left to get this working
before
>>>>> I have to hand it over for eval (by end of Friday).
>>>>>
>>>>> Here are some log snippets from the cluster that are current
>>>>>
>>>>> In /var/log/vdsm/vdsm.log on the host that has the Hosted Engine:
>>>>>
>>>>> 2017-06-29 10:50:15,071+0100 INFO (monitor/207221b)
[storage.SANLock]
>>>>> Acquiring host id for domain 207221b2-959b-426b-b945-18e1adfed62f
(id:
>>>>> 3) (clusterlock:282)
>>>>> 2017-06-29 10:50:15,072+0100 ERROR (monitor/207221b)
[storage.Monitor]
>>>>> Error acquiring host id 3 for domain
>>>>> 207221b2-959b-426b-b945-18e1adfed62f (monitor:558)
>>>>> Traceback (most recent call last):
>>>>> File "/usr/share/vdsm/storage/monitor.py", line 555, in
_acquireHostId
>>>>> self.domain.acquireHostId(self.hostId, async=True)
>>>>> File "/usr/share/vdsm/storage/sd.py", line 790, in
acquireHostId
>>>>> self._manifest.acquireHostId(hostId, async)
>>>>> File "/usr/share/vdsm/storage/sd.py", line 449, in
acquireHostId
>>>>> self._domainLock.acquireHostId(hostId, async)
>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>>> line 297, in acquireHostId
>>>>> raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>>> AcquireHostIdFailure: Cannot acquire host id:
>>>>> ('207221b2-959b-426b-b945-18e1adfed62f', SanlockException(22,
'Sanlock
>>>>> lockspace add failure', 'Invalid argument'))
>>>>>
>>>>> From /var/log/ovirt-hosted-engine-ha/agent.log on the same host:
>>>>>
>>>>> MainThread::ERROR::2017-06-19
>>>>>
13:30:50,592::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>> Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> MainThread::WARNING::2017-06-19
>>>>>
13:30:50,593::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Error while monitoring engine: Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> MainThread::WARNING::2017-06-19
>>>>>
13:30:50,593::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Unexpected error
>>>>> Traceback (most recent call last):
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>> line 443, in start_monitoring
>>>>> self._initialize_domain_monitor()
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>> line 823, in _initialize_domain_monitor
>>>>> raise Exception(msg)
>>>>> Exception: Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> MainThread::ERROR::2017-06-19
>>>>>
13:30:50,593::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Shutting down the agent because of 3 failures in a row!
>>>>>
>>>>> From sanlock.log:
>>>>>
>>>>> 2017-06-29 11:17:06+0100 1194149 [2530]: add_lockspace
>>>>>
207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>> conflicts with name of list1 s5
>>>>>
207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>
>>>>> From the two other hosts:
>>>>>
>>>>> host 2:
>>>>>
>>>>> vdsm.log
>>>>>
>>>>> 2017-06-29 10:53:47,755+0100 ERROR (jsonrpc/4)
[jsonrpc.JsonRpcServer]
>>>>> Internal server error (__init__:570)
>>>>> Traceback (most recent call last):
>>>>> File
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
>>>>> 565, in _handle_request
>>>>> res = method(**params)
>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
>>>>> 202, in _dynamicMethod
>>>>> result = fn(*methodArgs)
>>>>> File "/usr/share/vdsm/API.py", line 1454, in
getAllVmIoTunePolicies
>>>>> io_tune_policies_dict = self._cif.getAllVmIoTunePolicies()
>>>>> File "/usr/share/vdsm/clientIF.py", line 448, in
getAllVmIoTunePolicies
>>>>> 'current_values': v.getIoTune()}
>>>>> File "/usr/share/vdsm/virt/vm.py", line 2803, in
getIoTune
>>>>> result = self.getIoTuneResponse()
>>>>> File "/usr/share/vdsm/virt/vm.py", line 2816, in
getIoTuneResponse
>>>>> res = self._dom.blockIoTune(
>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line
>>>>> 47, in __getattr__
>>>>> % self.vmid)
>>>>> NotConnectedError: VM u'a79e6b0e-fff4-4cba-a02c-4c00be151300'
was not
>>>>> started yet or was shut down
>>>>>
>>>>> /var/log/ovirt-hosted-engine-ha/agent.log
>>>>>
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:33,636::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>>>>> Found OVF_STORE: imgUUID:222610db-7880-4f4f-8559-a3635fd73555,
>>>>> volUUID:c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:33,926::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:33,938::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>> OVF_STORE volume path:
>>>>>
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/images/222610db-7880-4f4f-8559-a3635fd73555/c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:33,967::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>> Found an OVF for HE VM, trying to convert
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:33,971::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>> Got vm.conf from OVF_STORE
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:36,736::states::678::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>>>>> Score is 0 due to unexpected vm shutdown at Thu Jun 29 10:53:59 2017
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:36,736::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>>> MainThread::INFO::2017-06-29
>>>>>
10:56:46,772::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
>>>>> Reloading vm.conf from the shared storage domain
>>>>>
>>>>> /var/log/messages:
>>>>>
>>>>> Jun 29 10:53:46 kvm-ldn-02 kernel: dd: sending ioctl 80306d02 to a
partition!
>>>>>
>>>>>
>>>>> host 1:
>>>>>
>>>>> /var/log/messages also in sanlock.log
>>>>>
>>>>> Jun 29 11:01:02 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:02+0100
>>>>> 678325 [9132]: s4531 delta_acquire host_id 1 busy1 1 2 1193177
>>>>> 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>> Jun 29 11:01:03 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:03+0100
>>>>> 678326 [24159]: s4531 add_lockspace fail result -262
>>>>>
>>>>> /var/log/ovirt-hosted-engine-ha/agent.log:
>>>>>
>>>>> MainThread::ERROR::2017-06-27
>>>>>
15:21:01,143::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>> Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> MainThread::WARNING::2017-06-27
>>>>>
15:21:01,144::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Error while monitoring engine: Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> MainThread::WARNING::2017-06-27
>>>>>
15:21:01,144::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Unexpected error
>>>>> Traceback (most recent call last):
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>> line 443, in start_monitoring
>>>>> self._initialize_domain_monitor()
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>> line 823, in _initialize_domain_monitor
>>>>> raise Exception(msg)
>>>>> Exception: Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> MainThread::ERROR::2017-06-27
>>>>>
15:21:01,144::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>> Shutting down the agent because of 3 failures in a row!
>>>>> MainThread::INFO::2017-06-27
>>>>>
15:21:06,717::hosted_engine::848::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>>>> VDSM domain monitor status: PENDING
>>>>> MainThread::INFO::2017-06-27
>>>>>
15:21:09,335::hosted_engine::776::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>>>> Failed to stop monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f): Storage domain is
>>>>> member of pool:
u'domain=207221b2-959b-426b-b945-18e1adfed62f'
>>>>> MainThread::INFO::2017-06-27
>>>>>
15:21:09,339::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>>> Agent shutting down
>>>>>
>>>>>
>>>>> Thanks for any help,
>>>>>
>>>>>
>>>>> Cam
>>>>>
>>>>>
>>>>> On Wed, Jun 28, 2017 at 11:25 AM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>> Hi Martin,
>>>>>>
>>>>>> yes, on two of the machines they have the same host_id. The other
has
>>>>>> a different host_id.
>>>>>>
>>>>>> To update since yesterday: I reinstalled and deployed Hosted
Engine on
>>>>>> the other host (so all three hosts in the cluster now have it
>>>>>> installed). The second one I deployed said it was able to host
the
>>>>>> engine (unlike the first I reinstalled), so I tried putting the
host
>>>>>> with the Hosted Engine on it into maintenance to see if it would
>>>>>> migrate over. It managed to move all hosts but the Hosted Engine.
And
>>>>>> now the host that said it was able to host the engine says
>>>>>> 'unavailable due to HA score'. The host that it was
trying to move
>>>>>> from is now in 'preparing for maintenance' for the last
12 hours.
>>>>>>
>>>>>> The summary is:
>>>>>>
>>>>>> kvm-ldn-01 - one of the original, pre-Hosted Engine hosts,
reinstalled
>>>>>> with 'Deploy Hosted Engine'. No icon saying it can host
the Hosted
>>>>>> Hngine, host_id of '2' in
/etc/ovirt-hosted-engine/hosted-engine.conf.
>>>>>> 'add_lockspace' fails in sanlock.log
>>>>>>
>>>>>> kvm-ldn-02 - the other host that was pre-existing before Hosted
Engine
>>>>>> was created. Reinstalled with 'Deploy Hosted Engine'. Had
an icon
>>>>>> saying that it was able to host the Hosted Engine, but after
migration
>>>>>> was attempted when putting kvm-ldn-03 into maintenance, it
reports:
>>>>>> 'unavailable due to HA score'. It has a host_id of
'1' in
>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf. No errors in
sanlock.log
>>>>>>
>>>>>> kvm-ldn-03 - this was the host I deployed Hosted Engine on, which
was
>>>>>> not part of the original cluster. I restored the bare-metal
engine
>>>>>> backup in the Hosted Engine on this host when deploying it,
without
>>>>>> error. It currently has the Hosted Engine on it (as the only VM
after
>>>>>> I put that host into maintenance to test the HA of Hosted
Engine).
>>>>>> Sanlock log shows conflicts
>>>>>>
>>>>>> I will look through all the logs for any other errors. Please let
me
>>>>>> know if you need any logs or other clarification/information.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Campbell
>>>>>>
>>>>>> On Wed, Jun 28, 2017 at 9:25 AM, Martin Sivak
<msivak(a)redhat.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> can you please check the contents of
>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf or
>>>>>>> /etc/ovirt-hosted-engine-ha/agent.conf (I am not sure which
one it is
>>>>>>> right now) and search for host-id?
>>>>>>>
>>>>>>> Make sure the IDs are different. If they are not, then there
is a bug somewhere.
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>> On Tue, Jun 27, 2017 at 6:26 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>> I see this on the host it is trying to migrate in
/var/log/sanlock:
>>>>>>>>
>>>>>>>> 2017-06-27 17:10:40+0100 527703 [2407]: s3528 lockspace
>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>> 2017-06-27 17:13:00+0100 527843 [27446]: s3528
delta_acquire host_id 1
>>>>>>>> busy1 1 2 1042692
3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>> 2017-06-27 17:13:01+0100 527844 [2407]: s3528
add_lockspace fail result -262
>>>>>>>>
>>>>>>>> The sanlock service is running. Why would this occur?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> C
>>>>>>>>
>>>>>>>> On Tue, Jun 27, 2017 at 5:21 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>> Hi Martin,
>>>>>>>>>
>>>>>>>>> Thanks for the reply. I have done this, and the
deployment completed
>>>>>>>>> without error. However, it still will not allow the
Hosted Engine
>>>>>>>>> migrate to another host. The
>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf got
created ok on the host
>>>>>>>>> I re-installed, but the ovirt-ha-broker.service,
though it starts,
>>>>>>>>> reports:
>>>>>>>>>
>>>>>>>>> --------------------8<-------------------
>>>>>>>>>
>>>>>>>>> Jun 27 14:58:26 kvm-ldn-01 systemd[1]: Starting oVirt
Hosted Engine
>>>>>>>>> High Availability Communications Broker...
>>>>>>>>> Jun 27 14:58:27 kvm-ldn-01 ovirt-ha-broker[6101]:
ovirt-ha-broker
>>>>>>>>>
ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker ERROR
>>>>>>>>> Failed to read metadata from
>>>>>>>>>
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata
>>>>>>>>>
Traceback (most
>>>>>>>>> recent call last):
>>>>>>>>>
File
>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>>> line 129, in get_raw_stats_for_service_type
>>>>>>>>>
f =
>>>>>>>>> os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)
>>>>>>>>>
OSError: [Errno 2]
>>>>>>>>> No such file or directory:
>>>>>>>>>
'/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'
>>>>>>>>>
>>>>>>>>> --------------------8<-------------------
>>>>>>>>>
>>>>>>>>> I checked the path, and it exists. I can run
'less -f' on it fine. The
>>>>>>>>> perms are slightly different on the host that is
running the VM vs the
>>>>>>>>> one that is reporting errors (600 vs 660), ownership
is vdsm:qemu. Is
>>>>>>>>> this a san locking issue?
>>>>>>>>>
>>>>>>>>> Thanks for any help,
>>>>>>>>>
>>>>>>>>> Cam
>>>>>>>>>
>>>>>>>>> On Tue, Jun 27, 2017 at 1:41 PM, Martin Sivak
<msivak(a)redhat.com> wrote:
>>>>>>>>>>> Should it be? It was not in the instructions
for the migration from
>>>>>>>>>>> bare-metal to Hosted VM
>>>>>>>>>>
>>>>>>>>>> The hosted engine will only migrate to hosts that
have the services
>>>>>>>>>> running. Please put one other host to maintenance
and select Hosted
>>>>>>>>>> engine action: DEPLOY in the reinstall dialog.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>> Martin Sivak
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 27, 2017 at 1:23 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>> I changed the
'os.other.devices.display.protocols.value.3.6 =
>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl' line to
have the same display protocols
>>>>>>>>>>> as 4 and the hosted engine now appears in the
list of VMs. I am
>>>>>>>>>>> guessing the compatibility version was
causing it to use the 3.6
>>>>>>>>>>> version. However, I am still unable to
migrate the engine VM to
>>>>>>>>>>> another host. When I try putting the host it
is currently on into
>>>>>>>>>>> maintenance, it reports:
>>>>>>>>>>>
>>>>>>>>>>> Error while executing action: Cannot switch
the Host(s) to Maintenance mode.
>>>>>>>>>>> There are no available hosts capable of
running the engine VM.
>>>>>>>>>>>
>>>>>>>>>>> Running 'hosted-engine --vm-status'
still shows 'Engine status:
>>>>>>>>>>> unknown stale-data'.
>>>>>>>>>>>
>>>>>>>>>>> The ovirt-ha-broker service is only running
on one host. It was set to
>>>>>>>>>>> 'disabled' in systemd. It won't
start as there is no
>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf
on the other two hosts.
>>>>>>>>>>> Should it be? It was not in the instructions
for the migration from
>>>>>>>>>>> bare-metal to Hosted VM
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Cam
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 22, 2017 at 1:07 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>> Hi Tomas,
>>>>>>>>>>>>
>>>>>>>>>>>> So in my
/usr/share/ovirt-engine/conf/osinfo-defaults.properties on my
>>>>>>>>>>>> engine VM, I have:
>>>>>>>>>>>>
>>>>>>>>>>>> os.other.devices.display.protocols.value
= spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>
os.other.devices.display.protocols.value.3.6 = spice/qxl,vnc/cirrus,vnc/qxl
>>>>>>>>>>>>
>>>>>>>>>>>> That seems to match - I assume since this
is 4.1, the 3.6 should not apply
>>>>>>>>>>>>
>>>>>>>>>>>> Is there somewhere else I should be
looking?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Cam
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 22, 2017 at 11:40 AM, Tomas
Jelinek <tjelinek(a)redhat.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 12:38 PM,
Michal Skrivanek
>>>>>>>>>>>>> <michal.skrivanek(a)redhat.com>
wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > On 22 Jun 2017, at 12:31,
Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Tomas, what fields are
needed in a VM to pass the check that causes
>>>>>>>>>>>>>> > the following error?
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >>>>> WARN
[org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>> >>>>>
(org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>> >>>>>
'ImportVm'
>>>>>>>>>>>>>> >>>>> failed for
user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>>
,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> to match the OS and VM Display
type;-)
>>>>>>>>>>>>>> Configuration is in osinfo….e.g.
if that is import from older releases on
>>>>>>>>>>>>>> Linux this is typically caused by
the cahgen of cirrus to vga for non-SPICE
>>>>>>>>>>>>>> VMs
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> yep, the default supported
combinations for 4.0+ is this:
>>>>>>>>>>>>>
os.other.devices.display.protocols.value =
>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Thanks.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Thu, Jun 22, 2017 at
12:19 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>> >> Hi Martin,
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> just as a random
comment, do you still have the database backup from
>>>>>>>>>>>>>> >>> the bare metal ->
VM attempt? It might be possible to just try again
>>>>>>>>>>>>>> >>> using it. Or in the
worst case.. update the offending value there
>>>>>>>>>>>>>> >>> before restoring it
to the new engine instance.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> I still have the backup.
I'd rather do the latter, as re-running the
>>>>>>>>>>>>>> >> HE deployment is quite
lengthy and involved (I have to re-initialise
>>>>>>>>>>>>>> >> the FC storage each
time). Do you know what the offending value(s)
>>>>>>>>>>>>>> >> would be? Would it be in
the Postgres DB or in a config file
>>>>>>>>>>>>>> >> somewhere?
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Cheers,
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Cam
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>> Regards
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> Martin Sivak
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> On Thu, Jun 22, 2017
at 11:39 AM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>> >>>> Hi Yanir,
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> Thanks for the
reply.
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>>> First of
all, maybe a chain reaction of :
>>>>>>>>>>>>>> >>>>> WARN
[org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>> >>>>>
(org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>> >>>>>
'ImportVm'
>>>>>>>>>>>>>> >>>>> failed for
user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>>
,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>> >>>>> is causing
the hosted engine vm not to be set up correctly and
>>>>>>>>>>>>>> >>>>> further
>>>>>>>>>>>>>> >>>>> actions were
made when the hosted engine vm wasnt in a stable state.
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>> As for now,
are you trying to revert back to a previous/initial
>>>>>>>>>>>>>> >>>>> state ?
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> I'm not
trying to revert it to a previous state for now. This was a
>>>>>>>>>>>>>> >>>> migration from a
bare metal engine, and it didn't report any error
>>>>>>>>>>>>>> >>>> during the
migration. I'd had some problems on my first attempts at
>>>>>>>>>>>>>> >>>> this migration,
whereby it never completed (due to a proxy issue) but
>>>>>>>>>>>>>> >>>> I managed to
resolve this. Do you know of a way to get the Hosted
>>>>>>>>>>>>>> >>>> Engine VM into a
stable state, without rebuilding the entire cluster
>>>>>>>>>>>>>> >>>> from scratch
(since I have a lot of VMs on it)?
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> Thanks for any
help.
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> Regards,
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> Cam
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>>> Regards,
>>>>>>>>>>>>>> >>>>> Yanir
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>> On Wed, Jun
21, 2017 at 4:32 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> Hi
Jenny/Martin,
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> Any idea
what I can do here? The hosted engine VM has no log on any
>>>>>>>>>>>>>> >>>>>> host in
/var/log/libvirt/qemu, and I fear that if I need to put the
>>>>>>>>>>>>>> >>>>>> host
into maintenance, e.g., to upgrade it that I created it on
>>>>>>>>>>>>>> >>>>>> (which
>>>>>>>>>>>>>> >>>>>> I think
is hosting it), or if it fails for any reason, it won't get
>>>>>>>>>>>>>> >>>>>> migrated
to another host, and I will not be able to manage the
>>>>>>>>>>>>>> >>>>>> cluster.
It seems to be a very dangerous position to be in.
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> Thanks,
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> Cam
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> On Wed,
Jun 21, 2017 at 11:48 AM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>> >>>>>>>
Thanks Martin. The hosts are all part of the same cluster.
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>> I
get these errors in the engine.log on the engine:
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
2017-06-19 03:28:05,030Z WARN
>>>>>>>>>>>>>> >>>>>>>
[org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>> >>>>>>>
(org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>> >>>>>>>
'ImportVm'
>>>>>>>>>>>>>> >>>>>>>
failed for user SYST
>>>>>>>>>>>>>> >>>>>>> EM.
Reasons:
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>> >>>>>>>
2017-06-19 03:28:05,030Z INFO
>>>>>>>>>>>>>> >>>>>>>
[org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>> >>>>>>>
(org.ovirt.thread.pool-6-thread-23) [] Lock freed to object
>>>>>>>>>>>>>> >>>>>>>
'EngineLock:{exclusiveLocks='[a
>>>>>>>>>>>>>> >>>>>>>
79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>>>>>>>>>>>> >>>>>>>
ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>,
>>>>>>>>>>>>>> >>>>>>>
HostedEngine=<VM_NAME, ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>>>>>>>>>>>> >>>>>>>
sharedLocks=
>>>>>>>>>>>>>> >>>>>>>
'[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>>>>>>>>>>>> >>>>>>>
ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>]'}'
>>>>>>>>>>>>>> >>>>>>>
2017-06-19 03:28:05,030Z ERROR
>>>>>>>>>>>>>> >>>>>>>
[org.ovirt.engine.core.bll.HostedEngineImporter]
>>>>>>>>>>>>>> >>>>>>>
(org.ovirt.thread.pool-6-thread-23) [] Failed importing the Hosted
>>>>>>>>>>>>>> >>>>>>>
Engine VM
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>> The
sanlock.log reports conflicts on that same host, and a
>>>>>>>>>>>>>> >>>>>>>
different
>>>>>>>>>>>>>> >>>>>>>
error on the other hosts, not sure if they are related.
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>> And
this in the /var/log/ovirt-hosted-engine-ha/agent log on the
>>>>>>>>>>>>>> >>>>>>>
host
>>>>>>>>>>>>>> >>>>>>>
which I deployed the hosted engine VM on:
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>> >>>>>>>
Unable to extract HEVM OVF
>>>>>>>>>>>>>> >>>>>>>
MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>> >>>>>>>
Failed extracting VM OVF from the OVF_STORE volume, falling back
>>>>>>>>>>>>>> >>>>>>> to
>>>>>>>>>>>>>> >>>>>>>
initial vm.conf
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
I've seen some of these issues reported in bugzilla, but they were
>>>>>>>>>>>>>> >>>>>>> for
>>>>>>>>>>>>>> >>>>>>>
older versions of oVirt (and appear to be resolved).
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>> I
will install that package on the other two hosts, for which I
>>>>>>>>>>>>>> >>>>>>>
will
>>>>>>>>>>>>>> >>>>>>> put
them in maintenance as vdsm is installed as an upgrade. I
>>>>>>>>>>>>>> >>>>>>>
guess
>>>>>>>>>>>>>> >>>>>>>
restarting vdsm is a good idea after that?
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
Thanks,
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>>
Campbell
>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>> >>>>>>> On
Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak <msivak(a)redhat.com>
>>>>>>>>>>>>>> >>>>>>>
wrote:
>>>>>>>>>>>>>> >>>>>>>>
Hi,
>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>
you do not have to install it on all hosts. But you should have
>>>>>>>>>>>>>> >>>>>>>>
more
>>>>>>>>>>>>>> >>>>>>>>
than one and ideally all hosted engine enabled nodes should
>>>>>>>>>>>>>> >>>>>>>>
belong to
>>>>>>>>>>>>>> >>>>>>>>
the same engine cluster.
>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>
Best regards
>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>
Martin Sivak
>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>
On Wed, Jun 21, 2017 at 11:29 AM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>> Does ovirt-hosted-engine-ha need to be installed
across all
>>>>>>>>>>>>>>
>>>>>>>>> hosts?
>>>>>>>>>>>>>>
>>>>>>>>> Could that be the reason it is failing to see it
properly?
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>> Cam
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>> Logs are attached. I can see errors in there, but
am unsure how
>>>>>>>>>>>>>>
>>>>>>>>>> they
>>>>>>>>>>>>>>
>>>>>>>>>> arose.
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>> Campbell
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar
>>>>>>>>>>>>>>
>>>>>>>>>> <etokar(a)redhat.com>
>>>>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>> From the output it looks like the agent is
down, try starting
>>>>>>>>>>>>>>
>>>>>>>>>>> it by
>>>>>>>>>>>>>>
>>>>>>>>>>> running:
>>>>>>>>>>>>>>
>>>>>>>>>>> systemctl start ovirt-ha-agent.
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>> The engine is supposed to see the hosted
engine storage domain
>>>>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>>>>>>
>>>>>>>>>>> import it
>>>>>>>>>>>>>>
>>>>>>>>>>> to the system, then it should import the
hosted engine vm.
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>> Can you attach the agent log from the host
>>>>>>>>>>>>>>
>>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>>>>>>>>>>>>
>>>>>>>>>>> and the engine log from the engine vm
>>>>>>>>>>>>>>
>>>>>>>>>>> (/var/log/ovirt-engine/engine.log)?
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>> Jenny
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc
<iucounu(a)gmail.com>
>>>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> What version are you running?
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> 4.1.2.2-1.el7.centos
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> For the hosted engine vm to be
imported and displayed in the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> engine, you
>>>>>>>>>>>>>>
>>>>>>>>>>>>> must first create a master storage
domain.
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> To provide a bit more detail: this was a
migration of a
>>>>>>>>>>>>>>
>>>>>>>>>>>> bare-metal
>>>>>>>>>>>>>>
>>>>>>>>>>>> engine in an existing cluster to a hosted
engine VM for that
>>>>>>>>>>>>>>
>>>>>>>>>>>> cluster.
>>>>>>>>>>>>>>
>>>>>>>>>>>> As part of this migration, I built an
entirely new host and
>>>>>>>>>>>>>>
>>>>>>>>>>>> ran
>>>>>>>>>>>>>>
>>>>>>>>>>>> 'hosted-engine --deploy'
(followed these instructions:
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_M...).
>>>>>>>>>>>>>>
>>>>>>>>>>>> I restored the backup from the engine and
it completed
>>>>>>>>>>>>>>
>>>>>>>>>>>> without any
>>>>>>>>>>>>>>
>>>>>>>>>>>> errors. I didn't see any instructions
regarding a master
>>>>>>>>>>>>>>
>>>>>>>>>>>> storage
>>>>>>>>>>>>>>
>>>>>>>>>>>> domain in the page above. The cluster has
two existing master
>>>>>>>>>>>>>>
>>>>>>>>>>>> storage
>>>>>>>>>>>>>>
>>>>>>>>>>>> domains, one is fibre channel, which is
up, and one ISO
>>>>>>>>>>>>>>
>>>>>>>>>>>> domain,
>>>>>>>>>>>>>>
>>>>>>>>>>>> which
>>>>>>>>>>>>>>
>>>>>>>>>>>> is currently offline.
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you mean the hosted engine
commands are failing?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> What
>>>>>>>>>>>>>>
>>>>>>>>>>>>> happens
>>>>>>>>>>>>>>
>>>>>>>>>>>>> when
>>>>>>>>>>>>>>
>>>>>>>>>>>>> you run hosted-engine --vm-status
now?
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> Interestingly, whereas when I ran it
before, it exited with
>>>>>>>>>>>>>>
>>>>>>>>>>>> no
>>>>>>>>>>>>>>
>>>>>>>>>>>> output
>>>>>>>>>>>>>>
>>>>>>>>>>>> and a return code of '1', it now
reports:
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> --== Host 1 status ==--
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> conf_on_shared_storage :
True
>>>>>>>>>>>>>>
>>>>>>>>>>>> Status up-to-date :
False
>>>>>>>>>>>>>>
>>>>>>>>>>>> Hostname :
>>>>>>>>>>>>>>
>>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk
>>>>>>>>>>>>>>
>>>>>>>>>>>> Host ID : 1
>>>>>>>>>>>>>>
>>>>>>>>>>>> Engine status :
unknown stale-data
>>>>>>>>>>>>>>
>>>>>>>>>>>> Score : 0
>>>>>>>>>>>>>>
>>>>>>>>>>>> stopped :
True
>>>>>>>>>>>>>>
>>>>>>>>>>>> Local maintenance :
False
>>>>>>>>>>>>>>
>>>>>>>>>>>> crc32 :
0217f07b
>>>>>>>>>>>>>>
>>>>>>>>>>>> local_conf_timestamp :
2911
>>>>>>>>>>>>>>
>>>>>>>>>>>> Host timestamp :
2897
>>>>>>>>>>>>>>
>>>>>>>>>>>> Extra metadata (valid at timestamp):
>>>>>>>>>>>>>>
>>>>>>>>>>>> metadata_parse_version=1
>>>>>>>>>>>>>>
>>>>>>>>>>>> metadata_feature_version=1
>>>>>>>>>>>>>>
>>>>>>>>>>>> timestamp=2897 (Thu Jun 15
16:22:54 2017)
>>>>>>>>>>>>>>
>>>>>>>>>>>> host-id=1
>>>>>>>>>>>>>>
>>>>>>>>>>>> score=0
>>>>>>>>>>>>>>
>>>>>>>>>>>> vm_conf_refresh_time=2911 (Thu Jun
15 16:23:08 2017)
>>>>>>>>>>>>>>
>>>>>>>>>>>> conf_on_shared_storage=True
>>>>>>>>>>>>>>
>>>>>>>>>>>> maintenance=False
>>>>>>>>>>>>>>
>>>>>>>>>>>> state=AgentStopped
>>>>>>>>>>>>>>
>>>>>>>>>>>> stopped=True
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> Yet I can login to the web GUI fine. I
guess it is not HA due
>>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>
>>>>>>>>>>>> being
>>>>>>>>>>>>>>
>>>>>>>>>>>> in an unknown state currently? Does the
hosted-engine-ha rpm
>>>>>>>>>>>>>>
>>>>>>>>>>>> need
>>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>
>>>>>>>>>>>> be installed across all nodes in the
cluster, btw?
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Jenny Tokar
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc
<iucounu(a)gmail.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've migrated from a
bare-metal engine to a hosted engine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> were
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> no errors during the install,
however, the hosted engine
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> did not
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> started. I tried running:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> hosted-engine --status
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> on the host I deployed it on, and
it returns nothing (exit
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is 1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> however). I could not ping it
either. So I tried starting
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it via
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 'hosted-engine
--vm-start' and it returned:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Virtual machine does not exist
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But it then became available. I
logged into it
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> successfully. It
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in the list of VMs however.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas why the hosted-engine
commands fail, and why it
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the list of virtual machines?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>
>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>> >>>>>>
_______________________________________________
>>>>>>>>>>>>>> >>>>>> Users
mailing list
>>>>>>>>>>>>>> >>>>>>
Users(a)ovirt.org
>>>>>>>>>>>>>> >>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >
_______________________________________________
>>>>>>>>>>>>>> > Users mailing list
>>>>>>>>>>>>>> > Users(a)ovirt.org
>>>>>>>>>>>>>> >
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>