I've had no other choice but to power up the old bare metal engine to
be able to start the VMs. This is probably really bad but I had to get
the VMs running.
I am guessing now that if the host is shutdown rather than simply
rebooted, that the VMs will not restart on powerup of the host. This
would not have been such a problem if the Hosted Engine started.
So I'm not sure where to go from here...
I guess it is start from scratch again?
On Fri, Jun 30, 2017 at 3:19 PM, cmc <iucounu(a)gmail.com> wrote:
Help! I put the cluster into global maintenance, then powered off
and
then on all of the nodes I have powered off and powered on all the
nodes. I have taken it out of global maintenance. No VM has started,
including the hosted engine. This is very bad. I am going to look
through logs to see why nothing has started. Help greatly appreciated.
Thanks,
Cam
On Fri, Jun 30, 2017 at 1:00 PM, cmc <iucounu(a)gmail.com> wrote:
> So I can run from any node: hosted-engine --set-maintenance
> --mode=global. By 'agents', you mean the ovirt-ha-agent, right? This
> shouldn't affect the running of any VMs, correct? Sorry for the
> questions, just want to do it correctly and not make assumptions :)
>
> Cheers,
>
> C
>
> On Fri, Jun 30, 2017 at 12:12 PM, Martin Sivak <msivak(a)redhat.com> wrote:
>> Hi,
>>
>>> Just to clarify: you mean the host_id in
>>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>>> correct?
>>
>> Exactly.
>>
>> Put the cluster to global maintenance first. Or kill all agents (has
>> the same effect).
>>
>> Martin
>>
>> On Fri, Jun 30, 2017 at 12:47 PM, cmc <iucounu(a)gmail.com> wrote:
>>> Just to clarify: you mean the host_id in
>>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>>> correct?
>>>
>>> On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak <msivak(a)redhat.com>
wrote:
>>>> Hi,
>>>>
>>>> cleaning metadata won't help in this case. Try transferring the
>>>> spm_ids you got from the engine to the proper hosted engine hosts so
>>>> the hosted engine ids match the spm_ids. Then restart all hosted
>>>> engine services. I would actually recommend restarting all hosts after
>>>> this change, but I have no idea how many VMs you have running.
>>>>
>>>> Martin
>>>>
>>>> On Thu, Jun 29, 2017 at 8:27 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>> Tried running a 'hosted-engine --clean-metadata" as per
>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>>>>> ovirt-ha-agent was not running anyway, but it fails with the
following
>>>>> error:
>>>>>
>>>>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>>>>> to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most
recent
>>>>> call last):
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>> line 191, in _run_agent
>>>>> return action(he)
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>> line 67, in action_clean
>>>>> return he.clean(options.force_cleanup)
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>> line 345, in clean
>>>>> self._initialize_domain_monitor()
>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>> line 823, in _initialize_domain_monitor
>>>>> raise Exception(msg)
>>>>> Exception: Failed to start monitoring domain
>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>> during domain acquisition
>>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart
agent
>>>>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent,
attempt '0'
>>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>>>>> occurred, giving up. Please review the log and consider filing a
bug.
>>>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>>>>
>>>>> On Thu, Jun 29, 2017 at 6:10 PM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>> Actually, it looks like sanlock problems:
>>>>>>
>>>>>> "SanlockInitializationError: Failed to initialize
sanlock, the
>>>>>> number of errors has exceeded the limit"
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>>> Sorry, I am mistaken, two hosts failed for the agent with the
following error:
>>>>>>>
>>>>>>> ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>>>>> ERROR Failed to start monitoring domain
>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1):
timeout
>>>>>>> during domain acquisition
>>>>>>> ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>>>>> ERROR Shutting down the agent because of 3 failures in a
row!
>>>>>>>
>>>>>>> What could cause these timeouts? Some other service not
running?
>>>>>>>
>>>>>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>> Both services are up on all three hosts. The broke logs
just report:
>>>>>>>>
>>>>>>>> Thread-6549::INFO::2017-06-29
>>>>>>>>
17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>>>>>>> Connection established
>>>>>>>> Thread-6549::INFO::2017-06-29
>>>>>>>>
17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>>>>>>> Connection closed
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Cam
>>>>>>>>
>>>>>>>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak
<msivak(a)redhat.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> please make sure that both ovirt-ha-agent and
ovirt-ha-broker services
>>>>>>>>> are restarted and up. The error says the agent
can't talk to the
>>>>>>>>> broker. Is there anything in the broker.log?
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>> Martin Sivak
>>>>>>>>>
>>>>>>>>> On Thu, Jun 29, 2017 at 4:42 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>> I've restarted those two services across all
hosts, have taken the
>>>>>>>>>> Hosted Engine host out of maintenance, and when I
try to migrate the
>>>>>>>>>> Hosted Engine over to another host, it reports
that all three hosts
>>>>>>>>>> 'did not satisfy internal filter HA because
it is not a Hosted Engine
>>>>>>>>>> host'.
>>>>>>>>>>
>>>>>>>>>> On the host that the Hosted Engine is currently
on it reports in the agent.log:
>>>>>>>>>>
>>>>>>>>>> ovirt-ha-agent
ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>>>>>>>>>> Connection closed: Connection closed
>>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]:
ovirt-ha-agent
>>>>>>>>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink
ERROR Exception
>>>>>>>>>> getting service path: Connection closed
>>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]:
ovirt-ha-agent
>>>>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR
Traceback (most recent
>>>>>>>>>> call last):
>>>>>>>>>>
File
>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>>>>> line 191, in _run_agent
>>>>>>>>>>
return action(he)
>>>>>>>>>>
File
>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>>>>> line 64, in action_proper
>>>>>>>>>>
return
>>>>>>>>>> he.start_monitoring()
>>>>>>>>>>
File
>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>> line 411, in start_monitoring
>>>>>>>>>>
self._initialize_sanlock()
>>>>>>>>>>
File
>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>> line 691, in _initialize_sanlock
>>>>>>>>>>
>>>>>>>>>> constants.SERVICE_TYPE +
constants.LOCKSPACE_EXTENSION)
>>>>>>>>>>
File
>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>>>>> line 162, in get_service_path
>>>>>>>>>>
.format(str(e)))
>>>>>>>>>>
RequestError: Failed
>>>>>>>>>> to get service path: Connection closed
>>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]:
ovirt-ha-agent
>>>>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR
Trying to restart agent
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 29, 2017 at 1:25 PM, Martin Sivak
<msivak(a)redhat.com> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> yep, you have to restart the ovirt-ha-agent
and ovirt-ha-broker services.
>>>>>>>>>>>
>>>>>>>>>>> The scheduling message just means that the
host has score 0 or is not
>>>>>>>>>>> reporting score at all.
>>>>>>>>>>>
>>>>>>>>>>> Martin
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 29, 2017 at 1:33 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>> Thanks Martin, do I have to restart
anything? When I try to use the
>>>>>>>>>>>> 'migrate' operation, it complains
that the other two hosts 'did not
>>>>>>>>>>>> satisfy internal filter HA because it is
not a Hosted Engine host..'
>>>>>>>>>>>> (even though I reinstalled both these
hosts with the 'deploy hosted
>>>>>>>>>>>> engine' option, which suggests that
something needs restarting. Should
>>>>>>>>>>>> I worry about the sanlock errors, or will
that be resolved by the
>>>>>>>>>>>> change in host_id?
>>>>>>>>>>>>
>>>>>>>>>>>> Kind regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Cam
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 29, 2017 at 12:22 PM, Martin
Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>> Change the ids so they are distinct.
I need to check if there is a way
>>>>>>>>>>>>> to read the SPM ids from the engine
as using the same numbers would be
>>>>>>>>>>>>> the best.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 29, 2017 at 12:46 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>> Is there any way of recovering
from this situation? I'd prefer to fix
>>>>>>>>>>>>>> the issue rather than re-deploy,
but if there is no recovery path, I
>>>>>>>>>>>>>> could perhaps try re-deploying
the hosted engine. In which case, would
>>>>>>>>>>>>>> the best option be to take a
backup of the Hosted Engine, and then
>>>>>>>>>>>>>> shut it down, re-initialise the
SAN partition (or use another
>>>>>>>>>>>>>> partition) and retry the
deployment? Would it be better to use the
>>>>>>>>>>>>>> older backup from the bare metal
engine that I originally used, or use
>>>>>>>>>>>>>> a backup from the Hosted Engine?
I'm not sure if any VMs have been
>>>>>>>>>>>>>> added since switching to Hosted
Engine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately I have very little
time left to get this working before
>>>>>>>>>>>>>> I have to hand it over for eval
(by end of Friday).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here are some log snippets from
the cluster that are current
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In /var/log/vdsm/vdsm.log on the
host that has the Hosted Engine:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-06-29 10:50:15,071+0100 INFO
(monitor/207221b) [storage.SANLock]
>>>>>>>>>>>>>> Acquiring host id for domain
207221b2-959b-426b-b945-18e1adfed62f (id:
>>>>>>>>>>>>>> 3) (clusterlock:282)
>>>>>>>>>>>>>> 2017-06-29 10:50:15,072+0100
ERROR (monitor/207221b) [storage.Monitor]
>>>>>>>>>>>>>> Error acquiring host id 3 for
domain
>>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f (monitor:558)
>>>>>>>>>>>>>> Traceback (most recent call
last):
>>>>>>>>>>>>>> File
"/usr/share/vdsm/storage/monitor.py", line 555, in _acquireHostId
>>>>>>>>>>>>>>
self.domain.acquireHostId(self.hostId, async=True)
>>>>>>>>>>>>>> File
"/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>>>>>>>>>>>>>
self._manifest.acquireHostId(hostId, async)
>>>>>>>>>>>>>> File
"/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>>>>>>>>>>>>>
self._domainLock.acquireHostId(hostId, async)
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>>>>>>>>>>>> line 297, in acquireHostId
>>>>>>>>>>>>>> raise
se.AcquireHostIdFailure(self._sdUUID, e)
>>>>>>>>>>>>>> AcquireHostIdFailure: Cannot
acquire host id:
>>>>>>>>>>>>>>
('207221b2-959b-426b-b945-18e1adfed62f', SanlockException(22, 'Sanlock
>>>>>>>>>>>>>> lockspace add failure',
'Invalid argument'))
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> From
/var/log/ovirt-hosted-engine-ha/agent.log on the same host:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>
13:30:50,592::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>>>>> Failed to start monitoring
domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>>>>>
13:30:50,593::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Error while monitoring engine:
Failed to start monitoring domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>>>>>
13:30:50,593::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Unexpected error
>>>>>>>>>>>>>> Traceback (most recent call
last):
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>>>>
self._initialize_domain_monitor()
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>>> line 823, in
_initialize_domain_monitor
>>>>>>>>>>>>>> raise Exception(msg)
>>>>>>>>>>>>>> Exception: Failed to start
monitoring domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>
13:30:50,593::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Shutting down the agent because
of 3 failures in a row!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> From sanlock.log:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-06-29 11:17:06+0100 1194149
[2530]: add_lockspace
>>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>> conflicts with name of list1 s5
>>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> From the two other hosts:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> host 2:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> vdsm.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-06-29 10:53:47,755+0100
ERROR (jsonrpc/4) [jsonrpc.JsonRpcServer]
>>>>>>>>>>>>>> Internal server error
(__init__:570)
>>>>>>>>>>>>>> Traceback (most recent call
last):
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
>>>>>>>>>>>>>> 565, in _handle_request
>>>>>>>>>>>>>> res = method(**params)
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
>>>>>>>>>>>>>> 202, in _dynamicMethod
>>>>>>>>>>>>>> result = fn(*methodArgs)
>>>>>>>>>>>>>> File
"/usr/share/vdsm/API.py", line 1454, in getAllVmIoTunePolicies
>>>>>>>>>>>>>> io_tune_policies_dict =
self._cif.getAllVmIoTunePolicies()
>>>>>>>>>>>>>> File
"/usr/share/vdsm/clientIF.py", line 448, in getAllVmIoTunePolicies
>>>>>>>>>>>>>> 'current_values':
v.getIoTune()}
>>>>>>>>>>>>>> File
"/usr/share/vdsm/virt/vm.py", line 2803, in getIoTune
>>>>>>>>>>>>>> result =
self.getIoTuneResponse()
>>>>>>>>>>>>>> File
"/usr/share/vdsm/virt/vm.py", line 2816, in getIoTuneResponse
>>>>>>>>>>>>>> res = self._dom.blockIoTune(
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line
>>>>>>>>>>>>>> 47, in __getattr__
>>>>>>>>>>>>>> % self.vmid)
>>>>>>>>>>>>>> NotConnectedError: VM
u'a79e6b0e-fff4-4cba-a02c-4c00be151300' was not
>>>>>>>>>>>>>> started yet or was shut down
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
/var/log/ovirt-hosted-engine-ha/agent.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:33,636::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>>>>>>>>>>>>>> Found OVF_STORE:
imgUUID:222610db-7880-4f4f-8559-a3635fd73555,
>>>>>>>>>>>>>>
volUUID:c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:33,926::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>> Extracting Engine VM OVF from the
OVF_STORE
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:33,938::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>> OVF_STORE volume path:
>>>>>>>>>>>>>>
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/images/222610db-7880-4f4f-8559-a3635fd73555/c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:33,967::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>> Found an OVF for HE VM, trying to
convert
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:33,971::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>> Got vm.conf from OVF_STORE
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:36,736::states::678::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>>>>>>>>>>>>>> Score is 0 due to unexpected vm
shutdown at Thu Jun 29 10:53:59 2017
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:36,736::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Current state
EngineUnexpectedlyDown (score: 0)
>>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>>
10:56:46,772::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
>>>>>>>>>>>>>> Reloading vm.conf from the shared
storage domain
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /var/log/messages:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jun 29 10:53:46 kvm-ldn-02
kernel: dd: sending ioctl 80306d02 to a partition!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> host 1:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /var/log/messages also in
sanlock.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jun 29 11:01:02 kvm-ldn-01
sanlock[2400]: 2017-06-29 11:01:02+0100
>>>>>>>>>>>>>> 678325 [9132]: s4531
delta_acquire host_id 1 busy1 1 2 1193177
>>>>>>>>>>>>>>
3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>>> Jun 29 11:01:03 kvm-ldn-01
sanlock[2400]: 2017-06-29 11:01:03+0100
>>>>>>>>>>>>>> 678326 [24159]: s4531
add_lockspace fail result -262
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
/var/log/ovirt-hosted-engine-ha/agent.log:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>>>>>
15:21:01,143::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>>>>> Failed to start monitoring
domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>>>>>
15:21:01,144::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Error while monitoring engine:
Failed to start monitoring domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>>>>>
15:21:01,144::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Unexpected error
>>>>>>>>>>>>>> Traceback (most recent call
last):
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>>>>
self._initialize_domain_monitor()
>>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>>> line 823, in
_initialize_domain_monitor
>>>>>>>>>>>>>> raise Exception(msg)
>>>>>>>>>>>>>> Exception: Failed to start
monitoring domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>>>>>
15:21:01,144::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>>> Shutting down the agent because
of 3 failures in a row!
>>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>>>
15:21:06,717::hosted_engine::848::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>>>>>>>>>>>>> VDSM domain monitor status:
PENDING
>>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>>>
15:21:09,335::hosted_engine::776::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>>>>>>>>>>>>> Failed to stop monitoring domain
>>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f): Storage domain is
>>>>>>>>>>>>>> member of pool:
u'domain=207221b2-959b-426b-b945-18e1adfed62f'
>>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>>>
15:21:09,339::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>>>>>>>>>>>> Agent shutting down
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 28, 2017 at 11:25 AM,
cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> yes, on two of the machines
they have the same host_id. The other has
>>>>>>>>>>>>>>> a different host_id.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To update since yesterday: I
reinstalled and deployed Hosted Engine on
>>>>>>>>>>>>>>> the other host (so all three
hosts in the cluster now have it
>>>>>>>>>>>>>>> installed). The second one I
deployed said it was able to host the
>>>>>>>>>>>>>>> engine (unlike the first I
reinstalled), so I tried putting the host
>>>>>>>>>>>>>>> with the Hosted Engine on it
into maintenance to see if it would
>>>>>>>>>>>>>>> migrate over. It managed to
move all hosts but the Hosted Engine. And
>>>>>>>>>>>>>>> now the host that said it was
able to host the engine says
>>>>>>>>>>>>>>> 'unavailable due to HA
score'. The host that it was trying to move
>>>>>>>>>>>>>>> from is now in 'preparing
for maintenance' for the last 12 hours.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The summary is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> kvm-ldn-01 - one of the
original, pre-Hosted Engine hosts, reinstalled
>>>>>>>>>>>>>>> with 'Deploy Hosted
Engine'. No icon saying it can host the Hosted
>>>>>>>>>>>>>>> Hngine, host_id of
'2' in /etc/ovirt-hosted-engine/hosted-engine.conf.
>>>>>>>>>>>>>>> 'add_lockspace' fails
in sanlock.log
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> kvm-ldn-02 - the other host
that was pre-existing before Hosted Engine
>>>>>>>>>>>>>>> was created. Reinstalled with
'Deploy Hosted Engine'. Had an icon
>>>>>>>>>>>>>>> saying that it was able to
host the Hosted Engine, but after migration
>>>>>>>>>>>>>>> was attempted when putting
kvm-ldn-03 into maintenance, it reports:
>>>>>>>>>>>>>>> 'unavailable due to HA
score'. It has a host_id of '1' in
>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf. No errors in sanlock.log
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> kvm-ldn-03 - this was the
host I deployed Hosted Engine on, which was
>>>>>>>>>>>>>>> not part of the original
cluster. I restored the bare-metal engine
>>>>>>>>>>>>>>> backup in the Hosted Engine
on this host when deploying it, without
>>>>>>>>>>>>>>> error. It currently has the
Hosted Engine on it (as the only VM after
>>>>>>>>>>>>>>> I put that host into
maintenance to test the HA of Hosted Engine).
>>>>>>>>>>>>>>> Sanlock log shows conflicts
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will look through all the
logs for any other errors. Please let me
>>>>>>>>>>>>>>> know if you need any logs or
other clarification/information.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Campbell
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jun 28, 2017 at 9:25
AM, Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> can you please check the
contents of
>>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf or
>>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine-ha/agent.conf (I am not sure which one it is
>>>>>>>>>>>>>>>> right now) and search for
host-id?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Make sure the IDs are
different. If they are not, then there is a bug somewhere.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at
6:26 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>> I see this on the
host it is trying to migrate in /var/log/sanlock:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-06-27
17:10:40+0100 527703 [2407]: s3528 lockspace
>>>>>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>>>>> 2017-06-27
17:13:00+0100 527843 [27446]: s3528 delta_acquire host_id 1
>>>>>>>>>>>>>>>>> busy1 1 2 1042692
3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>>>>>> 2017-06-27
17:13:01+0100 527844 [2407]: s3528 add_lockspace fail result -262
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The sanlock service
is running. Why would this occur?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> C
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017
at 5:21 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for the
reply. I have done this, and the deployment completed
>>>>>>>>>>>>>>>>>> without error.
However, it still will not allow the Hosted Engine
>>>>>>>>>>>>>>>>>> migrate to
another host. The
>>>>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf got created ok on the host
>>>>>>>>>>>>>>>>>> I re-installed,
but the ovirt-ha-broker.service, though it starts,
>>>>>>>>>>>>>>>>>> reports:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
--------------------8<-------------------
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Jun 27 14:58:26
kvm-ldn-01 systemd[1]: Starting oVirt Hosted Engine
>>>>>>>>>>>>>>>>>> High Availability
Communications Broker...
>>>>>>>>>>>>>>>>>> Jun 27 14:58:27
kvm-ldn-01 ovirt-ha-broker[6101]: ovirt-ha-broker
>>>>>>>>>>>>>>>>>>
ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker ERROR
>>>>>>>>>>>>>>>>>> Failed to read
metadata from
>>>>>>>>>>>>>>>>>>
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata
>>>>>>>>>>>>>>>>>>
Traceback (most
>>>>>>>>>>>>>>>>>> recent call
last):
>>>>>>>>>>>>>>>>>>
File
>>>>>>>>>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>>>>>>>>>>>> line 129, in
get_raw_stats_for_service_type
>>>>>>>>>>>>>>>>>>
f =
>>>>>>>>>>>>>>>>>> os.open(path,
direct_flag | os.O_RDONLY | os.O_SYNC)
>>>>>>>>>>>>>>>>>>
OSError: [Errno 2]
>>>>>>>>>>>>>>>>>> No such file or
directory:
>>>>>>>>>>>>>>>>>>
'/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
--------------------8<-------------------
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I checked the
path, and it exists. I can run 'less -f' on it fine. The
>>>>>>>>>>>>>>>>>> perms are
slightly different on the host that is running the VM vs the
>>>>>>>>>>>>>>>>>> one that is
reporting errors (600 vs 660), ownership is vdsm:qemu. Is
>>>>>>>>>>>>>>>>>> this a san
locking issue?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for any
help,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Jun 27,
2017 at 1:41 PM, Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>> Should it
be? It was not in the instructions for the migration from
>>>>>>>>>>>>>>>>>>>>
bare-metal to Hosted VM
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The hosted
engine will only migrate to hosts that have the services
>>>>>>>>>>>>>>>>>>> running.
Please put one other host to maintenance and select Hosted
>>>>>>>>>>>>>>>>>>> engine
action: DEPLOY in the reinstall dialog.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Jun
27, 2017 at 1:23 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> I changed
the 'os.other.devices.display.protocols.value.3.6 =
>>>>>>>>>>>>>>>>>>>>
spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display protocols
>>>>>>>>>>>>>>>>>>>> as 4 and
the hosted engine now appears in the list of VMs. I am
>>>>>>>>>>>>>>>>>>>> guessing
the compatibility version was causing it to use the 3.6
>>>>>>>>>>>>>>>>>>>> version.
However, I am still unable to migrate the engine VM to
>>>>>>>>>>>>>>>>>>>> another
host. When I try putting the host it is currently on into
>>>>>>>>>>>>>>>>>>>>
maintenance, it reports:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Error
while executing action: Cannot switch the Host(s) to Maintenance mode.
>>>>>>>>>>>>>>>>>>>> There are
no available hosts capable of running the engine VM.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Running
'hosted-engine --vm-status' still shows 'Engine status:
>>>>>>>>>>>>>>>>>>>> unknown
stale-data'.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The
ovirt-ha-broker service is only running on one host. It was set to
>>>>>>>>>>>>>>>>>>>>
'disabled' in systemd. It won't start as there is no
>>>>>>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf on the other two hosts.
>>>>>>>>>>>>>>>>>>>> Should it
be? It was not in the instructions for the migration from
>>>>>>>>>>>>>>>>>>>>
bare-metal to Hosted VM
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu,
Jun 22, 2017 at 1:07 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>> Hi
Tomas,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> So in
my /usr/share/ovirt-engine/conf/osinfo-defaults.properties on my
>>>>>>>>>>>>>>>>>>>>>
engine VM, I have:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
os.other.devices.display.protocols.value = spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>>>
os.other.devices.display.protocols.value.3.6 = spice/qxl,vnc/cirrus,vnc/qxl
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> That
seems to match - I assume since this is 4.1, the 3.6 should not apply
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Is
there somewhere else I should be looking?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On
Thu, Jun 22, 2017 at 11:40 AM, Tomas Jelinek <tjelinek(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek
>>>>>>>>>>>>>>>>>>>>>>
<michal.skrivanek(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
> On 22 Jun 2017, at 12:31, Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>>
> Tomas, what fields are needed in a VM to pass the check that causes
>>>>>>>>>>>>>>>>>>>>>>>
> the following error?
>>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
to match the OS and VM Display type;-)
>>>>>>>>>>>>>>>>>>>>>>>
Configuration is in osinfo….e.g. if that is import from older releases on
>>>>>>>>>>>>>>>>>>>>>>>
Linux this is typically caused by the cahgen of cirrus to vga for non-SPICE
>>>>>>>>>>>>>>>>>>>>>>>
VMs
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
yep, the default supported combinations for 4.0+ is this:
>>>>>>>>>>>>>>>>>>>>>>
os.other.devices.display.protocols.value =
>>>>>>>>>>>>>>>>>>>>>>
spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>>
> Thanks.
>>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>>
> On Thu, Jun 22, 2017 at 12:19 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>> Hi Martin,
>>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>> just as a random comment, do you still have the database backup from
>>>>>>>>>>>>>>>>>>>>>>>
>>> the bare metal -> VM attempt? It might be possible to just try again
>>>>>>>>>>>>>>>>>>>>>>>
>>> using it. Or in the worst case.. update the offending value there
>>>>>>>>>>>>>>>>>>>>>>>
>>> before restoring it to the new engine instance.
>>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>>
>> I still have the backup. I'd rather do the latter, as re-running the
>>>>>>>>>>>>>>>>>>>>>>>
>> HE deployment is quite lengthy and involved (I have to re-initialise
>>>>>>>>>>>>>>>>>>>>>>>
>> the FC storage each time). Do you know what the offending value(s)
>>>>>>>>>>>>>>>>>>>>>>>
>> would be? Would it be in the Postgres DB or in a config file
>>>>>>>>>>>>>>>>>>>>>>>
>> somewhere?
>>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>>
>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>>
>> Cam
>>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>>
>>> Regards
>>>>>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>> On Thu, Jun 22, 2017 at 11:39 AM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>> Hi Yanir,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>> Thanks for the reply.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> First of all, maybe a chain reaction of :
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> is causing the hosted engine vm not to be set up correctly and
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> further
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> actions were made when the hosted engine vm wasnt in a stable state.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> As for now, are you trying to revert back to a previous/initial
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> state ?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>> I'm not trying to revert it to a previous state for now. This was a
>>>>>>>>>>>>>>>>>>>>>>>
>>>> migration from a bare metal engine, and it didn't report any error
>>>>>>>>>>>>>>>>>>>>>>>
>>>> during the migration. I'd had some problems on my first attempts at
>>>>>>>>>>>>>>>>>>>>>>>
>>>> this migration, whereby it never completed (due to a proxy issue) but
>>>>>>>>>>>>>>>>>>>>>>>
>>>> I managed to resolve this. Do you know of a way to get the Hosted
>>>>>>>>>>>>>>>>>>>>>>>
>>>> Engine VM into a stable state, without rebuilding the entire cluster
>>>>>>>>>>>>>>>>>>>>>>>
>>>> from scratch (since I have a lot of VMs on it)?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>> Thanks for any help.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> Yanir
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Hi Jenny/Martin,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Any idea what I can do here? The hosted engine VM has no log on
any
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> host in /var/log/libvirt/qemu, and I fear that if I need to put
the
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> host into maintenance, e.g., to upgrade it that I created it on
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> (which
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> I think is hosting it), or if it fails for any reason, it
won't get
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> migrated to another host, and I will not be able to manage the
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> cluster. It seems to be a very dangerous position to be in.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Thanks Martin. The hosts are all part of the same cluster.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> I get these errors in the engine.log on the engine:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 2017-06-19 03:28:05,030Z WARN
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> failed for user SYST
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> EM. Reasons:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 2017-06-19 03:28:05,030Z INFO
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock freed to object
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 'EngineLock:{exclusiveLocks='[a
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName
HostedEngine>,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> HostedEngine=<VM_NAME,
ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> sharedLocks=
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName
HostedEngine>]'}'
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 2017-06-19 03:28:05,030Z ERROR
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed importing the
Hosted
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Engine VM
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> The sanlock.log reports conflicts on that same host, and a
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> error on the other hosts, not sure if they are related.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> And this in the /var/log/ovirt-hosted-engine-ha/agent log on
the
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> host
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> which I deployed the hosted engine VM on:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Unable to extract HEVM OVF
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Failed extracting VM OVF from the OVF_STORE volume, falling
back
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> initial vm.conf
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> I've seen some of these issues reported in bugzilla, but
they were
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> older versions of oVirt (and appear to be resolved).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> I will install that package on the other two hosts, for which
I
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> put them in maintenance as vdsm is installed as an upgrade.
I
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> guess
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> restarting vdsm is a good idea after that?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak
<msivak(a)redhat.com>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> you do not have to install it on all hosts. But you
should have
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> more
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> than one and ideally all hosted engine enabled nodes
should
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> belong to
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> the same engine cluster.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Does ovirt-hosted-engine-ha need to be installed
across all
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> hosts?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Could that be the reason it is failing to see it
properly?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Logs are attached. I can see errors in there, but
am unsure how
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> they
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> arose.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> <etokar(a)redhat.com>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> From the output it looks like the agent is
down, try starting
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> it by
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> running:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> systemctl start ovirt-ha-agent.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> The engine is supposed to see the hosted
engine storage domain
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> import it
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> to the system, then it should import the
hosted engine vm.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> Can you attach the agent log from the host
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> and the engine log from the engine vm
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (/var/log/ovirt-engine/engine.log)?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> Jenny
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc
<iucounu(a)gmail.com>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> What version are you running?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> 4.1.2.2-1.el7.centos
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> For the hosted engine vm to be
imported and displayed in the
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> engine, you
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> must first create a master storage
domain.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> To provide a bit more detail: this was a
migration of a
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> bare-metal
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> engine in an existing cluster to a hosted
engine VM for that
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> cluster.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> As part of this migration, I built an
entirely new host and
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> ran
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> 'hosted-engine --deploy'
(followed these instructions:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_M...).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> I restored the backup from the engine and
it completed
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> without any
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> errors. I didn't see any instructions
regarding a master
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> domain in the page above. The cluster has
two existing master
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> domains, one is fibre channel, which is
up, and one ISO
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> domain,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> is currently offline.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you mean the hosted engine
commands are failing?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> What
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> happens
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> you run hosted-engine --vm-status
now?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Interestingly, whereas when I ran it
before, it exited with
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> output
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> and a return code of '1', it now
reports:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> --== Host 1 status ==--
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> conf_on_shared_storage :
True
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Status up-to-date :
False
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Hostname :
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Host ID : 1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Engine status :
unknown stale-data
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Score : 0
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> stopped :
True
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Local maintenance :
False
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> crc32 :
0217f07b
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> local_conf_timestamp :
2911
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Host timestamp :
2897
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Extra metadata (valid at timestamp):
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> metadata_parse_version=1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> metadata_feature_version=1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> timestamp=2897 (Thu Jun 15
16:22:54 2017)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> host-id=1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> score=0
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> vm_conf_refresh_time=2911 (Thu Jun
15 16:23:08 2017)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> conf_on_shared_storage=True
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> maintenance=False
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> state=AgentStopped
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> stopped=True
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Yet I can login to the web GUI fine. I
guess it is not HA due
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> in an unknown state currently? Does the
hosted-engine-ha rpm
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> be installed across all nodes in the
cluster, btw?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> Jenny Tokar
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc
<iucounu(a)gmail.com>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've migrated from a
bare-metal engine to a hosted engine.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> were
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> no errors during the install,
however, the hosted engine
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> did not
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> started. I tried running:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> hosted-engine --status
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> on the host I deployed it on, and
it returns nothing (exit
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is 1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> however). I could not ping it
either. So I tried starting
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it via
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 'hosted-engine
--vm-start' and it returned:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Virtual machine does not exist
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But it then became available. I
logged into it
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> successfully. It
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in the list of VMs however.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas why the hosted-engine
commands fail, and why it
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the list of virtual machines?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>
> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>>
> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>>
>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>