Help! I put the cluster into global maintenance, then powered off and
then on all of the nodes I have powered off and powered on all the
nodes. I have taken it out of global maintenance. No VM has started,
including the hosted engine. This is very bad. I am going to look
through logs to see why nothing has started. Help greatly appreciated.
Thanks,
Cam
On Fri, Jun 30, 2017 at 1:00 PM, cmc <iucounu(a)gmail.com> wrote:
So I can run from any node: hosted-engine --set-maintenance
--mode=global. By 'agents', you mean the ovirt-ha-agent, right? This
shouldn't affect the running of any VMs, correct? Sorry for the
questions, just want to do it correctly and not make assumptions :)
Cheers,
C
On Fri, Jun 30, 2017 at 12:12 PM, Martin Sivak <msivak(a)redhat.com> wrote:
> Hi,
>
>> Just to clarify: you mean the host_id in
>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>> correct?
>
> Exactly.
>
> Put the cluster to global maintenance first. Or kill all agents (has
> the same effect).
>
> Martin
>
> On Fri, Jun 30, 2017 at 12:47 PM, cmc <iucounu(a)gmail.com> wrote:
>> Just to clarify: you mean the host_id in
>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>> correct?
>>
>> On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak <msivak(a)redhat.com> wrote:
>>> Hi,
>>>
>>> cleaning metadata won't help in this case. Try transferring the
>>> spm_ids you got from the engine to the proper hosted engine hosts so
>>> the hosted engine ids match the spm_ids. Then restart all hosted
>>> engine services. I would actually recommend restarting all hosts after
>>> this change, but I have no idea how many VMs you have running.
>>>
>>> Martin
>>>
>>> On Thu, Jun 29, 2017 at 8:27 PM, cmc <iucounu(a)gmail.com> wrote:
>>>> Tried running a 'hosted-engine --clean-metadata" as per
>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>>>> ovirt-ha-agent was not running anyway, but it fails with the following
>>>> error:
>>>>
>>>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>>>> to start monitoring domain
>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>> during domain acquisition
>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>>>> call last):
>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 191, in _run_agent
>>>> return action(he)
>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 67, in action_clean
>>>> return he.clean(options.force_cleanup)
>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 345, in clean
>>>> self._initialize_domain_monitor()
>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 823, in _initialize_domain_monitor
>>>> raise Exception(msg)
>>>> Exception: Failed to start monitoring domain
>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>> during domain acquisition
>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>>>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent,
attempt '0'
>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>>>> occurred, giving up. Please review the log and consider filing a bug.
>>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>>>
>>>> On Thu, Jun 29, 2017 at 6:10 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>> Actually, it looks like sanlock problems:
>>>>>
>>>>> "SanlockInitializationError: Failed to initialize sanlock,
the
>>>>> number of errors has exceeded the limit"
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>> Sorry, I am mistaken, two hosts failed for the agent with the
following error:
>>>>>>
>>>>>> ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>>>> ERROR Failed to start monitoring domain
>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1):
timeout
>>>>>> during domain acquisition
>>>>>> ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>>>> ERROR Shutting down the agent because of 3 failures in a row!
>>>>>>
>>>>>> What could cause these timeouts? Some other service not running?
>>>>>>
>>>>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>>> Both services are up on all three hosts. The broke logs just
report:
>>>>>>>
>>>>>>> Thread-6549::INFO::2017-06-29
>>>>>>>
17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>>>>>> Connection established
>>>>>>> Thread-6549::INFO::2017-06-29
>>>>>>>
17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>>>>>> Connection closed
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Cam
>>>>>>>
>>>>>>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak
<msivak(a)redhat.com> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> please make sure that both ovirt-ha-agent and
ovirt-ha-broker services
>>>>>>>> are restarted and up. The error says the agent can't
talk to the
>>>>>>>> broker. Is there anything in the broker.log?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> Martin Sivak
>>>>>>>>
>>>>>>>> On Thu, Jun 29, 2017 at 4:42 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>> I've restarted those two services across all
hosts, have taken the
>>>>>>>>> Hosted Engine host out of maintenance, and when I try
to migrate the
>>>>>>>>> Hosted Engine over to another host, it reports that
all three hosts
>>>>>>>>> 'did not satisfy internal filter HA because it is
not a Hosted Engine
>>>>>>>>> host'.
>>>>>>>>>
>>>>>>>>> On the host that the Hosted Engine is currently on it
reports in the agent.log:
>>>>>>>>>
>>>>>>>>> ovirt-ha-agent
ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>>>>>>>>> Connection closed: Connection closed
>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]:
ovirt-ha-agent
>>>>>>>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink
ERROR Exception
>>>>>>>>> getting service path: Connection closed
>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]:
ovirt-ha-agent
>>>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR
Traceback (most recent
>>>>>>>>> call last):
>>>>>>>>>
File
>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>>>> line 191, in _run_agent
>>>>>>>>>
return action(he)
>>>>>>>>>
File
>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>>>> line 64, in action_proper
>>>>>>>>>
return
>>>>>>>>> he.start_monitoring()
>>>>>>>>>
File
>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>> line 411, in start_monitoring
>>>>>>>>>
self._initialize_sanlock()
>>>>>>>>>
File
>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>> line 691, in _initialize_sanlock
>>>>>>>>>
>>>>>>>>> constants.SERVICE_TYPE +
constants.LOCKSPACE_EXTENSION)
>>>>>>>>>
File
>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>>>> line 162, in get_service_path
>>>>>>>>>
.format(str(e)))
>>>>>>>>>
RequestError: Failed
>>>>>>>>> to get service path: Connection closed
>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]:
ovirt-ha-agent
>>>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying
to restart agent
>>>>>>>>>
>>>>>>>>> On Thu, Jun 29, 2017 at 1:25 PM, Martin Sivak
<msivak(a)redhat.com> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> yep, you have to restart the ovirt-ha-agent and
ovirt-ha-broker services.
>>>>>>>>>>
>>>>>>>>>> The scheduling message just means that the host
has score 0 or is not
>>>>>>>>>> reporting score at all.
>>>>>>>>>>
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 29, 2017 at 1:33 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>> Thanks Martin, do I have to restart anything?
When I try to use the
>>>>>>>>>>> 'migrate' operation, it complains
that the other two hosts 'did not
>>>>>>>>>>> satisfy internal filter HA because it is not
a Hosted Engine host..'
>>>>>>>>>>> (even though I reinstalled both these hosts
with the 'deploy hosted
>>>>>>>>>>> engine' option, which suggests that
something needs restarting. Should
>>>>>>>>>>> I worry about the sanlock errors, or will
that be resolved by the
>>>>>>>>>>> change in host_id?
>>>>>>>>>>>
>>>>>>>>>>> Kind regards,
>>>>>>>>>>>
>>>>>>>>>>> Cam
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 29, 2017 at 12:22 PM, Martin
Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>> Change the ids so they are distinct. I
need to check if there is a way
>>>>>>>>>>>> to read the SPM ids from the engine as
using the same numbers would be
>>>>>>>>>>>> the best.
>>>>>>>>>>>>
>>>>>>>>>>>> Martin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 29, 2017 at 12:46 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>> Is there any way of recovering from
this situation? I'd prefer to fix
>>>>>>>>>>>>> the issue rather than re-deploy, but
if there is no recovery path, I
>>>>>>>>>>>>> could perhaps try re-deploying the
hosted engine. In which case, would
>>>>>>>>>>>>> the best option be to take a backup
of the Hosted Engine, and then
>>>>>>>>>>>>> shut it down, re-initialise the SAN
partition (or use another
>>>>>>>>>>>>> partition) and retry the deployment?
Would it be better to use the
>>>>>>>>>>>>> older backup from the bare metal
engine that I originally used, or use
>>>>>>>>>>>>> a backup from the Hosted Engine?
I'm not sure if any VMs have been
>>>>>>>>>>>>> added since switching to Hosted
Engine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately I have very little time
left to get this working before
>>>>>>>>>>>>> I have to hand it over for eval (by
end of Friday).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here are some log snippets from the
cluster that are current
>>>>>>>>>>>>>
>>>>>>>>>>>>> In /var/log/vdsm/vdsm.log on the host
that has the Hosted Engine:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-06-29 10:50:15,071+0100 INFO
(monitor/207221b) [storage.SANLock]
>>>>>>>>>>>>> Acquiring host id for domain
207221b2-959b-426b-b945-18e1adfed62f (id:
>>>>>>>>>>>>> 3) (clusterlock:282)
>>>>>>>>>>>>> 2017-06-29 10:50:15,072+0100 ERROR
(monitor/207221b) [storage.Monitor]
>>>>>>>>>>>>> Error acquiring host id 3 for domain
>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f
(monitor:558)
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>> File
"/usr/share/vdsm/storage/monitor.py", line 555, in _acquireHostId
>>>>>>>>>>>>>
self.domain.acquireHostId(self.hostId, async=True)
>>>>>>>>>>>>> File
"/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>>>>>>>>>>>>
self._manifest.acquireHostId(hostId, async)
>>>>>>>>>>>>> File
"/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>>>>>>>>>>>>
self._domainLock.acquireHostId(hostId, async)
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>>>>>>>>>>> line 297, in acquireHostId
>>>>>>>>>>>>> raise
se.AcquireHostIdFailure(self._sdUUID, e)
>>>>>>>>>>>>> AcquireHostIdFailure: Cannot acquire
host id:
>>>>>>>>>>>>>
('207221b2-959b-426b-b945-18e1adfed62f', SanlockException(22, 'Sanlock
>>>>>>>>>>>>> lockspace add failure',
'Invalid argument'))
>>>>>>>>>>>>>
>>>>>>>>>>>>> From
/var/log/ovirt-hosted-engine-ha/agent.log on the same host:
>>>>>>>>>>>>>
>>>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>
13:30:50,592::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>>>> Failed to start monitoring domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>>>>
13:30:50,593::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Error while monitoring engine: Failed
to start monitoring domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>>>>
13:30:50,593::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Unexpected error
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>>>
self._initialize_domain_monitor()
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 823, in
_initialize_domain_monitor
>>>>>>>>>>>>> raise Exception(msg)
>>>>>>>>>>>>> Exception: Failed to start monitoring
domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>
13:30:50,593::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Shutting down the agent because of 3
failures in a row!
>>>>>>>>>>>>>
>>>>>>>>>>>>> From sanlock.log:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-06-29 11:17:06+0100 1194149
[2530]: add_lockspace
>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>> conflicts with name of list1 s5
>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>
>>>>>>>>>>>>> From the two other hosts:
>>>>>>>>>>>>>
>>>>>>>>>>>>> host 2:
>>>>>>>>>>>>>
>>>>>>>>>>>>> vdsm.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-06-29 10:53:47,755+0100 ERROR
(jsonrpc/4) [jsonrpc.JsonRpcServer]
>>>>>>>>>>>>> Internal server error (__init__:570)
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
>>>>>>>>>>>>> 565, in _handle_request
>>>>>>>>>>>>> res = method(**params)
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
>>>>>>>>>>>>> 202, in _dynamicMethod
>>>>>>>>>>>>> result = fn(*methodArgs)
>>>>>>>>>>>>> File
"/usr/share/vdsm/API.py", line 1454, in getAllVmIoTunePolicies
>>>>>>>>>>>>> io_tune_policies_dict =
self._cif.getAllVmIoTunePolicies()
>>>>>>>>>>>>> File
"/usr/share/vdsm/clientIF.py", line 448, in getAllVmIoTunePolicies
>>>>>>>>>>>>> 'current_values':
v.getIoTune()}
>>>>>>>>>>>>> File
"/usr/share/vdsm/virt/vm.py", line 2803, in getIoTune
>>>>>>>>>>>>> result =
self.getIoTuneResponse()
>>>>>>>>>>>>> File
"/usr/share/vdsm/virt/vm.py", line 2816, in getIoTuneResponse
>>>>>>>>>>>>> res = self._dom.blockIoTune(
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line
>>>>>>>>>>>>> 47, in __getattr__
>>>>>>>>>>>>> % self.vmid)
>>>>>>>>>>>>> NotConnectedError: VM
u'a79e6b0e-fff4-4cba-a02c-4c00be151300' was not
>>>>>>>>>>>>> started yet or was shut down
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/var/log/ovirt-hosted-engine-ha/agent.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:33,636::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>>>>>>>>>>>>> Found OVF_STORE:
imgUUID:222610db-7880-4f4f-8559-a3635fd73555,
>>>>>>>>>>>>>
volUUID:c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:33,926::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>> Extracting Engine VM OVF from the
OVF_STORE
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:33,938::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>> OVF_STORE volume path:
>>>>>>>>>>>>>
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/images/222610db-7880-4f4f-8559-a3635fd73555/c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:33,967::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>> Found an OVF for HE VM, trying to
convert
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:33,971::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>> Got vm.conf from OVF_STORE
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:36,736::states::678::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>>>>>>>>>>>>> Score is 0 due to unexpected vm
shutdown at Thu Jun 29 10:53:59 2017
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:36,736::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Current state EngineUnexpectedlyDown
(score: 0)
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>>
10:56:46,772::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
>>>>>>>>>>>>> Reloading vm.conf from the shared
storage domain
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/messages:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jun 29 10:53:46 kvm-ldn-02 kernel:
dd: sending ioctl 80306d02 to a partition!
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> host 1:
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/messages also in
sanlock.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jun 29 11:01:02 kvm-ldn-01
sanlock[2400]: 2017-06-29 11:01:02+0100
>>>>>>>>>>>>> 678325 [9132]: s4531 delta_acquire
host_id 1 busy1 1 2 1193177
>>>>>>>>>>>>>
3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>> Jun 29 11:01:03 kvm-ldn-01
sanlock[2400]: 2017-06-29 11:01:03+0100
>>>>>>>>>>>>> 678326 [24159]: s4531 add_lockspace
fail result -262
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/var/log/ovirt-hosted-engine-ha/agent.log:
>>>>>>>>>>>>>
>>>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>>>>
15:21:01,143::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>>>> Failed to start monitoring domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>>>>
15:21:01,144::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Error while monitoring engine: Failed
to start monitoring domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>>>>
15:21:01,144::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Unexpected error
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>>>
self._initialize_domain_monitor()
>>>>>>>>>>>>> File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 823, in
_initialize_domain_monitor
>>>>>>>>>>>>> raise Exception(msg)
>>>>>>>>>>>>> Exception: Failed to start monitoring
domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>>>>
15:21:01,144::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Shutting down the agent because of 3
failures in a row!
>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>>
15:21:06,717::hosted_engine::848::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>>>>>>>>>>>> VDSM domain monitor status: PENDING
>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>>
15:21:09,335::hosted_engine::776::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>>>>>>>>>>>> Failed to stop monitoring domain
>>>>>>>>>>>>>
(sd_uuid=207221b2-959b-426b-b945-18e1adfed62f): Storage domain is
>>>>>>>>>>>>> member of pool:
u'domain=207221b2-959b-426b-b945-18e1adfed62f'
>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>>
15:21:09,339::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>>>>>>>>>>> Agent shutting down
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 28, 2017 at 11:25 AM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> yes, on two of the machines they
have the same host_id. The other has
>>>>>>>>>>>>>> a different host_id.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To update since yesterday: I
reinstalled and deployed Hosted Engine on
>>>>>>>>>>>>>> the other host (so all three
hosts in the cluster now have it
>>>>>>>>>>>>>> installed). The second one I
deployed said it was able to host the
>>>>>>>>>>>>>> engine (unlike the first I
reinstalled), so I tried putting the host
>>>>>>>>>>>>>> with the Hosted Engine on it into
maintenance to see if it would
>>>>>>>>>>>>>> migrate over. It managed to move
all hosts but the Hosted Engine. And
>>>>>>>>>>>>>> now the host that said it was
able to host the engine says
>>>>>>>>>>>>>> 'unavailable due to HA
score'. The host that it was trying to move
>>>>>>>>>>>>>> from is now in 'preparing for
maintenance' for the last 12 hours.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The summary is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kvm-ldn-01 - one of the original,
pre-Hosted Engine hosts, reinstalled
>>>>>>>>>>>>>> with 'Deploy Hosted
Engine'. No icon saying it can host the Hosted
>>>>>>>>>>>>>> Hngine, host_id of '2' in
/etc/ovirt-hosted-engine/hosted-engine.conf.
>>>>>>>>>>>>>> 'add_lockspace' fails in
sanlock.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kvm-ldn-02 - the other host that
was pre-existing before Hosted Engine
>>>>>>>>>>>>>> was created. Reinstalled with
'Deploy Hosted Engine'. Had an icon
>>>>>>>>>>>>>> saying that it was able to host
the Hosted Engine, but after migration
>>>>>>>>>>>>>> was attempted when putting
kvm-ldn-03 into maintenance, it reports:
>>>>>>>>>>>>>> 'unavailable due to HA
score'. It has a host_id of '1' in
>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf. No errors in sanlock.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kvm-ldn-03 - this was the host I
deployed Hosted Engine on, which was
>>>>>>>>>>>>>> not part of the original cluster.
I restored the bare-metal engine
>>>>>>>>>>>>>> backup in the Hosted Engine on
this host when deploying it, without
>>>>>>>>>>>>>> error. It currently has the
Hosted Engine on it (as the only VM after
>>>>>>>>>>>>>> I put that host into maintenance
to test the HA of Hosted Engine).
>>>>>>>>>>>>>> Sanlock log shows conflicts
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will look through all the logs
for any other errors. Please let me
>>>>>>>>>>>>>> know if you need any logs or
other clarification/information.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Campbell
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 28, 2017 at 9:25 AM,
Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> can you please check the
contents of
>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf or
>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine-ha/agent.conf (I am not sure which one it is
>>>>>>>>>>>>>>> right now) and search for
host-id?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Make sure the IDs are
different. If they are not, then there is a bug somewhere.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 6:26
PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>> I see this on the host it
is trying to migrate in /var/log/sanlock:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-06-27 17:10:40+0100
527703 [2407]: s3528 lockspace
>>>>>>>>>>>>>>>>
207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>>>> 2017-06-27 17:13:00+0100
527843 [27446]: s3528 delta_acquire host_id 1
>>>>>>>>>>>>>>>> busy1 1 2 1042692
3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>>>>> 2017-06-27 17:13:01+0100
527844 [2407]: s3528 add_lockspace fail result -262
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The sanlock service is
running. Why would this occur?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> C
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at
5:21 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for the reply.
I have done this, and the deployment completed
>>>>>>>>>>>>>>>>> without error.
However, it still will not allow the Hosted Engine
>>>>>>>>>>>>>>>>> migrate to another
host. The
>>>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf got created ok on the host
>>>>>>>>>>>>>>>>> I re-installed, but
the ovirt-ha-broker.service, though it starts,
>>>>>>>>>>>>>>>>> reports:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
--------------------8<-------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Jun 27 14:58:26
kvm-ldn-01 systemd[1]: Starting oVirt Hosted Engine
>>>>>>>>>>>>>>>>> High Availability
Communications Broker...
>>>>>>>>>>>>>>>>> Jun 27 14:58:27
kvm-ldn-01 ovirt-ha-broker[6101]: ovirt-ha-broker
>>>>>>>>>>>>>>>>>
ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker ERROR
>>>>>>>>>>>>>>>>> Failed to read
metadata from
>>>>>>>>>>>>>>>>>
/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata
>>>>>>>>>>>>>>>>>
Traceback (most
>>>>>>>>>>>>>>>>> recent call last):
>>>>>>>>>>>>>>>>>
File
>>>>>>>>>>>>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>>>>>>>>>>> line 129, in
get_raw_stats_for_service_type
>>>>>>>>>>>>>>>>>
f =
>>>>>>>>>>>>>>>>> os.open(path,
direct_flag | os.O_RDONLY | os.O_SYNC)
>>>>>>>>>>>>>>>>>
OSError: [Errno 2]
>>>>>>>>>>>>>>>>> No such file or
directory:
>>>>>>>>>>>>>>>>>
'/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
--------------------8<-------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I checked the path,
and it exists. I can run 'less -f' on it fine. The
>>>>>>>>>>>>>>>>> perms are slightly
different on the host that is running the VM vs the
>>>>>>>>>>>>>>>>> one that is reporting
errors (600 vs 660), ownership is vdsm:qemu. Is
>>>>>>>>>>>>>>>>> this a san locking
issue?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017
at 1:41 PM, Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>> Should it be?
It was not in the instructions for the migration from
>>>>>>>>>>>>>>>>>>> bare-metal to
Hosted VM
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The hosted engine
will only migrate to hosts that have the services
>>>>>>>>>>>>>>>>>> running. Please
put one other host to maintenance and select Hosted
>>>>>>>>>>>>>>>>>> engine action:
DEPLOY in the reinstall dialog.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Jun 27,
2017 at 1:23 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> I changed the
'os.other.devices.display.protocols.value.3.6 =
>>>>>>>>>>>>>>>>>>>
spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display protocols
>>>>>>>>>>>>>>>>>>> as 4 and the
hosted engine now appears in the list of VMs. I am
>>>>>>>>>>>>>>>>>>> guessing the
compatibility version was causing it to use the 3.6
>>>>>>>>>>>>>>>>>>> version.
However, I am still unable to migrate the engine VM to
>>>>>>>>>>>>>>>>>>> another host.
When I try putting the host it is currently on into
>>>>>>>>>>>>>>>>>>> maintenance,
it reports:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Error while
executing action: Cannot switch the Host(s) to Maintenance mode.
>>>>>>>>>>>>>>>>>>> There are no
available hosts capable of running the engine VM.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Running
'hosted-engine --vm-status' still shows 'Engine status:
>>>>>>>>>>>>>>>>>>> unknown
stale-data'.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The
ovirt-ha-broker service is only running on one host. It was set to
>>>>>>>>>>>>>>>>>>>
'disabled' in systemd. It won't start as there is no
>>>>>>>>>>>>>>>>>>>
/etc/ovirt-hosted-engine/hosted-engine.conf on the other two hosts.
>>>>>>>>>>>>>>>>>>> Should it be?
It was not in the instructions for the migration from
>>>>>>>>>>>>>>>>>>> bare-metal to
Hosted VM
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Jun
22, 2017 at 1:07 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> Hi
Tomas,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> So in my
/usr/share/ovirt-engine/conf/osinfo-defaults.properties on my
>>>>>>>>>>>>>>>>>>>> engine
VM, I have:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
os.other.devices.display.protocols.value = spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>>
os.other.devices.display.protocols.value.3.6 = spice/qxl,vnc/cirrus,vnc/qxl
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> That
seems to match - I assume since this is 4.1, the 3.6 should not apply
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is there
somewhere else I should be looking?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu,
Jun 22, 2017 at 11:40 AM, Tomas Jelinek <tjelinek(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On
Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek
>>>>>>>>>>>>>>>>>>>>>
<michal.skrivanek(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
> On 22 Jun 2017, at 12:31, Martin Sivak <msivak(a)redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>
> Tomas, what fields are needed in a VM to pass the check that causes
>>>>>>>>>>>>>>>>>>>>>>
> the following error?
>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>
>>>>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>
>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>>>>>>>>>>
>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>>
>>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
to match the OS and VM Display type;-)
>>>>>>>>>>>>>>>>>>>>>>
Configuration is in osinfo….e.g. if that is import from older releases on
>>>>>>>>>>>>>>>>>>>>>>
Linux this is typically caused by the cahgen of cirrus to vga for non-SPICE
>>>>>>>>>>>>>>>>>>>>>>
VMs
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> yep,
the default supported combinations for 4.0+ is this:
>>>>>>>>>>>>>>>>>>>>>
os.other.devices.display.protocols.value =
>>>>>>>>>>>>>>>>>>>>>
spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>
> Thanks.
>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>
> On Thu, Jun 22, 2017 at 12:19 PM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>> Hi Martin,
>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>>>>
>>> just as a random comment, do you still have the database backup from
>>>>>>>>>>>>>>>>>>>>>>
>>> the bare metal -> VM attempt? It might be possible to just try again
>>>>>>>>>>>>>>>>>>>>>>
>>> using it. Or in the worst case.. update the offending value there
>>>>>>>>>>>>>>>>>>>>>>
>>> before restoring it to the new engine instance.
>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>
>> I still have the backup. I'd rather do the latter, as re-running the
>>>>>>>>>>>>>>>>>>>>>>
>> HE deployment is quite lengthy and involved (I have to re-initialise
>>>>>>>>>>>>>>>>>>>>>>
>> the FC storage each time). Do you know what the offending value(s)
>>>>>>>>>>>>>>>>>>>>>>
>> would be? Would it be in the Postgres DB or in a config file
>>>>>>>>>>>>>>>>>>>>>>
>> somewhere?
>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>
>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>
>> Cam
>>>>>>>>>>>>>>>>>>>>>>
>>
>>>>>>>>>>>>>>>>>>>>>>
>>> Regards
>>>>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>>>>
>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>>>>
>>> On Thu, Jun 22, 2017 at 11:39 AM, cmc <iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>> Hi Yanir,
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>> Thanks for the reply.
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>> First of all, maybe a chain reaction of :
>>>>>>>>>>>>>>>>>>>>>>
>>>>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>
>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>>>>>>>>>>
>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>>
>>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>
>>>>> is causing the hosted engine vm not to be set up correctly and
>>>>>>>>>>>>>>>>>>>>>>
>>>>> further
>>>>>>>>>>>>>>>>>>>>>>
>>>>> actions were made when the hosted engine vm wasnt in a stable state.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>> As for now, are you trying to revert back to a previous/initial
>>>>>>>>>>>>>>>>>>>>>>
>>>>> state ?
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>> I'm not trying to revert it to a previous state for now. This was a
>>>>>>>>>>>>>>>>>>>>>>
>>>> migration from a bare metal engine, and it didn't report any error
>>>>>>>>>>>>>>>>>>>>>>
>>>> during the migration. I'd had some problems on my first attempts at
>>>>>>>>>>>>>>>>>>>>>>
>>>> this migration, whereby it never completed (due to a proxy issue) but
>>>>>>>>>>>>>>>>>>>>>>
>>>> I managed to resolve this. Do you know of a way to get the Hosted
>>>>>>>>>>>>>>>>>>>>>>
>>>> Engine VM into a stable state, without rebuilding the entire cluster
>>>>>>>>>>>>>>>>>>>>>>
>>>> from scratch (since I have a lot of VMs on it)?
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>> Thanks for any help.
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>> Yanir
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Hi Jenny/Martin,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Any idea what I can do here? The hosted engine VM has no log on
any
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> host in /var/log/libvirt/qemu, and I fear that if I need to put
the
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> host into maintenance, e.g., to upgrade it that I created it on
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> (which
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> I think is hosting it), or if it fails for any reason, it
won't get
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> migrated to another host, and I will not be able to manage the
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> cluster. It seems to be a very dangerous position to be in.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc <iucounu(a)gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Thanks Martin. The hosts are all part of the same cluster.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> I get these errors in the engine.log on the engine:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 2017-06-19 03:28:05,030Z WARN
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> failed for user SYST
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> EM. Reasons:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 2017-06-19 03:28:05,030Z INFO
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock freed to object
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 'EngineLock:{exclusiveLocks='[a
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName
HostedEngine>,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> HostedEngine=<VM_NAME,
ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> sharedLocks=
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName
HostedEngine>]'}'
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> 2017-06-19 03:28:05,030Z ERROR
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter]
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed importing the
Hosted
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Engine VM
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> The sanlock.log reports conflicts on that same host, and a
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> error on the other hosts, not sure if they are related.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> And this in the /var/log/ovirt-hosted-engine-ha/agent log on
the
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> host
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> which I deployed the hosted engine VM on:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Unable to extract HEVM OVF
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Failed extracting VM OVF from the OVF_STORE volume, falling
back
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> initial vm.conf
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> I've seen some of these issues reported in bugzilla, but
they were
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> older versions of oVirt (and appear to be resolved).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> I will install that package on the other two hosts, for which
I
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> put them in maintenance as vdsm is installed as an upgrade.
I
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> guess
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> restarting vdsm is a good idea after that?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak
<msivak(a)redhat.com>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> you do not have to install it on all hosts. But you
should have
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> more
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> than one and ideally all hosted engine enabled nodes
should
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> belong to
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> the same engine cluster.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Does ovirt-hosted-engine-ha need to be installed
across all
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> hosts?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Could that be the reason it is failing to see it
properly?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc
<iucounu(a)gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Logs are attached. I can see errors in there, but
am unsure how
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> they
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> arose.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> <etokar(a)redhat.com>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> From the output it looks like the agent is
down, try starting
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> it by
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> running:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> systemctl start ovirt-ha-agent.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> The engine is supposed to see the hosted
engine storage domain
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> import it
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> to the system, then it should import the
hosted engine vm.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> Can you attach the agent log from the host
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> and the engine log from the engine vm
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> (/var/log/ovirt-engine/engine.log)?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> Jenny
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc
<iucounu(a)gmail.com>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> What version are you running?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> 4.1.2.2-1.el7.centos
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> For the hosted engine vm to be
imported and displayed in the
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> engine, you
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> must first create a master storage
domain.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> To provide a bit more detail: this was a
migration of a
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> bare-metal
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> engine in an existing cluster to a hosted
engine VM for that
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> cluster.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> As part of this migration, I built an
entirely new host and
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> ran
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> 'hosted-engine --deploy'
(followed these instructions:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_M...).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> I restored the backup from the engine and
it completed
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> without any
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> errors. I didn't see any instructions
regarding a master
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> domain in the page above. The cluster has
two existing master
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> domains, one is fibre channel, which is
up, and one ISO
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> domain,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> is currently offline.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you mean the hosted engine
commands are failing?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> What
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> happens
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> you run hosted-engine --vm-status
now?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Interestingly, whereas when I ran it
before, it exited with
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> output
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> and a return code of '1', it now
reports:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> --== Host 1 status ==--
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> conf_on_shared_storage :
True
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Status up-to-date :
False
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Hostname :
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Host ID : 1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Engine status :
unknown stale-data
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Score : 0
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> stopped :
True
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Local maintenance :
False
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> crc32 :
0217f07b
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> local_conf_timestamp :
2911
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Host timestamp :
2897
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Extra metadata (valid at timestamp):
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> metadata_parse_version=1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> metadata_feature_version=1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> timestamp=2897 (Thu Jun 15
16:22:54 2017)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> host-id=1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> score=0
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> vm_conf_refresh_time=2911 (Thu Jun
15 16:23:08 2017)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> conf_on_shared_storage=True
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> maintenance=False
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> state=AgentStopped
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> stopped=True
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Yet I can login to the web GUI fine. I
guess it is not HA due
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> in an unknown state currently? Does the
hosted-engine-ha rpm
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> be installed across all nodes in the
cluster, btw?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> Jenny Tokar
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc
<iucounu(a)gmail.com>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've migrated from a
bare-metal engine to a hosted engine.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> were
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> no errors during the install,
however, the hosted engine
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> did not
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> started. I tried running:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> hosted-engine --status
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> on the host I deployed it on, and
it returns nothing (exit
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is 1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> however). I could not ping it
either. So I tried starting
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it via
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 'hosted-engine
--vm-start' and it returned:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Virtual machine does not exist
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But it then became available. I
logged into it
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> successfully. It
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in the list of VMs however.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas why the hosted-engine
commands fail, and why it
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the list of virtual machines?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>
>>>>>> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>>>>>>>>>>>
> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>
> Users mailing list
>>>>>>>>>>>>>>>>>>>>>>
> Users(a)ovirt.org
>>>>>>>>>>>>>>>>>>>>>>
>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>