On Sun, Nov 18, 2018 at 11:30 AM Alex K <rightkicktech(a)gmail.com> wrote:
>
>
> On Sun, Nov 18, 2018 at 8:53 AM Alex K <rightkicktech(a)gmail.com> wrote:
>
>>
>>
>> On Sat, Nov 17, 2018, 19:32 Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
>> wrote:
>>
>>>
>>>
>>> Il giorno Sab 17 Nov 2018 14:07 Alex K <rightkicktech(a)gmail.com> ha
>>> scritto:
>>>
>>>> Hi all,
>>>>
>>>> I had a setup with ovirt 4.2.0 which at some point the engine stopped
>>>> responding, due to some split brain issues.
>>>>
>>>> Since was not able to resolve the split brain, I proceeded to redeploy
>>>> the engine.
>>>>
>>>> The steps I followed:
>>>> 1. upgrade servers (yum update)
>>>> 2. ran ovirt-hosted-engine-cleanup
>>>> 3. deployed engine (now 4.2.7)
>>>>
>>>> The deploy was successful and was able to add a new data domain.
>>>> The issue is that at this point I would expect the engine storage
>>>> domain and VM to be automatically imported, but it is not. At HA agent
logs
>>>> at the server I see:
>>>>
>>>> MainThread::INFO::2018-11-17
>>>>
12:55:51,856::states::444::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>>>> Engine vm running on localhost
>>>> MainThread::WARNING::2018-11-17
>>>>
12:55:52,145::ovf_store::140::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>>>> Unable to find OVF_STORE
>>>> MainThread::ERROR::2018-11-17
>>>>
12:55:52,146::config_ovf::84::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm::(_get_vm_conf_content_from_ovf_store)
>>>> Unable to identify the OVF_STORE volume, falling back to initial
vm.conf.
>>>> Please ensure you already added your first data domain for regular VMs
>>>> MainThread::INFO::2018-11-17
>>>>
12:55:52,246::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
>>>> Current state EngineUp (score: 3400)
>>>>
>>>> While at engine.log of engine VM I see:
>>>>
>>>> 2018-11-17 12:47:14,748Z INFO
>>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [] VM
>>>> '88dacb07-45f1-4bc1-80a0-9434d530eaaa' was discovered as
'Up' on VDS
>>>> '6eff2018-516d-4af1-807d-ecc31d024f4d'(v0.maya)
>>>> 2018-11-17 12:47:14,773Z INFO
>>>> [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand]
>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] Running
>>>> command: AddUnmanagedVmsCommand internal: true.
>>>> 2018-11-17 12:47:14,775Z INFO
>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand]
>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] START,
>>>> DumpXmlsVDSCommand(HostName = v0.maya,
>>>> Params:{hostId='6eff2018-516d-4af1-807d-ecc31d024f4d',
>>>> vmIds='[88dacb07-45f1-4bc1-80a0-9434d530eaaa]'}), log id:
44bb4e0a
>>>> 2018-11-17 12:47:14,779Z INFO
>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand]
>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] FINISH,
>>>> DumpXmlsVDSCommand, return:
{88dacb07-45f1-4bc1-80a0-9434d530eaaa=<domain
>>>> type='kvm' id='7'>
>>>> ...
>>>> <some kind of XML>
>>>> ...
>>>> 2018-11-17 12:47:14,793Z WARN
>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder]
>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] null
>>>> architecture type, replacing with x86_64, VM [HostedEngine]
>>>>
>>>> Sth is causing engine not not getting imported.
>>>> Tried to run hosted-engine –reinitialize-lockspace, since I was
>>>> getting some lockspace errors, but no change.
>>>>
>>>> Any idea what could be causing this?
>>>> I am left with little time due to the site being production. Any idea
>>>> is appreciated.
>>>>
>>>> Thanx,
>>>> Alex
>>>>
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>>>> oVirt Code of Conduct:
>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/M4DXHOUQ45Q...
>>>
>>>
>>> In step 3 how did you deploy engine?
>>> I had the same problem some days ago and was due to a bug in using
>>> command line and excluding ansible (option --no-ansible)
>>> I solved redeploying using the default that is with ansible
>>>
>> I deployed with --no-ansible flag since the ansible way was giving me an
>> error (sth with localhost). I can try ansible to check what was the error.
>>
> The error I am getting when trying to deploy with ansible is the
> following:
>
> 2018-11-17 09:03:50,378+0000 DEBUG
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 hostname_resolution_output:
> {'stderr_lines': [], u'changed': True, u'end':
u'2018-11-17
> 09:03:48.572863', u'stdout': u'', u'cmd': u'getent
ahostsv4 v0.maya | grep
> v0.maya', u'failed': True, u'delta': u'0:00:00.005712',
u'stderr': u'',
> u'rc': 1, u'msg': u'non-zero return code',
'stdout_lines': [], u'start':
> u'2018-11-17 09:03:48.567151'}
>
> 2018-11-17 09:03:51,280+0000 INFO
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Check address resolution]
>
> 2018-11-17 09:03:52,082+0000 DEBUG
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 {u'msg': u'Unable to resolve
address\n',
> u'changed': False, u'_ansible_no_log': False}
>
> 2018-11-17 09:03:52,182+0000 ERROR
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:98 fatal: [localhost]: FAILED! =>
{"changed":
> false, "msg": "Unable to resolve address\n"}
>
> 2018-11-17 09:03:52,784+0000 DEBUG
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 PLAY RECAP [localhost] : ok: 16 changed: 3
> unreachable: 0 skipped: 4 failed: 1
>
> 2018-11-17 09:03:52,884+0000 DEBUG
> otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:180
> ansible-playbook rc: 2
>
> 2018-11-17 09:03:52,884+0000 DEBUG
> otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:187
> ansible-playbook stdout:
>
> --
>
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/ansible_utils.py",
> line 194, in run
>
> raise RuntimeError(_('Failed executing ansible-playbook'))
>
> RuntimeError: Failed executing ansible-playbook
>
> 2018-11-17 09:03:52,886+0000 ERROR otopi.context
> context._executeMethod:152 Failed to execute stage 'Closing up': Failed
> executing ansible-playbook
>
> 2018-11-17 09:03:52,887+0000 DEBUG otopi.context
> context.dumpEnvironment:859 ENVIRONMENT DUMP - BEGIN
>
> 2018-11-17 09:03:52,887+0000 DEBUG otopi.context
> context.dumpEnvironment:869 ENV BASE/error=bool:'True'
>
> 2018-11-17 09:03:52,887+0000 DEBUG otopi.context
> context.dumpEnvironment:869 ENV BASE/exceptionInfo=list:'[(<type
> 'exceptions.RuntimeError'>, RuntimeError('Failed executing
> ansible-playbook',), <traceback object at 0x7fefb0248f38>)]'
>
> How Can I overcome this? I recall I've seen this on past attempts also
> and was able to proceed only with the traditional python (--no-ansible)
> way.
>
I was able to overcome this by amending the /etc/hosts file at the server.
It had some erroneous entries.
The deployment was able to proceed and engine is up though it gave only at
the end the following error:
[ INFO ] TASK [Wait for the local bootstrap VM to be down at engine eyes]
[ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts":
{"ovirt_vms":
[{"affinity_labels": [], "applications": [], "bios":
{"boot_menu":
{"enabled": false}}, "cdroms": [], "cluster":
{"href":
"/ovirt-engine/api/clusters/a407b02c-eb1a-11e8-a5a5-00163e445490",
"id":
"a407b02c-eb1a-11e8-a5a5-00163e445490"}, "comment": "",
"cpu":
{"architecture": "x86_64", "topology": {"cores":
1, "sockets": 4,
"threads": 1}}, "cpu_profile": {"href":
"/ovirt-engine/api/cpuprofiles/58ca604e-01a7-003f-01de-000000000250",
"id":
"58ca604e-01a7-003f-01de-000000000250"}, "cpu_shares": 0,
"creation_time":
"2018-11-18 10:17:45.351000+00:00", "delete_protected": false,
"description": "", "disk_attachments": [],
"display": {"address":
"127.0.0.1", "allow_override": false, "copy_paste_enabled":
true,
"disconnect_action": "LOCK_SCREEN",
"file_transfer_enabled": true,
"monitors": 1, "port": 5900, "single_qxl_pci": false,
"smartcard_enabled":
false, "type": "vnc"}, "fqdn": "engine.maya",
"graphics_consoles": [],
"guest_operating_system": {"architecture": "x86_64",
"codename": "",
"distribution": "CentOS Linux", "family":
"Linux", "kernel": {"version":
{"build": 0, "full_version": "3.10.0-862.14.4.el7.x86_64",
"major": 3,
"minor": 10, "revision": 862}}, "version":
{"full_version": "7", "major":
7}}, "guest_time_zone": {"name": "UTC",
"utc_offset": "+00:00"},
"high_availability": {"enabled": false, "priority": 0},
"host": {"href":
"/ovirt-engine/api/hosts/4250ef49-969c-4cd5-8a4f-30b7755a7d36",
"id":
"4250ef49-969c-4cd5-8a4f-30b7755a7d36"}, "host_devices": [],
"href":
"/ovirt-engine/api/vms/a7872048-030e-4991-be23-43283794d650", "id":
"a7872048-030e-4991-be23-43283794d650", "io": {"threads":
1},
"katello_errata": [], "large_icon": {"href":
"/ovirt-engine/api/icons/c444caf0-5750-9602-f4b4-62db210b133b",
"id":
"c444caf0-5750-9602-f4b4-62db210b133b"}, "memory": 10737418240,
"memory_policy": {"guaranteed": 10737418240, "max":
10737418240},
"migration": {"auto_converge": "inherit",
"compressed": "inherit"},
"migration_downtime": -1, "multi_queues_enabled": true,
"name":
"external-HostedEngineLocal", "next_run_configuration_exists":
false,
"nics": [], "numa_nodes": [], "numa_tune_mode":
"interleave", "origin":
"external", "original_template": {"href":
"/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000",
"id":
"00000000-0000-0000-0000-000000000000"}, "os": {"boot":
{"devices":
["hd"]}, "type": "other"}, "permissions": [],
"placement_policy":
{"affinity": "migratable"}, "quota": {"id":
"b4232eb4-eb1a-11e8-9bc5-00163e445490"}, "reported_devices": [],
"run_once": false, "sessions": [], "small_icon":
{"href":
"/ovirt-engine/api/icons/4a0580c6-11ba-bc2e-6c82-211666f323e9",
"id":
"4a0580c6-11ba-bc2e-6c82-211666f323e9"}, "snapshots": [],
"sso":
{"methods": [{"id": "guest_agent"}]},
"start_paused": false, "stateless":
false, "statistics": [], "status": "unknown",
"storage_error_resume_behaviour": "auto_resume", "tags":
[], "template":
{"href":
"/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000",
"id":
"00000000-0000-0000-0000-000000000000"}, "time_zone":
{"name": "Etc/GMT"},
"type": "desktop", "usb": {"enabled": false},
"watchdogs": []}]},
"attempts": 24, "changed": false, "deprecations":
[{"msg": "The
'ovirt_vms_facts' module is being renamed 'ovirt_vm_facts'",
"version":
2.8}]}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing
ansible-playbook
The engine is up and I can SSH into it. Though when I try to login through
GUI I get " The redirection URI for client is not registered ", even
though I have set its IP address in SSO_ALTERNATE_ENGINE_FQDNS through a
config file. What could be the issue now? Thanx
Ok I managed to resolve this one also by adding 127.0.0.1 in the
SSO_ALTERNATE_ENGINE_FQDNS, as I using port forwarding through SSH to
access engine GUI. Now I am in the process to import Data SD and hopefully
will complete my restoration... :) crossing fingers