Alex,
I haven't run into any issues with ovirt-ha-agent. I'm adding Simone who
may have a better idea of what could be causing the problem. Could you
provide any logs you have available from that deployment? Also, could you
please run "journalctl -u ovirt-ha-agent" on that host and provide the
output?
Thanks!
-Phillip Bailey
On Tue, May 15, 2018 at 9:22 AM, Alex K <rightkicktech(a)gmail.com> wrote:
Hi Philip,
I finally was not able to complete it.
The ovirt ha agent at host was not starting for some reason.
It could be because I ran a hosted-engine-cleanup earlier.
So I need to repeat from scratch to be able to reproduce/verify.
Alex
On Tue, May 15, 2018 at 2:48 PM, Phillip Bailey <phbailey(a)redhat.com>
wrote:
> Alex,
>
> I'm glad to hear you were able to get everything running! Please let us
> know if you have any issues going forward.
>
> Best regards,
>
> -Phillip Bailey
>
> On Tue, May 15, 2018 at 4:59 AM, Alex K <rightkicktech(a)gmail.com> wrote:
>
>> I overcame this with:
>>
>> run at host:
>>
>> /usr/sbin/ovirt-hosted-engine-cleanup
>>
>> Redeployed then engine
>> engine-setup
>>
>> This time was ok.
>>
>> Thanx,
>> Alex
>>
>> On Tue, May 15, 2018 at 10:51 AM, Alex K <rightkicktech(a)gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Thanx for the feedback.
>>>
>>> *getent ahostsv4 v0.mydomain*
>>>
>>> gives:
>>>
>>> 172.16.30.10 STREAM v0
>>> 172.16.30.10 DGRAM
>>> 172.16.30.10 RAW
>>>
>>> which means that
>>>
>>> *getent ahostsv4 v0.mydomain | grep v0.mydomain*
>>>
>>> gives null
>>>
>>> I overcame this by using the flag *--noansible* to proceed with the
>>> python way and it did succeed.
>>>
>>> Now I am stuck at engine-setup create CA step. It never finishes and I
>>> see several errors at setup log (grep -iE 'error|fail' ):
>>>
>>> 2018-05-15 03:40:03,749-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/error=bool:'False'
>>> 2018-05-15 03:40:03,751-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV CORE/failOnPrioOverride=bool:'True'
>>> 2018-05-15 03:40:04,338-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/error=bool:'False'
>>> 2018-05-15 03:40:04,339-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV CORE/failOnPrioOverride=bool:'True'
>>> 2018-05-15 03:40:04,532-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV OVESETUP_CORE/failOnDulicatedC
>>> onstant=bool:'False'
>>> 2018-05-15 03:40:04,809-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV OVESETUP_PROVISIONING/postgres
>>> ExtraConfigItems=tuple:'({'ok': <function <lambda> at
0x7ff1630b9578>,
>>> 'check_on_use': True, 'needed_on_create': True,
'key':
>>> 'autovacuum_vacuum_scale_factor', 'expected': 0.01,
'error_msg':
>>> '{key} required to be at most {expected}'}, {'ok':
<function <lambda> at
>>> 0x7ff1630b9a28>, 'check_on_use': True, 'needed_on_create':
True, 'key':
>>> 'autovacuum_analyze_scale_factor', 'expected': 0.075,
'error_msg':
>>> '{key} required to be at most {expected}'}, {'ok':
<function <lambda> at
>>> 0x7ff163099410>, 'check_on_use': True, 'needed_on_create':
True, 'key':
>>> 'autovacuum_max_workers', 'expected': 6, 'error_msg':
'{key} required to be
>>> at least {expected}'}, {'ok': <function <lambda> at
0x7ff163099488>,
>>> 'check_on_use': True, 'needeOperationalError: FATAL: *password
>>> authentication failed for user "engine"*
>>> FATAL: password authentication failed for user "engine"
>>> 2018-05-15 03:40:11,408-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/error=bool:'False'
>>> 2018-05-15 03:40:11,417-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV CORE/failOnPrioOverride=bool:'True'
>>> 2018-05-15 03:40:11,441-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV OVESETUP_CORE/failOnDulicatedC
>>> onstant=bool:'False'
>>> 2018-05-15 03:40:11,457-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV OVESETUP_PROVISIONING/postgres
>>> ExtraConfigItems=tuple:'({'ok': <function <lambda> at
0x7ff1630b9578>,
>>> 'check_on_use': True, 'needed_on_create': True,
'key':
>>> 'autovacuum_vacuum_scale_factor', 'expected': 0.01,
'error_msg':
>>> '{key} required to be at most {expected}'}, {'ok':
<function <lambda> at
>>> 0x7ff1630b9a28>, 'check_on_use': True, 'needed_on_create':
True, 'key':
>>> 'autovacuum_analyze_scale_factor', 'expected': 0.075,
'error_msg':
>>> '{key} required to be at most {expected}'}, {'ok':
<function <lambda> at
>>> 0x7ff163099410>, 'check_on_use': True, 'needed_on_create':
True, 'key':
>>> 'autovacuum_max_workers', 'expected': 6, 'error_msg':
'{key} required to be
>>> at least {expected}'}, {'ok': <function <lambda> at
0x7ff163099488>,
>>> 'check_on_use': True, 'needed_on_create': True,
'key':
>>> 'maintenance_work_mem', 'expected': 65536,
'error_msg': '{key} required to
>>> be at least {expected}', 'useQueryForValue': True},
{'ok': <function
>>> <lambda> at 0x7ff163099500>, 'check_on_use': True,
'needed_on_create':
>>> True, 'key': 'work_mem', 'expected': 8192,
'error_msg': '{key} required to
>>> be at least {expected}', 'useQueryForValue': True})'
>>> raise RuntimeError("SIG%s" % signum)
>>> RuntimeError: SIG2
>>> raise RuntimeError("SIG%s" % signum)
>>> RuntimeError: SIG2
>>> 2018-05-15 03:41:19,888-0400 ERROR otopi.context
>>> context._executeMethod:152 *Failed to execute stage 'Misc
>>> configuration': SIG2*
>>> 2018-05-15 03:41:19,993-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/error=bool:'True'
>>> 2018-05-15 03:41:19,993-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/exceptionInfo=list:'[(<type
>>> 'exceptions.RuntimeError'>, RuntimeError('SIG2',),
<traceback object at
>>> 0x7ff161de9560>)]'
>>> 2018-05-15 03:41:20,033-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/error=bool:'True'
>>> 2018-05-15 03:41:20,033-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV BASE/exceptionInfo=list:'[(<type
>>> 'exceptions.RuntimeError'>, RuntimeError('SIG2',),
<traceback object at
>>> 0x7ff161de9560>)]'
>>> 2018-05-15 03:41:20,038-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV CORE/failOnPrioOverride=bool:'True'
>>> 2018-05-15 03:41:20,056-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV OVESETUP_CORE/failOnDulicatedC
>>> onstant=bool:'False'
>>> 2018-05-15 03:41:20,069-0400 DEBUG otopi.context
>>> context.dumpEnvironment:869 ENV OVESETUP_PROVISIONING/postgres
>>> ExtraConfigItems=tuple:'({'ok': <function <lambda> at
0x7ff1630b9578>,
>>> 'check_on_use': True, 'needed_on_create': True,
'key':
>>> 'autovacuum_vacuum_scale_factor', 'expected': 0.01,
'error_msg':
>>> '{key} required to be at most {expected}'}, {'ok':
<function <lambda> at
>>> 0x7ff1630b9a28>, 'check_on_use': True, 'needed_on_create':
True, 'key':
>>> 'autovacuum_analyze_scale_factor', 'expected': 0.075,
'error_msg':
>>> '{key} required to be at most {expected}'}, {'ok':
<function <lambda> at
>>> 0x7ff163099410>, 'check_on_use': True, 'needed_on_create':
True, 'key':
>>> 'autovacuum_max_workers', 'expected': 6, 'error_msg':
'{key} required to be
>>> at least {expected}'}, {'ok': <function <lambda> at
0x7ff163099488>,
>>> 'check_on_use': True, 'needed_on_create': True,
'key':
>>> 'maintenance_work_mem', 'expected': 65536,
'error_msg': '{key} required to
>>> be at least {expected}', 'useQueryForValue': True},
{'ok': <function
>>> <lambda> at 0x7ff163099500>, 'check_on_use': True,
'needed_on_create':
>>> True, 'key': 'work_mem', 'expected': 8192,
'error_msg': '{key} required to
>>> be at least {expected}', 'useQueryForValue': True})'
>>> 2018-05-15 03:41:20,084-0400 ERROR
otopi.plugins.ovirt_engine_common.base.core.misc
>>> misc._terminate:162 Execution of setup failed
>>>
>>>
>>> I selected to autoconfigure the DB but seems that some auth issue is
>>> being logged for DB account of engine.
>>>
>>> Any ideas on this?
>>> I can share more logs if needed.
>>>
>>> Thanx,
>>> Alex
>>>
>>> On Mon, May 14, 2018 at 11:59 PM, Phillip Bailey <phbailey(a)redhat.com>
>>> wrote:
>>>
>>>> Hi Alex,
>>>>
>>>> I believe the lines below from the deploy log point to the issue. I
>>>> bolded the important parts. It looks like it was unable to resolve the
FQDN
>>>> for the host. What output and return code do you get when you run
"getent
>>>> ahostsv4 v0.mydomain | grep v0.mydomain" on that machine?
>>>>
>>>> 2018-05-14 13:24:59,631-0400 DEBUG
otopi.ovirt_hosted_engine_setup.ansible_utils
>>>> ansible_utils._process_output:94 hostname_resolution_output:
>>>> {'stderr_lines': [], u'changed': True, u'end':
u'2018-05-14
>>>> 13:24:58.914393', u'stdout': u'', u'cmd':
u'*getent ahostsv4
>>>> v0.mydomain | grep v0.mydomain*', u'failed': True,
u'delta':
>>>> u'0:00:00.005743', u'stderr': u'', u'rc':
1, u'msg': u'*non-zero
>>>> return code*', 'stdout_lines': [], u'start':
u'2018-05-14
>>>> 13:24:58.908650'}
>>>> 2018-05-14 13:24:59,832-0400 INFO
otopi.ovirt_hosted_engine_setup.ansible_utils
>>>> ansible_utils._process_output:100 TASK [Check address resolution]
>>>> 2018-05-14 13:25:00,133-0400 DEBUG
otopi.ovirt_hosted_engine_setup.ansible_utils
>>>> ansible_utils._process_output:94 {u'msg': u'Unable to
resolve
>>>> address\n', u'changed': False, u'_ansible_no_log':
False}
>>>> 2018-05-14 13:25:00,234-0400 ERROR
otopi.ovirt_hosted_engine_setup.ansible_utils
>>>> ansible_utils._process_output:98 fatal: [localhost]: FAILED! =>
>>>> {"changed": false, "msg": "*Unable to resolve
address*\n"}
>>>>
>>>> Additionally, work is underway to make the logs easier to read and
>>>> more useful so that troubleshooting issues like this won't be as
difficult
>>>> in the future. I'm sorry for any frustration it's caused and
appreciate you
>>>> reaching out to work through the issue.
>>>>
>>>> -Phillip Bailey
>>>>
>>>> On Mon, May 14, 2018 at 1:32 PM, Alex K <rightkicktech(a)gmail.com>
>>>> wrote:
>>>>
>>>>> I am attaching the deploy log in case it helps.
>>>>>
>>>>> Thanx,
>>>>> Alex
>>>>>
>>>>> On Mon, May 14, 2018 at 8:28 PM, Alex K
<rightkicktech(a)gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am trying to setup ovirt 4.2 self hosted with 3 nodes.
>>>>>>
>>>>>> I have done several 4.1 installations without issues. Now at 4.2
I
>>>>>> get:
>>>>>>
>>>>>> [ ERROR ] fatal: [localhost]: FAILED! => {"changed":
false, "msg":
>>>>>> "Unable to resolve address\n"}
>>>>>> [ ERROR ] Failed to execute stage 'Closing up': Failed
executing
>>>>>> ansible-playbook
>>>>>>
>>>>>> I am running:
>>>>>>
>>>>>> hosted-engine --deploy --config-append=/root/ovirt/storage.conf
>>>>>>
>>>>>> Checking the log doesn't give an easy reference of the issue.
Seems
>>>>>> to be related with DNS but I can confirm that the host can
resolve the
>>>>>> engine FQDN from /etc/hosts or from the DNS server.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Thanx,
>>>>>> Alex
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>
>>>>>
>>>>
>>>
>>
>