[ovirt-users] Unable to get HE up after update
Susinthiran Sithamparanathan
chesusin at gmail.com
Fri Oct 21 09:10:13 UTC 2016
Hi,
i did run that command from the engine (now the hostname is changed to
susin.myftp.org -> 192.168.0.101 ) and got:
[root at susin ~]# cat < /dev/tcp/susin/54321
-bash: connect: Connection refused
-bash: /dev/tcp/susin/54321: Connection refused
[root at susin ~]# cat < /dev/tcp/susin.myftp.org/54321
-bash: connect: Connection refused
-bash: /dev/tcp/susin.myftp.org/54321: Connection refused
[root at susin ~]# cat < /dev/tcp/192.168.0.101/54321
-bash: connect: Connection refused
-bash: /dev/tcp/192.168.0.101/54321: Connection refused
[root at susin ~]#
Both host and engine is behing a NAT and I've configured /etc/hosts
correctly so the hosts ping by name from the engine and host. The hostname
of the engine is susin.myftp.org so using dig or host, it will resolve to
my public IP, and pinging will resolve correctly.
But now i came over
http://www.ovirt.org/documentation/how-to/networking/changing-engine-hostname/
since i actually changed the hostname for the engine to be able to login
through the web UI.
Especially the following "The bigger concern is with the engine's
certificate. Currently, to the best of our knowledge, there is no component
that actually checks this trust. But it's possible, that in some future
version of one of the relevant tools - vdsm, libvirt, etc. - such a check
will actually be made, and even prevent connections. If this happens, the
engine might not be able to connect to the hosts, and the worst case is
that they will have to be reinstalled, thus loosing all the configuration
and data accumulated by then."
tail -f /var/log/ovirt-engine/engine.log
2016-10-21 11:05:16,888 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
(DefaultQuartzScheduler1) [] Command 'GetAllVmStatsVDSCommand(HostName =
hosted_engine_1, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='826a8da5-74c1-4002-ab7b-e6e32be94fe6',
vds='Host[hosted_engine_1,826a8da5-74c1-4002-ab7b-e6e32be94fe6]'})'
execution failed: VDSGenericException: VDSNetworkException: Vds timeout
occured
2016-10-21 11:05:16,888 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
(DefaultQuartzScheduler1) [] Failed to fetch vms info for host
'hosted_engine_1' - skipping VMs monitoring.
2016-10-21 11:05:16,918 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler4) [] Correlation ID: null, Call Stack: null, Custom
Event ID: -1, Message: VDSM hosted_engine_1 command failed: Message timeout
which can be caused by communication issues
2016-10-21 11:05:16,918 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler4) [] Command
'org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand'
return value
'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturnForXmlRpc at 1f2e4065'
2016-10-21 11:05:16,918 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler4) [] HostName = hosted_engine_1
2016-10-21 11:05:16,919 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler4) [] Command 'GetCapabilitiesVDSCommand(HostName =
hosted_engine_1, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='826a8da5-74c1-4002-ab7b-e6e32be94fe6',
vds='Host[hosted_engine_1,826a8da5-74c1-4002-ab7b-e6e32be94fe6]'})'
execution failed: VDSGenericException: VDSNetworkException: Message timeout
which can be caused by communication issues
2016-10-21 11:05:16,919 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler4) [] Failure to refresh Vds runtime info:
VDSGenericException: VDSNetworkException: Message timeout which can be
caused by communication issues
2016-10-21 11:05:16,919 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler4) [] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Message timeout which can be
caused by communication issues
at
org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand.executeVdsBrokerCommand(GetCapabilitiesVDSCommand.java:16)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73)
[vdsbroker.jar:]
at
org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
[dal.jar:]
at
org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:451)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:653)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:121)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refresh(HostMonitoring.java:85)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:238)
[vdsbroker.jar:]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[rt.jar:1.8.0_102]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[rt.jar:1.8.0_102]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[rt.jar:1.8.0_102]
at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_102]
at
org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:77)
[scheduler.jar:]
at
org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:51)
[scheduler.jar:]
at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[rt.jar:1.8.0_102]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[rt.jar:1.8.0_102]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[rt.jar:1.8.0_102]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[rt.jar:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_102]
2016-10-21 11:05:16,921 WARN [org.ovirt.engine.core.vdsbroker.VdsManager]
(DefaultQuartzScheduler4) [] Failed to refresh VDS, network error,
continuing, vds='hosted_engine_1'(826a8da5-74c1-4002-ab7b-e6e32be94fe6):
VDSGenericException: VDSNetworkException: Message timeout which can be
caused by communication issues
2016-10-21 11:05:16,921 WARN [org.ovirt.engine.core.vdsbroker.VdsManager]
(org.ovirt.thread.pool-8-thread-1) [] Host 'hosted_engine_1' is not
responding.
2016-10-21 11:05:16,993 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-1) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: Host hosted_engine_1 is not responding.
Host cannot be fenced automatically because power management for the host
is disabled.
2016-10-21 11:05:19,943 INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt01/192.168.0.100
2016-10-21 11:05:19,960 ERROR
[org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) []
Unable to process messages: General SSLEngine problem
..
2016-10-21 11:08:40,871 WARN
[org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand]
(org.ovirt.thread.pool-8-thread-6) [6d681722] Validation of action
'VdsNotRespondingTreatment' failed for user SYSTEM. Reasons:
VAR__ACTION__RESTART,POWER_MANAGEMENT_ACTION_ON_ENTITY_ALREADY_IN_PROGRESS
2016-10-21 11:08:41,023 INFO
[org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] Running command:
VdsNotRespondingTreatmentCommand internal: true. Entities affected : ID:
826a8da5-74c1-4002-ab7b-e6e32be94fe6 Type: VDS
2016-10-21 11:08:41,088 INFO
[org.ovirt.engine.core.bll.pm.SshSoftFencingCommand]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] Running command:
SshSoftFencingCommand internal: true. Entities affected : ID:
826a8da5-74c1-4002-ab7b-e6e32be94fe6 Type: VDS
2016-10-21 11:08:41,116 INFO
[org.ovirt.engine.core.bll.pm.SshSoftFencingCommand]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] Opening SSH Soft Fencing
session on host 'ovirt01'
2016-10-21 11:08:41,470 ERROR
[org.ovirt.engine.core.bll.pm.SshSoftFencingCommand]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] SSH Soft Fencing command
failed on host 'ovirt01': SSH authentication to 'root at ovirt01' failed.
Please verify provided credentials. Make sure key is authorized at host
Stdout:
Stderr:
2016-10-21 11:08:41,483 INFO
[org.ovirt.engine.core.bll.pm.SshSoftFencingCommand]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] Lock freed to object
'EngineLock:{exclusiveLocks='[826a8da5-74c1-4002-ab7b-e6e32be94fe6=<VDS_FENCE,
POWER_MANAGEMENT_ACTION_ON_ENTITY_ALREADY_IN_PROGRESS>]',
sharedLocks='null'}'
2016-10-21 11:08:41,545 WARN
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] Trying to release exclusive
lock which does not exist, lock key:
'826a8da5-74c1-4002-ab7b-e6e32be94fe6VDS_FENCE'
2016-10-21 11:08:41,547 INFO
[org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand]
(org.ovirt.thread.pool-8-thread-5) [193e2e22] Lock freed to object
'EngineLock:{exclusiveLocks='[826a8da5-74c1-4002-ab7b-e6e32be94fe6=<VDS_FENCE,
POWER_MANAGEMENT_ACTION_ON_ENTITY_ALREADY_IN_PROGRESS>]',
sharedLocks='null'}'
2016-10-21 11:08:43,503 INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt01/192.168.0.100
2016-10-21 11:08:43,517 ERROR
[org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) []
Unable to process messages: General SSLEngine problem
2016-10-21 11:08:55,446 INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt01/192.168.0.100
2016-10-21 11:08:55,461 ERROR
[org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) []
Unable to process messages: General SSLEngine problem
Since the engine is complaining about SSL communication error, i suspect
the problem is there.
Is there still any ways to save my VMs or do i have to reinstall it?
On Thu, Oct 20, 2016 at 6:27 PM, Simone Tiraboschi <stirabos at redhat.com>
wrote:
>
>
> On Thu, Oct 20, 2016 at 6:16 PM, Susinthiran Sithamparanathan <
> chesusin at gmail.com> wrote:
>
>> Hi,
>> still unable to get my system up with my VMs. Inside the web UI i can see
>> a warning at the bottom: Host hosted_engine_1 is non responsive.
>> When i try o activate master data domain NFS01, i get :Error while
>> executing action: Cannot activate Storage. There is no active Host in the
>> Data Center.
>> The Data Center tab shows "VDSM hosted_engine_1 command failed: Message
>> timeout which can be caused by communication issues" at the bottom.
>> I can't see why the host isn't active in the engine-vm.
>> Any help appreciated.Thanks.
>>
>>
> Can you please try
> cat < /dev/tcp/<yourhostaddress>/54321
> from the engine VM?
> If it's not able to connect, please check name resolution, addressing and
> so on.
>
>
>>
>>
>>
>> On Mon, Oct 17, 2016 at 5:00 PM, Susinthiran Sithamparanathan <
>> chesusin at gmail.com> wrote:
>>
>>> Now after a long time i got the prompt to login.
>>> What i see is that things are still down and unable to activate
>>> anything. I see
>>> [image: Inline image 1]
>>> This host is in non responding state. Try to Activate it; If the problem
>>> persists, switch Host to Maintenance mode and try to reinstall it.
>>>
>>>
>>> On Mon, Oct 17, 2016 at 4:51 PM, Susinthiran Sithamparanathan <
>>> chesusin at gmail.com> wrote:
>>>
>>>> Thanks.Savior at https://www.mail-archive.com/u
>>>> sers at ovirt.org/msg33874.html.
>>>> When i logged into the web UI, i couldn't bring up storage, datacenter,
>>>> cluster, everything was down.
>>>> I restarted the host and now when i enter admin portal, it spins for
>>>> ever. Seems to be some SSL communciation issues:
>>>> https://paste.fedoraproject.org/453944/76715737/
>>>> Any hints are appreciated!
>>>>
>>>>
>>>>
>>>> On Mon, Oct 17, 2016 at 4:15 PM, Simone Tiraboschi <stirabos at redhat.com
>>>> > wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 17, 2016 at 3:48 PM, Susinthiran Sithamparanathan <
>>>>> chesusin at gmail.com> wrote:
>>>>>
>>>>>> Got the engine up finally :)
>>>>>> But now met with The client is not authorized to request an
>>>>>> authorization. It's required to access the system using FQDN. It worked
>>>>>> fine prior to upgrade!
>>>>>>
>>>>>
>>>>> This is a new feature of 4.0; you cannot login anymore with the IP
>>>>> address if the cert has been signed for an fqdn.
>>>>>
>>>>>
>>>>>> Found https://bugzilla.redhat.com/show_bug.cgi?id=1351217, but not
>>>>>> sure if i have FQDN case issues.
>>>>>> Any idea how to fix this?
>>>>>>
>>>>>> On Mon, Oct 17, 2016 at 3:33 PM, Susinthiran Sithamparanathan <
>>>>>> chesusin at gmail.com> wrote:
>>>>>>
>>>>>>> Analyzed the log to find out that the problem was in the creation of
>>>>>>> the certs with openssl (missing distinguished name in config). And
>>>>>>> /etc/pki/ovirt-engine/{cacert,openssl}.conf were empty!
>>>>>>> That lead me to:
>>>>>>> yum provides /etc/pki/ovirt-engine/{cacert,openssl}.conf
>>>>>>> yum remove ovirt-engine-backend
>>>>>>> yum install ovirt-engine-backend ovirt-engine
>>>>>>> ovirt-engine-dashboard ovirt-engine-setup ovirt-engine-tools
>>>>>>> ovirt-engine-userportal ovirt-engine-webadmin-portal ovirt-engine-restapi
>>>>>>> ovirt-engine-dashboard
>>>>>>>
>>>>>>> Now i was able to successfully run engine-setup and exit maintenance
>>>>>>> mode on the host. Let's see how things unfold within a 30 min. Will keep
>>>>>>> you updated!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 17, 2016 at 3:14 PM, Susinthiran Sithamparanathan <
>>>>>>> chesusin at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> i tried that and ended up with : https://paste.fedoraproject.or
>>>>>>>> g/453892/71003314/ :(
>>>>>>>> Log ovirt-engine-setup-20161017150800-drmayj.log uploaded to
>>>>>>>> https://my.owndrive.com/index.php/s/3Dcyho9bqo7oZs8?path=%2F
>>>>>>>> engine-vm
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 17, 2016 at 11:54 AM, Simone Tiraboschi <
>>>>>>>> stirabos at redhat.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 17, 2016 at 9:55 AM, Susinthiran Sithamparanathan <
>>>>>>>>> chesusin at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi guys,
>>>>>>>>>> let me know if there anything else you need for further debugging
>>>>>>>>>> purpose.
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Can you please try reinstalling all the oVirt rpms on the engine
>>>>>>>>> VM and re-executing engine-setup there?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Oct 13, 2016 at 7:53 PM, Susinthiran Sithamparanathan <
>>>>>>>>>> chesusin at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Thu, Oct 13, 2016 at 3:23 PM, Yedidyah Bar David <
>>>>>>>>>>> didi at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> OK. Can you please attach the output of:
>>>>>>>>>>>>
>>>>>>>>>>>> grep MANUAL /etc/ovirt-engine/engine.conf.d/*.conf
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [root at ovirt01 ~]# grep MANUAL /etc/ovirt-engine/engine.conf.
>>>>>>>>>>> d/*.conf
>>>>>>>>>>> [root at ovirt01 ~]# ssh 192.168.0.101
>>>>>>>>>>> root at 192.168.0.101's password:
>>>>>>>>>>> Last login: Thu Oct 13 19:50:02 2016 from ovirt01
>>>>>>>>>>> [root at engine ~]# grep MANUAL /etc/ovirt-engine/engine.conf.
>>>>>>>>>>> d/*.conf
>>>>>>>>>>> [root at engine ~]#
>>>>>>>>>>>
>>>>>>>>>>> I.e nothing found by grep for that search on the host and the
>>>>>>>>>>> engine-vm.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Susinthiran Sithamparanathan
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Susinthiran Sithamparanathan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Susinthiran Sithamparanathan
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Susinthiran Sithamparanathan
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Susinthiran Sithamparanathan
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Susinthiran Sithamparanathan
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Susinthiran Sithamparanathan
>>>
>>
>>
>>
>> --
>>
>> Susinthiran Sithamparanathan
>>
>
>
--
Susinthiran Sithamparanathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161021/b966eddf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 99 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161021/b966eddf/attachment-0001.png>
More information about the Users
mailing list