Good day for all.
I have some issues with Ovirt 4.2.6. But now the main this of it:
I have two Centos 7 Nodes with same config and last Ovirt 4.2.6 with
Hostedengine with disk on NFS storage.
Also some of virtual machines working good.
But, when HostedEngine running on one node (srv02.local) everything is
fine.
After migrating to another node (srv00.local), i see that agent cannot to
check livelinness of HostedEngine. After few minutes HostedEngine going to
reboot and after some time i see some situation. After migration to another
node (srv00.local) all looks OK.
hosted-engine --vm-status commang when HosterEngine on srv00 node:
--== Host 1 status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : srv02.local
Host ID : 1
Engine status : {"reason": "vm not running on this
host", "health": "bad", "vm":
"down_unexpected", "detail": "unknown"}
Score : 0
stopped : False
Local maintenance : False
crc32 : ecc7ad2d
local_conf_timestamp : 78328
Host timestamp : 78328
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=78328 (Tue Sep 18 12:44:18 2018)
host-id=1
score=0
vm_conf_refresh_time=78328 (Tue Sep 18 12:44:18 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineUnexpectedlyDown
stopped=False
timeout=Fri Jan 2 03:49:58 1970
--== Host 2 status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : srv00.local
Host ID : 2
Engine status : {"reason": "failed liveliness
check",
"health": "bad", "vm": "up", "detail":
"Up"}
vm: up refers to vm status at virt level polling a local vdsm, health: bad
refers instead to a live check on the engine portal over http.
Bad name resolution or network routing issues can cause this. I'd suggest
to check if everything is fine on network side.
Score : 3400
stopped : False
Local maintenance : False
crc32 : 1d62b106
local_conf_timestamp : 326288
Host timestamp : 326288
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=326288 (Tue Sep 18 12:44:21 2018)
host-id=2
score=3400
vm_conf_refresh_time=326288 (Tue Sep 18 12:44:21 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineStarting
stopped=False
Log agent.log from srv00.local:
MainThread::INFO::2018-09-18
12:40:51,749::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18
12:40:52,052::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18
12:41:01,066::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18
12:41:01,374::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18
12:41:11,393::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(refresh) Global metadata: {'maintenance': False}
MainThread::INFO::2018-09-18
12:41:11,393::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(refresh) Host srv02.local.pioner.kz (id 1):
{'conf_on_shared_storage': True, 'extra': 'meta
data_parse_version=1\nmetadata_feature_version=1\ntimestamp=78128 (Tue Sep
18 12:40:58 2018)\nhost-id=1\ns
core=0\nvm_conf_refresh_time=78128 (Tue Sep 18 12:40:58
2018)\nconf_on_shared_storage=True\nmaintenance=Fa
lse\nstate=EngineUnexpectedlyDown\nstopped=False\ntimeout=Fri Jan 2
03:49:58 1970\n', 'hostname': 'srv02.
local.pioner.kz', 'alive': True, 'host-id': 1,
'engine-status':
{'reason': 'vm not running on this host',
'health': 'bad', 'vm': 'down_unexpected',
'detail': 'unknown'}, 'score':
0, 'stopped': False, 'maintenance
': False, 'crc32': 'e18e3f22', 'local_conf_timestamp': 78128,
'host-ts':
78128}
MainThread::INFO::2018-09-18
12:41:11,393::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason':
'failed
liveliness check', 'health': 'b
ad', 'vm': 'up', 'detail': 'Up'}, 'bridge':
True, 'mem-free': 12763.0,
'maintenance': False, 'cpu-load': 0
.0364, 'gateway': 1.0, 'storage-domain': True}
MainThread::INFO::2018-09-18
12:41:11,393::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18
12:41:11,703::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18
12:41:21,716::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18
12:41:22,020::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18
12:41:31,033::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18
12:41:31,344::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
As we can see, agent thinking that HostedEngine just in powering up mode.
I cannot to do anythink with it. I allready reinstalled many times srv00
node without success.
One time i even has to uninstall ovirt* and vdsm* software. Also here one
interesting point, after installing just "yum install
http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm" on this node
i try to install this node from engine web interface with "Deploy" action.
But, installation was unsuccesfull, before i didnt install
ovirt-hosted-engine-ha on this node. I dont see in documentation that its
need bofore installation of new hosts. But this is for information and
checking. After installing ovirt-hosted-engine-ha node was installed with
HostedEngine support. But the main issue not changed.
Thanks in advance for help.
BR,
Alexandr
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7KGDIM3X3G4...