Hi, i have oVirt instance (4.2.1.6-1.el7.centos). So, i have cluster with gluster. Hosts
periodically non response and VM's is not responding. Usually it happens after get
message "command GetGlusterVolumeHealInfoVDS failed: Message timeout which can be
caused by communication issues".
Will solve the trouble if an increase timeout for get heat status? And how to do it?
I attach part log below:
https://paste.fedoraproject.org/paste/8TTzwjMbYk32d7wd7Ix0Pw/raw
2018-10-15 14:44:22,582+03 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [70cfd553] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM
ovirt3.example.org command GetGlusterVolumeHealInfoVDS failed: Message timeout which can
be caused by communication issues
2018-10-15 14:44:22,584+03 ERROR
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeHealInfoVDSCommand]
(DefaultQuartzScheduler6) [70cfd553] Command
'GetGlusterVolumeHealInfoVDSCommand(HostName =
ovirt3.example.org,
GlusterVolumeVDSParameters:{hostId='39215015-2537-4329-921f-c11256f99e04',
volumeName='domain1'})' execution failed: VDSGenericException:
VDSNetworkException: Message timeout which can be caused by communication issues
2018-10-15 14:44:22,584+03 WARN [org.ovirt.engine.core.vdsbroker.VdsManager]
(EE-ManagedThreadFactory-engine-Thread-7) [70cfd553] Host 'ovirt3.example.org' is
not responding. It will stay in Connecting state for a grace period of 77 seconds and
after that an attempt to fence the host will be issued.
2018-10-15 14:44:22,591+03 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-7) [70cfd553] EVENT_ID:
VDS_HOST_NOT_RESPONDING_CONNECTING(9,008), Host
ovirt3.example.org is not responding. It
will stay in Connecting state for a grace period of 77 seconds and after that an attempt
to fence the host will be issued.
2018-10-15 14:44:54,620+03 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-13) [] EVENT_ID: VDS_STORAGE_VDS_STATS_FAILED(189),
Host
ovirt3.example.org reports about one of the Active Storage Domains as Problematic.
2018-10-15 14:44:54,827+03 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-46) [6d9504d1] EVENT_ID:
VDS_SET_NONOPERATIONAL_DOMAIN(522), Host
ovirt3.example.org cannot access the Storage
Domain(s) DOMAIN1 attached to the Data Center Default. Setting Host state to
Non-Operational.
2018-10-15 14:44:54,840+03 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-46) [6d9504d1] EVENT_ID:
CONNECT_STORAGE_POOL_FAILED(995), Failed to connect Host
ovirt3.example.org to Storage
Pool Default
2018-10-15 14:45:28,698+03 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-87) [] EVENT_ID: VM_NOT_RESPONDING(126),
VM HostedEngine is not responding.
2018-10-15 14:45:30,296+03 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] EVENT_ID: VM_NOT_RESPONDING(126),
VM vm2 is not responding.
2018-10-15 14:45:30,362+03 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] EVENT_ID: VM_NOT_RESPONDING(126),
VM vm3 is not responding.