On Tue, Oct 16, 2018 at 11:39 PM Spickiy Nikita <n.spickiy@outlook.com> wrote:
Hi, i have oVirt instance (4.2.1.6-1.el7.centos). So, i have cluster with gluster. Hosts periodically non response and VM's is not responding. Usually it happens after get message "command GetGlusterVolumeHealInfoVDS failed: Message timeout which can be caused by communication issues".

Will solve the trouble if an increase timeout for get heat status? And how to do it?

I attach part log below:

2018-10-15 14:44:22,582+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler6) [70cfd553] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ovirt3.example.org command GetGlusterVolumeHealInfoVDS failed: Message timeout which can be caused by communication issues
2018-10-15 14:44:22,584+03 ERROR [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeHealInfoVDSCommand] (DefaultQuartzScheduler6) [70cfd553] Command 'GetGlusterVolumeHealInfoVDSCommand(HostName = ovirt3.example.org, GlusterVolumeVDSParameters:{hostId='39215015-2537-4329-921f-c11256f99e04', volumeName='domain1'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues
2018-10-15 14:44:22,584+03 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engine-Thread-7) [70cfd553] Host 'ovirt3.example.org' is not responding. It will stay in Connecting state for a grace period of 77 seconds and after that an attempt to fence the host will be issued.
2018-10-15 14:44:22,591+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-7) [70cfd553] EVENT_ID: VDS_HOST_NOT_RESPONDING_CONNECTING(9,008), Host ovirt3.example.org is not responding. It will stay in Connecting state for a grace period of 77 seconds and after that an attempt to fence the host will be issued.
2018-10-15 14:44:54,620+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-13) [] EVENT_ID: VDS_STORAGE_VDS_STATS_FAILED(189), Host ovirt3.example.org reports about one of the Active Storage Domains as Problematic.
2018-10-15 14:44:54,827+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-46) [6d9504d1] EVENT_ID: VDS_SET_NONOPERATIONAL_DOMAIN(522), Host ovirt3.example.org cannot access the Storage Domain(s) DOMAIN1 attached to the Data Center Default. Setting Host state to Non-Operational.
2018-10-15 14:44:54,840+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-46) [6d9504d1] EVENT_ID: CONNECT_STORAGE_POOL_FAILED(995), Failed to connect Host ovirt3.example.org to Storage Pool Default
2018-10-15 14:45:28,698+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-87) [] EVENT_ID: VM_NOT_RESPONDING(126), VM HostedEngine is not responding.
2018-10-15 14:45:30,296+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-72) [] EVENT_ID: VM_NOT_RESPONDING(126), VM vm2 is not responding.
2018-10-15 14:45:30,362+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-72) [] EVENT_ID: VM_NOT_RESPONDING(126), VM vm3 is not responding.



Can you check the vdsm log to see if you're running into https://bugzilla.redhat.com/show_bug.cgi?id=1614430

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XK7YX6FINFOKA7WGK2ST7KGTCICS6M25/