Here VDSM logs ...



Em ter, 3 de jan de 2017 às 09:58, Rogério Ceni Coelho <rogeriocenicoelho@gmail.com> escreveu:
Hi Everyone,

I found a lot of Heartbeat exceeded like below ... I attach some logs ...

I am thinking to do rollback to 4.0.4 and CentOS 7.2 ...

2016-12-28 03:49:14,962 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] Internal server error: null
2016-12-28 03:49:14,986 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm15-poa.rbs.com.br/10.151.252.235
2016-12-28 03:49:14,993 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm06-poa.rbs.com.br/10.151.252.226
2016-12-28 03:49:14,997 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm08-poa.rbs.com.br/10.151.252.228
2016-12-28 03:49:15,001 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm03-poa.rbs.com.br/10.151.252.30
2016-12-28 03:49:15,006 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm17-poa.rbs.com.br/10.151.252.237
2016-12-28 03:49:15,012 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm02-poa.rbs.com.br/10.151.252.223
2016-12-28 03:49:15,018 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler11) [36d1c163] Command 'GetAllVmStatsVDSCommand(HostName = prd-rbs-ovirt-kvm07-poa.rbs.com.br, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='1f128680-6152-4273-bb1d-8be545b43461', vds='Host[prd-rbs-ovirt-kvm07-poa.rbs.com.br,1f128680-6152-4273-bb1d-8be545b43461]'})' execution failed: VDSGenericException: VDSNetworkException: Heartbeat exceeded
2016-12-28 03:49:15,018 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] (DefaultQuartzScheduler11) [36d1c163] Failed to fetch vms info for host 'prd-rbs-ovirt-kvm07-poa.rbs.com.br' - skipping VMs monitoring.
2016-12-28 03:49:15,018 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (org.ovirt.thread.pool-6-thread-8) [36d1c163] Host 'prd-rbs-ovirt-kvm07-poa.rbs.com.br' is not responding. It will stay in Connecting state for a grace period of 62 seconds and after that an attempt to fence the host will be issued.
2016-12-28 03:49:15,025 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler9) [13bbd638] Command 'GetAllVmStatsVDSCommand(HostName = prd-rbs-ovirt-kvm10-poa.rbs.com.br, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='8b4bc7dc-af8c-4415-a2ef-bccc11ddf23a', vds='Host[prd-rbs-ovirt-kvm10-poa.rbs.com.br,8b4bc7dc-af8c-4415-a2ef-bccc11ddf23a]'})' execution failed: VDSGenericException: VDSNetworkException: Heartbeat exceeded
2016-12-28 03:49:15,025 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] (DefaultQuartzScheduler9) [13bbd638] Failed to fetch vms info for host 'prd-rbs-ovirt-kvm10-poa.rbs.com.br' - skipping VMs monitoring.
2016-12-28 03:49:15,026 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm05-poa.rbs.com.br/10.151.252.225
2016-12-28 03:49:15,027 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler32) [5b86105] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM prd-rbs-ovirt-kvm03-poa.rbs.com.br command failed: Heartbeat exceeded
2016-12-28 03:49:15,027 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler32) [5b86105] Command 'SpmStatusVDSCommand(HostName = prd-rbs-ovirt-kvm03-poa.rbs.com.br, SpmStatusVDSCommandParameters:{runAsync='true', hostId='f7842244-646c-400a-9736-f8d4aa9b1cef', storagePoolId='98867d75-9c43-46b4-891a-ff3a5eb0f06e'})' execution failed: VDSGenericException: VDSNetworkException: Heartbeat exceeded
2016-12-28 03:49:15,027 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (org.ovirt.thread.pool-6-thread-10) [5b86105] Host 'prd-rbs-ovirt-kvm03-poa.rbs.com.br' is not responding. It will stay in Connecting state for a grace period of 81 seconds and after that an attempt to fence the host will be issued.
2016-12-28 03:49:15,029 INFO  [org.ovirt.engine.core.bll.storage.pool.SetStoragePoolStatusCommand] (DefaultQuartzScheduler32) [613ae8a1] Running command: SetStoragePoolStatusCommand internal: true. Entities affected :  ID: 98867d75-9c43-46b4-891a-ff3a5eb0f06e Type: StoragePool
2016-12-28 03:49:15,030 INFO  [org.ovirt.engine.core.vdsbroker.storage.StoragePoolDomainHelper] (DefaultQuartzScheduler32) [613ae8a1] Storage Pool '98867d75-9c43-46b4-891a-ff3a5eb0f06e' - Updating Storage Domain '7b8c9293-f103-401a-93ac-550981837224' status from 'Active' to 'Unknown', reason: null
2016-12-28 03:49:15,030 INFO  [org.ovirt.engine.core.vdsbroker.storage.StoragePoolDomainHelper] (DefaultQuartzScheduler32) [613ae8a1] Storage Pool '98867d75-9c43-46b4-891a-ff3a5eb0f06e' - Updating Storage Domain 'cfdbbda4-bd72-4c58-af73-8aa89d62ff01' status from 'Active' to 'Unknown', reason: null
2016-12-28 03:49:15,030 INFO  [org.ovirt.engine.core.vdsbroker.storage.StoragePoolDomainHelper] (DefaultQuartzScheduler32) [613ae8a1] Storage Pool '98867d75-9c43-46b4-891a-ff3a5eb0f06e' - Updating Storage Domain '0b5015d2-8f05-44c5-9e5a-d732b0b0e419' status from 'Active' to 'Unknown', reason: null
2016-12-28 03:49:15,031 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to prd-rbs-ovirt-kvm13-poa.rbs.com.br/10.151.252.233
2016-12-28 03:49:15,033 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler21) [b33cc28] Command 'GetAllVmStatsVDSCommand(HostName = prd-rbs-ovirt-kvm19-poa.rbs.com.br, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='887b6e35-1fd1-4cd6-9e78-05bcab12a417', vds='Host[prd-rbs-ovirt-kvm19-poa.rbs.com.br,887b6e35-1fd1-4cd6-9e78-05bcab12a417]'})' execution failed: VDSGenericException: VDSNetworkException: Heartbeat exceeded



Em seg, 2 de jan de 2017 às 17:18, Yaniv Kaul <ykaul@redhat.com> escreveu:
On Mon, Jan 2, 2017 at 8:51 PM, Rogério Ceni Coelho <rogeriocenicoelho@gmail.com> wrote:

Hi oVirt Gurus,


Happy new year to everyone !!!

 

I update oVirt Engine to 4.0.5 from 4.0.4 and Centos to 7.3 from 7.2 last week and after that I have instability four times. Every time ovirt engine seems to loose communication with one or more node servers like this image below. Every time I rebooted oVirt engine server and everything came back to normal.


It'd be great if you could share logs - engine.log from the Engine and vdsm.log from the host(s).
Y.
 

 

Anyone with this kind of problem ???

 

pasted1


After reboot :

 

pasted2

 



_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users