On Mon, Oct 9, 2017 at 6:00 PM, Maton, Brett <matonb@ltresources.co.uk> wrote:
Hope this is enough of the log,  no idea why it's saying invalid fence agents:

2017-10-09 16:01:01,535+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-184) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ov_host.example.com command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=47d12516-a41c-41a7-9da4-320f08fda147, msdUUID=a28ebb26-98be-4850-8e88-0ac2c3d6037a'
2017-10-09 16:01:01,535+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-184) [] HostName = ov_host.example.com
2017-10-09 16:01:01,535+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-184) [] Command 'ConnectStoragePoolVDSCommand(HostName = ov_host.example.com, ConnectStoragePoolVDSCommandParameters:{hostId='9390df92-5c79-4a25-9e94-1ca2d4aedc74', vdsId='9390df92-5c79-4a25-9e94-1ca2d4aedc74', storagePoolId='47d12516-a41c-41a7-9da4-320f08fda147', masterVersion='2'})' execution failed: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot find master domain: u'spUUID=47d12516-a41c-41a7-9da4-320f08fda147, msdUUID=a28ebb26-98be-4850-8e88-0ac2c3d6037a'
2017-10-09 16:01:01,535+01 INFO  [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-184) [] Could not connect host 'ov_host.example.com' to pool 'GHDC', as the master domain is in inactive/unknown status - not failing the operation

​This seems like severe storage issue​
, if your master storage domain is inactive/unknown, then your VMs cannot ​run. You would need to check VDSM logs to find the real cause of the issue. Are you sure there are no storage or network issues on your setup?

 
2017-10-09 16:01:01,565+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SetMOMPolicyParametersVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-1) [594458d4] START, SetMOMPolicyParametersVDSCommand(HostName = ov_host.example.com, MomPolicyVDSParameters:{hostId='9390df92-5c79-4a25-9e94-1ca2d4aedc74'}), log id: 43532a7a
2017-10-09 16:01:01,595+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-210) [6eeda756] START, SpmStopVDSCommand(HostName = ov_host.example.com, SpmStopVDSCommandParameters:{hostId='9390df92-5c79-4a25-9e94-1ca2d4aedc74', storagePoolId='47d12516-a41c-41a7-9da4-320f08fda147'}), log id: 3861bb24
2017-10-09 16:01:01,602+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-210) [6eeda756] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ov_host.example.com command HSMGetAllTasksStatusesVDS failed: Not SPM: ()
2017-10-09 16:01:01,602+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-210) [6eeda756] HostName = ov_host.example.com
2017-10-09 16:01:01,602+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-210) [6eeda756] Command 'HSMGetAllTasksStatusesVDSCommand(HostName = ov_host.example.com, VdsIdVDSCommandParametersBase:{hostId='9390df92-5c79-4a25-9e94-1ca2d4aedc74'})' execution failed: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM: ()
2017-10-09 16:01:01,653+01 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-1) [2e03c9cf] EVENT_ID: VDS_DETECTED(13), Status of host ov_host.example.com was set to Up.
2017-10-09 16:01:01,656+01 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-1) [2e03c9cf] EVENT_ID: VDS_ALERT_FENCE_TEST_FAILED(9,001), Power Management test failed for Host ov_host.example.com.Invalid fence agents defined for host '192.168.1.12'.
2017-10-09 16:01:01,659+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-1) [2e03c9cf] EVENT_ID: VDS_FENCE_STATUS_FAILED(497), Failed to verify Host ov_host.example.com power management.

​We are trying to get power management status of the host, when activating the host. In that case there was an error trying when getting power management status, more details should be in VDSM logs of the host that acted as fence proxy.
 




On 9 October 2017 at 14:41, Maton, Brett <matonb@ltresources.co.uk> wrote:
Hi Martin,

  Sure is there a way to trigger the check (I cleared the alerts), or do I just need to have a poke around the engine log next time it happens?

​Yes, please take a look at engine-config option PMHealthCheckIntervalInSec:

  engine-config -l | grep​
 
​PMHealthCheckIntervalInSec

After updating this value you need to restart ovirt-engine service.

Regards,
Brett

On 9 October 2017 at 14:26, Martin Perina <mperina@redhat.com> wrote:


On Mon, Oct 9, 2017 at 2:40 PM, Maton, Brett <matonb@ltresources.co.uk> wrote:
Hi,

  I recently upgraded my testlab to oVirt 4.2-pre and have been having some issues.

  One of which is fencing with Dell Drac 8 remote access cards.
  The configuration I'm using works fine on my other (4.1.6) cluster...

  Error:
Health check on Host host.example.com indicates that future attempts to Start this host using Power-Management are expected to fail.

​Hi, ​

​Could you please share engine.log around above message? We are executing get power management status call to check, so if this get PM status failed, the above error is raised.

Thanks

Martin
 


  Fence agent setup for the host appears to be working though:

 


  Which logs would be helpful in debugging this issue ?

Regards,
Brett

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users