After even more investigation, root of issue seems to lie in vdsm receiving SIGTERM in the only host that is in state up [1]:
[vds] Received signal 15, shutting down (vdsmd:70)
while the other host is still in status Installing (so it cannot be used for fencing- hence the fence action failure).
The vdsm then goes back up in few moments, but engine, expecting the host is up all the time, meanwhile fails doing an operation that requires host to be up.

[1] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/15829/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log

On Fri, Sep 13, 2019 at 5:18 PM Dusan Fodor <dfodor@redhat.com> wrote:
For brave investigators, similar issue in later stage of the same test can be found here [1]. Same symptom of fence action fail, but this time it causes failure for adding storage itself:
2019-09-12 09:53:32,571-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Cannot attach Storage. There is no active Host in the Data Center.]

On Fri, Sep 13, 2019 at 5:09 PM Dusan Fodor <dfodor@redhat.com> wrote:
Hello all,
lately i witnessed multiple failures for add_master_storage_domain test, which were not related to changes themselves, nor any infra issue. One example can be found here [1].
After investigation with huge help of Milan, issue is that Host falls from up state to whatever-but-not-up suddenly.

  1. add_storage_domain picks a random host that is in up state
  2. meantime engine starts fence action for it, so probably something gone bad with the host; the fence action fails with: [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [6692895f] Can not run fence action on host 'lago-basic-suite-master-host-0', no suitable proxy host was found.
  3. test fails on not being able to attach the domain to non-up host: [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Cannot add storage server connection when Host status is not up]
For better orientation in failed job's engine log [1], fence action for host fails at
:46:12,842-04
engine learns it cannot connect storage to host at
:46:16,105-04

The test itself add_master_storage_domain starts at ~ :46:13,753 (according to lago log).

Could you please check this?
Thanks