in 'verify_add_hosts' we specifically wait for single host to be up with a timeout:

 144     up_hosts = hosts_service.list(search='datacenter={} AND status=up'.format(DC_NAME))                                                      
 145     if len(up_hosts):                                                                                                                        
 146         return True
The log files say, that it took ~50 secs for one of the hosts to be up (seems reasonable) and no timeout is being reported.
Just after running 'verify_add_hosts', we run 'add_master_storage_domain', which calls '_hosts_in_dc' function.
That function does the exact same check, but it fails:
 113     hosts = hosts_service.list(search='datacenter={} AND status=up'.format(dc_name))                                                         
 114     if hosts:                                                                                                                                
 115         if random_host:                                                                                                                      
 116             return random.choice(hosts)                                                                                                      
 117         else:                                                                                                                                
 118             return sorted(hosts, key=lambda host: host.name)                                                                                 
 119     raise RuntimeError('Could not find hosts that are up in DC %s' % dc_name)

I'm also not able to reproduce this issue locally on my server. The investigation continues...

sure, I'm on it - it's weird though, I did ran 4.3 basic suite for this patch manually and everything was ok.

We are failing branch 4.3 for test: 002_bootstrap.add_master_storage_domain

It seems that in one of the hosts, the vdsm is not starting
there is nothing in vdsm.log or in supervdsm.log

CQ identified this patch as the suspected root cause:

https://gerrit.ovirt.org/#/c/98748/ - vdsm: client: Add support for flow id

Milan, Marcin, can you please have a look?

full logs:

the only error I can see is about host not being up (makes sense as vdsm is not running)


