On Thu, Dec 22, 2011 at 01:05:22PM +0200, Roy Golan wrote:
Hi all,
VDSM network provisioning api exposes a validity check to know the
newly applied changes works. The actual
check is comparing the /var/run/vdsm/client.log modified time to the
time when the check began and repeats
that check after sleeping 1 seconds for X time (where X is
connectivityTimeOut)
def clientSeen(timeout):
start = time.time()
while timeout >= 0:
if os.stat(constants.P_VDSM_CLIENT_LOG).st_mtime > start:
return True
time.sleep(1)
timeout -= 1
return False
Main issues I spot are:
1. In case the host is in maintenance, the caller of the API must
generate traffic, concurrently to running api call,
and then must join and sync threads to realize when all is done. see
http://gerrit.ovirt.org/#change,584
2. locally calling vdsClient also modifies the client.log - we
can't rely on that no one will do that during the call.
TODO: when verifying connectivity, vdsm should ack only if the caller of
setupNetworks is logged; communication from other IPs should be ignored
for that sense.
Until then, customers should remember that vdsClient is unsupported, and
can cause much worse stuff than fooling Vdsm to believe it has
ovirt-engine connectivity.
3. Failure writing to the client log will fail network provisioning!
If vdsm cannot touch a zero-length file, it has more serious problems.
4. we verify connectivity only of the management network. It could be
much nicer if we had an independent verification for other networks.
avahi protocol could give this + auto discovery.
All of the above makes it not very reliable as a check and harder to
call, without posing races, as client.
Possible alternate solution:
1. We can try to reach the api caller socket in return, maybe use
http code 100 ?
2. pass in the API a URL which the VDSM will call. could be a
health-check servlet or something similar.
Any toughs?