On Tue, Nov 8, 2016 at 6:52 PM, Martin Sivak <msivak(a)redhat.com> wrote:
Hi,
mom-vdsm.service contains:
Requires=vdsmd.service
After=vdsmd.service
After does not mean much, since vdsm is not integrated with systemd.
Systemd does not wait until vdsm is ready to receive connections.
We can use systemd.daemon to notify systemd that vdsm is ready, making
dependencies more reliable.
See
https://github.com/oVirt/ovirt-imageio/blob/master/daemon/ovirt_imageio_d...
So when Shira restarted vdsm, mom was also restarted.
[journalctl --unit vdsmd]
Nov 08 18:25:27 RHEL7.2Server systemd[1]: Stopping Virtual Desktop
Server Manager...
Nov 08 18:25:27 RHEL7.2Server vdsmd_init_common.sh[3053]: vdsm:
Running run_final_hooks
Nov 08 18:25:27 RHEL7.2Server systemd[1]: Starting Virtual Desktop
Server Manager...
[journalctl --unit mom-vdsm]
Nov 08 18:17:23 RHEL7.2Server systemd[1]: Starting MOM instance
configured for VDSM purposes...
Nov 08 18:25:16 RHEL7.2Server systemd[1]: Stopping MOM instance
configured for VDSM purposes...
Nov 08 18:25:29 RHEL7.2Server systemd[1]: Started MOM instance
configured for VDSM purposes.
But mom then immediately failed with:
2016-11-08 18:25:08,008 - mom.RPCServer - INFO - ping()
2016-11-08 18:25:08,010 - mom.RPCServer - INFO - getStatistics()
2016-11-08 18:25:17,028 - mom.RPCServer - INFO - RPC Server ending
2016-11-08 18:25:24,705 - mom.GuestManager - INFO - Guest Manager ending
2016-11-08 18:25:26,575 - mom.HostMonitor - INFO - Host Monitor ending
2016-11-08 18:25:29,869 - mom - INFO - MOM starting
2016-11-08 18:25:29,905 - mom.HostMonitor - INFO - Host Monitor starting
2016-11-08 18:25:29,905 - mom - INFO - hypervisor interface vdsmjsonrpcbulk
2016-11-08 18:25:30,029 - mom.vdsmInterface - ERROR - Cannot connect
to VDSM! [Errno 111] Connection refused
2016-11-08 18:25:30,030 - mom - ERROR - Failed to initialize MOM threads
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in run
hypervisor_iface = self.get_hypervisor_interface()
File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217,
in get_hypervisor_interface
return module.instance(self.config)
File
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcbulkInterface.py",
line 47, in instance
return JsonRpcVdsmBulkInterface()
File
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcbulkInterface.py",
line 29, in __init__
super(JsonRpcVdsmBulkInterface, self).__init__()
File
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcInterface.py",
line 43, in __init__
.orRaise(RuntimeError, 'No connection to VDSM.')
File "/usr/lib/python2.7/site-packages/mom/optional.py", line 28, in orRaise
raise exception(*args, **kwargs)
RuntimeError: No connection to VDSM.
The question here is, how much time does VDSM need to allow jsonrpc to
connect and request a ping and list of VMs?
Even if we fix vdsm, mom (and any other client) should be more robust
and do not created so much noise when connection is refused. Client should
retry connection and log less dramatic warnings.
Adding devel@ovirt, vdsm-devel mailing was abandoned more then 2 years ago.
Nir