On Mon, 2018-05-28 at 14:30 +0200, Mariusz Kozakowski wrote:
On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote:
The issue is on network configuration:
you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed.

From same time frame, vdsm.log. Can this be related?

2018-05-28 11:07:34,481+0200 INFO  (jsonrpc/1) [api.host] START getAllVmStats() from=::1,45816 (api:46)
2018-05-28 11:07:34,482+0200 INFO  (jsonrpc/1) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,45816 (api:52)
2018-05-28 11:07:34,483+0200 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573)
2018-05-28 11:07:34,489+0200 INFO  (jsonrpc/2) [api.host] START getAllVmIoTunePolicies() from=::1,45816 (api:46)
2018-05-28 11:07:34,489+0200 INFO  (jsonrpc/2) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'405f8ec0-03f9-43cb-a7e1-343a4c30453f': {'policy': [], 'current_values': []}}} from=::1,45816 (api:52)
2018-05-28 11:07:34,490+0200 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:573)
2018-05-28 11:07:35,555+0200 INFO  (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=6f517c47-a9f3-4913-bf9d-661355262c38 (api:46)
2018-05-28 11:07:35,555+0200 INFO  (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=6f517c47-a9f3-4913-bf9d-661355262c38 (api:52)
2018-05-28 11:07:35,555+0200 INFO  (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:707)
2018-05-28 11:07:38,982+0200 WARN  (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/7 running <Task <JsonRpcTask {'params': {}, 'jsonrpc': '2.0', 'method': u'Host.getCapabilities', 'id': u'b5990e16-65c9-4137-aeed-271c415f9df5'} at 0x7eff50088910> timeout=60, duration=180 at 0x2fc3650> task#=1 at 0x3590710>, traceback:
File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
  self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
  self.run()
File: "/usr/lib64/python2.7/threading.py", line 765, in run
  self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run
  self._execute_task()
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task
  task()
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
  self._callable()
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__
  self._handler(self._ctx, self._req)
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest
  response = self._handle_request(req, ctx)
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request
  res = method(**params)
File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in _dynamicMethod
  result = fn(*methodArgs)
File: "<string>", line 2, in getCapabilities
File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1339, in getCapabilities
  c = caps.get()
File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 201, in get
  liveSnapSupported = _getLiveSnapshotSupport(cpuarch.effective())
File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__
  value = self.func(*args)
File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 92, in _getLiveSnapshotSupport
  capabilities = _getCapsXMLStr()
File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__
  value = self.func(*args)
File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 60, in _getCapsXMLStr
  return _getFreshCapsXMLStr()
File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 55, in _getFreshCapsXMLStr
  return libvirtconnection.get().getCapabilities()
File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
  ret = f(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
  return func(inst, *args, **kwargs)
File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 3669, in getCapabilities
  ret = libvirtmod.virConnectGetCapabilities(self._o) (executor:363)
2018-05-28 11:07:40,561+0200 INFO  (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=2ef7c0a3-4cf9-436e-ad6b-ee04e2a0cf3a (api:46)
2018-05-28 11:07:40,561+0200 INFO  (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=2ef7c0a3-4cf9-436e-ad6b-ee04e2a0cf3a (api:52)
2018-05-28 11:07:40,561+0200 INFO  (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:707)

supervdsm hase nothing at 11:07:39.

Found another one, maybe related:

2018-05-28 11:02:44,371+0200 ERROR (vm/405f8ec0) [virt.vm] (vmId='405f8ec0-03f9-43cb-a7e1-343a4c30453f') Failed to connect to guest agent channel (vm:2500)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2498, in _vmDependentInit
    self.guestAgent.start()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 249, in start
    self._prepare_socket()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 291, in _prepare_socket
    supervdsm.getProxy().prepareVmChannel(self._socketName)
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in <lambda>
    **kwargs)
  File "<string>", line 2, in prepareVmChannel
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
OSError: [Errno 2] No such file or directory: '/var/lib/libvirt/qemu/channels/405f8ec0-03f9-43cb-a7e1-343a4c30453f.com.redhat.rhevm.vdsm'

Interesting part: there is no file with ending: com.redhat.rhevm.vdsm, but there is `/var/lib/libvirt/qemu/channels/405f8ec0-03f9-43cb-a7e1-343a4c30453f.org.qemu.guest_agent.0`