Hi all,

ovirt 4.1 hosted engine on 2 node cluster and FC LUN Storage

I'm trying to clear some task pending from months using vdsClient but I can't do anything.  Below are the steps (on node 1, the SPM):

1. Show all tasks:

# vdsClient -s 0 getAllTasksInfo
fd319af4-d160-48ce-b682-5a908333a5e1 :
         verb = createVolume
         id = fd319af4-d160-48ce-b682-5a908333a5e1
9bbc2bc4-3c73-4814-a785-6ea737904528 :
         verb = prepareMerge
         id = 9bbc2bc4-3c73-4814-a785-6ea737904528
e70feb21-964d-49d9-9b5a-8e3f70a92db1 :
         verb = prepareMerge
         id = e70feb21-964d-49d9-9b5a-8e3f70a92db1
cf064461-f0ab-4e44-a68f-b2d58fa83a21 :
         verb = prepareMerge
         id = cf064461-f0ab-4e44-a68f-b2d58fa83a21
85b7cf4e-d658-4785-94f0-391fe9616b41 :
         verb = prepareMerge
         id = 85b7cf4e-d658-4785-94f0-391fe9616b41
7416627a-fe50-4353-b129-e01bba066a66 :
         verb = prepareMerge
         id = 7416627a-fe50-4353-b129-e01bba066a66


2. Stop all tasks (repeted for every task):

# vdsClient -s 0 stopTask 7416627a-fe50-4353-b129-e01bba066a66 
Task is aborted: u'7416627a-fe50-4353-b129-e01bba066a66' - code 411

3. Tring to clear tasks:

 # vdsClient -s 0 clearTask 7416627a-fe50-4353-b129-e01bba066a66
Operation is not allowed in this task state: ("can't clean in state running",)



On Node 01 (the SPM) I have multiple errors in /var/log/vdsm/vdsm.log like this:

2017-10-11 15:09:53,719+0200 INFO  (jsonrpc/3) [storage.TaskManager.Task] (Task='9519d4db-2960-4b88-82f2-e4c1094eac54') aborting: Task is aborted: u'Operation is not allowed in this task state: ("can\'t clean in state running",)' - code 100 (task:1175)
2017-10-11 15:09:53,719+0200 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH clearTask error=Operation is not allowed in this task state: ("can't clean in state running",) (dispatcher:78)
2017-10-11 15:09:53,720+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Task.clear failed (error 410) in 0.01 seconds (__init__:539)
2017-10-11 15:09:53,743+0200 INFO  (jsonrpc/6) [vdsm.api] START clearTask(taskID=u'7416627a-fe50-4353-b129-e01bba066a66', spUUID=None, options=None) from=::ffff:192.168.0.226,36724, flow_id=7cd340ec (api:46)
2017-10-11 15:09:53,743+0200 INFO  (jsonrpc/6) [vdsm.api] FINISH clearTask error=Operation is not allowed in this task state: ("can't clean in state running",) from=::ffff:192.168.0.226,36724, flow_id=7cd340ec (api:50)
2017-10-11 15:09:53,743+0200 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task='0e12e052-2aca-480d-b50f-5de01ddebe35') Unexpected error (task:870)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 877, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in clearTask
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2258, in clearTask
    return self.taskMng.clearTask(taskID=taskID)
  File "/usr/share/vdsm/storage/taskManager.py", line 175, in clearTask
    t.clean()
  File "/usr/share/vdsm/storage/task.py", line 1047, in clean
    raise se.TaskStateError("can't clean in state %s" % self.state)
TaskStateError: Operation is not allowed in this task state: ("can't clean in state running",)


On Node 02 (is a 2 node cluster) I have other errors (I don't know if are related):

2017-10-11 15:11:57,083+0200 INFO  (jsonrpc/7) [storage.LVM] Refreshing lvs: vg=b50c1f5c-aa2c-4a53-9f89-83517fa70d3b lvs=['leases'] (lvm:1291)
2017-10-11 15:11:57,084+0200 INFO  (jsonrpc/7) [storage.LVM] Refreshing LVs (vg=b50c1f5c-aa2c-4a53-9f89-83517fa70d3b, lvs=['leases']) (lvm:1319)
2017-10-11 15:11:57,124+0200 INFO  (jsonrpc/7) [storage.VolumeManifest] b50c1f5c-aa2c-4a53-9f89-83517fa70d3b/d42f671e-1745-46c1-9e1c-2833245675fc/c86afaa5-6ca8-4fcb-a27e-ffbe0133fe23 info is {'status': 'OK', 'domain': 'b50c1f5c-aa2c-4a53-9f89-83517fa70d3b', 'voltype': 'LEAF', 'description': 'hosted-engine.metadata', 'parent': '00000000-0000-0000-0000-000000000000', 'format': 'RAW', 'generation': 0, 'image': 'd42f671e-1745-46c1-9e1c-2833245675fc', 'ctime': '1499437345', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '134217728', 'children': [], 'pool': '', 'capacity': '134217728', 'uuid': u'c86afaa5-6ca8-4fcb-a27e-ffbe0133fe23', 'truesize': '134217728', 'type': 'PREALLOCATED', 'lease': {'owners': [], 'version': None}} (volume:272)
2017-10-11 15:11:57,125+0200 INFO  (jsonrpc/7) [vdsm.api] FINISH getVolumeInfo return={'info': {'status': 'OK', 'domain': 'b50c1f5c-aa2c-4a53-9f89-83517fa70d3b', 'voltype': 'LEAF', 'description': 'hosted-engine.metadata', 'parent': '00000000-0000-0000-0000-000000000000', 'format': 'RAW', 'generation': 0, 'image': 'd42f671e-1745-46c1-9e1c-2833245675fc', 'ctime': '1499437345', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '134217728', 'children': [], 'pool': '', 'capacity': '134217728', 'uuid': u'c86afaa5-6ca8-4fcb-a27e-ffbe0133fe23', 'truesize': '134217728', 'type': 'PREALLOCATED', 'lease': {'owners': [], 'version': None}}} from=::1,56906 (api:52)
2017-10-11 15:11:57,126+0200 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.05 seconds (__init__:539)
2017-10-11 15:11:57,758+0200 INFO  (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:56908 (protocoldetector:72)
2017-10-11 15:11:57,764+0200 INFO  (Reactor thread) [ProtocolDetector.Detector] Detected protocol stomp from ::1:56908 (protocoldetector:127)
2017-10-11 15:11:57,765+0200 INFO  (Reactor thread) [Broker.StompAdapter] Processing CONNECT request (stompreactor:103)
2017-10-11 15:11:57,765+0200 INFO  (JsonRpc (StompReactor)) [Broker.StompAdapter] Subscribe command received (stompreactor:130)
2017-10-11 15:11:57,930+0200 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.getHardwareInfo succeeded in 0.01 seconds (__init__:539)
2017-10-11 15:11:57,933+0200 INFO  (jsonrpc/1) [vdsm.api] START repoStats(options=None) from=::1,56908 (api:46)
2017-10-11 15:11:57,933+0200 INFO  (jsonrpc/1) [vdsm.api] FINISH repoStats return={u'b50c1f5c-aa2c-4a53-9f89-83517fa70d3b': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000138003', 'lastCheck': '4.9', 'valid': True}, u'b6730d64-2cf8-42a3-8f08-24b8cc2c0cd8': {'code': 200, 'actual': True, 'version': -1, 'acquired': False, 'delay': '0', 'lastCheck': '9.7', 'valid': False}, u'c7d32f1b-f32c-4a21-995b-2e3b415aae4e': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000618471', 'lastCheck': '1.4', 'valid': True}, u'05ab1dd9-24bc-409b-80b8-6c5b00c52aa9': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.00027591', 'lastCheck': '5.2', 'valid': True}} from=::1,56908 (api:52)
2017-10-11 15:11:57,998+0200 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getStats succeeded in 0.06 seconds (__init__:539)
2017-10-11 15:11:58,253+0200 ERROR (monitor/b6730d6) [storage.Monitor] Setting up monitor for b6730d64-2cf8-42a3-8f08-24b8cc2c0cd8 failed (monitor:329)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/monitor.py", line 326, in _setupLoop
    self._setupMonitor()
  File "/usr/share/vdsm/storage/monitor.py", line 349, in _setupMonitor
    self._produceDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 401, in wrapper
    value = meth(self, *a, **kw)
  File "/usr/share/vdsm/storage/monitor.py", line 367, in _produceDomain
    self.domain = sdCache.produce(self.sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 112, in produce
    domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 136, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 153, in _findDomain
    return findMethod(sdUUID)
  File "/usr/share/vdsm/storage/nfsSD.py", line 126, in findDomain
    return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
  File "/usr/share/vdsm/storage/fileSD.py", line 359, in __init__
    manifest = self.manifestClass(domainPath)
  File "/usr/share/vdsm/storage/fileSD.py", line 171, in __init__
    sd.StorageDomainManifest.__init__(self, sdUUID, domaindir, metadata)
  File "/usr/share/vdsm/storage/sd.py", line 332, in __init__
    self._domainLock = self._makeDomainLock()
  File "/usr/share/vdsm/storage/sd.py", line 526, in _makeDomainLock
    domVersion = self.getVersion()
  File "/usr/share/vdsm/storage/sd.py", line 403, in getVersion
    return self.getMetaParam(DMDK_VERSION)
  File "/usr/share/vdsm/storage/sd.py", line 400, in getMetaParam
    return self._metadata[key]
  File "/usr/lib/python2.7/site-packages/vdsm/storage/persistent.py", line 91, in __getitem__
    return dec(self._dict[key])
  File "/usr/lib/python2.7/site-packages/vdsm/storage/persistent.py", line 203, in __getitem__
    raise KeyError(key)
KeyError: 'VERSION'


Can you help me? 

Restart hosted engine don't solve the problem

Thank you


p.s. Related question: tasks above are the same/related reported by the engine in the screenshot here? https://snag.gy/XDmoUt.jpg ... How Can I clear also these tasks from engine?