Problem trying to detach ISO datastore

17 Apr 2019

      Hi.

When trying to detach an ISO datastore I get the following error. Also
fails if I try to switch the datastore from maintenance to active.

2019-04-17 10:07:23,100+0000 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer]
RPC call StoragePool.connectStorageServer succeeded in 0.01 seconds
(__init__:312)
2019-04-17 10:07:23,104+0000 INFO  (jsonrpc/4) [vdsm.api] START
detachStorageDomain(sdUUID=u'2722942e-0887-439c-81f0-70b60d860060',
spUUID=u'6901a281-c414-42c6-8b41-3cf6e6bcd788',
msdUUID=u'00000000-0000-0000-0000-000000000000', masterVersion=14,
options=None) from=::fff
f:10.5.72.63,58618, flow_id=0420c8d0-fb37-4b22-9fea-9ee8bac7d41d,
task_id=9301405a-eaf6-4f1c-a1b6-6537740e2daf (api:48)
2019-04-17 10:07:23,105+0000 INFO  (jsonrpc/4) [storage.StoragePool]
sdUUID=2722942e-0887-439c-81f0-70b60d860060
spUUID=6901a281-c414-42c6-8b41-3cf6e6bcd788 (sp:1048)
2019-04-17 10:07:23,316+0000 INFO  (itmap/0) [IOProcessClient]
(/pre1-svm-templates.por-ngcs.lan:_ds__pre1__templates__iso01_ovirt3)
Starting client (__init__:308)
2019-04-17 10:07:23,324+0000 INFO  (itmap/1) [IOProcessClient]
(/pre1-svm-templates.por-ngcs.lan:_ds__pre1__templates__iso01_ovirt)
Starting client (__init__:308)
2019-04-17 10:07:23,327+0000 INFO  (ioprocess/74639) [IOProcess]
(/pre1-svm-templates.por-ngcs.lan:_ds__pre1__templates__iso01_ovirt3)
Starting ioprocess (__init__:434)
2019-04-17 10:07:23,333+0000 INFO  (ioprocess/74645) [IOProcess]
(/pre1-svm-templates.por-ngcs.lan:_ds__pre1__templates__iso01_ovirt)
Starting ioprocess (__init__:434)
2019-04-17 10:07:23,335+0000 INFO  (jsonrpc/4) [IOProcessClient]
(2722942e-0887-439c-81f0-70b60d860060) Starting client (__init__:308)
2019-04-17 10:07:23,344+0000 INFO  (ioprocess/74653) [IOProcess]
(2722942e-0887-439c-81f0-70b60d860060) Starting ioprocess (__init__:434)
2019-04-17 10:07:23,350+0000 INFO  (jsonrpc/4) [storage.StorageDomain]
Removing remnants of deleted images [] (fileSD:740)
2019-04-17 10:07:23,352+0000 INFO  (jsonrpc/4) [vdsm.api] FINISH
detachStorageDomain error=Storage domain not in pool:
u'domain=2722942e-0887-439c-81f0-70b60d860060,
pool=6901a281-c414-42c6-8b41-3cf6e6bcd788' from=::ffff:10.5.72.63,58618,
flow_id=0420c8d0-fb37-4b22-9fea-
9ee8bac7d41d, task_id=9301405a-eaf6-4f1c-a1b6-6537740e2daf (api:52)
2019-04-17 10:07:23,352+0000 ERROR (jsonrpc/4)
[storage.TaskManager.Task] (Task='9301405a-eaf6-4f1c-a1b6-6537740e2daf')
Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in detachStorageDomain
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50,
in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 891,
in detachStorageDomain
    pool.detachSD(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1053,
in detachSD
    self.validateAttachedDomain(dom)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 551,
in validateAttachedDomain
    raise se.StorageDomainNotInPool(self.spUUID, dom.sdUUID)
StorageDomainNotInPool: Storage domain not in pool:
u'domain=2722942e-0887-439c-81f0-70b60d860060,
pool=6901a281-c414-42c6-8b41-3cf6e6bcd788'
2019-04-17 10:07:23,352+0000 INFO  (jsonrpc/4)
[storage.TaskManager.Task] (Task='9301405a-eaf6-4f1c-a1b6-6537740e2daf')
aborting: Task is aborted: "Storage domain not in pool:
u'domain=2722942e-0887-439c-81f0-70b60d860060,
pool=6901a281-c414-42c6-8b41-3cf6e6bcd788'" - co
de 353 (task:1181)
2019-04-17 10:07:23,353+0000 ERROR (jsonrpc/4) [storage.Dispatcher]
FINISH detachStorageDomain error=Storage domain not in pool:
u'domain=2722942e-0887-439c-81f0-70b60d860060,
pool=6901a281-c414-42c6-8b41-3cf6e6bcd788' (dispatcher:83)
2019-04-17 10:07:23,353+0000 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer]
RPC call StorageDomain.detach failed (error 353) in 0.25 seconds
(__init__:312)
2019-04-17 10:07:26,275+0000 INFO  (jsonrpc/2) [api.host] START
getAllVmStats() from=::ffff:10.5.72.63,58604 (api:48)
2019-04-17 10:07:26,275+0000 INFO  (jsonrpc/2) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done', 'code': 0},
'statsList': (suppressed)} from=::ffff:10.5.72.63,58604 (api:54)
2019-04-17 10:07:26,275+0000 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer]
RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312)
2019-04-17 10:07:29,675+0000 INFO  (jsonrpc/7) [api.host] START
getStats() from=::ffff:10.5.72.63,58604 (api:48)
2019-04-17 10:07:29,739+0000 INFO  (jsonrpc/7) [vdsm.api] START
repoStats(domains=()) from=::ffff:10.5.72.63,58604,
task_id=3ec94f9f-7c98-4c49-8710-ffbfd9910ff2 (api:48)
2019-04-17 10:07:29,739+0000 INFO  (jsonrpc/7) [vdsm.api] FINISH
repoStats return={u'a97bbda5-33e4-4e78-890c-f7d2a572179e': {'code': 0,
'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000505524',
'lastCheck': '3.6', 'valid': True}, u'4121f681-45c3-413a-9c6e-6
9d154710ace': {'code': 0, 'actual': True, 'version': 4, 'acquired':
True, 'delay': '0.000790301', 'lastCheck': '2.7', 'valid': True}}
from=::ffff:10.5.72.63,58604,
task_id=3ec94f9f-7c98-4c49-8710-ffbfd9910ff2 (api:54)
2019-04-17 10:07:29,740+0000 INFO  (jsonrpc/7) [vdsm.api] START
multipath_health() from=::ffff:10.5.72.63,58604,
task_id=f567e03a-8cf4-49dd-84eb-628acb10e1d3 (api:48)
2019-04-17 10:07:29,740+0000 INFO  (jsonrpc/7) [vdsm.api] FINISH
multipath_health return={} from=::ffff:10.5.72.63,58604,
task_id=f567e03a-8cf4-49dd-84eb-628acb10e1d3 (api:54)

Interestingly, I have 2 datacenters, I can attach, detach, set to
maintenance or activate this same datastore on the second datacenter,
but not on this one.

So far I have tried updating OVF, setting master datastore and hosts to
maintenance and re-activating them.

Started happening on 4.3.2, updating to 4.3.3 makes no difference.

This is what I see on the database. Looks OK to me:

engine=# select * from storage_pool;
                  id                  |       name        | description
| storage_pool_type | storage_pool_format_type | status |
master_domain_version |              spm_vds_id              |
compatibility_version |         _create_date          |        
_update_date 
        | quota_enforcement_type | free_text_comment | is_local
--------------------------------------+-------------------+-------------+-------------------+--------------------------+--------+-----------------------+--------------------------------------+-----------------------+-------------------------------+-----------------------
--------+------------------------+-------------------+----------
 039aef77-77c8-4f04-8d17-22f4b655df28 | dc-pre1-vc03-cl02 |            
|                   | 4                        |      1
|                     1 | 42c969f3-a2a6-4784-8506-e6fbc731da9d |
4.3                   | 2019-03-14 12:13:24.411468+00 | 2019-04-17
08:38:05.53
8042+00 |                      0 |                   | f
 6901a281-c414-42c6-8b41-3cf6e6bcd788 | dc-pre1-vc03-cl01 |            
|                   | 4                        |      1
|                    15 | 089cdde4-ecbf-4054-82f9-c106b046d73a |
4.3                   | 2019-03-14 12:13:24.365529+00 | 2019-04-17
11:10:46.42
2305+00 |                      0 |                   | f
(2 rows)

engine=# select * from storage_pool_iso_map ;
              storage_id              |          
storage_pool_id            | status
--------------------------------------+--------------------------------------+--------
 2722942e-0887-439c-81f0-70b60d860060 |
039aef77-77c8-4f04-8d17-22f4b655df28 |      6
 2722942e-0887-439c-81f0-70b60d860060 |
6901a281-c414-42c6-8b41-3cf6e6bcd788 |      6
 5d390949-24ec-45e2-bb6d-43b3f30bd286 |
039aef77-77c8-4f04-8d17-22f4b655df28 |      3
 94cb1a4f-a267-479c-88e7-ea9c32e3fd24 |
039aef77-77c8-4f04-8d17-22f4b655df28 |      3
 4121f681-45c3-413a-9c6e-69d154710ace |
6901a281-c414-42c6-8b41-3cf6e6bcd788 |      3
 a97bbda5-33e4-4e78-890c-f7d2a572179e |
6901a281-c414-42c6-8b41-3cf6e6bcd788 |      3
(6 rows)

Apart from root cause. I would be interested in recovering from the
error, detaching the ISO storage, even if it is manually. This
datacenter is not in production yet, so I have options on the things I
can try.

-- 
Eduardo Mayoral Jimeno
Systems engineer, platform department. Arsys Internet.
emayoral@arsys.es - +34 941 620 105 - ext 2153

Eduardo Mayoral

Eduardo Mayoral

tags

participants (1)