I think it is in issue with newer kernel, looks like devices are not
available after `iscsiadm -l`, I've seen a similar reproducible issue on my
env (in a different flow though), so I submitted a bug[1]
[1]
On Tue, Feb 25, 2020 at 10:55 AM Sandro Bonazzola <sbonazzo(a)redhat.com>
wrote:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/
Engine fails at:
2020-02-25 02:53:53,144-05 DEBUG
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (default
task-1) [56f93c90-3868-43f8-920f-bc1fccc72a27] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException:
VDSErrorException: Failed to HSMGetStorageDomainInfoVDS, error = Storage domain does not
exist: ('8c9f3762-3bf1-48fa-9237-b6587d0268ab',), code = 358
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/arti...
corresponding failure in VDSM is:
2020-02-25 02:53:53,128-0500 ERROR (jsonrpc/7) [storage.TaskManager.Task]
(Task='752b540e-fbfc-4602-8eeb-3c357fb7f5a2') Unexpected error (task:874)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 881, in
_run
return fn(*args, **kargs)
File "<decorator-gen-129>", line 2, in getStorageDomainInfo
File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2752, in
getStorageDomainInfo
dom = self.validateSdUUID(sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 309, in
validateSdUUID
sdDom = sdCache.produce(sdUUID=sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 115, in
produce
domain.getRealDomain()
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 51, in
getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 139, in
_realProduce
domain = self._findDomain(sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 156, in
_findDomain
return findMethod(sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 186, in
_findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist:
('8c9f3762-3bf1-48fa-9237-b6587d0268ab',)
2020-02-25 02:53:53,128-0500 INFO (jsonrpc/7) [storage.TaskManager.Task]
(Task='752b540e-fbfc-4602-8eeb-3c357fb7f5a2') aborting: Task is aborted:
"value=Storage domain does not exist:
('8c9f3762-3bf1-48fa-9237-b6587d0268ab',) abortedcode=358" (task:1184)
2020-02-25 02:53:53,128-0500 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH
getStorageDomainInfo error=Storage domain does not exist:
('8c9f3762-3bf1-48fa-9237-b6587d0268ab',) (dispatcher:83)
2020-02-25 02:53:53,129-0500 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call
StorageDomain.getInfo failed (error 358) in 0.28 seconds (__init__:312)
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/arti...
Corresponding var log messages:
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2:
Attached scsi generic sg8 type 0
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: alua:
port group 00 state A non-preferred supports TOlUSNA
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1:
Direct-Access LIO-ORG lun1_bdev 4.0 PQ: 0 ANSI: 5
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1: alua:
supports implicit and explicit TPGS
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: scsi 3:0:0:1: alua:
device naa.600140541e20cc9dba94e4fa46a57322 port group 0 rel port 1
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
Write Protect is off
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
Write cache: enabled, read cache: enabled, supports DPO and FUA
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1:
Attached scsi generic sg9 type 0
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: alua:
port group 00 state A non-preferred supports TOlUSNA
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:0: [sdf]
Attached SCSI disk
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
Write Protect is off
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
Write cache: enabled, read cache: enabled, supports DPO and FUA
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:4: [sdg]
Attached SCSI disk
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:3: [sdh]
Attached SCSI disk
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:2: [sdi]
Attached SCSI disk
Feb 25 02:53:52 lago-basic-suite-master-host-0 kernel: sd 3:0:0:1: [sdj]
Attached SCSI disk
Feb 25 02:53:52 lago-basic-suite-master-host-0 systemd[1]: Started Session
c67 of user root.
Feb 25 02:53:52 lago-basic-suite-master-host-0 systemd[1]: Starting LVM
event activation on device 8:128...
Feb 25 02:53:52 lago-basic-suite-master-host-0 lvm[30164]: pvscan[30164]
PV /dev/sdi online, VG 8c9f3762-3bf1-48fa-9237-b6587d0268ab is complete.
Feb 25 02:53:52 lago-basic-suite-master-host-0 lvm[30164]: pvscan[30164]
VG 8c9f3762-3bf1-48fa-9237-b6587d0268ab skip autoactivation.
Feb 25 02:53:52 lago-basic-suite-master-host-0 systemd[1]: Started LVM
event activation on device 8:128.
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/20589/arti...
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <
https://www.redhat.com/>
sbonazzo(a)redhat.com
<
https://www.redhat.com/>*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.*