Hi, I'm trying to run Vdsm with Python 3, which exposes various Python 3
incompatibilities in Vdsm code. I'm currently semi-stuck on Vdsm
crashing very shortly after startup with:
Panic: Error initializing IRS
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/vdsmd.py", line 95, in
serve_clients
irs = Dispatcher(HSM())
File "/usr/lib/python3.6/site-packages/vdsm/storage/dispatcher.py", line 46,
in __init__
self._exposeFunctions(obj)
File "/usr/lib/python3.6/site-packages/vdsm/storage/dispatcher.py", line 56,
in _exposeFunctions
if hasattr(funcObj, _EXPORTED_ATTRIBUTE) and callable(funcObj):
File "/usr/lib/python3.6/site-packages/vdsm/storage/sp.py", line 85, in
__getattr__
raise se.StoragePoolNotConnected
vdsm.storage.exception.StoragePoolNotConnected: Storage pool not connected: ()
That wouldn't be a problem by itself, errors are expected in Python 3
experiments. However the problem is that this is often the *only* error
I receive.
I've found out that HSM() call spawns threads, which apparently don't
have chance to report their errors before the panic above kills Vdsm.
For instance, there was a string-byte mismatch in running iscsiadm
scanning commands. Only after I added time.sleep to the end of
HSM.__init__, I got an error + traceback from that and could see what's
wrong. That worries me a bit, because it might mean that if some
initialization external commands are not fast enough, Vdsm may not be
able to start.
After fixing some problems, I get the panic again, even with the sleep
at the end of HSM.__init__, and again without any other error. It's
difficult to proceed in such a situation -- any tips what to look at or
to look for are welcome.
FWIW, my vdsm.log now ends with
INFO (MainThread) [storage.check] Starting check service (check:91)
DEBUG (check/loop) [root] START thread <Thread(check/loop, started daemon
140180434425600)> (func=<bound method EventLoop.run_forever of <EventLoop
running=True closed=False at 0x140181096264704>>, args=(), kwargs={})
(concurrent:193)
INFO (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False
at 0x140181096264704> (asyncevent:125)
DEBUG (hsm/init) [storage.SamplingMethod] Returning last result (misc:386)
DEBUG (hsm/init) [storage.SamplingMethod] Trying to enter sampling method
(vdsm.storage.hba.<function rescan at 0x7f7e7a597e18>) (misc:376)
DEBUG (hsm/init) [storage.SamplingMethod] Got in to sampling method (misc:379)
DEBUG (hsm/init) [storage.HBA] Starting scan (hba:58)
DEBUG (hsm/init) [storage.HBA] Scan finished (hba:64)
DEBUG (hsm/init) [storage.SamplingMethod] Returning last result (misc:386)
DEBUG (hsm/init) [root] /usr/bin/taskset --cpu-list 0-0 /sbin/udevadm settle --timeout=5
(cwd None) (commands:200)
DEBUG (hsm/init) [root] SUCCESS: <err> = b''; <rc> = 0
(commands:221)
DEBUG (hsm/init) [storage.SamplingMethod] Returning last result (misc:386)
DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n
/sbin/lvm pvs --config 'devices { preferred_names=["^/dev/mapper/"]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1
wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }'
--noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o
uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size,mda_used_count
(cwd None) (commands:200)
DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> =
0 (commands:221)
DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n
/sbin/lvm vgs --config 'devices { preferred_names=["^/dev/mapper/"]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1
wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }'
--noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
(cwd None) (commands:200)
DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> =
0 (commands:221)
DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n
/sbin/lvm lvs --config 'devices { preferred_names=["^/dev/mapper/"]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1
wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }'
--noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o
uuid,name,vg_name,attr,size,seg_start_pe,devices,tags (cwd None) (commands:200)
DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> =
0 (commands:221)
DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n
/sbin/lvm vgs --config 'devices { preferred_names=["^/dev/mapper/"]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1
wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }'
--noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
(cwd None) (commands:200)
DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> =
0 (commands:221)
INFO (hsm/init) [storage.HSM] FINISH HSM init succeeded in 0.43 seconds (hsm:410)
DEBUG (hsm/init) [storage.HSM] FINISH thread <Thread(hsm/init, stopped daemon
140180442818304)> (concurrent:196)
and is completely silent after that.
Thanks,
Milan