
Hi, I'm trying to run Vdsm with Python 3, which exposes various Python 3 incompatibilities in Vdsm code. I'm currently semi-stuck on Vdsm crashing very shortly after startup with: Panic: Error initializing IRS Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/vdsmd.py", line 95, in serve_clients irs = Dispatcher(HSM()) File "/usr/lib/python3.6/site-packages/vdsm/storage/dispatcher.py", line 46, in __init__ self._exposeFunctions(obj) File "/usr/lib/python3.6/site-packages/vdsm/storage/dispatcher.py", line 56, in _exposeFunctions if hasattr(funcObj, _EXPORTED_ATTRIBUTE) and callable(funcObj): File "/usr/lib/python3.6/site-packages/vdsm/storage/sp.py", line 85, in __getattr__ raise se.StoragePoolNotConnected vdsm.storage.exception.StoragePoolNotConnected: Storage pool not connected: () That wouldn't be a problem by itself, errors are expected in Python 3 experiments. However the problem is that this is often the *only* error I receive. I've found out that HSM() call spawns threads, which apparently don't have chance to report their errors before the panic above kills Vdsm. For instance, there was a string-byte mismatch in running iscsiadm scanning commands. Only after I added time.sleep to the end of HSM.__init__, I got an error + traceback from that and could see what's wrong. That worries me a bit, because it might mean that if some initialization external commands are not fast enough, Vdsm may not be able to start. After fixing some problems, I get the panic again, even with the sleep at the end of HSM.__init__, and again without any other error. It's difficult to proceed in such a situation -- any tips what to look at or to look for are welcome. FWIW, my vdsm.log now ends with INFO (MainThread) [storage.check] Starting check service (check:91) DEBUG (check/loop) [root] START thread <Thread(check/loop, started daemon 140180434425600)> (func=<bound method EventLoop.run_forever of <EventLoop running=True closed=False at 0x140181096264704>>, args=(), kwargs={}) (concurrent:193) INFO (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x140181096264704> (asyncevent:125) DEBUG (hsm/init) [storage.SamplingMethod] Returning last result (misc:386) DEBUG (hsm/init) [storage.SamplingMethod] Trying to enter sampling method (vdsm.storage.hba.<function rescan at 0x7f7e7a597e18>) (misc:376) DEBUG (hsm/init) [storage.SamplingMethod] Got in to sampling method (misc:379) DEBUG (hsm/init) [storage.HBA] Starting scan (hba:58) DEBUG (hsm/init) [storage.HBA] Scan finished (hba:64) DEBUG (hsm/init) [storage.SamplingMethod] Returning last result (misc:386) DEBUG (hsm/init) [root] /usr/bin/taskset --cpu-list 0-0 /sbin/udevadm settle --timeout=5 (cwd None) (commands:200) DEBUG (hsm/init) [root] SUCCESS: <err> = b''; <rc> = 0 (commands:221) DEBUG (hsm/init) [storage.SamplingMethod] Returning last result (misc:386) DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n /sbin/lvm pvs --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size,mda_used_count (cwd None) (commands:200) DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> = 0 (commands:221) DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n /sbin/lvm vgs --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name (cwd None) (commands:200) DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> = 0 (commands:221) DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n /sbin/lvm lvs --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,vg_name,attr,size,seg_start_pe,devices,tags (cwd None) (commands:200) DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> = 0 (commands:221) DEBUG (hsm/init) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-0 /usr/bin/sudo -n /sbin/lvm vgs --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name (cwd None) (commands:200) DEBUG (hsm/init) [storage.Misc.excCmd] SUCCESS: <err> = b''; <rc> = 0 (commands:221) INFO (hsm/init) [storage.HSM] FINISH HSM init succeeded in 0.43 seconds (hsm:410) DEBUG (hsm/init) [storage.HSM] FINISH thread <Thread(hsm/init, stopped daemon 140180442818304)> (concurrent:196) and is completely silent after that. Thanks, Milan