Dear all,
We have updated our hypervisors with yum. This included an update ov vdsm also. We now are with these version:
vdsm-4.16.7-1.gitdb83943.el6.x86_64
vdsm-python-4.16.7-1.gitdb83943.el6.noarch
vdsm-python-zombiereaper-4.16.7-1.gitdb83943.el6.noarch
vdsm-xmlrpc-4.16.7-1.gitdb83943.el6.noarch
vdsm-yajsonrpc-4.16.7-1.gitdb83943.el6.noarch
vdsm-jsonrpc-4.16.7-1.gitdb83943.el6.noarch
vdsm-cli-4.16.7-1.gitdb83943.el6.noarch
And ever since these updates we experience BIG troubles with our fibre connections. I've already update the brocade cards to the latest version. This seemed to help, they already came back up and saw the storage domains (before the brocade update, they didn't even see their storage domains). But after a day or so, one of the hypersisors began to freak out again. Coming up and going back down... Below you can find the errors:
Thread-821::ERROR::2014-12-08 07:10:33,190::task::866::Storage.TaskManager.Task::(_setError) Task=`27cb9779-a8e9-4080-988d-9772c922710b`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-821::ERROR::2014-12-08 07:10:33,194::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-822::ERROR::2014-12-08 07:11:03,878::task::866::Storage.TaskManager.Task::(_setError) Task=`30177931-68c0-420f-950f-da5b770fe35c`::Unexpected error
Thread-822::ERROR::2014-12-08 07:11:03,882::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Unknown pool id, pool not connected: ('1d03dc05-008b-4d14-97ce-b17bd714183d',)", 'code': 309}}
Thread-813::ERROR::2014-12-08 07:11:07,634::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-813::ERROR::2014-12-08 07:11:07,634::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-813::DEBUG::2014-12-08 07:11:07,638::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-813::DEBUG::2014-12-08 07:11:07,835::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-813::ERROR::2014-12-08 07:11:07,896::spbackends::271::Storage.StoragePoolDiskBackend::(validateMasterDomainVersion) Requested master domain 78d84adf-7274-4efe-a711-fbec31196ece does not have expected version 42 it is version 17
Thread-813::ERROR::2014-12-08 07:11:07,903::task::866::Storage.TaskManager.Task::(_setError) Task=`c434f325-5193-4236-a04d-2fee9ac095bc`::Unexpected error
Thread-813::ERROR::2014-12-08 07:11:07,946::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Wrong Master domain or its version: 'SD=78d84adf-7274-4efe-a711-fbec31196ece, pool=1d03dc05-008b-4d14-97ce-b17bd714183d'", 'code': 324}}
Thread-823::ERROR::2014-12-08 07:11:43,993::task::866::Storage.TaskManager.Task::(_setError) Task=`9abbccd9-88a7-4632-b350-f9af1f65bebd`::Unexpected error
Thread-823::ERROR::2014-12-08 07:11:43,998::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Unknown pool id, pool not connected: ('1d03dc05-008b-4d14-97ce-b17bd714183d',)", 'code': 309}}
Thread-823::ERROR::2014-12-08 07:11:44,003::task::866::Storage.TaskManager.Task::(_setError) Task=`7ef1ac39-e7c2-4538-b30b-ab2fcefac01d`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:11:44,007::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-823::ERROR::2014-12-08 07:11:44,133::task::866::Storage.TaskManager.Task::(_setError) Task=`cc1ae82c-f3c4-4efa-9cd2-c62a27801e76`::Unexpected error
Thread-823::ERROR::2014-12-08 07:11:44,137::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Unknown pool id, pool not connected: ('1d03dc05-008b-4d14-97ce-b17bd714183d',)", 'code': 309}}
Thread-823::ERROR::2014-12-08 07:12:24,580::task::866::Storage.TaskManager.Task::(_setError) Task=`9bcbb87d-3093-4894-879b-3fe2b09ef351`::Unexpected error
Thread-823::ERROR::2014-12-08 07:12:24,585::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Unknown pool id, pool not connected: ('1d03dc05-008b-4d14-97ce-b17bd714183d',)", 'code': 309}}
Thread-823::ERROR::2014-12-08 07:13:04,926::task::866::Storage.TaskManager.Task::(_setError) Task=`8bdd0c1f-e681-4a8e-ad55-296c021389ed`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:13:04,931::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-823::ERROR::2014-12-08 07:13:45,342::task::866::Storage.TaskManager.Task::(_setError) Task=`160ea2a7-b6cb-4102-9df4-71ba87fd863e`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:13:45,346::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-823::ERROR::2014-12-08 07:14:25,879::task::866::Storage.TaskManager.Task::(_setError) Task=`985628db-8f48-44b5-8f61-631a922f7f71`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:14:25,883::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-823::ERROR::2014-12-08 07:15:06,175::task::866::Storage.TaskManager.Task::(_setError) Task=`ddca1c88-0565-41e8-bf0c-22eadcc75918`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:15:06,179::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-823::ERROR::2014-12-08 07:15:46,585::task::866::Storage.TaskManager.Task::(_setError) Task=`12bbded5-59ce-46d8-9e67-f48862a03606`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:15:46,589::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}
Thread-814::ERROR::2014-12-08 07:16:08,619::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-814::ERROR::2014-12-08 07:16:08,619::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-814::DEBUG::2014-12-08 07:16:08,624::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-814::DEBUG::2014-12-08 07:16:08,740::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-814::ERROR::2014-12-08 07:16:08,812::spbackends::271::Storage.StoragePoolDiskBackend::(validateMasterDomainVersion) Requested master domain 78d84adf-7274-4efe-a711-fbec31196ece does not have expected version 42 it is version 17
Thread-814::ERROR::2014-12-08 07:16:08,820::task::866::Storage.TaskManager.Task::(_setError) Task=`5cdce5cd-6e6d-421e-bc2a-f999d8cbb056`::Unexpected error
Thread-814::ERROR::2014-12-08 07:16:08,865::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Wrong Master domain or its version: 'SD=78d84adf-7274-4efe-a711-fbec31196ece, pool=1d03dc05-008b-4d14-97ce-b17bd714183d'", 'code': 324}}
Thread-815::ERROR::2014-12-08 07:16:09,471::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-815::ERROR::2014-12-08 07:16:09,472::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-815::DEBUG::2014-12-08 07:16:09,476::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-815::DEBUG::2014-12-08 07:16:09,564::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-815::ERROR::2014-12-08 07:16:09,627::spbackends::271::Storage.StoragePoolDiskBackend::(validateMasterDomainVersion) Requested master domain 78d84adf-7274-4efe-a711-fbec31196ece does not have expected version 42 it is version 17
Thread-815::ERROR::2014-12-08 07:16:09,635::task::866::Storage.TaskManager.Task::(_setError) Task=`abfa0fd0-04b3-4c65-b3d0-be18b085a65d`::Unexpected error
Thread-815::ERROR::2014-12-08 07:16:09,681::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Wrong Master domain or its version: 'SD=78d84adf-7274-4efe-a711-fbec31196ece, pool=1d03dc05-008b-4d14-97ce-b17bd714183d'", 'code': 324}}
Thread-816::ERROR::2014-12-08 07:16:10,182::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-816::ERROR::2014-12-08 07:16:10,183::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 78d84adf-7274-4efe-a711-fbec31196ece
Thread-816::DEBUG::2014-12-08 07:16:10,187::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489000000000000062|/dev/mapper/36005076802810d48e0000000000000ae|/dev/mapper/36005076802810d48e0000000000000de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None)
Thread-823::ERROR::2014-12-08 07:16:27,163::task::866::Storage.TaskManager.Task::(_setError) Task=`9b0fd676-7941-40a7-a71e-0f1dee48a107`::Unexpected error
raise se.SpmStatusError()
SpmStatusError: Not SPM: ()
Thread-823::ERROR::2014-12-08 07:16:27,168::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}