Hi,

Can you provide the vdsm and engine logs ?

Thanks,
Fred

On Wed, Apr 26, 2017 at 5:30 PM, Jens Oechsler <joe@avaleo.net> wrote:
Greetings,

Is there any way to get the oVirt Data Center described below active again?

On Tue, Apr 25, 2017 at 4:11 PM, Jens Oechsler <joe@avaleo.net> wrote:
> Hi,
>
> LUN is not in pvs output, but I found it in lsblk output without any
> partions on it apparently.
>
> $ sudo pvs
>   PV                                            VG
>               Fmt  Attr PSize   PFree
>   /dev/mapper/360050768018182b6c000000000000990 data
>               lvm2 a--  200.00g 180.00g
>   /dev/mapper/360050768018182b6c000000000000998
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a--  499.62g 484.50g
>   /dev/sda2                                     system
>               lvm2 a--  278.78g 208.41g
>
> $ sudo lvs
>   LV                                   VG
>      Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync
> Convert
>   34a9328f-87fe-4190-96e9-a3580b0734fc
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----    1.00g
>   506ff043-1058-448c-bbab-5c864adb2bfc
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----   10.00g
>   65449c88-bc28-4275-bbbb-5fc75b692cbc
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----  128.00m
>   e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao----  128.00m
>   ids
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao----  128.00m
>   inbox
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----  128.00m
>   leases
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----    2.00g
>   master
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----    1.00g
>   metadata
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----  512.00m
>   outbox
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-----  128.00m
>   data                                 data
>      -wi-ao----   20.00g
>   home                                 system
>      -wi-ao---- 1000.00m
>   prod                                 system
>      -wi-ao----    4.88g
>   root                                 system
>      -wi-ao----    7.81g
>   swap                                 system
>      -wi-ao----    4.00g
>   swap7                                system
>      -wi-ao----   20.00g
>   tmp                                  system
>      -wi-ao----    4.88g
>   var                                  system
>      -wi-ao----   27.81g
>
> $ sudo lsblk
> <output trimmed>
> sdq
>                 65:0    0   500G  0 disk
> └─360050768018182b6c0000000000009d7
>                253:33   0   500G  0 mpath
>
> Data domain was made with one 500 GB LUN and extended with 500 GB more.
>
> On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland <frolland@redhat.com> wrote:
>> Hi,
>>
>> Do you see the LUN in the host ?
>> Can you share pvs and lvs output ?
>>
>> Thanks,
>>
>> Fred
>>
>> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler <joe@avaleo.net> wrote:
>>>
>>> Hello
>>> I have a problem with oVirt Hosted Engine Setup version:
>>> 4.0.5.5-1.el7.centos.
>>> Setup is using FCP SAN for data and engine.
>>> Cluster has worked fine for a while. It has two hosts with VMs running.
>>> I extended storage with an additional LUN recently. This LUN seems to
>>> be gone from data domain and one VM is paused which I assume has data
>>> on that device.
>>>
>>> Got these errors in events:
>>>
>>> Apr 24, 2017 10:26:05 AM
>>> Failed to activate Storage Domain SD (Data Center DC) by
>>> admin@internal-authz
>>> Apr 10, 2017 3:38:08 PM
>>> Status of host cl01 was set to Up.
>>> Apr 10, 2017 3:38:03 PM
>>> Host cl01 does not enforce SELinux. Current status: DISABLED
>>> Apr 10, 2017 3:37:58 PM
>>> Host cl01 is initializing. Message: Recovering from crash or Initializing
>>> Apr 10, 2017 3:37:58 PM
>>> VDSM cl01 command failed: Recovering from crash or Initializing
>>> Apr 10, 2017 3:37:46 PM
>>> Failed to Reconstruct Master Domain for Data Center DC.
>>> Apr 10, 2017 3:37:46 PM
>>> Host cl01 is not responding. Host cannot be fenced automatically
>>> because power management for the host is disabled.
>>> Apr 10, 2017 3:37:46 PM
>>> VDSM cl01 command failed: Broken pipe
>>> Apr 10, 2017 3:37:46 PM
>>> VDSM cl01 command failed: Broken pipe
>>> Apr 10, 2017 3:32:45 PM
>>> Invalid status on Data Center DC. Setting Data Center status to Non
>>> Responsive (On host cl01, Error: General Exception).
>>> Apr 10, 2017 3:32:45 PM
>>> VDSM cl01 command failed: [Errno 19] Could not find dm device named
>>> `[unknown]`
>>> Apr 7, 2017 1:28:04 PM
>>> VM HostedEngine is down with error. Exit message: resource busy:
>>> Failed to acquire lock: error -243.
>>> Apr 7, 2017 1:28:02 PM
>>> Storage Pool Manager runs on Host cl01 (Address: cl01).
>>> Apr 7, 2017 1:27:59 PM
>>> Invalid status on Data Center DC. Setting status to Non Responsive.
>>> Apr 7, 2017 1:27:53 PM
>>> Host cl02 does not enforce SELinux. Current status: DISABLED
>>> Apr 7, 2017 1:27:52 PM
>>> Host cl01 does not enforce SELinux. Current status: DISABLED
>>> Apr 7, 2017 1:27:49 PM
>>> Affinity Rules Enforcement Manager started.
>>> Apr 7, 2017 1:27:34 PM
>>> ETL Service Started
>>> Apr 7, 2017 1:26:01 PM
>>> ETL Service Stopped
>>> Apr 3, 2017 1:22:54 PM
>>> Shutdown of VM HostedEngine failed.
>>> Apr 3, 2017 1:22:52 PM
>>> Storage Pool Manager runs on Host cl01 (Address: cl01).
>>> Apr 3, 2017 1:22:49 PM
>>> Invalid status on Data Center DC. Setting status to Non Responsive.
>>>
>>>
>>> Master data domain is inactive.
>>>
>>>
>>> vdsm.log:
>>>
>>> jsonrpc.Executor/5::INFO::2017-04-20
>>> 07:01:26,796::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
>>> vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['ids']
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:26,796::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
>>> --cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
>>> devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
>>> evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
>>> '\''a|/dev/mapper/360050768018182b6c00000000000099e|[unknown]|'\'',
>>> '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
>>> wait_for_locks=1  use_lvmetad=
>>> 0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/ids (cwd None)
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:26,880::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = "
>>> WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
>>> WARNING: To avoid corruption, rescan devices to make changes
>>>  visible (pvscan --cache).\n  Couldn't find device with uuid
>>> jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n"; <rc> = 0
>>> jsonrpc.Executor/5::INFO::2017-04-20
>>> 07:01:26,881::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
>>> vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['leases']
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:26,881::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
>>> --cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
>>> devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
>>> evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
>>> '\''a|/dev/mapper/360050768018182b6c00000000000099e|[unknown]|'\'',
>>> '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
>>> wait_for_locks=1  use_lvmetad=
>>> 0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/leases (cwd None)
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:26,973::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = "
>>> WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
>>> WARNING: To avoid corruption, rescan devices to make changes
>>>  visible (pvscan --cache).\n  Couldn't find device with uuid
>>> jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n"; <rc> = 0
>>> jsonrpc.Executor/5::INFO::2017-04-20
>>> 07:01:26,973::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
>>> vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['metadata', 'leases',
>>> 'ids', 'inbox', 'outbox', 'master']
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:26,974::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
>>> --cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
>>> devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
>>> evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
>>> '\''a|/dev/mapper/360050768018182b6c00000000000099e|[unknown]|'\'',
>>> '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
>>> wait_for_locks=1  use_lvmetad=
>>> 0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/metadata
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/leases
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/ids
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/inbox b
>>> d616961-6da7-4eb0-939e-330b0a3fea6e/outbox
>>> bd616961-6da7-4eb0-939e-330b0a3fea6e/master (cwd None)
>>> Reactor thread::INFO::2017-04-20
>>>
>>> 07:01:27,069::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept)
>>> Accepting connection from ::1:44692
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:27,070::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = "
>>> WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
>>> WARNING: To avoid corruption, rescan devices to make changes
>>>  visible (pvscan --cache).\n  Couldn't find device with uuid
>>> jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n"; <rc> = 0
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>> 07:01:27,070::sp::662::Storage.StoragePool::(_stopWatchingDomainsState)
>>> Stop watching domains state
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,070::resourceManager::628::Storage.ResourceManager::(releaseResource)
>>> Trying to release resource
>>> 'Storage.58493e81-01dc-01d8-0390-000000000032'
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::647::Storage.ResourceManager::(releaseResource)
>>> Released resource 'Storage.58493e81-01dc-01d8-0390-000000000032' (0
>>> active users)
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::653::Storage.ResourceManager::(releaseResource)
>>> Resource 'Storage.58493e81-01dc-01d8-0390-000000000032' is free,
>>> finding out if anyone is waiting for it.
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::661::Storage.ResourceManager::(releaseResource)
>>> No one is waiting for resource
>>> 'Storage.58493e81-01dc-01d8-0390-000000000032', Clearing records.
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::628::Storage.ResourceManager::(releaseResource)
>>> Trying to release resource 'Storage.HsmDomainMonitorLock'
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::647::Storage.ResourceManager::(releaseResource)
>>> Released resource 'Storage.HsmDomainMonitorLock' (0 active users)
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::653::Storage.ResourceManager::(releaseResource)
>>> Resource 'Storage.HsmDomainMonitorLock' is free, finding out if anyone
>>> is waiting for it.
>>> jsonrpc.Executor/5::DEBUG::2017-04-20
>>>
>>> 07:01:27,071::resourceManager::661::Storage.ResourceManager::(releaseResource)
>>> No one is waiting for resource 'Storage.HsmDomainMonitorLock',
>>> Clearing records.
>>> jsonrpc.Executor/5::ERROR::2017-04-20
>>> 07:01:27,072::task::868::Storage.TaskManager.Task::(_setError)
>>> Task=`15122a21-4fb7-45bf-9a9a-4b97f27bc1e1`::Unexpected error
>>> Traceback (most recent call last):
>>>   File "/usr/share/vdsm/storage/task.py", line 875, in _run
>>>     return fn(*args, **kargs)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in
>>> wrapper
>>>     res = f(*args, **kwargs)
>>>   File "/usr/share/vdsm/storage/hsm.py", line 988, in connectStoragePool
>>>     spUUID, hostID, msdUUID, masterVersion, domainsMap)
>>>   File "/usr/share/vdsm/storage/hsm.py", line 1053, in _connectStoragePool
>>>     res = pool.connect(hostID, msdUUID, masterVersion)
>>>   File "/usr/share/vdsm/storage/sp.py", line 646, in connect
>>>     self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
>>>   File "/usr/share/vdsm/storage/sp.py", line 1219, in __rebuild
>>>     self.setMasterDomain(msdUUID, masterVersion)
>>>   File "/usr/share/vdsm/storage/sp.py", line 1427, in setMasterDomain
>>>     domain = sdCache.produce(msdUUID)
>>>   File "/usr/share/vdsm/storage/sdc.py", line 101, in produce
>>>     domain.getRealDomain()
>>>   File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain
>>>     return self._cache._realProduce(self._sdUUID)
>>>   File "/usr/share/vdsm/storage/sdc.py", line 125, in _realProduce
>>>     domain = self._findDomain(sdUUID)
>>>   File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain
>>>     dom = findMethod(sdUUID)
>>>   File "/usr/share/vdsm/storage/blockSD.py", line 1441, in findDomain
>>>     return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID))
>>>   File "/usr/share/vdsm/storage/blockSD.py", line 814, in __init__
>>>     lvm.checkVGBlockSizes(sdUUID, (self.logBlkSize, self.phyBlkSize))
>>>   File "/usr/share/vdsm/storage/lvm.py", line 1056, in checkVGBlockSizes
>>>     _checkpvsblksize(pvs, vgBlkSize)
>>>  File "/usr/share/vdsm/storage/lvm.py", line 1033, in _checkpvsblksize
>>>     pvBlkSize = _getpvblksize(pv)
>>>   File "/usr/share/vdsm/storage/lvm.py", line 1027, in _getpvblksize
>>>     dev = devicemapper.getDmId(os.path.basename(pv))
>>>   File "/usr/share/vdsm/storage/devicemapper.py", line 40, in getDmId
>>>     deviceMultipathName)
>>> OSError: [Errno 19] Could not find dm device named `[unknown]`
>>>
>>>
>>> Any input how to diagnose or troubleshoot would be appreciated.
>>>
>>> --
>>> Best Regards
>>>
>>> Jens Oechsler
>>> _______________________________________________
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
>
> --
> Med Venlig Hilsen / Best Regards
>
> Jens Oechsler
> System administrator
> KMD Nexus
> +45 51 82 62 13



--
Med Venlig Hilsen / Best Regards

Jens Oechsler
System administrator
KMD Nexus
+45 51 82 62 13
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users