On Wed, Dec 7, 2016 at 2:00 PM, Anantha Raghava
<raghav(a)exzatechconsulting.com> wrote:
Hello,
No luck with this? Awaiting urgent response. Also attached the vdsm and
supervdsm logs from one of the hosts.
Please provide guidance to solve this issue.
--
Thanks & Regards,
Anantha Raghava eXza Technology Consulting & Services Ph: +91-9538849179,
E-mail: raghav(a)exzatechconsulting.com
Do not print this e-mail unless required. Save Paper & trees.
On Monday 05 December 2016 11:16 AM, Anantha Raghava wrote:
Hi,
We have a single cluster with 6 Nodes in a single DC and added 4 FC Storage
domains. All the while it was working fine, migrations, creation of new VMs
everything were working fine. Now, all of a sudden we see the error message
"vdsm is unable to communicate with Master domain ......." and all storage
domains, including DC are down. But all Hosts are up, all VMs are running
without any issues. But migrations stopped, we cannot create new VMs, we
cannot start a shutdown VM.
Can someone help us trouble shoot the issue?
According to your log, vdsm cannot access the master domain:
Thread-35::ERROR::2016-12-07
17:18:10,354::sdc::146::Storage.StorageDomainCache::(_findDomain)
domain 6d25efc2-b056-4c43-9a82-82f0c8a5ebc3 not found
Traceback (most recent call last):
File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain
dom = findMethod(sdUUID)
File "/usr/share/vdsm/storage/blockSD.py", line 1441, in findDomain
return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID))
File "/usr/share/vdsm/storage/blockSD.py", line 1404, in findDomainPath
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'6d25efc2-b056-4c43-9a82-82f0c8a5ebc3',)
Thread-35::ERROR::2016-12-07
17:18:10,354::monitor::425::Storage.Monitor::(_checkDomainStatus)
Error checking domain 6d25efc2-b056-4c43-9a82-82f0c8a5ebc3
Traceback (most recent call last):
File "/usr/share/vdsm/storage/monitor.py", line 406, in _checkDomainStatus
self.domain.selftest()
File "/usr/share/vdsm/storage/sdc.py", line 50, in __getattr__
return getattr(self.getRealDomain(), attrName)
File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 125, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain
dom = findMethod(sdUUID)
File "/usr/share/vdsm/storage/blockSD.py", line 1441, in findDomain
return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID))
File "/usr/share/vdsm/storage/blockSD.py", line 1404, in findDomainPath
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'6d25efc2-b056-4c43-9a82-82f0c8a5ebc3',)
Thread-35::DEBUG::2016-12-07
17:18:10,279::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
--cpu-list 0-31 /usr/bin/sudo -n /usr/sbin/lvm vgs --config ' devices
{ preferred_names = ["^
/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0
disable_after_error_count=3 filter = [
'\''a|/dev/mapper/36005076300808e51e80000000000002c|/dev/mapper/36005076300808e51e8000000
0000002d|/dev/mapper/36005076300808e51e80000000000002e|/dev/mapper/36005076300808e51e80000000000002f|/dev/mapper/36005076300808e51e800000000000030|/dev/mapper/36005076300808e51e8000000000000
31|'\'', '\''r|.*|'\'' ] } global { locking_type=1
prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup {
retain_min = 50 retain_days = 0 } ' --noheadings --units b --nos
uffix --separator '|' --ignoreskippedcluster -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
6d25efc2-b056-4c43-9a82-82
f0c8a5ebc3 (cwd None)
Thread-35::DEBUG::2016-12-07
17:18:10,351::lvm::288::Storage.Misc.excCmd::(cmd) FAILED: <err> = '
WARNING: lvmetad is running but disabled. Restart lvmetad before
enabling it!\n Volume gro
up "6d25efc2-b056-4c43-9a82-82f0c8a5ebc3" not found\n Cannot process
volume group 6d25efc2-b056-4c43-9a82-82f0c8a5ebc3\n'; <rc> = 5
Thread-35::WARNING::2016-12-07
17:18:10,354::lvm::376::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
[' WARNING: lvmetad is running but disabled. Restart lvmetad before
enabling it!', ' V
olume group "6d25efc2-b056-4c43-9a82-82f0c8a5ebc3" not found', '
Cannot process volume group 6d25efc2-b056-4c43-9a82-82f0c8a5ebc3']
But the monitoring system is accessing this domain just fine:
Thread-12::DEBUG::2016-12-07
17:18:13,048::check::296::storage.check::(_start_process) START check
'/dev/6d25efc2-b056-4c43-9a82-82f0c8a5ebc3/metadata'
cmd=['/usr/bin/taskset', '--cpu-list',
'0-31', '/usr/bin/dd',
'if=/dev/6d25efc2-b056-4c43-9a82-82f0c8a5ebc3/metadata',
'of=/dev/null', 'bs=4096', 'count=1', 'iflag=direct']
delay=0.00
Thread-12::DEBUG::2016-12-07
17:18:13,069::check::327::storage.check::(_check_completed) FINISH
check '/dev/6d25efc2-b056-4c43-9a82-82f0c8a5ebc3/metadata' rc=0
err=bytearray(b'1+0 records in\n1+0 records out\n4096 bytes (4.1 kB)
copied, 0.000367523 s, 11.1 MB/s\n') elapsed=0.02
I suggest to file a bug about this.
I would try to restart vdsm, maybe there is some issue with vdsm lvm cache.
It can also be useful to see the output of:
pvscan --cache
vgs -vvvv 6d25efc2-b056-4c43-9a82-82f0c8a5ebc3
vgs -o name,pv_name -vvvv 6d25efc2-b056-4c43-9a82-82f0c8a5ebc3
By the way, we are running oVirt Version 4.0.1.
Running 4.0.1 not a good idea, you should upgrade to latest version.
Cheers,
Nir
--
Thanks & Regards,
Anantha Raghava eXza Technology Consulting & Services
Do not print this e-mail unless required. Save Paper & trees.
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users