[ovirt-users] Fwd: Re: ovirt - can't attach master domain II

Ravishankar N ravishankar at redhat.com
Wed Feb 24 11:34:55 UTC 2016


On 02/24/2016 04:48 PM, paf1 at email.cz wrote:
>
>
> prereq: 2KVM12-P2 = master domain
> ---------------------------------------------------------
> YES - I'm using gluster.fuse NFS
> localhost:/2KVM12-P2 on 
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2 type 
> fuse.glusterfs 
> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
> ---------------------------------------------------------
> Healing
> ======
> # gluster volume heal 2KVM12-P2 info
> Brick 16.0.0.164:/STORAGES/g1r5p2/GFS
> Number of entries: 0
>
> Brick 16.0.0.163:/STORAGES/g1r5p2/GFS
> Number of entries: 0
>
> # while true; do for vol in `gluster volume list`; do gluster volume 
> heal $vol info | sort | grep "Number of entries" | awk -F: '{tot+=$2} 
> END { printf("Heal entries for '"$vol"': %d\n", $tot);}'; done; sleep 
> 120; echo -e "\n==================\n"; done
> Heal entries for 1KVM12-BCK: 1
> Heal entries for 1KVM12-P1: 1
> Heal entries for 1KVM12-P2: 0
> Heal entries for 1KVM12-P3: 0
> Heal entries for 1KVM12-P4: 0
> Heal entries for 1KVM12-P5: 0
> Heal entries for 2KVM12-P1: 1
> Heal entries for 2KVM12-P2: 0
> Heal entries for 2KVM12-P3: 0
> Heal entries for 2KVM12-P5: 0
> Heal entries for 2KVM12_P4: 1
>
> # gluster volume heal 1KVM12-BCK info split-brain
>     Brick 16.0.0.161:/STORAGES/g2r5p1/GFS
>     /0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>     Number of entries in split-brain: 1
>
>     Brick 16.0.0.162:/STORAGES/g2r5p1/GFS
>     /0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>     Number of entries in split-brain: 1
>
> # gluster volume heal 1KVM12-P1 info split-brain
>     Brick 16.0.0.161:/STORAGES/g1r5p1/GFS
>     /__DIRECT_IO_TEST__
>     Number of entries in split-brain: 1
>
>     Brick 16.0.0.162:/STORAGES/g1r5p1/GFS
>     /__DIRECT_IO_TEST__
>     Number of entries in split-brain: 1
>
> etc......
>
>
> YES - in split brain , but NOT master domain ( will solve later, after 
> master - if possible  )

I'm not sure if it is related, but you could try to resolve the 
split-brain first and see if it helps. Also, I see that you are using 
replica-2. It is recommended to use replica-3 or arbiter volumes to 
avoid split-brains.

-Ravi

>
> ---------------------------------------------------------------
> vdsm.log
> =========
>
> Thread-461::DEBUG::2016-02-24 
> 11:12:45,328::fileSD::262::Storage.Misc.excCmd::(getReadDelay) 
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n333 bytes (333 B) 
> copied, 0.000724379 s, 460 kB/s\n'; <rc> = 0
> Thread-461::INFO::2016-02-24 
> 11:12:45,331::clusterlock::219::Storage.SANLock::(acquireHostId) 
> Acquiring host id for domain 88adbd49-62d6-45b1-9992-b04464a04112 (id: 3)
> Thread-461::DEBUG::2016-02-24 
> 11:12:45,331::clusterlock::237::Storage.SANLock::(acquireHostId) Host 
> id for domain 88adbd49-62d6-45b1-9992-b04464a04112 successfully 
> acquired (id: 3)
> Thread-33186::DEBUG::2016-02-24 
> 11:12:46,067::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) 
> Calling 'GlusterVolume.list' in bridge with {}
> Thread-33186::DEBUG::2016-02-24 
> 11:12:46,204::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest) 
> Return 'GlusterVolume.list' in bridge with {'volumes': {'2KVM12-P5': 
> {'transportType': ['TCP'], 'uuid': 
> '4a6d775d-4a51-4f6c-9bfa-f7ef57f3ca1d', 'bricks': 
> ['16.0.0.164:/STORAGES/g1r5p5/GFS', 
> '16.0.0.163:/STORAGES/g1r5p5/GFS'], 'volumeName': '2KVM12-P5', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.164:/STORAGES/g1r5p5/GFS', 'hostUuid': 
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name': 
> '16.0.0.163:/STORAGES/g1r5p5/GFS', 'hostUuid': 
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.stat-prefetch': 'off', 'cluster.quorum-type': 'fixed', 
> 'performance.quick-read': 'off', 'network.remote-dio': 'enable', 
> 'cluster.quorum-count': '1', 'performance.io-cache': 'off', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}, '2KVM12_P4': {'transportType': ['TCP'], 
> 'uuid': '18310aeb-639f-4b6d-9ef4-9ef560d6175c', 'bricks': 
> ['16.0.0.163:/STORAGES/g1r5p4/GFS', 
> '16.0.0.164:/STORAGES/g1r5p4/GFS'], 'volumeName': '2KVM12_P4', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p4/GFS', 'hostUuid': 
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name': 
> '16.0.0.164:/STORAGES/g1r5p4/GFS', 'hostUuid': 
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'performance.stat-prefetch': 'off', 
> 'cluster.quorum-type': 'fixed', 'performance.quick-read': 'off', 
> 'network.remote-dio': 'enable', 'cluster.quorum-count': '1', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}, '2KVM12-P1': {'transportType': ['TCP'], 
> 'uuid': 'cbf142f8-a40b-4cf4-ad29-2243c81d30c1', 'bricks': 
> ['16.0.0.163:/STORAGES/g1r5p1/GFS', 
> '16.0.0.164:/STORAGES/g1r5p1/GFS'], 'volumeName': '2KVM12-P1', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p1/GFS', 'hostUuid': 
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name': 
> '16.0.0.164:/STORAGES/g1r5p1/GFS', 'hostUuid': 
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'performance.stat-prefetch': 'off', 
> 'cluster.quorum-type': 'fixed', 'performance.quick-read': 'off', 
> 'network.remote-dio': 'enable', 'cluster.quorum-count': '1', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}, '2KVM12-P3': {'transportType': ['TCP'], 
> 'uuid': '25a5ec22-660e-42a0-aa00-45211d341738', 'bricks': 
> ['16.0.0.163:/STORAGES/g1r5p3/GFS', 
> '16.0.0.164:/STORAGES/g1r5p3/GFS'], 'volumeName': '2KVM12-P3', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p3/GFS', 'hostUuid': 
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name': 
> '16.0.0.164:/STORAGES/g1r5p3/GFS', 'hostUuid': 
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'performance.stat-prefetch': 'off', 
> 'cluster.quorum-type': 'fixed', 'performance.quick-read': 'off', 
> 'network.remote-dio': 'enable', 'cluster.quorum-count': '1', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}, '2KVM12-P2': {'transportType': ['TCP'], 
> 'uuid': '9745551f-4696-4a6c-820a-619e359a61fd', 'bricks': 
> ['16.0.0.164:/STORAGES/g1r5p2/GFS', 
> '16.0.0.163:/STORAGES/g1r5p2/GFS'], 'volumeName': '2KVM12-P2', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.164:/STORAGES/g1r5p2/GFS', 'hostUuid': 
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name': 
> '16.0.0.163:/STORAGES/g1r5p2/GFS', 'hostUuid': 
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.stat-prefetch': 'off', 'cluster.quorum-type': 'fixed', 
> 'performance.quick-read': 'off', 'network.remote-dio': 'enable', 
> 'cluster.quorum-count': '1', 'performance.io-cache': 'off', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}, '1KVM12-P4': {'transportType': ['TCP'], 
> 'uuid': 'b4356604-4404-428a-9da6-f1636115e2fd', 'bricks': 
> ['16.0.0.161:/STORAGES/g1r5p4/GFS', 
> '16.0.0.162:/STORAGES/g1r5p4/GFS'], 'volumeName': '1KVM12-P4', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p4/GFS', 'hostUuid': 
> '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name': 
> '16.0.0.162:/STORAGES/g1r5p4/GFS', 'hostUuid': 
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'diagnostics.count-fop-hits': 'on', 'performance.stat-prefetch': 
> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read': 
> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1', 
> 'performance.io-cache': 'off', 'storage.owner-uid': '36', 
> 'performance.read-ahead': 'off', 'storage.owner-gid': '36', 
> 'diagnostics.latency-measurement': 'on'}}, '1KVM12-BCK': 
> {'transportType': ['TCP'], 'uuid': 
> '62c89345-fd61-4b67-b8b4-69296eb7d217', 'bricks': 
> ['16.0.0.161:/STORAGES/g2r5p1/GFS', 
> '16.0.0.162:/STORAGES/g2r5p1/GFS'], 'volumeName': '1KVM12-BCK', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g2r5p1/GFS', 'hostUuid': 
> '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name': 
> '16.0.0.162:/STORAGES/g2r5p1/GFS', 'hostUuid': 
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'performance.stat-prefetch': 'off', 
> 'cluster.quorum-type': 'fixed', 'performance.quick-read': 'off', 
> 'network.remote-dio': 'enable', 'cluster.quorum-count': '1', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}, '1KVM12-P2': {'transportType': ['TCP'], 
> 'uuid': 'aa2d607d-3c6c-4f13-8205-aae09dcc9d35', 'bricks': 
> ['16.0.0.161:/STORAGES/g1r5p2/GFS', 
> '16.0.0.162:/STORAGES/g1r5p2/GFS'], 'volumeName': '1KVM12-P2', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p2/GFS', 'hostUuid': 
> '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name': 
> '16.0.0.162:/STORAGES/g1r5p2/GFS', 'hostUuid': 
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'diagnostics.count-fop-hits': 'on', 
> 'performance.stat-prefetch': 'off', 'cluster.quorum-type': 'fixed', 
> 'performance.quick-read': 'off', 'network.remote-dio': 'enable', 
> 'cluster.quorum-count': '1', 'storage.owner-uid': '36', 
> 'performance.read-ahead': 'off', 'storage.owner-gid': '36', 
> 'diagnostics.latency-measurement': 'on'}}, '1KVM12-P3': 
> {'transportType': ['TCP'], 'uuid': 
> '6060ff77-d552-4d94-97bf-5a32982e7d8a', 'bricks': 
> ['16.0.0.161:/STORAGES/g1r5p3/GFS', 
> '16.0.0.162:/STORAGES/g1r5p3/GFS'], 'volumeName': '1KVM12-P3', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p3/GFS', 'hostUuid': 
> '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name': 
> '16.0.0.162:/STORAGES/g1r5p3/GFS', 'hostUuid': 
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'diagnostics.count-fop-hits': 'on', 
> 'performance.stat-prefetch': 'off', 'cluster.quorum-type': 'fixed', 
> 'performance.quick-read': 'off', 'network.remote-dio': 'enable', 
> 'cluster.quorum-count': '1', 'storage.owner-uid': '36', 
> 'performance.read-ahead': 'off', 'storage.owner-gid': '36', 
> 'diagnostics.latency-measurement': 'on'}}, '1KVM12-P1': 
> {'transportType': ['TCP'], 'uuid': 
> 'f410c6a9-9a51-42b3-89bb-c20ac72a0461', 'bricks': 
> ['16.0.0.161:/STORAGES/g1r5p1/GFS', 
> '16.0.0.162:/STORAGES/g1r5p1/GFS'], 'volumeName': '1KVM12-P1', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p1/GFS', 'hostUuid': 
> '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name': 
> '16.0.0.162:/STORAGES/g1r5p1/GFS', 'hostUuid': 
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'diagnostics.count-fop-hits': 'on', 
> 'performance.stat-prefetch': 'off', 'cluster.quorum-type': 'fixed', 
> 'performance.quick-read': 'off', 'network.remote-dio': 'enable', 
> 'cluster.quorum-count': '1', 'storage.owner-uid': '36', 
> 'performance.read-ahead': 'off', 'storage.owner-gid': '36', 
> 'diagnostics.latency-measurement': 'on'}}, '1KVM12-P5': 
> {'transportType': ['TCP'], 'uuid': 
> '420fa218-60bc-47e4-89a8-ce39b7da885e', 'bricks': 
> ['16.0.0.161:/STORAGES/g1r5p5/GFS', 
> '16.0.0.162:/STORAGES/g1r5p5/GFS'], 'volumeName': '1KVM12-P5', 
> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 
> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p5/GFS', 'hostUuid': 
> '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name': 
> '16.0.0.162:/STORAGES/g1r5p5/GFS', 'hostUuid': 
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options': 
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock': 'enable', 
> 'performance.io-cache': 'off', 'performance.stat-prefetch': 'off', 
> 'cluster.quorum-type': 'fixed', 'performance.quick-read': 'off', 
> 'network.remote-dio': 'enable', 'cluster.quorum-count': '1', 
> 'storage.owner-uid': '36', 'performance.read-ahead': 'off', 
> 'storage.owner-gid': '36'}}}}
> Thread-431::DEBUG::2016-02-24 
> 11:12:47,729::fileSD::262::Storage.Misc.excCmd::(getReadDelay) 
> /usr/bin/dd 
> if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata 
> iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
> Thread-431::DEBUG::2016-02-24 
> 11:12:47,743::fileSD::262::Storage.Misc.excCmd::(getReadDelay) 
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n997 bytes (997 B) 
> copied, 0.000569374 s, 1.8 MB/s\n'; <rc> = 0
> Thread-431::INFO::2016-02-24 
> 11:12:47,751::clusterlock::219::Storage.SANLock::(acquireHostId) 
> Acquiring host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)
> Thread-431::DEBUG::2016-02-24 
> 11:12:47,751::clusterlock::237::Storage.SANLock::(acquireHostId) Host 
> id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 successfully 
> acquired (id: 3)
> --------------------------------
>
> Thread-349::ERROR::2016-02-24 
> 11:18:20,040::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) 
> Error while collecting domain ef010d08-aed1-41c4-ba9a-e6d9bdecb4b4 
> monitoring information
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/domainMonitor.py", line 221, in 
> _monitorDomain
>     self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
>     domain.getRealDomain()
>   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
>     return self._cache._realProduce(self._sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
>     domain = self._findDomain(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>     dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
>     return 
> GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
>   File "/usr/share/vdsm/storage/fileSD.py", line 160, in __init__
>     validateFileSystemFeatures(sdUUID, self.mountpoint)
>   File "/usr/share/vdsm/storage/fileSD.py", line 89, in 
> validateFileSystemFeatures
>     oop.getProcessPool(sdUUID).directTouch(testFilePath)
>   File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in directTouch
>     ioproc.touch(path, flags, mode)
>   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 
> 543, in touch
>     self.timeout)
>   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 
> 427, in _sendCommand
>     raise OSError(errcode, errstr)
> OSError: [Errno 5] Input/output error
> Thread-453::ERROR::2016-02-24 
> 11:18:20,043::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) 
> Error while collecting domain 300e9ac8-3c2f-4703-9bb1-1df2130c7c97 
> monitoring information
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/domainMonitor.py", line 221, in 
> _monitorDomain
>     self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
>     domain.getRealDomain()
>   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
>     return self._cache._realProduce(self._sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
>     domain = self._findDomain(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>     dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
>     return 
> GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
>   File "/usr/share/vdsm/storage/fileSD.py", line 160, in __init__
>     validateFileSystemFeatures(sdUUID, self.mountpoint)
>   File "/usr/share/vdsm/storage/fileSD.py", line 89, in 
> validateFileSystemFeatures
>     oop.getProcessPool(sdUUID).directTouch(testFilePath)
>   File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in directTouch
>     ioproc.touch(path, flags, mode)
>   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 
> 543, in touch
>     self.timeout)
>   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 
> 427, in _sendCommand
>     raise OSError(errcode, errstr)
> OSError: [Errno 5] Input/output error
> Thread-431::DEBUG::2016-02-24 
> 11:18:20,109::fileSD::262::Storage.Misc.excCmd::(getReadDelay) 
> /usr/bin/dd 
> if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata 
> iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
> Thread-431::DEBUG::2016-02-24 
> 11:18:20,122::fileSD::262::Storage.Misc.excCmd::(getReadDelay) 
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n997 bytes (997 B) 
> copied, 0.000444081 s, 2.2 MB/s\n'; <rc> = 0
> Thread-431::INFO::2016-02-24 
> 11:18:20,128::clusterlock::219::Storage.SANLock::(acquireHostId) 
> Acquiring host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)
> Thread-431::DEBUG::2016-02-24 
> 11:18:20,129::clusterlock::237::Storage.SANLock::(acquireHostId) Host 
> id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 successfully 
> acquired (id: 3)
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,690::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) 
> Calling 'Task.getStatus' in bridge with {u'taskID': 
> u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,692::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state init -> 
> state preparing
> Thread-35631::INFO::2016-02-24 
> 11:18:20,692::logUtils::44::dispatcher::(wrapper) Run and protect: 
> getTaskStatus(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660', 
> spUUID=None, options=None)
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,693::taskManager::103::Storage.TaskManager::(getTaskStatus) 
> Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,693::taskManager::106::Storage.TaskManager::(getTaskStatus) 
> Return. Response: {'code': 661, 'message': 'Cannot acquire host id', 
> 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 
> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35631::INFO::2016-02-24 
> 11:18:20,693::logUtils::47::dispatcher::(wrapper) Run and protect: 
> getTaskStatus, Return response: {'taskStatus': {'code': 661, 
> 'message': 'Cannot acquire host id', 'taskState': 'finished', 
> 'taskResult': 'cleanSuccess', 'taskID': 
> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,693::task::1191::Storage.TaskManager.Task::(prepare) 
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::finished: {'taskStatus': 
> {'code': 661, 'message': 'Cannot acquire host id', 'taskState': 
> 'finished', 'taskResult': 'cleanSuccess', 'taskID': 
> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,693::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state 
> preparing -> state finished
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,694::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) 
> Owner.releaseAll requests {} resources {}
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,694::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
> Owner.cancelAll requests {}
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,694::task::993::Storage.TaskManager.Task::(_decref) 
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::ref 0 aborting False
> Thread-35631::DEBUG::2016-02-24 
> 11:18:20,694::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest) 
> Return 'Task.getStatus' in bridge with {'code': 661, 'message': 
> 'Cannot acquire host id', 'taskState': 'finished', 'taskResult': 
> 'cleanSuccess', 'taskID': 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,699::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) 
> Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID': 
> u'00000002-0002-0002-0002-00000000021e'}
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,700::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state init -> 
> state preparing
> Thread-35632::INFO::2016-02-24 
> 11:18:20,700::logUtils::44::dispatcher::(wrapper) Run and protect: 
> getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e', options=None)
> Thread-35632::INFO::2016-02-24 
> 11:18:20,707::logUtils::47::dispatcher::(wrapper) Run and protect: 
> getSpmStatus, Return response: {'spm_st': {'spmId': -1, 'spmStatus': 
> 'Free', 'spmLver': -1}}
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,707::task::1191::Storage.TaskManager.Task::(prepare) 
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::finished: {'spm_st': 
> {'spmId': -1, 'spmStatus': 'Free', 'spmLver': -1}}
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,707::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state 
> preparing -> state finished
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,708::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) 
> Owner.releaseAll requests {} resources {}
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,708::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
> Owner.cancelAll requests {}
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,708::task::993::Storage.TaskManager.Task::(_decref) 
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::ref 0 aborting False
> Thread-35632::DEBUG::2016-02-24 
> 11:18:20,708::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest) 
> Return 'StoragePool.getSpmStatus' in bridge with {'spmId': -1, 
> 'spmStatus': 'Free', 'spmLver': -1}
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,737::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) 
> Calling 'Task.clear' in bridge with {u'taskID': 
> u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,738::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state init -> 
> state preparing
> Thread-35633::INFO::2016-02-24 
> 11:18:20,738::logUtils::44::dispatcher::(wrapper) Run and protect: 
> clearTask(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660', spUUID=None, 
> options=None)
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,739::taskManager::171::Storage.TaskManager::(clearTask) 
> Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,739::taskManager::176::Storage.TaskManager::(clearTask) Return.
> Thread-35633::INFO::2016-02-24 
> 11:18:20,739::logUtils::47::dispatcher::(wrapper) Run and protect: 
> clearTask, Return response: None
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,739::task::1191::Storage.TaskManager.Task::(prepare) 
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::finished: None
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,740::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state 
> preparing -> state finished
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,740::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) 
> Owner.releaseAll requests {} resources {}
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,740::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
> Owner.cancelAll requests {}
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,740::task::993::Storage.TaskManager.Task::(_decref) 
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::ref 0 aborting False
> Thread-35633::DEBUG::2016-02-24 
> 11:18:20,740::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest) 
> Return 'Task.clear' in bridge with True
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,859::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) 
> Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID': 
> u'00000002-0002-0002-0002-00000000021e'}
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,860::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state init -> 
> state preparing
> Thread-35634::INFO::2016-02-24 
> 11:18:20,860::logUtils::44::dispatcher::(wrapper) Run and protect: 
> getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e', options=None)
> Thread-35634::INFO::2016-02-24 
> 11:18:20,867::logUtils::47::dispatcher::(wrapper) Run and protect: 
> getSpmStatus, Return response: {'spm_st': {'spmId': -1, 'spmStatus': 
> 'Free', 'spmLver': -1}}
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,867::task::1191::Storage.TaskManager.Task::(prepare) 
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::finished: {'spm_st': 
> {'spmId': -1, 'spmStatus': 'Free', 'spmLver': -1}}
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,867::task::595::Storage.TaskManager.Task::(_updateState) 
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state 
> preparing -> state finished
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,867::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) 
> Owner.releaseAll requests {} resources {}
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,867::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
> Owner.cancelAll requests {}
> Thread-35634::DEBUG::2016-02-24 
> 11:18:20,867::task::993::Storage.TaskManager.Task::(_decref) 
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::ref 0 aborting False
> ---------------------------------------
>
> these blocks generated in cycle for each domain
>
> Any IDEA ??
> regs.
> Pa.
>
>
>
> On 24.2.2016 10:54, Ravishankar N wrote:
>> Hi,
>>
>> On 02/24/2016 06:43 AM, paf1 at email.cz wrote:
>>> Hi,
>>> I found the main ( maybe ) problem with IO error ( -5 ) for "ids" 
>>> file access
>>> This file is not accessable via NFS, locally yes
>> How is NFS coming into the picture? Are you not using gluster fuse 
>> mount?
>>> .
>>> How can I fix it ??
>> Can you run `gluster volume heal volname info` and `gluster volume 
>> heal volname info split-brain` to see if the "ids" file is in 
>> split-brain? A file in split-brain returns EIO when accessed from the 
>> mount.
>> Regards,
>> Ravi
>>
>>
>>> regs.
>>> Pavel
>>>
>>> # sanlock client log_dump
>>> ....
>>> 0 flags 1 timeout 0
>>> 2016-02-24 02:01:10+0100 3828 [12111]: s1316 lockspace 
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>> 2016-02-24 02:01:10+0100 3828 [12111]: cmd_add_lockspace 4,15 async 
>>> done 0
>>> 2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire begin 
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1
>>> 2016-02-24 02:01:10+0100 3828 [19556]: 88adbd49 aio collect 0 
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:10+0100 3828 [19556]: read_sectors delta_leader 
>>> offset 0 rv -5 
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids
>>> 2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire 
>>> leader_read1 error -5
>>> 2016-02-24 02:01:11+0100 3829 [12111]: s1316 add_lockspace fail 
>>> result -5
>>> 2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace 4,15 
>>> 7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0 
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:12+0100 3831 [12116]: s1317 lockspace 
>>> 7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>>> 2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace 4,15 async 
>>> done 0
>>> 2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire begin 
>>> 7f52b697-c199-4f58-89aa-102d44327124:1
>>> 2016-02-24 02:01:12+0100 3831 [19562]: 7f52b697 aio collect 0 
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:12+0100 3831 [19562]: read_sectors delta_leader 
>>> offset 0 rv -5 
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids
>>> 2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire 
>>> leader_read1 error -5
>>> 2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace 4,15 
>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0 
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:13+0100 3831 [1321]: s1318 lockspace 
>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0
>>> 2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace 4,15 async 
>>> done 0
>>> 2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire begin 
>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1
>>> 2016-02-24 02:01:13+0100 3831 [19564]: 0fcad888 aio collect 0 
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458201000 result -5:0 match res
>>> 2016-02-24 02:01:13+0100 3831 [19564]: read_sectors delta_leader 
>>> offset 0 rv -5 
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>>> 2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire 
>>> leader_read1 error -5
>>> 2016-02-24 02:01:13+0100 3832 [12116]: s1317 add_lockspace fail 
>>> result -5
>>> 2016-02-24 02:01:14+0100 3832 [1321]: s1318 add_lockspace fail result -5
>>> 2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace 4,15 
>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0 
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:19+0100 3838 [12106]: s1319 lockspace 
>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0
>>> 2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace 4,15 async 
>>> done 0
>>> 2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire begin 
>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1
>>> 2016-02-24 02:01:19+0100 3838 [19638]: 3da46e07 aio collect 0 
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:19+0100 3838 [19638]: read_sectors delta_leader 
>>> offset 0 rv -5 
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids
>>> 2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire 
>>> leader_read1 error -5
>>> 2016-02-24 02:01:20+0100 3839 [12106]: s1319 add_lockspace fail 
>>> result -5
>>> 2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace 4,15 
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0 
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:20+0100 3839 [1320]: s1320 lockspace 
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>> 2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace 4,15 async 
>>> done 0
>>> 2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire begin 
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1
>>> 2016-02-24 02:01:20+0100 3839 [19658]: 88adbd49 aio collect 0 
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:20+0100 3839 [19658]: read_sectors delta_leader 
>>> offset 0 rv -5 
>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids
>>> 2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire 
>>> leader_read1 error -5
>>> 2016-02-24 02:01:21+0100 3840 [1320]: s1320 add_lockspace fail result -5
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160224/8662d385/attachment-0001.html>


More information about the Users mailing list