This is a multi-part message in MIME format.
--------------050202070801030107080007
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
used replica2 with volume option
Volume Name: 2KVM12-P2
Type: Replicate
Volume ID: 9745551f-4696-4a6c-820a-619e359a61fd
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 16.0.0.164:/STORAGES/g1r5p2/GFS
Brick2: 16.0.0.163:/STORAGES/g1r5p2/GFS
Options Reconfigured:
storage.owner-uid: 36
storage.owner-gid: 36
performance.io-cache: off
performance.read-ahead: off
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.quick-read: off
cluster.quorum-count: 1
cluster.server-quorum-type: none
cluster.quorum-type: fixed
was runnig over year with no problems ( reboots, ..etc... )
On 24.2.2016 12:34, Ravishankar N wrote:
On 02/24/2016 04:48 PM, paf1(a)email.cz wrote:
>
>
> prereq: 2KVM12-P2 = master domain
> ---------------------------------------------------------
> YES - I'm using gluster.fuse NFS
> localhost:/2KVM12-P2 on
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2 type
> fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
> ---------------------------------------------------------
> Healing
> ======
> # gluster volume heal 2KVM12-P2 info
> Brick 16.0.0.164:/STORAGES/g1r5p2/GFS
> Number of entries: 0
>
> Brick 16.0.0.163:/STORAGES/g1r5p2/GFS
> Number of entries: 0
>
> # while true; do for vol in `gluster volume list`; do gluster volume
> heal $vol info | sort | grep "Number of entries" | awk -F: '{tot+=$2}
> END { printf("Heal entries for '"$vol"': %d\n",
$tot);}'; done; sleep
> 120; echo -e "\n==================\n"; done
> Heal entries for 1KVM12-BCK: 1
> Heal entries for 1KVM12-P1: 1
> Heal entries for 1KVM12-P2: 0
> Heal entries for 1KVM12-P3: 0
> Heal entries for 1KVM12-P4: 0
> Heal entries for 1KVM12-P5: 0
> Heal entries for 2KVM12-P1: 1
> Heal entries for 2KVM12-P2: 0
> Heal entries for 2KVM12-P3: 0
> Heal entries for 2KVM12-P5: 0
> Heal entries for 2KVM12_P4: 1
>
> # gluster volume heal 1KVM12-BCK info split-brain
> Brick 16.0.0.161:/STORAGES/g2r5p1/GFS
> /0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
> Number of entries in split-brain: 1
>
> Brick 16.0.0.162:/STORAGES/g2r5p1/GFS
> /0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
> Number of entries in split-brain: 1
>
> # gluster volume heal 1KVM12-P1 info split-brain
> Brick 16.0.0.161:/STORAGES/g1r5p1/GFS
> /__DIRECT_IO_TEST__
> Number of entries in split-brain: 1
>
> Brick 16.0.0.162:/STORAGES/g1r5p1/GFS
> /__DIRECT_IO_TEST__
> Number of entries in split-brain: 1
>
> etc......
>
>
> YES - in split brain , but NOT master domain ( will solve later,
> after master - if possible )
I'm not sure if it is related, but you could try to resolve the
split-brain first and see if it helps. Also, I see that you are using
replica-2. It is recommended to use replica-3 or arbiter volumes to
avoid split-brains.
-Ravi
>
> ---------------------------------------------------------------
> vdsm.log
> =========
>
> Thread-461::DEBUG::2016-02-24
> 11:12:45,328::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n333 bytes (333 B)
> copied, 0.000724379 s, 460 kB/s\n'; <rc> = 0
> Thread-461::INFO::2016-02-24
> 11:12:45,331::clusterlock::219::Storage.SANLock::(acquireHostId)
> Acquiring host id for domain 88adbd49-62d6-45b1-9992-b04464a04112 (id: 3)
> Thread-461::DEBUG::2016-02-24
> 11:12:45,331::clusterlock::237::Storage.SANLock::(acquireHostId) Host
> id for domain 88adbd49-62d6-45b1-9992-b04464a04112 successfully
> acquired (id: 3)
> Thread-33186::DEBUG::2016-02-24
> 11:12:46,067::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
> Calling 'GlusterVolume.list' in bridge with {}
> Thread-33186::DEBUG::2016-02-24
> 11:12:46,204::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
> Return 'GlusterVolume.list' in bridge with {'volumes':
{'2KVM12-P5':
> {'transportType': ['TCP'], 'uuid':
> '4a6d775d-4a51-4f6c-9bfa-f7ef57f3ca1d', 'bricks':
> ['16.0.0.164:/STORAGES/g1r5p5/GFS',
> '16.0.0.163:/STORAGES/g1r5p5/GFS'], 'volumeName':
'2KVM12-P5',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.164:/STORAGES/g1r5p5/GFS',
> 'hostUuid': '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name':
> '16.0.0.163:/STORAGES/g1r5p5/GFS', 'hostUuid':
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.stat-prefetch': 'off',
'cluster.quorum-type':
> 'fixed', 'performance.quick-read': 'off',
'network.remote-dio':
> 'enable', 'cluster.quorum-count': '1',
'performance.io-cache': 'off',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}, '2KVM12_P4':
{'transportType': ['TCP'],
> 'uuid': '18310aeb-639f-4b6d-9ef4-9ef560d6175c', 'bricks':
> ['16.0.0.163:/STORAGES/g1r5p4/GFS',
> '16.0.0.164:/STORAGES/g1r5p4/GFS'], 'volumeName':
'2KVM12_P4',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p4/GFS',
> 'hostUuid': '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
> '16.0.0.164:/STORAGES/g1r5p4/GFS', 'hostUuid':
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}, '2KVM12-P1':
{'transportType': ['TCP'],
> 'uuid': 'cbf142f8-a40b-4cf4-ad29-2243c81d30c1', 'bricks':
> ['16.0.0.163:/STORAGES/g1r5p1/GFS',
> '16.0.0.164:/STORAGES/g1r5p1/GFS'], 'volumeName':
'2KVM12-P1',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p1/GFS',
> 'hostUuid': '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
> '16.0.0.164:/STORAGES/g1r5p1/GFS', 'hostUuid':
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}, '2KVM12-P3':
{'transportType': ['TCP'],
> 'uuid': '25a5ec22-660e-42a0-aa00-45211d341738', 'bricks':
> ['16.0.0.163:/STORAGES/g1r5p3/GFS',
> '16.0.0.164:/STORAGES/g1r5p3/GFS'], 'volumeName':
'2KVM12-P3',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p3/GFS',
> 'hostUuid': '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
> '16.0.0.164:/STORAGES/g1r5p3/GFS', 'hostUuid':
> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}, '2KVM12-P2':
{'transportType': ['TCP'],
> 'uuid': '9745551f-4696-4a6c-820a-619e359a61fd', 'bricks':
> ['16.0.0.164:/STORAGES/g1r5p2/GFS',
> '16.0.0.163:/STORAGES/g1r5p2/GFS'], 'volumeName':
'2KVM12-P2',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.164:/STORAGES/g1r5p2/GFS',
> 'hostUuid': '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name':
> '16.0.0.163:/STORAGES/g1r5p2/GFS', 'hostUuid':
> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.stat-prefetch': 'off',
'cluster.quorum-type':
> 'fixed', 'performance.quick-read': 'off',
'network.remote-dio':
> 'enable', 'cluster.quorum-count': '1',
'performance.io-cache': 'off',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}, '1KVM12-P4':
{'transportType': ['TCP'],
> 'uuid': 'b4356604-4404-428a-9da6-f1636115e2fd', 'bricks':
> ['16.0.0.161:/STORAGES/g1r5p4/GFS',
> '16.0.0.162:/STORAGES/g1r5p4/GFS'], 'volumeName':
'1KVM12-P4',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p4/GFS',
> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
> '16.0.0.162:/STORAGES/g1r5p4/GFS', 'hostUuid':
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'diagnostics.count-fop-hits': 'on',
> 'performance.stat-prefetch': 'off', 'cluster.quorum-type':
'fixed',
> 'performance.quick-read': 'off', 'network.remote-dio':
'enable',
> 'cluster.quorum-count': '1', 'performance.io-cache':
'off',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36', 'diagnostics.latency-measurement':
'on'}},
> '1KVM12-BCK': {'transportType': ['TCP'], 'uuid':
> '62c89345-fd61-4b67-b8b4-69296eb7d217', 'bricks':
> ['16.0.0.161:/STORAGES/g2r5p1/GFS',
> '16.0.0.162:/STORAGES/g2r5p1/GFS'], 'volumeName':
'1KVM12-BCK',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g2r5p1/GFS',
> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
> '16.0.0.162:/STORAGES/g2r5p1/GFS', 'hostUuid':
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}, '1KVM12-P2':
{'transportType': ['TCP'],
> 'uuid': 'aa2d607d-3c6c-4f13-8205-aae09dcc9d35', 'bricks':
> ['16.0.0.161:/STORAGES/g1r5p2/GFS',
> '16.0.0.162:/STORAGES/g1r5p2/GFS'], 'volumeName':
'1KVM12-P2',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p2/GFS',
> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
> '16.0.0.162:/STORAGES/g1r5p2/GFS', 'hostUuid':
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
> 'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36', 'diagnostics.latency-measurement':
'on'}},
> '1KVM12-P3': {'transportType': ['TCP'], 'uuid':
> '6060ff77-d552-4d94-97bf-5a32982e7d8a', 'bricks':
> ['16.0.0.161:/STORAGES/g1r5p3/GFS',
> '16.0.0.162:/STORAGES/g1r5p3/GFS'], 'volumeName':
'1KVM12-P3',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p3/GFS',
> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
> '16.0.0.162:/STORAGES/g1r5p3/GFS', 'hostUuid':
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
> 'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36', 'diagnostics.latency-measurement':
'on'}},
> '1KVM12-P1': {'transportType': ['TCP'], 'uuid':
> 'f410c6a9-9a51-42b3-89bb-c20ac72a0461', 'bricks':
> ['16.0.0.161:/STORAGES/g1r5p1/GFS',
> '16.0.0.162:/STORAGES/g1r5p1/GFS'], 'volumeName':
'1KVM12-P1',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p1/GFS',
> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
> '16.0.0.162:/STORAGES/g1r5p1/GFS', 'hostUuid':
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
> 'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36', 'diagnostics.latency-measurement':
'on'}},
> '1KVM12-P5': {'transportType': ['TCP'], 'uuid':
> '420fa218-60bc-47e4-89a8-ce39b7da885e', 'bricks':
> ['16.0.0.161:/STORAGES/g1r5p5/GFS',
> '16.0.0.162:/STORAGES/g1r5p5/GFS'], 'volumeName':
'1KVM12-P5',
> 'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount': '2',
> 'distCount': '2', 'volumeStatus': 'ONLINE',
'stripeCount': '1',
> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p5/GFS',
> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
> '16.0.0.162:/STORAGES/g1r5p5/GFS', 'hostUuid':
> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
> 'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch':
> 'off', 'cluster.quorum-type': 'fixed',
'performance.quick-read':
> 'off', 'network.remote-dio': 'enable',
'cluster.quorum-count': '1',
> 'storage.owner-uid': '36', 'performance.read-ahead':
'off',
> 'storage.owner-gid': '36'}}}}
> Thread-431::DEBUG::2016-02-24
> 11:12:47,729::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
> /usr/bin/dd
>
if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata
> iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
> Thread-431::DEBUG::2016-02-24
> 11:12:47,743::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n997 bytes (997 B)
> copied, 0.000569374 s, 1.8 MB/s\n'; <rc> = 0
> Thread-431::INFO::2016-02-24
> 11:12:47,751::clusterlock::219::Storage.SANLock::(acquireHostId)
> Acquiring host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)
> Thread-431::DEBUG::2016-02-24
> 11:12:47,751::clusterlock::237::Storage.SANLock::(acquireHostId) Host
> id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 successfully
> acquired (id: 3)
> --------------------------------
>
> Thread-349::ERROR::2016-02-24
> 11:18:20,040::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain)
> Error while collecting domain ef010d08-aed1-41c4-ba9a-e6d9bdecb4b4
> monitoring information
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/domainMonitor.py", line 221, in
> _monitorDomain
> self.domain = sdCache.produce(self.sdUUID)
> File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
> domain.getRealDomain()
> File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> return self._cache._realProduce(self._sdUUID)
> File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
> domain = self._findDomain(sdUUID)
> File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> dom = findMethod(sdUUID)
> File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
> return
> GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
> File "/usr/share/vdsm/storage/fileSD.py", line 160, in __init__
> validateFileSystemFeatures(sdUUID, self.mountpoint)
> File "/usr/share/vdsm/storage/fileSD.py", line 89, in
> validateFileSystemFeatures
> oop.getProcessPool(sdUUID).directTouch(testFilePath)
> File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in
> directTouch
> ioproc.touch(path, flags, mode)
> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
> 543, in touch
> self.timeout)
> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
> 427, in _sendCommand
> raise OSError(errcode, errstr)
> OSError: [Errno 5] Input/output error
> Thread-453::ERROR::2016-02-24
> 11:18:20,043::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain)
> Error while collecting domain 300e9ac8-3c2f-4703-9bb1-1df2130c7c97
> monitoring information
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/domainMonitor.py", line 221, in
> _monitorDomain
> self.domain = sdCache.produce(self.sdUUID)
> File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
> domain.getRealDomain()
> File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> return self._cache._realProduce(self._sdUUID)
> File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
> domain = self._findDomain(sdUUID)
> File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> dom = findMethod(sdUUID)
> File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
> return
> GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
> File "/usr/share/vdsm/storage/fileSD.py", line 160, in __init__
> validateFileSystemFeatures(sdUUID, self.mountpoint)
> File "/usr/share/vdsm/storage/fileSD.py", line 89, in
> validateFileSystemFeatures
> oop.getProcessPool(sdUUID).directTouch(testFilePath)
> File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in
> directTouch
> ioproc.touch(path, flags, mode)
> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
> 543, in touch
> self.timeout)
> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
> 427, in _sendCommand
> raise OSError(errcode, errstr)
> OSError: [Errno 5] Input/output error
> Thread-431::DEBUG::2016-02-24
> 11:18:20,109::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
> /usr/bin/dd
>
if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata
> iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
> Thread-431::DEBUG::2016-02-24
> 11:18:20,122::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n997 bytes (997 B)
> copied, 0.000444081 s, 2.2 MB/s\n'; <rc> = 0
> Thread-431::INFO::2016-02-24
> 11:18:20,128::clusterlock::219::Storage.SANLock::(acquireHostId)
> Acquiring host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)
> Thread-431::DEBUG::2016-02-24
> 11:18:20,129::clusterlock::237::Storage.SANLock::(acquireHostId) Host
> id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 successfully
> acquired (id: 3)
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,690::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
> Calling 'Task.getStatus' in bridge with {u'taskID':
> u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,692::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state init
> -> state preparing
> Thread-35631::INFO::2016-02-24
> 11:18:20,692::logUtils::44::dispatcher::(wrapper) Run and protect:
> getTaskStatus(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660',
> spUUID=None, options=None)
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,693::taskManager::103::Storage.TaskManager::(getTaskStatus)
> Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,693::taskManager::106::Storage.TaskManager::(getTaskStatus)
> Return. Response: {'code': 661, 'message': 'Cannot acquire host
id',
> 'taskState': 'finished', 'taskResult':
'cleanSuccess', 'taskID':
> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35631::INFO::2016-02-24
> 11:18:20,693::logUtils::47::dispatcher::(wrapper) Run and protect:
> getTaskStatus, Return response: {'taskStatus': {'code': 661,
> 'message': 'Cannot acquire host id', 'taskState':
'finished',
> 'taskResult': 'cleanSuccess', 'taskID':
> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,693::task::1191::Storage.TaskManager.Task::(prepare)
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::finished: {'taskStatus':
> {'code': 661, 'message': 'Cannot acquire host id',
'taskState':
> 'finished', 'taskResult': 'cleanSuccess', 'taskID':
> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,693::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state
> preparing -> state finished
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,694::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,694::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,694::task::993::Storage.TaskManager.Task::(_decref)
> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::ref 0 aborting False
> Thread-35631::DEBUG::2016-02-24
> 11:18:20,694::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
> Return 'Task.getStatus' in bridge with {'code': 661,
'message':
> 'Cannot acquire host id', 'taskState': 'finished',
'taskResult':
> 'cleanSuccess', 'taskID':
'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,699::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
> Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID':
> u'00000002-0002-0002-0002-00000000021e'}
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,700::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state init
> -> state preparing
> Thread-35632::INFO::2016-02-24
> 11:18:20,700::logUtils::44::dispatcher::(wrapper) Run and protect:
> getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e',
> options=None)
> Thread-35632::INFO::2016-02-24
> 11:18:20,707::logUtils::47::dispatcher::(wrapper) Run and protect:
> getSpmStatus, Return response: {'spm_st': {'spmId': -1,
'spmStatus':
> 'Free', 'spmLver': -1}}
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,707::task::1191::Storage.TaskManager.Task::(prepare)
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::finished: {'spm_st':
> {'spmId': -1, 'spmStatus': 'Free', 'spmLver': -1}}
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,707::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state
> preparing -> state finished
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,708::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,708::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,708::task::993::Storage.TaskManager.Task::(_decref)
> Task=`3a303728-a737-4ec4-8468-c5f142964689`::ref 0 aborting False
> Thread-35632::DEBUG::2016-02-24
> 11:18:20,708::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
> Return 'StoragePool.getSpmStatus' in bridge with {'spmId': -1,
> 'spmStatus': 'Free', 'spmLver': -1}
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,737::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
> Calling 'Task.clear' in bridge with {u'taskID':
> u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,738::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state init
> -> state preparing
> Thread-35633::INFO::2016-02-24
> 11:18:20,738::logUtils::44::dispatcher::(wrapper) Run and protect:
> clearTask(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660',
> spUUID=None, options=None)
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,739::taskManager::171::Storage.TaskManager::(clearTask)
> Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,739::taskManager::176::Storage.TaskManager::(clearTask) Return.
> Thread-35633::INFO::2016-02-24
> 11:18:20,739::logUtils::47::dispatcher::(wrapper) Run and protect:
> clearTask, Return response: None
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,739::task::1191::Storage.TaskManager.Task::(prepare)
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::finished: None
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,740::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state
> preparing -> state finished
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,740::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,740::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,740::task::993::Storage.TaskManager.Task::(_decref)
> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::ref 0 aborting False
> Thread-35633::DEBUG::2016-02-24
> 11:18:20,740::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
> Return 'Task.clear' in bridge with True
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,859::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
> Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID':
> u'00000002-0002-0002-0002-00000000021e'}
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,860::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state init
> -> state preparing
> Thread-35634::INFO::2016-02-24
> 11:18:20,860::logUtils::44::dispatcher::(wrapper) Run and protect:
> getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e',
> options=None)
> Thread-35634::INFO::2016-02-24
> 11:18:20,867::logUtils::47::dispatcher::(wrapper) Run and protect:
> getSpmStatus, Return response: {'spm_st': {'spmId': -1,
'spmStatus':
> 'Free', 'spmLver': -1}}
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,867::task::1191::Storage.TaskManager.Task::(prepare)
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::finished: {'spm_st':
> {'spmId': -1, 'spmStatus': 'Free', 'spmLver': -1}}
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,867::task::595::Storage.TaskManager.Task::(_updateState)
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state
> preparing -> state finished
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,867::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,867::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-35634::DEBUG::2016-02-24
> 11:18:20,867::task::993::Storage.TaskManager.Task::(_decref)
> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::ref 0 aborting False
> ---------------------------------------
>
> these blocks generated in cycle for each domain
>
> Any IDEA ??
> regs.
> Pa.
>
>
>
> On 24.2.2016 10:54, Ravishankar N wrote:
>> Hi,
>>
>> On 02/24/2016 06:43 AM, paf1(a)email.cz wrote:
>>> Hi,
>>> I found the main ( maybe ) problem with IO error ( -5 ) for "ids"
>>> file access
>>> This file is not accessable via NFS, locally yes
>> How is NFS coming into the picture? Are you not using gluster fuse
>> mount?
>>> .
>>> How can I fix it ??
>> Can you run `gluster volume heal volname info` and `gluster volume
>> heal volname info split-brain` to see if the "ids" file is in
>> split-brain? A file in split-brain returns EIO when accessed from
>> the mount.
>> Regards,
>> Ravi
>>
>>
>>> regs.
>>> Pavel
>>>
>>> # sanlock client log_dump
>>> ....
>>> 0 flags 1 timeout 0
>>> 2016-02-24 02:01:10+0100 3828 [12111]: s1316 lockspace
>>>
88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>> 2016-02-24 02:01:10+0100 3828 [12111]: cmd_add_lockspace 4,15 async
>>> done 0
>>> 2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire begin
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1
>>> 2016-02-24 02:01:10+0100 3828 [19556]: 88adbd49 aio collect 0
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:10+0100 3828 [19556]: read_sectors delta_leader
>>> offset 0 rv -5
>>>
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids
>>> 2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire
>>> leader_read1 error -5
>>> 2016-02-24 02:01:11+0100 3829 [12111]: s1316 add_lockspace fail
>>> result -5
>>> 2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace 4,15
>>>
7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:12+0100 3831 [12116]: s1317 lockspace
>>>
7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>>> 2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace 4,15 async
>>> done 0
>>> 2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire begin
>>> 7f52b697-c199-4f58-89aa-102d44327124:1
>>> 2016-02-24 02:01:12+0100 3831 [19562]: 7f52b697 aio collect 0
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:12+0100 3831 [19562]: read_sectors delta_leader
>>> offset 0 rv -5
>>>
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids
>>> 2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire
>>> leader_read1 error -5
>>> 2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace 4,15
>>>
0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:13+0100 3831 [1321]: s1318 lockspace
>>>
0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0
>>> 2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace 4,15 async
>>> done 0
>>> 2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire begin
>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1
>>> 2016-02-24 02:01:13+0100 3831 [19564]: 0fcad888 aio collect 0
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458201000 result -5:0 match res
>>> 2016-02-24 02:01:13+0100 3831 [19564]: read_sectors delta_leader
>>> offset 0 rv -5
>>>
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>>> 2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire
>>> leader_read1 error -5
>>> 2016-02-24 02:01:13+0100 3832 [12116]: s1317 add_lockspace fail
>>> result -5
>>> 2016-02-24 02:01:14+0100 3832 [1321]: s1318 add_lockspace fail
>>> result -5
>>> 2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace 4,15
>>>
3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:19+0100 3838 [12106]: s1319 lockspace
>>>
3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0
>>> 2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace 4,15 async
>>> done 0
>>> 2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire begin
>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1
>>> 2016-02-24 02:01:19+0100 3838 [19638]: 3da46e07 aio collect 0
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:19+0100 3838 [19638]: read_sectors delta_leader
>>> offset 0 rv -5
>>>
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids
>>> 2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire
>>> leader_read1 error -5
>>> 2016-02-24 02:01:20+0100 3839 [12106]: s1319 add_lockspace fail
>>> result -5
>>> 2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace 4,15
>>>
88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>> flags 1 timeout 0
>>> 2016-02-24 02:01:20+0100 3839 [1320]: s1320 lockspace
>>>
88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>> 2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace 4,15 async
>>> done 0
>>> 2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire begin
>>> 88adbd49-62d6-45b1-9992-b04464a04112:1
>>> 2016-02-24 02:01:20+0100 3839 [19658]: 88adbd49 aio collect 0
>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>> 2016-02-24 02:01:20+0100 3839 [19658]: read_sectors delta_leader
>>> offset 0 rv -5
>>>
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids
>>> 2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire
>>> leader_read1 error -5
>>> 2016-02-24 02:01:21+0100 3840 [1320]: s1320 add_lockspace fail
>>> result -5
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
--------------050202070801030107080007
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: 8bit
<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000066" bgcolor="#FFFFFF">
used replica2 with volume option<br>
<br>
Volume Name: 2KVM12-P2<br>
Type: Replicate<br>
Volume ID: 9745551f-4696-4a6c-820a-619e359a61fd<br>
Status: Started<br>
Number of Bricks: 1 x 2 = 2<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: 16.0.0.164:/STORAGES/g1r5p2/GFS<br>
Brick2: 16.0.0.163:/STORAGES/g1r5p2/GFS<br>
Options Reconfigured:<br>
storage.owner-uid: 36<br>
storage.owner-gid: 36<br>
performance.io-cache: off<br>
performance.read-ahead: off<br>
network.remote-dio: enable<br>
cluster.eager-lock: enable<br>
performance.stat-prefetch: off<br>
performance.quick-read: off<br>
cluster.quorum-count: 1<br>
cluster.server-quorum-type: none<br>
cluster.quorum-type: fixed<br>
<br>
was runnig over year with no problems ( reboots, ..etc... )<br>
<br>
<br>
<div class="moz-cite-prefix">On 24.2.2016 12:34, Ravishankar N
wrote:<br>
</div>
<blockquote cite="mid:56CD955F.9020508@redhat.com"
type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<div class="moz-cite-prefix">On 02/24/2016 04:48 PM, <a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:paf1@email.cz">paf1@email.cz</a>
wrote:<br>
</div>
<blockquote cite="mid:56CD9174.7080801@email.cz"
type="cite">
<meta http-equiv="content-type" content="text/html;
charset=windows-1252">
<br>
<div class="moz-forward-container"><br>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
prereq: 2KVM12-P2 = master domain<br>
---------------------------------------------------------<br>
YES - I'm using gluster.fuse NFS<br>
localhost:/2KVM12-P2 on
/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2 type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)<br>
---------------------------------------------------------<br>
Healing <br>
======<br>
# gluster volume heal 2KVM12-P2 info<br>
Brick 16.0.0.164:/STORAGES/g1r5p2/GFS<br>
Number of entries: 0<br>
<br>
Brick 16.0.0.163:/STORAGES/g1r5p2/GFS<br>
Number of entries: 0<br>
<br>
# while true; do for vol in `gluster volume list`; do gluster
volume heal $vol info | sort | grep "Number of entries" | awk
-F: '{tot+=$2} END { printf("Heal entries for
'"$vol"': %d\n",
$tot);}'; done; sleep 120; echo -e "\n==================\n";
done<br>
Heal entries for 1KVM12-BCK: 1<br>
Heal entries for 1KVM12-P1: 1<br>
Heal entries for 1KVM12-P2: 0<br>
Heal entries for 1KVM12-P3: 0<br>
Heal entries for 1KVM12-P4: 0<br>
Heal entries for 1KVM12-P5: 0<br>
Heal entries for 2KVM12-P1: 1<br>
Heal entries for 2KVM12-P2: 0<br>
Heal entries for 2KVM12-P3: 0<br>
Heal entries for 2KVM12-P5: 0<br>
Heal entries for 2KVM12_P4: 1<br>
<br>
# gluster volume heal 1KVM12-BCK info split-brain<br>
Brick 16.0.0.161:/STORAGES/g2r5p1/GFS<br>
/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids<br>
Number of entries in split-brain: 1<br>
<br>
Brick 16.0.0.162:/STORAGES/g2r5p1/GFS<br>
/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids<br>
Number of entries in split-brain: 1<br>
<br>
# gluster volume heal 1KVM12-P1 info split-brain<br>
Brick 16.0.0.161:/STORAGES/g1r5p1/GFS<br>
/__DIRECT_IO_TEST__<br>
Number of entries in split-brain: 1<br>
<br>
Brick 16.0.0.162:/STORAGES/g1r5p1/GFS<br>
/__DIRECT_IO_TEST__<br>
Number of entries in split-brain: 1<br>
<br>
etc......<br>
<br>
<br>
YES - in split brain , but NOT master domain ( will solve
later, after master - if possible )<br>
</div>
</blockquote>
<br>
I'm not sure if it is related, but you could try to resolve the
split-brain first and see if it helps. Also, I see that you are
using replica-2. It is recommended to use replica-3 or arbiter
volumes to avoid split-brains.<br>
<br>
-Ravi<br>
<br>
<blockquote cite="mid:56CD9174.7080801@email.cz"
type="cite">
<div class="moz-forward-container"> <br>
---------------------------------------------------------------<br>
vdsm.log<br>
=========<br>
<br>
Thread-461::DEBUG::2016-02-24
11:12:45,328::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
SUCCESS: <err> = '0+1 records in\n0+1 records out\n333
bytes (333 B) copied, 0.000724379 s, 460 kB/s\n'; <rc> =
0<br>
Thread-461::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:12:45,331::clusterlock::219::Storage.SANLock::(acquireHostId)
Acquiring host id for domain
88adbd49-62d6-45b1-9992-b04464a04112 (id: 3)<br>
Thread-461::DEBUG::2016-02-24
11:12:45,331::clusterlock::237::Storage.SANLock::(acquireHostId)
Host id for domain 88adbd49-62d6-45b1-9992-b04464a04112
successfully acquired (id: 3)<br>
Thread-33186::DEBUG::2016-02-24
11:12:46,067::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
Calling 'GlusterVolume.list' in bridge with {}<br>
Thread-33186::DEBUG::2016-02-24
11:12:46,204::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
Return 'GlusterVolume.list' in bridge with {'volumes':
{'2KVM12-P5': {'transportType': ['TCP'],
'uuid':
'4a6d775d-4a51-4f6c-9bfa-f7ef57f3ca1d', 'bricks':
['16.0.0.164:/STORAGES/g1r5p5/GFS',
'16.0.0.163:/STORAGES/g1r5p5/GFS'], 'volumeName':
'2KVM12-P5',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.164:/STORAGES/g1r5p5/GFS', 'hostUuid':
'06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name':
'16.0.0.163:/STORAGES/g1r5p5/GFS', 'hostUuid':
'6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.stat-prefetch': 'off',
'cluster.quorum-type': 'fixed',
'performance.quick-read':
'off', 'network.remote-dio': 'enable',
'cluster.quorum-count':
'1', 'performance.io-cache': 'off',
'storage.owner-uid': '36',
'performance.read-ahead': 'off', 'storage.owner-gid':
'36'}},
'2KVM12_P4': {'transportType': ['TCP'], 'uuid':
'18310aeb-639f-4b6d-9ef4-9ef560d6175c', 'bricks':
['16.0.0.163:/STORAGES/g1r5p4/GFS',
'16.0.0.164:/STORAGES/g1r5p4/GFS'], 'volumeName':
'2KVM12_P4',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.163:/STORAGES/g1r5p4/GFS', 'hostUuid':
'6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
'16.0.0.164:/STORAGES/g1r5p4/GFS', 'hostUuid':
'06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36'}}, '2KVM12-P1':
{'transportType':
['TCP'], 'uuid':
'cbf142f8-a40b-4cf4-ad29-2243c81d30c1',
'bricks': ['16.0.0.163:/STORAGES/g1r5p1/GFS',
'16.0.0.164:/STORAGES/g1r5p1/GFS'], 'volumeName':
'2KVM12-P1',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.163:/STORAGES/g1r5p1/GFS', 'hostUuid':
'6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
'16.0.0.164:/STORAGES/g1r5p1/GFS', 'hostUuid':
'06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36'}}, '2KVM12-P3':
{'transportType':
['TCP'], 'uuid':
'25a5ec22-660e-42a0-aa00-45211d341738',
'bricks': ['16.0.0.163:/STORAGES/g1r5p3/GFS',
'16.0.0.164:/STORAGES/g1r5p3/GFS'], 'volumeName':
'2KVM12-P3',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.163:/STORAGES/g1r5p3/GFS', 'hostUuid':
'6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
'16.0.0.164:/STORAGES/g1r5p3/GFS', 'hostUuid':
'06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36'}}, '2KVM12-P2':
{'transportType':
['TCP'], 'uuid':
'9745551f-4696-4a6c-820a-619e359a61fd',
'bricks': ['16.0.0.164:/STORAGES/g1r5p2/GFS',
'16.0.0.163:/STORAGES/g1r5p2/GFS'], 'volumeName':
'2KVM12-P2',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.164:/STORAGES/g1r5p2/GFS', 'hostUuid':
'06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name':
'16.0.0.163:/STORAGES/g1r5p2/GFS', 'hostUuid':
'6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.stat-prefetch': 'off',
'cluster.quorum-type': 'fixed',
'performance.quick-read':
'off', 'network.remote-dio': 'enable',
'cluster.quorum-count':
'1', 'performance.io-cache': 'off',
'storage.owner-uid': '36',
'performance.read-ahead': 'off', 'storage.owner-gid':
'36'}},
'1KVM12-P4': {'transportType': ['TCP'], 'uuid':
'b4356604-4404-428a-9da6-f1636115e2fd', 'bricks':
['16.0.0.161:/STORAGES/g1r5p4/GFS',
'16.0.0.162:/STORAGES/g1r5p4/GFS'], 'volumeName':
'1KVM12-P4',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.161:/STORAGES/g1r5p4/GFS', 'hostUuid':
'194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
'16.0.0.162:/STORAGES/g1r5p4/GFS', 'hostUuid':
'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'performance.io-cache': 'off', 'storage.owner-uid':
'36',
'performance.read-ahead': 'off', 'storage.owner-gid':
'36',
'diagnostics.latency-measurement': 'on'}},
'1KVM12-BCK':
{'transportType': ['TCP'], 'uuid':
'62c89345-fd61-4b67-b8b4-69296eb7d217', 'bricks':
['16.0.0.161:/STORAGES/g2r5p1/GFS',
'16.0.0.162:/STORAGES/g2r5p1/GFS'], 'volumeName':
'1KVM12-BCK', 'volumeType': 'REPLICATE',
'replicaCount': '2',
'brickCount': '2', 'distCount': '2',
'volumeStatus': 'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.161:/STORAGES/g2r5p1/GFS', 'hostUuid':
'194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
'16.0.0.162:/STORAGES/g2r5p1/GFS', 'hostUuid':
'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36'}}, '1KVM12-P2':
{'transportType':
['TCP'], 'uuid':
'aa2d607d-3c6c-4f13-8205-aae09dcc9d35',
'bricks': ['16.0.0.161:/STORAGES/g1r5p2/GFS',
'16.0.0.162:/STORAGES/g1r5p2/GFS'], 'volumeName':
'1KVM12-P2',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.161:/STORAGES/g1r5p2/GFS', 'hostUuid':
'194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
'16.0.0.162:/STORAGES/g1r5p2/GFS', 'hostUuid':
'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36',
'diagnostics.latency-measurement':
'on'}}, '1KVM12-P3': {'transportType': ['TCP'],
'uuid':
'6060ff77-d552-4d94-97bf-5a32982e7d8a', 'bricks':
['16.0.0.161:/STORAGES/g1r5p3/GFS',
'16.0.0.162:/STORAGES/g1r5p3/GFS'], 'volumeName':
'1KVM12-P3',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.161:/STORAGES/g1r5p3/GFS', 'hostUuid':
'194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
'16.0.0.162:/STORAGES/g1r5p3/GFS', 'hostUuid':
'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36',
'diagnostics.latency-measurement':
'on'}}, '1KVM12-P1': {'transportType': ['TCP'],
'uuid':
'f410c6a9-9a51-42b3-89bb-c20ac72a0461', 'bricks':
['16.0.0.161:/STORAGES/g1r5p1/GFS',
'16.0.0.162:/STORAGES/g1r5p1/GFS'], 'volumeName':
'1KVM12-P1',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.161:/STORAGES/g1r5p1/GFS', 'hostUuid':
'194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
'16.0.0.162:/STORAGES/g1r5p1/GFS', 'hostUuid':
'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'diagnostics.count-fop-hits': 'on',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36',
'diagnostics.latency-measurement':
'on'}}, '1KVM12-P5': {'transportType': ['TCP'],
'uuid':
'420fa218-60bc-47e4-89a8-ce39b7da885e', 'bricks':
['16.0.0.161:/STORAGES/g1r5p5/GFS',
'16.0.0.162:/STORAGES/g1r5p5/GFS'], 'volumeName':
'1KVM12-P5',
'volumeType': 'REPLICATE', 'replicaCount': '2',
'brickCount':
'2', 'distCount': '2', 'volumeStatus':
'ONLINE',
'stripeCount': '1', 'bricksInfo': [{'name':
'16.0.0.161:/STORAGES/g1r5p5/GFS', 'hostUuid':
'194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
'16.0.0.162:/STORAGES/g1r5p5/GFS', 'hostUuid':
'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
{'cluster.server-quorum-type': 'none',
'cluster.eager-lock':
'enable', 'performance.io-cache': 'off',
'performance.stat-prefetch': 'off',
'cluster.quorum-type':
'fixed', 'performance.quick-read': 'off',
'network.remote-dio': 'enable', 'cluster.quorum-count':
'1',
'storage.owner-uid': '36', 'performance.read-ahead':
'off',
'storage.owner-gid': '36'}}}}<br>
Thread-431::DEBUG::2016-02-24
11:12:47,729::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
/usr/bin/dd
if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata
iflag=direct of=/dev/null bs=4096 count=1 (cwd None)<br>
Thread-431::DEBUG::2016-02-24
11:12:47,743::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
SUCCESS: <err> = '0+1 records in\n0+1 records out\n997
bytes (997 B) copied, 0.000569374 s, 1.8 MB/s\n'; <rc> =
0<br>
Thread-431::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:12:47,751::clusterlock::219::Storage.SANLock::(acquireHostId)
Acquiring host id for domain
ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)<br>
Thread-431::DEBUG::2016-02-24
11:12:47,751::clusterlock::237::Storage.SANLock::(acquireHostId)
Host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0
successfully acquired (id: 3)<br>
--------------------------------<br>
<br>
Thread-349::ERROR::2016-02-24
11:18:20,040::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain
ef010d08-aed1-41c4-ba9a-e6d9bdecb4b4 monitoring information<br>
Traceback (most recent call last):<br>
File "/usr/share/vdsm/storage/domainMonitor.py", line 221,
in _monitorDomain<br>
self.domain = sdCache.produce(self.sdUUID)<br>
File "/usr/share/vdsm/storage/sdc.py", line 98, in
produce<br>
domain.getRealDomain()<br>
File "/usr/share/vdsm/storage/sdc.py", line 52, in
getRealDomain<br>
return self._cache._realProduce(self._sdUUID)<br>
File "/usr/share/vdsm/storage/sdc.py", line 122, in
_realProduce<br>
domain = self._findDomain(sdUUID)<br>
File "/usr/share/vdsm/storage/sdc.py", line 141, in
_findDomain<br>
dom = findMethod(sdUUID)<br>
File "/usr/share/vdsm/storage/glusterSD.py", line 32, in
findDomain<br>
return
GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))<br>
File "/usr/share/vdsm/storage/fileSD.py", line 160, in
__init__<br>
validateFileSystemFeatures(sdUUID, self.mountpoint)<br>
File "/usr/share/vdsm/storage/fileSD.py", line 89, in
validateFileSystemFeatures<br>
oop.getProcessPool(sdUUID).directTouch(testFilePath)<br>
File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in
directTouch<br>
ioproc.touch(path, flags, mode)<br>
File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
543, in touch<br>
self.timeout)<br>
File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
427, in _sendCommand<br>
raise OSError(errcode, errstr)<br>
OSError: [Errno 5] Input/output error<br>
Thread-453::ERROR::2016-02-24
11:18:20,043::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain
300e9ac8-3c2f-4703-9bb1-1df2130c7c97 monitoring information<br>
Traceback (most recent call last):<br>
File "/usr/share/vdsm/storage/domainMonitor.py", line 221,
in _monitorDomain<br>
self.domain = sdCache.produce(self.sdUUID)<br>
File "/usr/share/vdsm/storage/sdc.py", line 98, in
produce<br>
domain.getRealDomain()<br>
File "/usr/share/vdsm/storage/sdc.py", line 52, in
getRealDomain<br>
return self._cache._realProduce(self._sdUUID)<br>
File "/usr/share/vdsm/storage/sdc.py", line 122, in
_realProduce<br>
domain = self._findDomain(sdUUID)<br>
File "/usr/share/vdsm/storage/sdc.py", line 141, in
_findDomain<br>
dom = findMethod(sdUUID)<br>
File "/usr/share/vdsm/storage/glusterSD.py", line 32, in
findDomain<br>
return
GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))<br>
File "/usr/share/vdsm/storage/fileSD.py", line 160, in
__init__<br>
validateFileSystemFeatures(sdUUID, self.mountpoint)<br>
File "/usr/share/vdsm/storage/fileSD.py", line 89, in
validateFileSystemFeatures<br>
oop.getProcessPool(sdUUID).directTouch(testFilePath)<br>
File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in
directTouch<br>
ioproc.touch(path, flags, mode)<br>
File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
543, in touch<br>
self.timeout)<br>
File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
427, in _sendCommand<br>
raise OSError(errcode, errstr)<br>
OSError: [Errno 5] Input/output error<br>
Thread-431::DEBUG::2016-02-24
11:18:20,109::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
/usr/bin/dd
if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata
iflag=direct of=/dev/null bs=4096 count=1 (cwd None)<br>
Thread-431::DEBUG::2016-02-24
11:18:20,122::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
SUCCESS: <err> = '0+1 records in\n0+1 records out\n997
bytes (997 B) copied, 0.000444081 s, 2.2 MB/s\n'; <rc> =
0<br>
Thread-431::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,128::clusterlock::219::Storage.SANLock::(acquireHostId)
Acquiring host id for domain
ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)<br>
Thread-431::DEBUG::2016-02-24
11:18:20,129::clusterlock::237::Storage.SANLock::(acquireHostId)
Host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0
successfully acquired (id: 3)<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,690::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
Calling 'Task.getStatus' in bridge with {u'taskID':
u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,692::task::595::Storage.TaskManager.Task::(_updateState)
Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state
init -> state preparing<br>
Thread-35631::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,692::logUtils::44::dispatcher::(wrapper) Run and
protect:
getTaskStatus(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660',
spUUID=None, options=None)<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,693::taskManager::103::Storage.TaskManager::(getTaskStatus)
Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,693::taskManager::106::Storage.TaskManager::(getTaskStatus)
Return. Response: {'code': 661, 'message': 'Cannot acquire
host id', 'taskState': 'finished', 'taskResult':
'cleanSuccess', 'taskID':
'ea8c684e-fedb-44f2-84c5-4f1407f24660'}<br>
Thread-35631::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,693::logUtils::47::dispatcher::(wrapper) Run and
protect: getTaskStatus, Return response: {'taskStatus':
{'code': 661, 'message': 'Cannot acquire host id',
'taskState': 'finished', 'taskResult':
'cleanSuccess',
'taskID': 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,693::task::1191::Storage.TaskManager.Task::(prepare)
Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::finished:
{'taskStatus': {'code': 661, 'message': 'Cannot
acquire host
id', 'taskState': 'finished', 'taskResult':
'cleanSuccess',
'taskID': 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,693::task::595::Storage.TaskManager.Task::(_updateState)
Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state
preparing -> state finished<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,694::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,694::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,694::task::993::Storage.TaskManager.Task::(_decref)
Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::ref 0 aborting
False<br>
Thread-35631::DEBUG::2016-02-24
11:18:20,694::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
Return 'Task.getStatus' in bridge with {'code': 661,
'message': 'Cannot acquire host id', 'taskState':
'finished',
'taskResult': 'cleanSuccess', 'taskID':
'ea8c684e-fedb-44f2-84c5-4f1407f24660'}<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,699::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
Calling 'StoragePool.getSpmStatus' in bridge with
{u'storagepoolID':
u'00000002-0002-0002-0002-00000000021e'}<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,700::task::595::Storage.TaskManager.Task::(_updateState)
Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state
init -> state preparing<br>
Thread-35632::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,700::logUtils::44::dispatcher::(wrapper) Run and
protect:
getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e',
options=None)<br>
Thread-35632::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,707::logUtils::47::dispatcher::(wrapper) Run and
protect: getSpmStatus, Return response: {'spm_st': {'spmId':
-1, 'spmStatus': 'Free', 'spmLver': -1}}<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,707::task::1191::Storage.TaskManager.Task::(prepare)
Task=`3a303728-a737-4ec4-8468-c5f142964689`::finished:
{'spm_st': {'spmId': -1, 'spmStatus': 'Free',
'spmLver': -1}}<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,707::task::595::Storage.TaskManager.Task::(_updateState)
Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state
preparing -> state finished<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,708::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,708::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,708::task::993::Storage.TaskManager.Task::(_decref)
Task=`3a303728-a737-4ec4-8468-c5f142964689`::ref 0 aborting
False<br>
Thread-35632::DEBUG::2016-02-24
11:18:20,708::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
Return 'StoragePool.getSpmStatus' in bridge with {'spmId': -1,
'spmStatus': 'Free', 'spmLver': -1}<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,737::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
Calling 'Task.clear' in bridge with {u'taskID':
u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,738::task::595::Storage.TaskManager.Task::(_updateState)
Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state
init -> state preparing<br>
Thread-35633::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,738::logUtils::44::dispatcher::(wrapper) Run and
protect:
clearTask(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660',
spUUID=None, options=None)<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,739::taskManager::171::Storage.TaskManager::(clearTask)
Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,739::taskManager::176::Storage.TaskManager::(clearTask)
Return.<br>
Thread-35633::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,739::logUtils::47::dispatcher::(wrapper) Run and
protect: clearTask, Return response: None<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,739::task::1191::Storage.TaskManager.Task::(prepare)
Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::finished: None<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,740::task::595::Storage.TaskManager.Task::(_updateState)
Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state
preparing -> state finished<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,740::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,740::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,740::task::993::Storage.TaskManager.Task::(_decref)
Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::ref 0 aborting
False<br>
Thread-35633::DEBUG::2016-02-24
11:18:20,740::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
Return 'Task.clear' in bridge with True<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,859::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
Calling 'StoragePool.getSpmStatus' in bridge with
{u'storagepoolID':
u'00000002-0002-0002-0002-00000000021e'}<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,860::task::595::Storage.TaskManager.Task::(_updateState)
Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state
init -> state preparing<br>
Thread-35634::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,860::logUtils::44::dispatcher::(wrapper) Run and
protect:
getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e',
options=None)<br>
Thread-35634::<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="INFO::2016-02-24">INFO::2016-02-24</a>
11:18:20,867::logUtils::47::dispatcher::(wrapper) Run and
protect: getSpmStatus, Return response: {'spm_st': {'spmId':
-1, 'spmStatus': 'Free', 'spmLver': -1}}<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,867::task::1191::Storage.TaskManager.Task::(prepare)
Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::finished:
{'spm_st': {'spmId': -1, 'spmStatus': 'Free',
'spmLver': -1}}<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,867::task::595::Storage.TaskManager.Task::(_updateState)
Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state
preparing -> state finished<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,867::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,867::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}<br>
Thread-35634::DEBUG::2016-02-24
11:18:20,867::task::993::Storage.TaskManager.Task::(_decref)
Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::ref 0 aborting
False<br>
---------------------------------------<br>
<br>
these blocks generated in cycle for each domain<br>
<br>
Any IDEA ??<br>
regs.<br>
Pa.<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 24.2.2016 10:54, Ravishankar N
wrote:<br>
</div>
<blockquote cite="mid:56CD7DE3.3050905@redhat.com"
type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<div class="moz-cite-prefix">Hi,<br>
<br>
On 02/24/2016 06:43 AM, <a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:paf1@email.cz">paf1@email.cz</a>
wrote:<br>
</div>
<blockquote cite="mid:56CD03D7.2070308@email.cz"
type="cite">
<meta http-equiv="content-type" content="text/html;
charset=windows-1252">
Hi,<br>
I found the main ( maybe ) problem with IO error ( -5 )
for "ids" file access<br>
This file is not accessable via NFS, locally yes</blockquote>
How is NFS coming into the picture? Are you not using
gluster fuse mount?
<blockquote cite="mid:56CD03D7.2070308@email.cz"
type="cite">.<br>
How can I fix it ??<br>
</blockquote>
Can you run `gluster volume heal volname info` and `gluster
volume heal volname info split-brain` to see if the "ids"
file is in split-brain? A file in split-brain returns EIO
when accessed from the mount.<br>
Regards,<br>
Ravi<br>
<br>
<br>
<blockquote cite="mid:56CD03D7.2070308@email.cz"
type="cite">
regs.<br>
Pavel<br>
<br>
# sanlock client log_dump<br>
....<br>
0 flags 1 timeout 0<br>
2016-02-24 02:01:10+0100 3828 [12111]: s1316 lockspace
88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0<br>
2016-02-24 02:01:10+0100 3828 [12111]: cmd_add_lockspace
4,15 async done 0<br>
2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire
begin 88adbd49-62d6-45b1-9992-b04464a04112:1<br>
2016-02-24 02:01:10+0100 3828 [19556]: 88adbd49 aio
collect 0 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000
result -5:0 match res<br>
2016-02-24 02:01:10+0100 3828 [19556]: read_sectors
delta_leader offset 0 rv -5
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids<br>
2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire
leader_read1 error -5<br>
2016-02-24 02:01:11+0100 3829 [12111]: s1316 add_lockspace
fail result -5<br>
2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace
4,15
7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
flags 1 timeout 0<br>
2016-02-24 02:01:12+0100 3831 [12116]: s1317 lockspace
7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0<br>
2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace
4,15 async done 0<br>
2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire
begin 7f52b697-c199-4f58-89aa-102d44327124:1<br>
2016-02-24 02:01:12+0100 3831 [19562]: 7f52b697 aio
collect 0 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000
result -5:0 match res<br>
2016-02-24 02:01:12+0100 3831 [19562]: read_sectors
delta_leader offset 0 rv -5
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids<br>
2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire
leader_read1 error -5<br>
2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace
4,15
0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0
flags 1 timeout 0<br>
2016-02-24 02:01:13+0100 3831 [1321]: s1318 lockspace
0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0<br>
2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace
4,15 async done 0<br>
2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire
begin 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1<br>
2016-02-24 02:01:13+0100 3831 [19564]: 0fcad888 aio
collect 0 0x7fe4580008c0:0x7fe4580008d0:0x7fe458201000
result -5:0 match res<br>
2016-02-24 02:01:13+0100 3831 [19564]: read_sectors
delta_leader offset 0 rv -5
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids<br>
2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire
leader_read1 error -5<br>
2016-02-24 02:01:13+0100 3832 [12116]: s1317 add_lockspace
fail result -5<br>
2016-02-24 02:01:14+0100 3832 [1321]: s1318 add_lockspace
fail result -5<br>
2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace
4,15
3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0
flags 1 timeout 0<br>
2016-02-24 02:01:19+0100 3838 [12106]: s1319 lockspace
3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0<br>
2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace
4,15 async done 0<br>
2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire
begin 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1<br>
2016-02-24 02:01:19+0100 3838 [19638]: 3da46e07 aio
collect 0 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000
result -5:0 match res<br>
2016-02-24 02:01:19+0100 3838 [19638]: read_sectors
delta_leader offset 0 rv -5
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids<br>
2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire
leader_read1 error -5<br>
2016-02-24 02:01:20+0100 3839 [12106]: s1319 add_lockspace
fail result -5<br>
2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace
4,15
88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
flags 1 timeout 0<br>
2016-02-24 02:01:20+0100 3839 [1320]: s1320 lockspace
88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0<br>
2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace
4,15 async done 0<br>
2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire
begin 88adbd49-62d6-45b1-9992-b04464a04112:1<br>
2016-02-24 02:01:20+0100 3839 [19658]: 88adbd49 aio
collect 0 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000
result -5:0 match res<br>
2016-02-24 02:01:20+0100 3839 [19658]: read_sectors
delta_leader offset 0 rv -5
/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids<br>
2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire
leader_read1 error -5<br>
2016-02-24 02:01:21+0100 3840 [1320]: s1320 add_lockspace
fail result -5<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre
wrap="">_______________________________________________
Users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:Users@ovirt.org">Users@ovirt.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://lists.ovirt.org/mailman/listinfo/users">http://...
</pre>
</blockquote>
<br>
<br>
</blockquote>
<br>
<br>
</div>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:Users@ovirt.org">Users@ovirt.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://lists.ovirt.org/mailman/listinfo/users">http://...
</pre>
</blockquote>
<br>
<br>
</blockquote>
<br>
</body>
</html>
--------------050202070801030107080007--