[ovirt-users] Fwd: Re: ovirt - can't attach master domain II
paf1 at email.cz
paf1 at email.cz
Wed Feb 24 11:42:49 UTC 2016
used replica2 with volume option
Volume Name: 2KVM12-P2
Type: Replicate
Volume ID: 9745551f-4696-4a6c-820a-619e359a61fd
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 16.0.0.164:/STORAGES/g1r5p2/GFS
Brick2: 16.0.0.163:/STORAGES/g1r5p2/GFS
Options Reconfigured:
storage.owner-uid: 36
storage.owner-gid: 36
performance.io-cache: off
performance.read-ahead: off
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.quick-read: off
cluster.quorum-count: 1
cluster.server-quorum-type: none
cluster.quorum-type: fixed
was runnig over year with no problems ( reboots, ..etc... )
On 24.2.2016 12:34, Ravishankar N wrote:
> On 02/24/2016 04:48 PM, paf1 at email.cz wrote:
>>
>>
>> prereq: 2KVM12-P2 = master domain
>> ---------------------------------------------------------
>> YES - I'm using gluster.fuse NFS
>> localhost:/2KVM12-P2 on
>> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2 type
>> fuse.glusterfs
>> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>> ---------------------------------------------------------
>> Healing
>> ======
>> # gluster volume heal 2KVM12-P2 info
>> Brick 16.0.0.164:/STORAGES/g1r5p2/GFS
>> Number of entries: 0
>>
>> Brick 16.0.0.163:/STORAGES/g1r5p2/GFS
>> Number of entries: 0
>>
>> # while true; do for vol in `gluster volume list`; do gluster volume
>> heal $vol info | sort | grep "Number of entries" | awk -F: '{tot+=$2}
>> END { printf("Heal entries for '"$vol"': %d\n", $tot);}'; done; sleep
>> 120; echo -e "\n==================\n"; done
>> Heal entries for 1KVM12-BCK: 1
>> Heal entries for 1KVM12-P1: 1
>> Heal entries for 1KVM12-P2: 0
>> Heal entries for 1KVM12-P3: 0
>> Heal entries for 1KVM12-P4: 0
>> Heal entries for 1KVM12-P5: 0
>> Heal entries for 2KVM12-P1: 1
>> Heal entries for 2KVM12-P2: 0
>> Heal entries for 2KVM12-P3: 0
>> Heal entries for 2KVM12-P5: 0
>> Heal entries for 2KVM12_P4: 1
>>
>> # gluster volume heal 1KVM12-BCK info split-brain
>> Brick 16.0.0.161:/STORAGES/g2r5p1/GFS
>> /0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>> Number of entries in split-brain: 1
>>
>> Brick 16.0.0.162:/STORAGES/g2r5p1/GFS
>> /0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>> Number of entries in split-brain: 1
>>
>> # gluster volume heal 1KVM12-P1 info split-brain
>> Brick 16.0.0.161:/STORAGES/g1r5p1/GFS
>> /__DIRECT_IO_TEST__
>> Number of entries in split-brain: 1
>>
>> Brick 16.0.0.162:/STORAGES/g1r5p1/GFS
>> /__DIRECT_IO_TEST__
>> Number of entries in split-brain: 1
>>
>> etc......
>>
>>
>> YES - in split brain , but NOT master domain ( will solve later,
>> after master - if possible )
>
> I'm not sure if it is related, but you could try to resolve the
> split-brain first and see if it helps. Also, I see that you are using
> replica-2. It is recommended to use replica-3 or arbiter volumes to
> avoid split-brains.
>
> -Ravi
>
>>
>> ---------------------------------------------------------------
>> vdsm.log
>> =========
>>
>> Thread-461::DEBUG::2016-02-24
>> 11:12:45,328::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
>> SUCCESS: <err> = '0+1 records in\n0+1 records out\n333 bytes (333 B)
>> copied, 0.000724379 s, 460 kB/s\n'; <rc> = 0
>> Thread-461::INFO::2016-02-24
>> 11:12:45,331::clusterlock::219::Storage.SANLock::(acquireHostId)
>> Acquiring host id for domain 88adbd49-62d6-45b1-9992-b04464a04112 (id: 3)
>> Thread-461::DEBUG::2016-02-24
>> 11:12:45,331::clusterlock::237::Storage.SANLock::(acquireHostId) Host
>> id for domain 88adbd49-62d6-45b1-9992-b04464a04112 successfully
>> acquired (id: 3)
>> Thread-33186::DEBUG::2016-02-24
>> 11:12:46,067::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
>> Calling 'GlusterVolume.list' in bridge with {}
>> Thread-33186::DEBUG::2016-02-24
>> 11:12:46,204::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
>> Return 'GlusterVolume.list' in bridge with {'volumes': {'2KVM12-P5':
>> {'transportType': ['TCP'], 'uuid':
>> '4a6d775d-4a51-4f6c-9bfa-f7ef57f3ca1d', 'bricks':
>> ['16.0.0.164:/STORAGES/g1r5p5/GFS',
>> '16.0.0.163:/STORAGES/g1r5p5/GFS'], 'volumeName': '2KVM12-P5',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.164:/STORAGES/g1r5p5/GFS',
>> 'hostUuid': '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name':
>> '16.0.0.163:/STORAGES/g1r5p5/GFS', 'hostUuid':
>> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.stat-prefetch': 'off', 'cluster.quorum-type':
>> 'fixed', 'performance.quick-read': 'off', 'network.remote-dio':
>> 'enable', 'cluster.quorum-count': '1', 'performance.io-cache': 'off',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}, '2KVM12_P4': {'transportType': ['TCP'],
>> 'uuid': '18310aeb-639f-4b6d-9ef4-9ef560d6175c', 'bricks':
>> ['16.0.0.163:/STORAGES/g1r5p4/GFS',
>> '16.0.0.164:/STORAGES/g1r5p4/GFS'], 'volumeName': '2KVM12_P4',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p4/GFS',
>> 'hostUuid': '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
>> '16.0.0.164:/STORAGES/g1r5p4/GFS', 'hostUuid':
>> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}, '2KVM12-P1': {'transportType': ['TCP'],
>> 'uuid': 'cbf142f8-a40b-4cf4-ad29-2243c81d30c1', 'bricks':
>> ['16.0.0.163:/STORAGES/g1r5p1/GFS',
>> '16.0.0.164:/STORAGES/g1r5p1/GFS'], 'volumeName': '2KVM12-P1',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p1/GFS',
>> 'hostUuid': '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
>> '16.0.0.164:/STORAGES/g1r5p1/GFS', 'hostUuid':
>> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}, '2KVM12-P3': {'transportType': ['TCP'],
>> 'uuid': '25a5ec22-660e-42a0-aa00-45211d341738', 'bricks':
>> ['16.0.0.163:/STORAGES/g1r5p3/GFS',
>> '16.0.0.164:/STORAGES/g1r5p3/GFS'], 'volumeName': '2KVM12-P3',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.163:/STORAGES/g1r5p3/GFS',
>> 'hostUuid': '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}, {'name':
>> '16.0.0.164:/STORAGES/g1r5p3/GFS', 'hostUuid':
>> '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}, '2KVM12-P2': {'transportType': ['TCP'],
>> 'uuid': '9745551f-4696-4a6c-820a-619e359a61fd', 'bricks':
>> ['16.0.0.164:/STORAGES/g1r5p2/GFS',
>> '16.0.0.163:/STORAGES/g1r5p2/GFS'], 'volumeName': '2KVM12-P2',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.164:/STORAGES/g1r5p2/GFS',
>> 'hostUuid': '06854ac0-2ef1-4c12-bb8d-56cf9bf95ec9'}, {'name':
>> '16.0.0.163:/STORAGES/g1r5p2/GFS', 'hostUuid':
>> '6482ae32-25ac-41b5-b41d-b7ddf49bac2c'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.stat-prefetch': 'off', 'cluster.quorum-type':
>> 'fixed', 'performance.quick-read': 'off', 'network.remote-dio':
>> 'enable', 'cluster.quorum-count': '1', 'performance.io-cache': 'off',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}, '1KVM12-P4': {'transportType': ['TCP'],
>> 'uuid': 'b4356604-4404-428a-9da6-f1636115e2fd', 'bricks':
>> ['16.0.0.161:/STORAGES/g1r5p4/GFS',
>> '16.0.0.162:/STORAGES/g1r5p4/GFS'], 'volumeName': '1KVM12-P4',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p4/GFS',
>> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
>> '16.0.0.162:/STORAGES/g1r5p4/GFS', 'hostUuid':
>> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'diagnostics.count-fop-hits': 'on',
>> 'performance.stat-prefetch': 'off', 'cluster.quorum-type': 'fixed',
>> 'performance.quick-read': 'off', 'network.remote-dio': 'enable',
>> 'cluster.quorum-count': '1', 'performance.io-cache': 'off',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36', 'diagnostics.latency-measurement': 'on'}},
>> '1KVM12-BCK': {'transportType': ['TCP'], 'uuid':
>> '62c89345-fd61-4b67-b8b4-69296eb7d217', 'bricks':
>> ['16.0.0.161:/STORAGES/g2r5p1/GFS',
>> '16.0.0.162:/STORAGES/g2r5p1/GFS'], 'volumeName': '1KVM12-BCK',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g2r5p1/GFS',
>> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
>> '16.0.0.162:/STORAGES/g2r5p1/GFS', 'hostUuid':
>> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}, '1KVM12-P2': {'transportType': ['TCP'],
>> 'uuid': 'aa2d607d-3c6c-4f13-8205-aae09dcc9d35', 'bricks':
>> ['16.0.0.161:/STORAGES/g1r5p2/GFS',
>> '16.0.0.162:/STORAGES/g1r5p2/GFS'], 'volumeName': '1KVM12-P2',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p2/GFS',
>> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
>> '16.0.0.162:/STORAGES/g1r5p2/GFS', 'hostUuid':
>> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off',
>> 'diagnostics.count-fop-hits': 'on', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36', 'diagnostics.latency-measurement': 'on'}},
>> '1KVM12-P3': {'transportType': ['TCP'], 'uuid':
>> '6060ff77-d552-4d94-97bf-5a32982e7d8a', 'bricks':
>> ['16.0.0.161:/STORAGES/g1r5p3/GFS',
>> '16.0.0.162:/STORAGES/g1r5p3/GFS'], 'volumeName': '1KVM12-P3',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p3/GFS',
>> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
>> '16.0.0.162:/STORAGES/g1r5p3/GFS', 'hostUuid':
>> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off',
>> 'diagnostics.count-fop-hits': 'on', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36', 'diagnostics.latency-measurement': 'on'}},
>> '1KVM12-P1': {'transportType': ['TCP'], 'uuid':
>> 'f410c6a9-9a51-42b3-89bb-c20ac72a0461', 'bricks':
>> ['16.0.0.161:/STORAGES/g1r5p1/GFS',
>> '16.0.0.162:/STORAGES/g1r5p1/GFS'], 'volumeName': '1KVM12-P1',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p1/GFS',
>> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
>> '16.0.0.162:/STORAGES/g1r5p1/GFS', 'hostUuid':
>> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off',
>> 'diagnostics.count-fop-hits': 'on', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36', 'diagnostics.latency-measurement': 'on'}},
>> '1KVM12-P5': {'transportType': ['TCP'], 'uuid':
>> '420fa218-60bc-47e4-89a8-ce39b7da885e', 'bricks':
>> ['16.0.0.161:/STORAGES/g1r5p5/GFS',
>> '16.0.0.162:/STORAGES/g1r5p5/GFS'], 'volumeName': '1KVM12-P5',
>> 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2',
>> 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1',
>> 'bricksInfo': [{'name': '16.0.0.161:/STORAGES/g1r5p5/GFS',
>> 'hostUuid': '194a16db-4924-4324-b1d4-97eeacb4f100'}, {'name':
>> '16.0.0.162:/STORAGES/g1r5p5/GFS', 'hostUuid':
>> 'bfb9cc69-0c57-4905-8043-66953368e713'}], 'options':
>> {'cluster.server-quorum-type': 'none', 'cluster.eager-lock':
>> 'enable', 'performance.io-cache': 'off', 'performance.stat-prefetch':
>> 'off', 'cluster.quorum-type': 'fixed', 'performance.quick-read':
>> 'off', 'network.remote-dio': 'enable', 'cluster.quorum-count': '1',
>> 'storage.owner-uid': '36', 'performance.read-ahead': 'off',
>> 'storage.owner-gid': '36'}}}}
>> Thread-431::DEBUG::2016-02-24
>> 11:12:47,729::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
>> /usr/bin/dd
>> if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata
>> iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
>> Thread-431::DEBUG::2016-02-24
>> 11:12:47,743::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
>> SUCCESS: <err> = '0+1 records in\n0+1 records out\n997 bytes (997 B)
>> copied, 0.000569374 s, 1.8 MB/s\n'; <rc> = 0
>> Thread-431::INFO::2016-02-24
>> 11:12:47,751::clusterlock::219::Storage.SANLock::(acquireHostId)
>> Acquiring host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)
>> Thread-431::DEBUG::2016-02-24
>> 11:12:47,751::clusterlock::237::Storage.SANLock::(acquireHostId) Host
>> id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 successfully
>> acquired (id: 3)
>> --------------------------------
>>
>> Thread-349::ERROR::2016-02-24
>> 11:18:20,040::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain)
>> Error while collecting domain ef010d08-aed1-41c4-ba9a-e6d9bdecb4b4
>> monitoring information
>> Traceback (most recent call last):
>> File "/usr/share/vdsm/storage/domainMonitor.py", line 221, in
>> _monitorDomain
>> self.domain = sdCache.produce(self.sdUUID)
>> File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
>> domain.getRealDomain()
>> File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
>> return self._cache._realProduce(self._sdUUID)
>> File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
>> domain = self._findDomain(sdUUID)
>> File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>> dom = findMethod(sdUUID)
>> File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
>> return
>> GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
>> File "/usr/share/vdsm/storage/fileSD.py", line 160, in __init__
>> validateFileSystemFeatures(sdUUID, self.mountpoint)
>> File "/usr/share/vdsm/storage/fileSD.py", line 89, in
>> validateFileSystemFeatures
>> oop.getProcessPool(sdUUID).directTouch(testFilePath)
>> File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in
>> directTouch
>> ioproc.touch(path, flags, mode)
>> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
>> 543, in touch
>> self.timeout)
>> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
>> 427, in _sendCommand
>> raise OSError(errcode, errstr)
>> OSError: [Errno 5] Input/output error
>> Thread-453::ERROR::2016-02-24
>> 11:18:20,043::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain)
>> Error while collecting domain 300e9ac8-3c2f-4703-9bb1-1df2130c7c97
>> monitoring information
>> Traceback (most recent call last):
>> File "/usr/share/vdsm/storage/domainMonitor.py", line 221, in
>> _monitorDomain
>> self.domain = sdCache.produce(self.sdUUID)
>> File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
>> domain.getRealDomain()
>> File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
>> return self._cache._realProduce(self._sdUUID)
>> File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
>> domain = self._findDomain(sdUUID)
>> File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>> dom = findMethod(sdUUID)
>> File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
>> return
>> GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
>> File "/usr/share/vdsm/storage/fileSD.py", line 160, in __init__
>> validateFileSystemFeatures(sdUUID, self.mountpoint)
>> File "/usr/share/vdsm/storage/fileSD.py", line 89, in
>> validateFileSystemFeatures
>> oop.getProcessPool(sdUUID).directTouch(testFilePath)
>> File "/usr/share/vdsm/storage/outOfProcess.py", line 351, in
>> directTouch
>> ioproc.touch(path, flags, mode)
>> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
>> 543, in touch
>> self.timeout)
>> File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line
>> 427, in _sendCommand
>> raise OSError(errcode, errstr)
>> OSError: [Errno 5] Input/output error
>> Thread-431::DEBUG::2016-02-24
>> 11:18:20,109::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
>> /usr/bin/dd
>> if=/rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md/metadata
>> iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
>> Thread-431::DEBUG::2016-02-24
>> 11:18:20,122::fileSD::262::Storage.Misc.excCmd::(getReadDelay)
>> SUCCESS: <err> = '0+1 records in\n0+1 records out\n997 bytes (997 B)
>> copied, 0.000444081 s, 2.2 MB/s\n'; <rc> = 0
>> Thread-431::INFO::2016-02-24
>> 11:18:20,128::clusterlock::219::Storage.SANLock::(acquireHostId)
>> Acquiring host id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 (id: 3)
>> Thread-431::DEBUG::2016-02-24
>> 11:18:20,129::clusterlock::237::Storage.SANLock::(acquireHostId) Host
>> id for domain ff71b47b-0f72-4528-9bfe-c3da888e47f0 successfully
>> acquired (id: 3)
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,690::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
>> Calling 'Task.getStatus' in bridge with {u'taskID':
>> u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,692::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state init
>> -> state preparing
>> Thread-35631::INFO::2016-02-24
>> 11:18:20,692::logUtils::44::dispatcher::(wrapper) Run and protect:
>> getTaskStatus(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660',
>> spUUID=None, options=None)
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,693::taskManager::103::Storage.TaskManager::(getTaskStatus)
>> Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,693::taskManager::106::Storage.TaskManager::(getTaskStatus)
>> Return. Response: {'code': 661, 'message': 'Cannot acquire host id',
>> 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID':
>> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
>> Thread-35631::INFO::2016-02-24
>> 11:18:20,693::logUtils::47::dispatcher::(wrapper) Run and protect:
>> getTaskStatus, Return response: {'taskStatus': {'code': 661,
>> 'message': 'Cannot acquire host id', 'taskState': 'finished',
>> 'taskResult': 'cleanSuccess', 'taskID':
>> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,693::task::1191::Storage.TaskManager.Task::(prepare)
>> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::finished: {'taskStatus':
>> {'code': 661, 'message': 'Cannot acquire host id', 'taskState':
>> 'finished', 'taskResult': 'cleanSuccess', 'taskID':
>> 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}}
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,693::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::moving from state
>> preparing -> state finished
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,694::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
>> Owner.releaseAll requests {} resources {}
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,694::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
>> Owner.cancelAll requests {}
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,694::task::993::Storage.TaskManager.Task::(_decref)
>> Task=`8e893d78-e9dd-4af3-9ddc-5eb5c1006cfc`::ref 0 aborting False
>> Thread-35631::DEBUG::2016-02-24
>> 11:18:20,694::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
>> Return 'Task.getStatus' in bridge with {'code': 661, 'message':
>> 'Cannot acquire host id', 'taskState': 'finished', 'taskResult':
>> 'cleanSuccess', 'taskID': 'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,699::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
>> Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID':
>> u'00000002-0002-0002-0002-00000000021e'}
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,700::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state init
>> -> state preparing
>> Thread-35632::INFO::2016-02-24
>> 11:18:20,700::logUtils::44::dispatcher::(wrapper) Run and protect:
>> getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e',
>> options=None)
>> Thread-35632::INFO::2016-02-24
>> 11:18:20,707::logUtils::47::dispatcher::(wrapper) Run and protect:
>> getSpmStatus, Return response: {'spm_st': {'spmId': -1, 'spmStatus':
>> 'Free', 'spmLver': -1}}
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,707::task::1191::Storage.TaskManager.Task::(prepare)
>> Task=`3a303728-a737-4ec4-8468-c5f142964689`::finished: {'spm_st':
>> {'spmId': -1, 'spmStatus': 'Free', 'spmLver': -1}}
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,707::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`3a303728-a737-4ec4-8468-c5f142964689`::moving from state
>> preparing -> state finished
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,708::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
>> Owner.releaseAll requests {} resources {}
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,708::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
>> Owner.cancelAll requests {}
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,708::task::993::Storage.TaskManager.Task::(_decref)
>> Task=`3a303728-a737-4ec4-8468-c5f142964689`::ref 0 aborting False
>> Thread-35632::DEBUG::2016-02-24
>> 11:18:20,708::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
>> Return 'StoragePool.getSpmStatus' in bridge with {'spmId': -1,
>> 'spmStatus': 'Free', 'spmLver': -1}
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,737::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
>> Calling 'Task.clear' in bridge with {u'taskID':
>> u'ea8c684e-fedb-44f2-84c5-4f1407f24660'}
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,738::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state init
>> -> state preparing
>> Thread-35633::INFO::2016-02-24
>> 11:18:20,738::logUtils::44::dispatcher::(wrapper) Run and protect:
>> clearTask(taskID=u'ea8c684e-fedb-44f2-84c5-4f1407f24660',
>> spUUID=None, options=None)
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,739::taskManager::171::Storage.TaskManager::(clearTask)
>> Entry. taskID: ea8c684e-fedb-44f2-84c5-4f1407f24660
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,739::taskManager::176::Storage.TaskManager::(clearTask) Return.
>> Thread-35633::INFO::2016-02-24
>> 11:18:20,739::logUtils::47::dispatcher::(wrapper) Run and protect:
>> clearTask, Return response: None
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,739::task::1191::Storage.TaskManager.Task::(prepare)
>> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::finished: None
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,740::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::moving from state
>> preparing -> state finished
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,740::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
>> Owner.releaseAll requests {} resources {}
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,740::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
>> Owner.cancelAll requests {}
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,740::task::993::Storage.TaskManager.Task::(_decref)
>> Task=`6164a1ff-67b3-4e35-bce0-f0dc9657bf94`::ref 0 aborting False
>> Thread-35633::DEBUG::2016-02-24
>> 11:18:20,740::__init__::514::jsonrpc.JsonRpcServer::(_serveRequest)
>> Return 'Task.clear' in bridge with True
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,859::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest)
>> Calling 'StoragePool.getSpmStatus' in bridge with {u'storagepoolID':
>> u'00000002-0002-0002-0002-00000000021e'}
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,860::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state init
>> -> state preparing
>> Thread-35634::INFO::2016-02-24
>> 11:18:20,860::logUtils::44::dispatcher::(wrapper) Run and protect:
>> getSpmStatus(spUUID=u'00000002-0002-0002-0002-00000000021e',
>> options=None)
>> Thread-35634::INFO::2016-02-24
>> 11:18:20,867::logUtils::47::dispatcher::(wrapper) Run and protect:
>> getSpmStatus, Return response: {'spm_st': {'spmId': -1, 'spmStatus':
>> 'Free', 'spmLver': -1}}
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,867::task::1191::Storage.TaskManager.Task::(prepare)
>> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::finished: {'spm_st':
>> {'spmId': -1, 'spmStatus': 'Free', 'spmLver': -1}}
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,867::task::595::Storage.TaskManager.Task::(_updateState)
>> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::moving from state
>> preparing -> state finished
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,867::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
>> Owner.releaseAll requests {} resources {}
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,867::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
>> Owner.cancelAll requests {}
>> Thread-35634::DEBUG::2016-02-24
>> 11:18:20,867::task::993::Storage.TaskManager.Task::(_decref)
>> Task=`e887fd8b-6961-40f1-b3a0-917ffbea25c0`::ref 0 aborting False
>> ---------------------------------------
>>
>> these blocks generated in cycle for each domain
>>
>> Any IDEA ??
>> regs.
>> Pa.
>>
>>
>>
>> On 24.2.2016 10:54, Ravishankar N wrote:
>>> Hi,
>>>
>>> On 02/24/2016 06:43 AM, paf1 at email.cz wrote:
>>>> Hi,
>>>> I found the main ( maybe ) problem with IO error ( -5 ) for "ids"
>>>> file access
>>>> This file is not accessable via NFS, locally yes
>>> How is NFS coming into the picture? Are you not using gluster fuse
>>> mount?
>>>> .
>>>> How can I fix it ??
>>> Can you run `gluster volume heal volname info` and `gluster volume
>>> heal volname info split-brain` to see if the "ids" file is in
>>> split-brain? A file in split-brain returns EIO when accessed from
>>> the mount.
>>> Regards,
>>> Ravi
>>>
>>>
>>>> regs.
>>>> Pavel
>>>>
>>>> # sanlock client log_dump
>>>> ....
>>>> 0 flags 1 timeout 0
>>>> 2016-02-24 02:01:10+0100 3828 [12111]: s1316 lockspace
>>>> 88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>>> 2016-02-24 02:01:10+0100 3828 [12111]: cmd_add_lockspace 4,15 async
>>>> done 0
>>>> 2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire begin
>>>> 88adbd49-62d6-45b1-9992-b04464a04112:1
>>>> 2016-02-24 02:01:10+0100 3828 [19556]: 88adbd49 aio collect 0
>>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>>> 2016-02-24 02:01:10+0100 3828 [19556]: read_sectors delta_leader
>>>> offset 0 rv -5
>>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids
>>>> 2016-02-24 02:01:10+0100 3828 [19556]: s1316 delta_acquire
>>>> leader_read1 error -5
>>>> 2016-02-24 02:01:11+0100 3829 [12111]: s1316 add_lockspace fail
>>>> result -5
>>>> 2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace 4,15
>>>> 7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>>>> flags 1 timeout 0
>>>> 2016-02-24 02:01:12+0100 3831 [12116]: s1317 lockspace
>>>> 7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>>>> 2016-02-24 02:01:12+0100 3831 [12116]: cmd_add_lockspace 4,15 async
>>>> done 0
>>>> 2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire begin
>>>> 7f52b697-c199-4f58-89aa-102d44327124:1
>>>> 2016-02-24 02:01:12+0100 3831 [19562]: 7f52b697 aio collect 0
>>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>>> 2016-02-24 02:01:12+0100 3831 [19562]: read_sectors delta_leader
>>>> offset 0 rv -5
>>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids
>>>> 2016-02-24 02:01:12+0100 3831 [19562]: s1317 delta_acquire
>>>> leader_read1 error -5
>>>> 2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace 4,15
>>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0
>>>> flags 1 timeout 0
>>>> 2016-02-24 02:01:13+0100 3831 [1321]: s1318 lockspace
>>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids:0
>>>> 2016-02-24 02:01:13+0100 3831 [1321]: cmd_add_lockspace 4,15 async
>>>> done 0
>>>> 2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire begin
>>>> 0fcad888-d573-47be-bef3-0bc0b7a99fb7:1
>>>> 2016-02-24 02:01:13+0100 3831 [19564]: 0fcad888 aio collect 0
>>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458201000 result -5:0 match res
>>>> 2016-02-24 02:01:13+0100 3831 [19564]: read_sectors delta_leader
>>>> offset 0 rv -5
>>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md/ids
>>>> 2016-02-24 02:01:13+0100 3831 [19564]: s1318 delta_acquire
>>>> leader_read1 error -5
>>>> 2016-02-24 02:01:13+0100 3832 [12116]: s1317 add_lockspace fail
>>>> result -5
>>>> 2016-02-24 02:01:14+0100 3832 [1321]: s1318 add_lockspace fail
>>>> result -5
>>>> 2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace 4,15
>>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0
>>>> flags 1 timeout 0
>>>> 2016-02-24 02:01:19+0100 3838 [12106]: s1319 lockspace
>>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids:0
>>>> 2016-02-24 02:01:19+0100 3838 [12106]: cmd_add_lockspace 4,15 async
>>>> done 0
>>>> 2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire begin
>>>> 3da46e07-d1ea-4f10-9250-6cbbb7b94d80:1
>>>> 2016-02-24 02:01:19+0100 3838 [19638]: 3da46e07 aio collect 0
>>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>>> 2016-02-24 02:01:19+0100 3838 [19638]: read_sectors delta_leader
>>>> offset 0 rv -5
>>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P5/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids
>>>> 2016-02-24 02:01:19+0100 3838 [19638]: s1319 delta_acquire
>>>> leader_read1 error -5
>>>> 2016-02-24 02:01:20+0100 3839 [12106]: s1319 add_lockspace fail
>>>> result -5
>>>> 2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace 4,15
>>>> 88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>>> flags 1 timeout 0
>>>> 2016-02-24 02:01:20+0100 3839 [1320]: s1320 lockspace
>>>> 88adbd49-62d6-45b1-9992-b04464a04112:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids:0
>>>> 2016-02-24 02:01:20+0100 3839 [1320]: cmd_add_lockspace 4,15 async
>>>> done 0
>>>> 2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire begin
>>>> 88adbd49-62d6-45b1-9992-b04464a04112:1
>>>> 2016-02-24 02:01:20+0100 3839 [19658]: 88adbd49 aio collect 0
>>>> 0x7fe4580008c0:0x7fe4580008d0:0x7fe458101000 result -5:0 match res
>>>> 2016-02-24 02:01:20+0100 3839 [19658]: read_sectors delta_leader
>>>> offset 0 rv -5
>>>> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md/ids
>>>> 2016-02-24 02:01:20+0100 3839 [19658]: s1320 delta_acquire
>>>> leader_read1 error -5
>>>> 2016-02-24 02:01:21+0100 3840 [1320]: s1320 add_lockspace fail
>>>> result -5
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160224/e773e94d/attachment-0001.html>
More information about the Users
mailing list