And in fact after solving the split brain, the gluster domain
automatically activated.
From rhev-data-center-mnt-glusterSD-f18ovn01.mydomain:gvdata.log under
/var/log/glusterfs I found ids file was the one not in sync
As the VM only started on f18ovn03 and I was not able to migrate to
f18ovn01, I decided to delete the file form f18ovn01.
BTW: what does dom_md/ids contain?
[2013-10-03 22:06:33.543730] E
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
0-gvdata-replicate-0: Unable to self-heal contents of
'/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids' (possible
split-brain). Please delete the file from all but the preferred
subvolume.- Pending matrix: [ [ 0 2 ] [ 2 0 ] ]
[2013-10-03 22:06:33.544013] E
[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk]
0-gvdata-replicate-0: background data self-heal failed on
/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids
[2013-10-03 22:06:33.544522] W [afr-open.c:213:afr_open]
0-gvdata-replicate-0: failed to open as split brain seen, returning
EIO
[2013-10-03 22:06:33.544603] W [page.c:991:__ioc_page_error]
0-gvdata-io-cache: page error for page = 0x7f4b80004910 & waitq =
0x7f4b8001da60
[2013-10-03 22:06:33.544635] W [fuse-bridge.c:2049:fuse_readv_cbk]
0-glusterfs-fuse: 132995: READ => -1 (Input/output error)
[2013-10-03 22:06:33.545070] W
[client-lk.c:367:delete_granted_locks_owner] 0-gvdata-client-0: fdctx
not valid
[2013-10-03 22:06:33.545118] W
[client-lk.c:367:delete_granted_locks_owner] 0-gvdata-client-1: fdctx
not valid
I found that gluster creates hard links, so you have to delete all
copies of conflicting file from the brick directory of the node you
choose to delete from.
Thanks very much to this link:
http://inuits.eu/blog/fixing-glusterfs-split-brain
So these my steps:
locate
[root@f18ovn01 d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291]# find
/gluster/DATA_GLUSTER/brick1/ -samefile
/gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids
-print
/gluster/DATA_GLUSTER/brick1/.glusterfs/ae/27/ae27eb8d-c653-4cc0-a054-ea376ce8097d
/gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids
and then delete both
[root@f18ovn01 d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291]# find
/gluster/DATA_GLUSTER/brick1/ -samefile
/gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids
-print -delete
/gluster/DATA_GLUSTER/brick1/.glusterfs/ae/27/ae27eb8d-c653-4cc0-a054-ea376ce8097d
/gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids
An after this step no more " E " lines in gluster log and gluster
domain automatically activated by engine.
Gianluca