On Fri, Jul 7, 2017 at 2:23 PM, Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
On Thu, Jul 6, 2017 at 3:22 PM, Gianluca Cecchi
<gianluca.cecchi(a)gmail.com
> wrote:
> On Thu, Jul 6, 2017 at 2:16 PM, Atin Mukherjee <amukherj(a)redhat.com>
> wrote:
>
>>
>>
>> On Thu, Jul 6, 2017 at 5:26 PM, Gianluca Cecchi <
>> gianluca.cecchi(a)gmail.com> wrote:
>>
>>> On Thu, Jul 6, 2017 at 8:38 AM, Gianluca Cecchi <
>>> gianluca.cecchi(a)gmail.com> wrote:
>>>
>>>>
>>>> Eventually I can destroy and recreate this "export" volume
again with
>>>> the old names (ovirt0N.localdomain.local) if you give me the sequence of
>>>> commands, then enable debug and retry the reset-brick command
>>>>
>>>> Gianluca
>>>>
>>>
>>>
>>> So it seems I was able to destroy and re-create.
>>> Now I see that the volume creation uses by default the new ip, so I
>>> reverted the hostnames roles in the commands after putting glusterd in
>>> debug mode on the host where I execute the reset-brick command (do I have
>>> to set debug for the the nodes too?)
>>>
>>
>> You have to set the log level to debug for glusterd instance where the
>> commit fails and share the glusterd log of that particular node.
>>
>>
>
> Ok, done.
>
> Command executed on ovirt01 with timestamp "2017-07-06 13:04:12" in
> glusterd log files
>
> [root@ovirt01 export]# gluster volume reset-brick export
> gl01.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
>
> [root@ovirt01 export]# gluster volume reset-brick export
> gl01.localdomain.local:/gluster/brick3/export
> ovirt01.localdomain.local:/gluster/brick3/export commit force
> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
> Please check log file for details.
> Commit failed on ovirt03.localdomain.local. Please check log file for
> details.
> [root@ovirt01 export]#
>
> See glusterd log files for the 3 nodes in debug mode here:
> ovirt01:
https://drive.google.com/file/d/0BwoPbcrMv8mvY1RTTG
> p3RUhScm8/view?usp=sharing
> ovirt02:
https://drive.google.com/file/d/0BwoPbcrMv8mvSVpJUH
> NhMzhMSU0/view?usp=sharing
> ovirt03:
https://drive.google.com/file/d/0BwoPbcrMv8mvT2xiWE
> dQVmJNb0U/view?usp=sharing
>
> HIH debugging
> Gianluca
>
>
Hi Atin,
did you have time to see the logs?
Comparing debug enabled messages with previous ones, I see these added
lines on nodes where commit failed after running the commands
gluster volume reset-brick export gl01.localdomain.local:/gluster/brick3/export
start
gluster volume reset-brick export gl01.localdomain.local:/gluster/brick3/export
ovirt01.localdomain.local:/gluster/brick3/export commit force
[2017-07-06 13:04:30.221872] D [MSGID: 0]
[glusterd-peer-utils.c:674:gd_peerinfo_find_from_hostname]
0-management: Friend ovirt01.localdomain.local found.. state: 3
[2017-07-06 13:04:30.221882] D [MSGID: 0]
[glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
0-management: returning 0
[2017-07-06 13:04:30.221888] D [MSGID: 0] [glusterd-utils.c:1039:glusterd_resolve_brick]
0-management: Returning 0
[2017-07-06 13:04:30.221908] D [MSGID: 0] [glusterd-utils.c:998:glusterd_brickinfo_new]
0-management: Returning 0
[2017-07-06 13:04:30.221915] D [MSGID: 0]
[glusterd-utils.c:1195:glusterd_brickinfo_new_from_brick]
0-management: Returning 0
[2017-07-06 13:04:30.222187] D [MSGID: 0]
[glusterd-peer-utils.c:167:glusterd_hostname_to_uuid]
0-management: returning 0
[2017-07-06 13:04:30.222201] D [MSGID: 0]
[glusterd-utils.c:1486:glusterd_volume_brickinfo_get]
0-management: Returning -1
The above log entry is the reason of the failure. GlusterD is unable to
find the old brick (src_brick) from its volinfo structure. FWIW, would you
be able to share the content of 'gluster get-state' output & gluster volume
info output after running reset-brick start? I'd need to check why glusterd
is unable to find out the old brick's details from its volinfo post
reset-brick start.
[2017-07-06 13:04:30.222207] D [MSGID: 0]
[store.c:459:gf_store_handle_destroy]
0-: Returning 0
[2017-07-06 13:04:30.222242] D [MSGID: 0] [glusterd-utils.c:1512:gluster
d_volume_brickinfo_get_by_brick] 0-glusterd: Returning -1
[2017-07-06 13:04:30.222250] D [MSGID: 0] [glusterd-replace-brick.c:416:
glusterd_op_perform_replace_brick] 0-glusterd: Returning -1
[2017-07-06 13:04:30.222257] C [MSGID: 106074]
[glusterd-reset-brick.c:372:glusterd_op_reset_brick] 0-management: Unable
to add dst-brick: ovirt01.localdomain.local:/gluster/brick3/export to
volume: export
Does it share up more light?
Thanks,
Gianluca