On Thu, Jul 6, 2017 at 3:47 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Wed, Jul 5, 2017 at 6:39 PM, Atin Mukherjee <amukherj@redhat.com> wrote:
OK, so the log just hints to the following:

[2017-07-05 15:04:07.178204] E [MSGID: 106123] [glusterd-mgmt.c:1532:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Reset Brick on local node
[2017-07-05 15:04:07.178214] E [MSGID: 106123] [glusterd-replace-brick.c:649:glusterd_mgmt_v3_initiate_replace_brick_cmd_phases] 0-management: Commit Op Failed

While going through the code, glusterd_op_reset_brick () failed resulting into these logs. Now I don't see any error logs generated from glusterd_op_reset_brick () which makes me thing that have we failed from a place where we log the failure in debug mode. Would you be able to restart glusterd service with debug log mode and reran this test and share the log?


Do you mean to run the reset-brick command for another volume or for the same? Can I run it against this "now broken" volume?

Or perhaps can I modify /usr/lib/systemd/system/glusterd.service and change in [service] section

from
Environment="LOG_LEVEL=INFO"

to
Environment="LOG_LEVEL=DEBUG"

and then
systemctl daemon-reload
systemctl restart glusterd

Yes, that's how you can run glusterd in debug log mode.

I think it would be better to keep gluster in debug mode the less time possible, as there are other volumes active right now, and I want to prevent fill the log files file system
Best to put only some components in debug mode if possible as in the example commands above.

You can switch back to info mode the moment this is hit one more time with the debug log enabled. What I'd need here is the glusterd log (with debug mode) to figure out the exact cause of the failure.
 

Let me know,
thanks