[ovirt-users] I’m having trouble deleting a test gluster volume

Precht, Andrew Andrew.Precht at sjlibrary.org
Wed Apr 12 21:05:18 UTC 2017


Hi all,
In the end, I ran this on each host node and is what worked:
systemctl stop glusterd && rm -rf /var/lib/glusterd/vols/* && rm -rf /var/lib/glusterd/peers/*

Thanks so much for your help.

P.S. I work as a sys admin for the San Jose library. Part of my job satisfaction comes from knowing that the work I do here goes directly back into this community. We’r fortunate that you, your coworkers, and Red Hat do so much to give back. I have to imagine you too feel this sense of satisfaction. Thanks again…

P.S.S. I never did hear back from users at ovirt.org mailing list. I did fill out the fields on this page: https://lists.ovirt.org/mailman/listinfo/users. Yet, everytime I send them an email I get: Your message to Users awaits moderator approval. Is there a secret handshake, I’m not aware of?

Regards,
Andrew


________________________________
From: knarra <knarra at redhat.com>
Sent: Wednesday, April 12, 2017 10:01:33 AM
To: Precht, Andrew; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik; Nir Soffer
Cc: users
Subject: Re: [ovirt-users] I’m having trouble deleting a test gluster volume

On 04/12/2017 08:45 PM, Precht, Andrew wrote:

Hi all,

You asked: Any errors in ovirt-engine.log file ?

Yes, In the engine.log this error is repeated about every 3 minutes:


2017-04-12 07:16:12,554-07 ERROR [org.ovirt.engine.core.bll.gluster.GlusterTasksSyncJob] (DefaultQuartzScheduler3) [ccc8ed0d-8b91-4397-b6b9-ab0f77c5f7b8] Error updating tasks from CLI: org.ovirt.engine.core.common.errors.EngineException: EngineException: Command execution failed error: Error : Request timed out return code: 1 (Failed with error GlusterVolumeStatusAllFailedException and code 4161) error: Error : Request timed out

I am not sure why this says "Request timed out".

1) gluster volume list ->  Still shows the deleted volume (test1)

2) gluster peer status -> Shows one of the peers twice with different uuid’s:

Hostname: 192.168.10.109 Uuid: 42fbb7de-8e6f-4159-a601-3f858fa65f6c State: Peer in Cluster (Connected) Hostname: 192.168.10.109 Uuid: e058babe-7f9d-49fe-a3ea-ccdc98d7e5b5 State: Peer in Cluster (Connected)

How did this happen? Are the hostname same for two hosts ?

I tried a gluster volume stop test1, with this result: volume stop: test1: failed: Another transaction is in progress for test1. Please try again after sometime.

can you restart glusterd and try to stop and delete the volume?

The etc-glusterfs-glusterd.vol.log shows no activity triggered by trying to remove the test1 volume from the UI.


The ovirt-engine.log shows this repeating many times, when trying to remove the test1 volume from the UI:


2017-04-12 07:57:38,049-07 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler9) [ccc8ed0d-8b91-4397-b6b9-ab0f77c5f7b8] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[b0e1b909-9a6a-49dc-8e20-3a027218f7e1=<GLUSTER, ACTION_TYPE_FAILED_GLUSTER_OPERATION_INPROGRESS>]', sharedLocks='null'}'

can you restart ovirt-engine service because i see that "failed to acquire lock".  Once ovirt-engine is restarted some one who is holding the lock should be release  and things should work fine.

Last but not least, if none of the above works:

Login to all your nodes in the cluster.
rm -rf /var/lib/glusterd/vols/*
rm -rf /var/lib/glusterd/peers/*
systemctl restart glusterd on all the nodes.

Login to UI and see if any volumes / hosts are present. If yes, remove them.

This should clear things for you and you can start from basic.



Thanks much,

Andrew

________________________________
From: knarra <knarra at redhat.com><mailto:knarra at redhat.com>
Sent: Tuesday, April 11, 2017 11:10:04 PM
To: Precht, Andrew; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik; Nir Soffer
Cc: users
Subject: Re: [ovirt-users] I’m having trouble deleting a test gluster volume

On 04/12/2017 03:35 AM, Precht, Andrew wrote:

I just noticed this in the Alerts tab: Detected deletion of volume test1 on cluster 8000-1, and deleted it from engine DB.

Yet, It still shows in the web UI?

Any errors in ovirt-engine.log file ? if the volume is deleted from db ideally it should be deleted from UI too.  Can you go to gluster nodes and check for the following:

1) gluster volume list -> should not return anything since you have deleted the volumes.

2) gluster peer status -> on all the nodes should show that all the peers are in connected state.

can you tail -f /var/log/ovirt-engine/ovirt-engine.log and gluster log and capture the error messages when you try deleting the volume from UI?

Log what you have pasted in the previous mail only gives info and i could not get any details from that on why volume delete is failing

________________________________
From: Precht, Andrew
Sent: Tuesday, April 11, 2017 2:39:31 PM
To: knarra; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik; Nir Soffer
Cc: users
Subject: Re: [ovirt-users] I’m having trouble deleting a test gluster volume

The plot thickens…
I put all hosts in the cluster into maintenance mode, with the Stop Gluster service checkbox checked. I then deleted the /var/lib/glusterd/vols/test1 directory on all hosts. I then took the host that the test1 volume was on out of maintenance mode. Then I tried to remove the test1 volume from within the web UI. With no luck, I got the message: Could not delete Gluster Volume test1 on cluster 8000-1.

I went back and checked all host for the test1 directory, it is not on any host. Yet I still can’t remove it…

Any suggestions?

________________________________
From: Precht, Andrew
Sent: Tuesday, April 11, 2017 1:15:22 PM
To: knarra; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik; Nir Soffer
Cc: users
Subject: Re: [ovirt-users] I’m having trouble deleting a test gluster volume

Here is an update…

I checked the /var/log/glusterfs/etc-glusterfs-glusterd.vol.log on the node that had the trouble volume (test1). I didn’t see any errors. So, I ran a tail -f on the log as I tried to remove the volume using the web UI. here is what was appended:

[2017-04-11 19:48:40.756360] I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2017-04-11 19:48:42.238840] I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req
The message "I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 6 times between [2017-04-11 19:48:40.756360] and [2017-04-11 19:49:32.596536]
The message "I [MSGID: 106488] [glusterd-handler.c:1537:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 20 times between [2017-04-11 19:48:42.238840] and [2017-04-11 19:49:34.082179]
[2017-04-11 19:51:41.556077] I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req

I’m seeing that the timestamps on these log entries do not match the time on the node.

The next steps
I stopped the glusterd service on the node with volume test1
I deleted it with:  rm -rf /var/lib/glusterd/vols/test1
I started the glusterd service.

After starting the gluster service back up, the directory /var/lib/glusterd/vols/test1 reappears.
I’m guessing syncing with the other nodes?
Is this because I have the Volume Option: auth allow *
Do I need to remove the directory /var/lib/glusterd/vols/test1 on all nodes in the cluster individually?

thanks

________________________________
From: knarra <knarra at redhat.com><mailto:knarra at redhat.com>
Sent: Tuesday, April 11, 2017 11:51:18 AM
To: Precht, Andrew; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik; Nir Soffer
Cc: users
Subject: Re: [ovirt-users] I’m having trouble deleting a test gluster volume

On 04/11/2017 11:28 PM, Precht, Andrew wrote:
Hi all,
The node is oVirt Node 4.1.1 with glusterfs-3.8.10-1.el7.
On the node I can not find /var/log/glusterfs/glusterd.log However, there is a /var/log/glusterfs/glustershd.log
can you check if /var/log/glusterfs/etc-glusterfs-glusterd.vol.log exists? if yes, can you check if there is any error present in that file ?

What happens if I follow the four steps outlined here to remove the volume from the node BUT, I do have another volume present in the cluster. It too is a test volume. Neither one has any data on them. So, data loss is not an issue.
Running those four steps will remove the volume from your cluster . If the volumes what you have are test volumes you could just follow the steps outlined to delete them (since you are not able to delete from UI) and bring back the cluster into a normal state.

________________________________
From: knarra <knarra at redhat.com><mailto:knarra at redhat.com>
Sent: Tuesday, April 11, 2017 10:32:27 AM
To: Sandro Bonazzola; Precht, Andrew; Sahina Bose; Tal Nisan; Allon Mureinik; Nir Soffer
Cc: users
Subject: Re: [ovirt-users] I’m having trouble deleting a test gluster volume

On 04/11/2017 10:44 PM, Sandro Bonazzola wrote:
Adding some people

Il 11/Apr/2017 19:06, "Precht, Andrew" <Andrew.Precht at sjlibrary.org<mailto:Andrew.Precht at sjlibrary.org>> ha scritto:
Hi Ovirt users,
I’m a newbie to oVirt and I’m having trouble deleting a test gluster volume. The nodes are 4.1.1 and the engine is 4.1.0

When I try to remove the test volume, I click Remove, the dialog box prompting to confirm the deletion pops up and after I click OK, the dialog box changes to show a little spinning wheel and then it disappears. In the end the volume is still there.
with the latest version of glusterfs & ovirt we do not see any issue with deleting a volume. Can you please check /var/log/glusterfs/glusterd.log file if there is any error present?


The test volume was distributed with two host members. One of the hosts I was able to remove from the volume by removing the host form the cluster. When I try to remove the remaining host in the volume, even with the “Force Remove” box ticked, I get this response: Cannot remove Host. Server having Gluster volume.

What to try next?
since you have already removed the volume from one host in the cluster and you still see it on another host you can do the following to remove the volume from another host.

1) Login to the host where the volume is present.
2) cd to /var/lib/glusterd/vols
3) rm -rf <vol_name>
4) Restart glusterd on that  host.

And before doing the above make sure that you do not have any other volume present in the cluster.

Above steps should not be run on a production system as you might loose the volume and data.

Now removing the host from UI should succed.


P.S. I’ve tried to join this user group several times in the past, with no response.
Is it possible for me to join this group?

Regards,
Andrew




_______________________________________________
Users mailing list
Users at ovirt.org<mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170412/c9b58566/attachment-0001.html>


More information about the Users mailing list