sbonazzo created OVIRT-1323:
-------------------------------
Summary: Re: [ovirt-users] I’m having trouble deleting a test gluster volume
Key: OVIRT-1323
URL:
https://ovirt-jira.atlassian.net/browse/OVIRT-1323
Project: oVirt - virtualization made easy
Issue Type: By-EMAIL
Reporter: sbonazzo
Assignee: infra
On Wed, Apr 12, 2017 at 11:05 PM, Precht, Andrew <
Andrew.Precht(a)sjlibrary.org> wrote:
Hi all,
In the end, I ran this on each host node and is what worked:
systemctl stop glusterd && rm -rf /var/lib/glusterd/vols/* && rm -rf
/var/lib/glusterd/peers/*
Thanks so much for your help.
P.S. I work as a sys admin for the San Jose library. Part of my job
satisfaction comes from knowing that the work I do here goes directly back
into this community. We’r fortunate that you, your coworkers, and Red Hat
do so much to give back. I have to imagine you too feel this sense of
satisfaction. Thanks again…
P.S.S. I never did hear back from users(a)ovirt.org mailing list. I did
fill out the fields on this page:
https://lists.ovirt.org/
mailman/listinfo/users. Yet, everytime I send them an email I get: Your
message to Users awaits moderator approval. Is there a secret handshake,
I’m not aware of?
Opening a ticket on infra to check your account on users mailing list.
Regards,
Andrew
------------------------------
*From:* knarra <knarra(a)redhat.com>
*Sent:* Wednesday, April 12, 2017 10:01:33 AM
*To:* Precht, Andrew; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon
Mureinik; Nir Soffer
*Cc:* users
*Subject:* Re: [ovirt-users] I’m having trouble deleting a test gluster
volume
On 04/12/2017 08:45 PM, Precht, Andrew wrote:
Hi all,
You asked: Any errors in ovirt-engine.log file ?
Yes, In the engine.log this error is repeated about every 3 minutes:
2017-04-12 07:16:12,554-07 ERROR [org.ovirt.engine.core.bll.gluster.GlusterTasksSyncJob]
(DefaultQuartzScheduler3) [ccc8ed0d-8b91-4397-b6b9-ab0f77c5f7b8] Error
updating tasks from CLI: org.ovirt.engine.core.common.errors.EngineException:
EngineException: Command execution failed error: Error : Request timed out return
code: 1 (Failed with error GlusterVolumeStatusAllFailedException and code
4161) error: Error : Request timed out
I am not sure why this says "Request timed out".
1) gluster volume list -> Still shows the deleted volume (test1)
2) gluster peer status -> Shows one of the peers twice with different
uuid’s:
Hostname: 192.168.10.109 Uuid: 42fbb7de-8e6f-4159-a601-3f858fa65f6c State:
Peer in Cluster (Connected) Hostname: 192.168.10.109 Uuid:
e058babe-7f9d-49fe-a3ea-ccdc98d7e5b5 State: Peer in Cluster (Connected)
How did this happen? Are the hostname same for two hosts ?
I tried a gluster volume stop test1, with this result: volume stop:
test1: failed: Another transaction is in progress for test1. Please try
again after sometime.
can you restart glusterd and try to stop and delete the volume?
The etc-glusterfs-glusterd.vol.log shows no activity triggered by trying
to remove the test1 volume from the UI.
The ovirt-engine.log shows this repeating many times, when trying to
remove the test1 volume from the UI:
2017-04-12 07:57:38,049-07 INFO [org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler9) [ccc8ed0d-8b91-4397-b6b9-ab0f77c5f7b8] Failed
to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[
b0e1b909-9a6a-49dc-8e20-3a027218f7e1=<GLUSTER,
ACTION_TYPE_FAILED_GLUSTER_OPERATION_INPROGRESS>]',
sharedLocks='null'}'
can you restart ovirt-engine service because i see that "failed to acquire
lock". Once ovirt-engine is restarted some one who is holding the lock
should be release and things should work fine.
Last but not least, if none of the above works:
Login to all your nodes in the cluster.
rm -rf /var/lib/glusterd/vols/*
rm -rf /var/lib/glusterd/peers/*
systemctl restart glusterd on all the nodes.
Login to UI and see if any volumes / hosts are present. If yes, remove
them.
This should clear things for you and you can start from basic.
Thanks much,
Andrew
------------------------------
*From:* knarra <knarra(a)redhat.com> <knarra(a)redhat.com>
*Sent:* Tuesday, April 11, 2017 11:10:04 PM
*To:* Precht, Andrew; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon
Mureinik; Nir Soffer
*Cc:* users
*Subject:* Re: [ovirt-users] I’m having trouble deleting a test gluster
volume
On 04/12/2017 03:35 AM, Precht, Andrew wrote:
I just noticed this in the Alerts tab: Detected deletion of volume test1
on cluster 8000-1, and deleted it from engine DB.
Yet, It still shows in the web UI?
Any errors in ovirt-engine.log file ? if the volume is deleted from db
ideally it should be deleted from UI too. Can you go to gluster nodes and
check for the following:
1) gluster volume list -> should not return anything since you have
deleted the volumes.
2) gluster peer status -> on all the nodes should show that all the peers
are in connected state.
can you tail -f /var/log/ovirt-engine/ovirt-engine.log and gluster log
and capture the error messages when you try deleting the volume from UI?
Log what you have pasted in the previous mail only gives info and i could
not get any details from that on why volume delete is failing
------------------------------
*From:* Precht, Andrew
*Sent:* Tuesday, April 11, 2017 2:39:31 PM
*To:* knarra; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik;
Nir Soffer
*Cc:* users
*Subject:* Re: [ovirt-users] I’m having trouble deleting a test gluster
volume
The plot thickens…
I put all hosts in the cluster into maintenance mode, with the Stop
Gluster service checkbox checked. I then deleted the
/var/lib/glusterd/vols/test1 directory on all hosts. I then took the host
that the test1 volume was on out of maintenance mode. Then I tried to
remove the test1 volume from within the web UI. With no luck, I got the
message: Could not delete Gluster Volume test1 on cluster 8000-1.
I went back and checked all host for the test1 directory, it is not on any
host. Yet I still can’t remove it…
Any suggestions?
------------------------------
*From:* Precht, Andrew
*Sent:* Tuesday, April 11, 2017 1:15:22 PM
*To:* knarra; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon Mureinik;
Nir Soffer
*Cc:* users
*Subject:* Re: [ovirt-users] I’m having trouble deleting a test gluster
volume
Here is an update…
I checked the /var/log/glusterfs/etc-glusterfs-glusterd.vol.log on the
node that had the trouble volume (test1). I didn’t see any errors. So, I
ran a tail -f on the log as I tried to remove the volume using the web UI.
here is what was appended:
[2017-04-11 19:48:40.756360] I [MSGID: 106487] [glusterd-handler.c:1474:__
glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2017-04-11 19:48:42.238840] I [MSGID: 106488] [glusterd-handler.c:1537:__
glusterd_handle_cli_get_volume] 0-management: Received get vol req
The message "I [MSGID: 106487] [glusterd-handler.c:1474:__
glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req"
repeated 6 times between [2017-04-11 19:48:40.756360] and [2017-04-11
19:49:32.596536]
The message "I [MSGID: 106488] [glusterd-handler.c:1537:__
glusterd_handle_cli_get_volume] 0-management: Received get vol req"
repeated 20 times between [2017-04-11 19:48:42.238840] and [2017-04-11
19:49:34.082179]
[2017-04-11 19:51:41.556077] I [MSGID: 106487] [glusterd-handler.c:1474:__
glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
I’m seeing that the timestamps on these log entries do not match the time
on the node.
The next steps
I stopped the glusterd service on the node with volume test1
I deleted it with: rm -rf /var/lib/glusterd/vols/test1
I started the glusterd service.
After starting the gluster service back up, the directory
/var/lib/glusterd/vols/test1 reappears.
I’m guessing syncing with the other nodes?
Is this because I have the Volume Option: auth allow *
Do I need to remove the directory /var/lib/glusterd/vols/test1 on all
nodes in the cluster individually?
thanks
------------------------------
*From:* knarra <knarra(a)redhat.com> <knarra(a)redhat.com>
*Sent:* Tuesday, April 11, 2017 11:51:18 AM
*To:* Precht, Andrew; Sandro Bonazzola; Sahina Bose; Tal Nisan; Allon
Mureinik; Nir Soffer
*Cc:* users
*Subject:* Re: [ovirt-users] I’m having trouble deleting a test gluster
volume
On 04/11/2017 11:28 PM, Precht, Andrew wrote:
Hi all,
The node is oVirt Node 4.1.1 with glusterfs-3.8.10-1.el7.
On the node I can not find /var/log/glusterfs/glusterd.log However, there
is a /var/log/glusterfs/glustershd.log
can you check if /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
exists? if yes, can you check if there is any error present in that file ?
What happens if I follow the four steps outlined here to remove the volume
from the node *BUT*, I do have another volume present in the cluster. It
too is a test volume. Neither one has any data on them. So, data loss is
not an issue.
Running those four steps will remove the volume from your cluster . If the
volumes what you have are test volumes you could just follow the steps
outlined to delete them (since you are not able to delete from UI) and
bring back the cluster into a normal state.
------------------------------
*From:* knarra <knarra(a)redhat.com> <knarra(a)redhat.com>
*Sent:* Tuesday, April 11, 2017 10:32:27 AM
*To:* Sandro Bonazzola; Precht, Andrew; Sahina Bose; Tal Nisan; Allon
Mureinik; Nir Soffer
*Cc:* users
*Subject:* Re: [ovirt-users] I’m having trouble deleting a test gluster
volume
On 04/11/2017 10:44 PM, Sandro Bonazzola wrote:
Adding some people
Il 11/Apr/2017 19:06, "Precht, Andrew" <Andrew.Precht(a)sjlibrary.org> ha
scritto:
> Hi Ovirt users,
> I’m a newbie to oVirt and I’m having trouble deleting a test gluster
> volume. The nodes are 4.1.1 and the engine is 4.1.0
>
> When I try to remove the test volume, I click Remove, the dialog box
> prompting to confirm the deletion pops up and after I click OK, the dialog
> box changes to show a little spinning wheel and then it disappears. In the
> end the volume is still there.
>
with the latest version of glusterfs & ovirt we do not see any issue with
deleting a volume. Can you please check /var/log/glusterfs/glusterd.log
file if there is any error present?
The test volume was distributed with two host members. One of the hosts I
> was able to remove from the volume by removing the host form the cluster.
> When I try to remove the remaining host in the volume, even with the “Force
> Remove” box ticked, I get this response: Cannot remove Host. Server having
> Gluster volume.
>
> What to try next?
>
since you have already removed the volume from one host in the cluster and
you still see it on another host you can do the following to remove the
volume from another host.
1) Login to the host where the volume is present.
2) cd to /var/lib/glusterd/vols
3) rm -rf <vol_name>
4) Restart glusterd on that host.
And before doing the above make sure that you do not have any other volume
present in the cluster.
Above steps should not be run on a production system as you might loose
the volume and data.
Now removing the host from UI should succed.
> P.S. I’ve tried to join this user group several times in the past, with
> no response.
> Is it possible for me to join this group?
>
> Regards,
> Andrew
>
>
_______________________________________________
Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
--
SANDRO BONAZZOLA
ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
Red Hat EMEA <
https://www.redhat.com/>
<
https://red.ht/sig>
TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>
--
This message was sent by Atlassian JIRA
(v1000.910.0#100040)