Re: [ovirt-users] Gluster setup disappears any chance to recover?

1 Mar 2017


      ok Finally by single brick up and running so I can access to data.
Now the question is do we need to run glusterd daemon on startup? or it is
managed by vdsmd?


On Wed, Mar 1, 2017 at 2:36 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
...
all folders /var/lib/glusterd/vols/ are empty
In the history of one of the servers I found the command how it was
created:
gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA
10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu 10.10.10.41:
/zclei26/01/glu
But executing this command it claims that:
volume create: GluReplica: failed: /zclei22/01/glu is already part of a
volume
Any chance to force it?
On Wed, Mar 1, 2017 at 12:13 PM, Ramesh Nachimuthu <rnachimu@redhat.com>
wrote:
...
...
From: "Arman Khalatyan" <arm2arm@gmail.com>
To: "users" <users@ovirt.org>
Sent: Wednesday, March 1, 2017 3:10:38 PM
Subject: Re: [ovirt-users] Gluster setup disappears any chance to
recover?
engine throws following errors:
2017-03-01 10:39:59,608+01 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [d7f7d83] EVENT_ID:
GLUSTER_VOLUME_DELETED_FROM_CLI(4,027), Correlation ID: null, Call
Stack:
null, Custom Event ID: -1, Message: Detected deletion of volume
GluReplica
on cluster HaGLU, and deleted it from engine DB.
2017-03-01 10:39:59,610+01 ERROR
[org.ovirt.engine.core.bll.gluster.GlusterSyncJob]
(DefaultQuartzScheduler6)
[d7f7d83] Error while removing volumes from database!:
org.springframework.dao.DataIntegrityViolationException:
CallableStatementCallback; SQL [{call deleteglustervolumesbyguids(?)}];
ERROR: update or delete on table "gluster_volumes" violates foreign key
constraint "fk_storage_connection_to_glustervolume" on table
"storage_server_connections"
Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
referenced
from table "storage_server_connections".
Where: SQL statement "DELETE
FROM gluster_volumes
WHERE id IN (
SELECT *
FROM fnSplitterUuid(v_volume_ids)
)"
PL/pgSQL function deleteglustervolumesbyguids(character varying) line
3 at
SQL statement; nested exception is org.postgresql.util.PSQLException:
ERROR:
update or delete on table "gluster_volumes" violates foreign key
constraint
"fk_storage_connection_to_glustervolume" on table
"storage_server_connections"
Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
referenced
from table "storage_server_connections".
Where: SQL statement "DELETE
FROM gluster_volumes
WHERE id IN (
SELECT *
FROM fnSplitterUuid(v_volume_ids)
)"
PL/pgSQL function deleteglustervolumesbyguids(character varying) line
3 at
SQL statement
at
org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTra
nslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:243)
[spring-jdbc.jar:4.2.4.RELEASE]
at
org.springframework.jdbc.support.AbstractFallbackSQLExceptio
nTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
[spring-jdbc.jar:4.2.4.RELEASE]
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
ate.java:1094)
[spring-jdbc.jar:4.2.4.RELEASE]
at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate
.java:1130)
[spring-jdbc.jar:4.2.4.RELEASE]
at
org.springframework.jdbc.core.simple.AbstractJdbcCall.execut
eCallInternal(AbstractJdbcCall.java:405)
[spring-jdbc.jar:4.2.4.RELEASE]
at
org.springframework.jdbc.core.simple.AbstractJdbcCall.doExec
ute(AbstractJdbcCall.java:365)
[spring-jdbc.jar:4.2.4.RELEASE]
at
org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(
SimpleJdbcCall.java:198)
[spring-jdbc.jar:4.2.4.RELEASE]
at
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
ecuteImpl(SimpleJdbcCallsHandler.java:135)
[dal.jar:]
at
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
ecuteImpl(SimpleJdbcCallsHandler.java:130)
[dal.jar:]
at
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
ecuteModification(SimpleJdbcCallsHandler.java:76)
[dal.jar:]
at
org.ovirt.engine.core.dao.gluster.GlusterVolumeDaoImpl.remov
eAll(GlusterVolumeDaoImpl.java:233)
[dal.jar:]
at
org.ovirt.engine.core.bll.gluster.GlusterSyncJob.removeDelet
edVolumes(GlusterSyncJob.java:521)
[bll.jar:]
at
org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshVolu
meData(GlusterSyncJob.java:465)
[bll.jar:]
at
org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshClus
terData(GlusterSyncJob.java:133)
[bll.jar:]
at
org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshLigh
tWeightData(GlusterSyncJob.java:111)
[bll.jar:]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[rt.jar:1.8.0_121]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:62)
[rt.jar:1.8.0_121]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
...
[rt.jar:1.8.0_121]
at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_121]
at
org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(Jo
bWrapper.java:77)
[scheduler.jar:]
at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrap
----- Original Message -----
thodAccessorImpl.java:43)
per.java:51)
...
[scheduler.jar:]
at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
at java.util.concurrent.Executors$RunnableAdapter.call(
Executors.java:511)
[rt.jar:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[rt.jar:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[rt.jar:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[rt.jar:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]
Caused by: org.postgresql.util.PSQLException: ERROR: update or delete
on
table "gluster_volumes" violates foreign key constraint
"fk_storage_connection_to_glustervolume" on table
"storage_server_connections"
Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
referenced
from table "storage_server_connections".
Where: SQL statement "DELETE
FROM gluster_volumes
WHERE id IN (
SELECT *
FROM fnSplitterUuid(v_volume_ids)
)"
PL/pgSQL function deleteglustervolumesbyguids(character varying) line
3 at
SQL statement
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorRespons
e(QueryExecutorImpl.java:2157)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(Quer
yExecutorImpl.java:1886)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecut
orImpl.java:255)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
Jdbc2Statement.java:555)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags
(AbstractJdbc2Statement.java:417)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
Jdbc2Statement.java:410)
at
org.jboss.jca.adapters.jdbc.CachedPreparedStatement.execute(
CachedPreparedStatement.java:303)
at
org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.execute
(WrappedPreparedStatement.java:442)
at
org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
tement(JdbcTemplate.java:1133)
[spring-jdbc.jar:4.2.4.RELEASE]
at
org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
tement(JdbcTemplate.java:1130)
[spring-jdbc.jar:4.2.4.RELEASE]
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
ate.java:1078)
[spring-jdbc.jar:4.2.4.RELEASE]
... 24 more
This is a side effect volume deletion in the gluster side. Looks like you
have storage domains created using those volumes.
...
On Wed, Mar 1, 2017 at 9:49 AM, Arman Khalatyan < arm2arm@gmail.com >
wrote:
Hi,
I just tested power cut on the test system:
Cluster with 3-Hosts each host has 4TB localdisk with zfs on it
/zhost/01/glu
folder as a brick.
Glusterfs was with replicated to 3 disks with arbiter. So far so good.
Vm was
up an running with 5oGB OS disk: dd was showing 100-70MB/s performance
with
the Vm disk.
I just simulated disaster powercut: with ipmi power-cycle all 3 hosts
same
time.
the result is all hosts are green up and running but bricks are down.
in the processes I can see:
ps aux | grep gluster
root 16156 0.8 0.0 475360 16964 ? Ssl 08:47 0:00 /usr/sbin/glusterd -p
/var/run/glusterd.pid --log-level INFO
What happened with my volume setup??
Is it possible to recover it??
[root@clei21 ~]# gluster peer status
Number of Peers: 2
Hostname: clei22.cls
Uuid: 96b52c7e-3526-44fd-af80-14a3073ebac2
State: Peer in Cluster (Connected)
Other names:
192.168.101.40
10.10.10.44
Hostname: clei26.cls
Uuid: c9fab907-5053-41a8-a1fa-d069f34e42dc
State: Peer in Cluster (Connected)
Other names:
10.10.10.41
[root@clei21 ~]# gluster volume info
No volumes present
[root@clei21 ~]#
I not sure why all volumes are getting deleted after reboot. Do you see
any vol files under the directory /var/lib/glusterd/vols/?. Also
/var/log/glusterfs/cmd_history.log should have all the gluster commands
executed.
Regards,
Ramesh
...
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users