[ovirt-users] Gluster setup disappears any chance to recover?

Ramesh Nachimuthu rnachimu at redhat.com
Thu Mar 2 04:34:46 UTC 2017





----- Original Message -----
> From: "Arman Khalatyan" <arm2arm at gmail.com>
> To: "Ramesh Nachimuthu" <rnachimu at redhat.com>
> Cc: "users" <users at ovirt.org>, "Sahina Bose" <sabose at redhat.com>
> Sent: Wednesday, March 1, 2017 11:22:32 PM
> Subject: Re: [ovirt-users] Gluster setup disappears any chance to recover?
> 
> ok I will answer by my self:
> yes gluster daemon is managed by vdms:)
> and to recover lost config simply one should add "force" keyword
> gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA
> 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu
> 10.10.10.41:/zclei26/01/glu
> force
> 
> now everything is up an running !
> one annoying thing is epel dependency in the zfs and conflicting ovirt...
> every time one need to enable and then disable epel.
> 
> 

Glusterd service will be started when you add/activate the host in oVirt. It will be configured to start after every reboot. 
Volumes disappearing seems to be a serious issue. We have never seen such an issue with XFS file system. Are you able to reproduce this issue consistently?.

Regards,
Ramesh

> 
> On Wed, Mar 1, 2017 at 5:33 PM, Arman Khalatyan <arm2arm at gmail.com> wrote:
> 
> > ok Finally by single brick up and running so I can access to data.
> > Now the question is do we need to run glusterd daemon on startup? or it is
> > managed by vdsmd?
> >
> >
> > On Wed, Mar 1, 2017 at 2:36 PM, Arman Khalatyan <arm2arm at gmail.com> wrote:
> >
> >> all folders /var/lib/glusterd/vols/ are empty
> >> In the history of one of the servers I found the command how it was
> >> created:
> >>
> >> gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA
> >> 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu 10.10.10.41:
> >> /zclei26/01/glu
> >>
> >> But executing this command it claims that:
> >> volume create: GluReplica: failed: /zclei22/01/glu is already part of a
> >> volume
> >>
> >> Any chance to force it?
> >>
> >>
> >>
> >> On Wed, Mar 1, 2017 at 12:13 PM, Ramesh Nachimuthu <rnachimu at redhat.com>
> >> wrote:
> >>
> >>>
> >>>
> >>>
> >>>
> >>> ----- Original Message -----
> >>> > From: "Arman Khalatyan" <arm2arm at gmail.com>
> >>> > To: "users" <users at ovirt.org>
> >>> > Sent: Wednesday, March 1, 2017 3:10:38 PM
> >>> > Subject: Re: [ovirt-users] Gluster setup disappears any chance to
> >>> recover?
> >>> >
> >>> > engine throws following errors:
> >>> > 2017-03-01 10:39:59,608+01 WARN
> >>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> >>> > (DefaultQuartzScheduler6) [d7f7d83] EVENT_ID:
> >>> > GLUSTER_VOLUME_DELETED_FROM_CLI(4,027), Correlation ID: null, Call
> >>> Stack:
> >>> > null, Custom Event ID: -1, Message: Detected deletion of volume
> >>> GluReplica
> >>> > on cluster HaGLU, and deleted it from engine DB.
> >>> > 2017-03-01 10:39:59,610+01 ERROR
> >>> > [org.ovirt.engine.core.bll.gluster.GlusterSyncJob]
> >>> (DefaultQuartzScheduler6)
> >>> > [d7f7d83] Error while removing volumes from database!:
> >>> > org.springframework.dao.DataIntegrityViolationException:
> >>> > CallableStatementCallback; SQL [{call deleteglustervolumesbyguids(?)
> >>> }];
> >>> > ERROR: update or delete on table "gluster_volumes" violates foreign key
> >>> > constraint "fk_storage_connection_to_glustervolume" on table
> >>> > "storage_server_connections"
> >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
> >>> referenced
> >>> > from table "storage_server_connections".
> >>> > Where: SQL statement "DELETE
> >>> > FROM gluster_volumes
> >>> > WHERE id IN (
> >>> > SELECT *
> >>> > FROM fnSplitterUuid(v_volume_ids)
> >>> > )"
> >>> > PL/pgSQL function deleteglustervolumesbyguids(character varying) line
> >>> 3 at
> >>> > SQL statement; nested exception is org.postgresql.util.PSQLException:
> >>> ERROR:
> >>> > update or delete on table "gluster_volumes" violates foreign key
> >>> constraint
> >>> > "fk_storage_connection_to_glustervolume" on table
> >>> > "storage_server_connections"
> >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
> >>> referenced
> >>> > from table "storage_server_connections".
> >>> > Where: SQL statement "DELETE
> >>> > FROM gluster_volumes
> >>> > WHERE id IN (
> >>> > SELECT *
> >>> > FROM fnSplitterUuid(v_volume_ids)
> >>> > )"
> >>> > PL/pgSQL function deleteglustervolumesbyguids(character varying) line
> >>> 3 at
> >>> > SQL statement
> >>> > at
> >>> > org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTra
> >>> nslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:243)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at
> >>> > org.springframework.jdbc.support.AbstractFallbackSQLExceptio
> >>> nTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
> >>> ate.java:1094)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate
> >>> .java:1130)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at
> >>> > org.springframework.jdbc.core.simple.AbstractJdbcCall.execut
> >>> eCallInternal(AbstractJdbcCall.java:405)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at
> >>> > org.springframework.jdbc.core.simple.AbstractJdbcCall.doExec
> >>> ute(AbstractJdbcCall.java:365)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at
> >>> > org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(
> >>> SimpleJdbcCall.java:198)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at
> >>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
> >>> ecuteImpl(SimpleJdbcCallsHandler.java:135)
> >>> > [dal.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
> >>> ecuteImpl(SimpleJdbcCallsHandler.java:130)
> >>> > [dal.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
> >>> ecuteModification(SimpleJdbcCallsHandler.java:76)
> >>> > [dal.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.dao.gluster.GlusterVolumeDaoImpl.remov
> >>> eAll(GlusterVolumeDaoImpl.java:233)
> >>> > [dal.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.removeDelet
> >>> edVolumes(GlusterSyncJob.java:521)
> >>> > [bll.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshVolu
> >>> meData(GlusterSyncJob.java:465)
> >>> > [bll.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshClus
> >>> terData(GlusterSyncJob.java:133)
> >>> > [bll.jar:]
> >>> > at
> >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshLigh
> >>> tWeightData(GlusterSyncJob.java:111)
> >>> > [bll.jar:]
> >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>> > [rt.jar:1.8.0_121]
> >>> > at
> >>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> >>> ssorImpl.java:62)
> >>> > [rt.jar:1.8.0_121]
> >>> > at
> >>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> >>> thodAccessorImpl.java:43)
> >>> > [rt.jar:1.8.0_121]
> >>> > at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_121]
> >>> > at
> >>> > org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(Jo
> >>> bWrapper.java:77)
> >>> > [scheduler.jar:]
> >>> > at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrap
> >>> per.java:51)
> >>> > [scheduler.jar:]
> >>> > at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
> >>> > at java.util.concurrent.Executors$RunnableAdapter.call(Executor
> >>> s.java:511)
> >>> > [rt.jar:1.8.0_121]
> >>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>> > [rt.jar:1.8.0_121]
> >>> > at
> >>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> >>> Executor.java:1142)
> >>> > [rt.jar:1.8.0_121]
> >>> > at
> >>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> >>> lExecutor.java:617)
> >>> > [rt.jar:1.8.0_121]
> >>> > at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]
> >>> > Caused by: org.postgresql.util.PSQLException: ERROR: update or delete
> >>> on
> >>> > table "gluster_volumes" violates foreign key constraint
> >>> > "fk_storage_connection_to_glustervolume" on table
> >>> > "storage_server_connections"
> >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
> >>> referenced
> >>> > from table "storage_server_connections".
> >>> > Where: SQL statement "DELETE
> >>> > FROM gluster_volumes
> >>> > WHERE id IN (
> >>> > SELECT *
> >>> > FROM fnSplitterUuid(v_volume_ids)
> >>> > )"
> >>> > PL/pgSQL function deleteglustervolumesbyguids(character varying) line
> >>> 3 at
> >>> > SQL statement
> >>> > at
> >>> > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorRespons
> >>> e(QueryExecutorImpl.java:2157)
> >>> > at
> >>> > org.postgresql.core.v3.QueryExecutorImpl.processResults(Quer
> >>> yExecutorImpl.java:1886)
> >>> > at
> >>> > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecut
> >>> orImpl.java:255)
> >>> > at
> >>> > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
> >>> Jdbc2Statement.java:555)
> >>> > at
> >>> > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags
> >>> (AbstractJdbc2Statement.java:417)
> >>> > at
> >>> > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
> >>> Jdbc2Statement.java:410)
> >>> > at
> >>> > org.jboss.jca.adapters.jdbc.CachedPreparedStatement.execute(
> >>> CachedPreparedStatement.java:303)
> >>> > at
> >>> > org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.execute
> >>> (WrappedPreparedStatement.java:442)
> >>> > at
> >>> > org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
> >>> tement(JdbcTemplate.java:1133)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at
> >>> > org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
> >>> tement(JdbcTemplate.java:1130)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
> >>> ate.java:1078)
> >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> >>> > ... 24 more
> >>> >
> >>> >
> >>> >
> >>>
> >>> This is a side effect volume deletion in the gluster side. Looks like
> >>> you have storage domains created using those volumes.
> >>>
> >>> > On Wed, Mar 1, 2017 at 9:49 AM, Arman Khalatyan < arm2arm at gmail.com >
> >>> wrote:
> >>> >
> >>> >
> >>> >
> >>> > Hi,
> >>> > I just tested power cut on the test system:
> >>> >
> >>> > Cluster with 3-Hosts each host has 4TB localdisk with zfs on it
> >>> /zhost/01/glu
> >>> > folder as a brick.
> >>> >
> >>> > Glusterfs was with replicated to 3 disks with arbiter. So far so good.
> >>> Vm was
> >>> > up an running with 5oGB OS disk: dd was showing 100-70MB/s performance
> >>> with
> >>> > the Vm disk.
> >>> > I just simulated disaster powercut: with ipmi power-cycle all 3 hosts
> >>> same
> >>> > time.
> >>> > the result is all hosts are green up and running but bricks are down.
> >>> > in the processes I can see:
> >>> > ps aux | grep gluster
> >>> > root 16156 0.8 0.0 475360 16964 ? Ssl 08:47 0:00 /usr/sbin/glusterd -p
> >>> > /var/run/glusterd.pid --log-level INFO
> >>> >
> >>> > What happened with my volume setup??
> >>> > Is it possible to recover it??
> >>> > [root at clei21 ~]# gluster peer status
> >>> > Number of Peers: 2
> >>> >
> >>> > Hostname: clei22.cls
> >>> > Uuid: 96b52c7e-3526-44fd-af80-14a3073ebac2
> >>> > State: Peer in Cluster (Connected)
> >>> > Other names:
> >>> > 192.168.101.40
> >>> > 10.10.10.44
> >>> >
> >>> > Hostname: clei26.cls
> >>> > Uuid: c9fab907-5053-41a8-a1fa-d069f34e42dc
> >>> > State: Peer in Cluster (Connected)
> >>> > Other names:
> >>> > 10.10.10.41
> >>> > [root at clei21 ~]# gluster volume info
> >>> > No volumes present
> >>> > [root at clei21 ~]#
> >>>
> >>> I not sure why all volumes are getting deleted after reboot. Do you see
> >>> any vol files under the directory /var/lib/glusterd/vols/?. Also
> >>> /var/log/glusterfs/cmd_history.log should have all the gluster commands
> >>> executed.
> >>>
> >>> Regards,
> >>> Ramesh
> >>>
> >>> >
> >>> >
> >>> >
> >>> > _______________________________________________
> >>> > Users mailing list
> >>> > Users at ovirt.org
> >>> > http://lists.ovirt.org/mailman/listinfo/users
> >>> >
> >>>
> >>
> >>
> >
> 


More information about the Users mailing list