[ovirt-users] Gluster setup disappears any chance to recover?

Wed Mar 1 17:52:32 UTC 2017

ok I will answer by my self:
yes gluster daemon is managed by vdms:)
and to recover lost config simply one should add "force" keyword
gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA
10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu
10.10.10.41:/zclei26/01/glu
force

now everything is up an running !
one annoying thing is epel dependency in the zfs and conflicting ovirt...
every time one need to enable and then disable epel.

On Wed, Mar 1, 2017 at 5:33 PM, Arman Khalatyan <arm2arm at gmail.com> wrote:

> ok Finally by single brick up and running so I can access to data.
> Now the question is do we need to run glusterd daemon on startup? or it is
> managed by vdsmd?
>
>
> On Wed, Mar 1, 2017 at 2:36 PM, Arman Khalatyan <arm2arm at gmail.com> wrote:
>
>> all folders /var/lib/glusterd/vols/ are empty
>> In the history of one of the servers I found the command how it was
>> created:
>>
>> gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA
>> 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu 10.10.10.41:
>> /zclei26/01/glu
>>
>> But executing this command it claims that:
>> volume create: GluReplica: failed: /zclei22/01/glu is already part of a
>> volume
>>
>> Any chance to force it?
>>
>>
>>
>> On Wed, Mar 1, 2017 at 12:13 PM, Ramesh Nachimuthu <rnachimu at redhat.com>
>> wrote:
>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>> > From: "Arman Khalatyan" <arm2arm at gmail.com>
>>> > To: "users" <users at ovirt.org>
>>> > Sent: Wednesday, March 1, 2017 3:10:38 PM
>>> > Subject: Re: [ovirt-users] Gluster setup disappears any chance to
>>> recover?
>>> >
>>> > engine throws following errors:
>>> > 2017-03-01 10:39:59,608+01 WARN
>>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> > (DefaultQuartzScheduler6) [d7f7d83] EVENT_ID:
>>> > GLUSTER_VOLUME_DELETED_FROM_CLI(4,027), Correlation ID: null, Call
>>> Stack:
>>> > null, Custom Event ID: -1, Message: Detected deletion of volume
>>> GluReplica
>>> > on cluster HaGLU, and deleted it from engine DB.
>>> > 2017-03-01 10:39:59,610+01 ERROR
>>> > [org.ovirt.engine.core.bll.gluster.GlusterSyncJob]
>>> (DefaultQuartzScheduler6)
>>> > [d7f7d83] Error while removing volumes from database!:
>>> > org.springframework.dao.DataIntegrityViolationException:
>>> > CallableStatementCallback; SQL [{call deleteglustervolumesbyguids(?)
>>> }];
>>> > ERROR: update or delete on table "gluster_volumes" violates foreign key
>>> > constraint "fk_storage_connection_to_glustervolume" on table
>>> > "storage_server_connections"
>>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
>>> referenced
>>> > from table "storage_server_connections".
>>> > Where: SQL statement "DELETE
>>> > FROM gluster_volumes
>>> > WHERE id IN (
>>> > SELECT *
>>> > FROM fnSplitterUuid(v_volume_ids)
>>> > )"
>>> > PL/pgSQL function deleteglustervolumesbyguids(character varying) line
>>> 3 at
>>> > SQL statement; nested exception is org.postgresql.util.PSQLException:
>>> ERROR:
>>> > update or delete on table "gluster_volumes" violates foreign key
>>> constraint
>>> > "fk_storage_connection_to_glustervolume" on table
>>> > "storage_server_connections"
>>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
>>> referenced
>>> > from table "storage_server_connections".
>>> > Where: SQL statement "DELETE
>>> > FROM gluster_volumes
>>> > WHERE id IN (
>>> > SELECT *
>>> > FROM fnSplitterUuid(v_volume_ids)
>>> > )"
>>> > PL/pgSQL function deleteglustervolumesbyguids(character varying) line
>>> 3 at
>>> > SQL statement
>>> > at
>>> > org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTra
>>> nslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:243)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at
>>> > org.springframework.jdbc.support.AbstractFallbackSQLExceptio
>>> nTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
>>> ate.java:1094)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate
>>> .java:1130)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at
>>> > org.springframework.jdbc.core.simple.AbstractJdbcCall.execut
>>> eCallInternal(AbstractJdbcCall.java:405)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at
>>> > org.springframework.jdbc.core.simple.AbstractJdbcCall.doExec
>>> ute(AbstractJdbcCall.java:365)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at
>>> > org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(
>>> SimpleJdbcCall.java:198)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at
>>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
>>> ecuteImpl(SimpleJdbcCallsHandler.java:135)
>>> > [dal.jar:]
>>> > at
>>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
>>> ecuteImpl(SimpleJdbcCallsHandler.java:130)
>>> > [dal.jar:]
>>> > at
>>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
>>> ecuteModification(SimpleJdbcCallsHandler.java:76)
>>> > [dal.jar:]
>>> > at
>>> > org.ovirt.engine.core.dao.gluster.GlusterVolumeDaoImpl.remov
>>> eAll(GlusterVolumeDaoImpl.java:233)
>>> > [dal.jar:]
>>> > at
>>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.removeDelet
>>> edVolumes(GlusterSyncJob.java:521)
>>> > [bll.jar:]
>>> > at
>>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshVolu
>>> meData(GlusterSyncJob.java:465)
>>> > [bll.jar:]
>>> > at
>>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshClus
>>> terData(GlusterSyncJob.java:133)
>>> > [bll.jar:]
>>> > at
>>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshLigh
>>> tWeightData(GlusterSyncJob.java:111)
>>> > [bll.jar:]
>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> > [rt.jar:1.8.0_121]
>>> > at
>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>> > [rt.jar:1.8.0_121]
>>> > at
>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>> > [rt.jar:1.8.0_121]
>>> > at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_121]
>>> > at
>>> > org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(Jo
>>> bWrapper.java:77)
>>> > [scheduler.jar:]
>>> > at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrap
>>> per.java:51)
>>> > [scheduler.jar:]
>>> > at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
>>> > at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>> s.java:511)
>>> > [rt.jar:1.8.0_121]
>>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>> > [rt.jar:1.8.0_121]
>>> > at
>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>> > [rt.jar:1.8.0_121]
>>> > at
>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>> > [rt.jar:1.8.0_121]
>>> > at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]
>>> > Caused by: org.postgresql.util.PSQLException: ERROR: update or delete
>>> on
>>> > table "gluster_volumes" violates foreign key constraint
>>> > "fk_storage_connection_to_glustervolume" on table
>>> > "storage_server_connections"
>>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is still
>>> referenced
>>> > from table "storage_server_connections".
>>> > Where: SQL statement "DELETE
>>> > FROM gluster_volumes
>>> > WHERE id IN (
>>> > SELECT *
>>> > FROM fnSplitterUuid(v_volume_ids)
>>> > )"
>>> > PL/pgSQL function deleteglustervolumesbyguids(character varying) line
>>> 3 at
>>> > SQL statement
>>> > at
>>> > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorRespons
>>> e(QueryExecutorImpl.java:2157)
>>> > at
>>> > org.postgresql.core.v3.QueryExecutorImpl.processResults(Quer
>>> yExecutorImpl.java:1886)
>>> > at
>>> > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecut
>>> orImpl.java:255)
>>> > at
>>> > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
>>> Jdbc2Statement.java:555)
>>> > at
>>> > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags
>>> (AbstractJdbc2Statement.java:417)
>>> > at
>>> > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
>>> Jdbc2Statement.java:410)
>>> > at
>>> > org.jboss.jca.adapters.jdbc.CachedPreparedStatement.execute(
>>> CachedPreparedStatement.java:303)
>>> > at
>>> > org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.execute
>>> (WrappedPreparedStatement.java:442)
>>> > at
>>> > org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
>>> tement(JdbcTemplate.java:1133)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at
>>> > org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
>>> tement(JdbcTemplate.java:1130)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
>>> ate.java:1078)
>>> > [spring-jdbc.jar:4.2.4.RELEASE]
>>> > ... 24 more
>>> >
>>> >
>>> >
>>>
>>> This is a side effect volume deletion in the gluster side. Looks like
>>> you have storage domains created using those volumes.
>>>
>>> > On Wed, Mar 1, 2017 at 9:49 AM, Arman Khalatyan < arm2arm at gmail.com >
>>> wrote:
>>> >
>>> >
>>> >
>>> > Hi,
>>> > I just tested power cut on the test system:
>>> >
>>> > Cluster with 3-Hosts each host has 4TB localdisk with zfs on it
>>> /zhost/01/glu
>>> > folder as a brick.
>>> >
>>> > Glusterfs was with replicated to 3 disks with arbiter. So far so good.
>>> Vm was
>>> > up an running with 5oGB OS disk: dd was showing 100-70MB/s performance
>>> with
>>> > the Vm disk.
>>> > I just simulated disaster powercut: with ipmi power-cycle all 3 hosts
>>> same
>>> > time.
>>> > the result is all hosts are green up and running but bricks are down.
>>> > in the processes I can see:
>>> > ps aux | grep gluster
>>> > root 16156 0.8 0.0 475360 16964 ? Ssl 08:47 0:00 /usr/sbin/glusterd -p
>>> > /var/run/glusterd.pid --log-level INFO
>>> >
>>> > What happened with my volume setup??
>>> > Is it possible to recover it??
>>> > [root at clei21 ~]# gluster peer status
>>> > Number of Peers: 2
>>> >
>>> > Hostname: clei22.cls
>>> > Uuid: 96b52c7e-3526-44fd-af80-14a3073ebac2
>>> > State: Peer in Cluster (Connected)
>>> > Other names:
>>> > 192.168.101.40
>>> > 10.10.10.44
>>> >
>>> > Hostname: clei26.cls
>>> > Uuid: c9fab907-5053-41a8-a1fa-d069f34e42dc
>>> > State: Peer in Cluster (Connected)
>>> > Other names:
>>> > 10.10.10.41
>>> > [root at clei21 ~]# gluster volume info
>>> > No volumes present
>>> > [root at clei21 ~]#
>>>
>>> I not sure why all volumes are getting deleted after reboot. Do you see
>>> any vol files under the directory /var/lib/glusterd/vols/?. Also
>>> /var/log/glusterfs/cmd_history.log should have all the gluster commands
>>> executed.
>>>
>>> Regards,
>>> Ramesh
>>>
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Users mailing list
>>> > Users at ovirt.org
>>> > http://lists.ovirt.org/mailman/listinfo/users
>>> >
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170301/a9ada950/attachment.html>