forgot to mention number 4) my fault was with glustefs on zfs: setup was
with the xattr=on one should put xattr=sa
On Thu, Mar 2, 2017 at 10:08 AM, Arman Khalatyan <arm2arm(a)gmail.com> wrote:
I just discovered in the logs several troubles:
1) the rdma support was not installed from glusterfs (but the RDMA check
box was selected)
2) somehow every second during the resync the connection was going down
and up...
3)Due to 2) the hosts are restarging daemon glusterfs several times, with
correct parameters and with no parameters.. they where giving conflict and
one other other was overtaking.
Maybe the fault was due to the onboot enabled glusterfs service.
I can try to destroy whole cluster and reinstall from scratch to see if we
can figure-out why the vol config files are disappears.
On Thu, Mar 2, 2017 at 5:34 AM, Ramesh Nachimuthu <rnachimu(a)redhat.com>
wrote:
>
>
>
>
> ----- Original Message -----
> > From: "Arman Khalatyan" <arm2arm(a)gmail.com>
> > To: "Ramesh Nachimuthu" <rnachimu(a)redhat.com>
> > Cc: "users" <users(a)ovirt.org>, "Sahina Bose"
<sabose(a)redhat.com>
> > Sent: Wednesday, March 1, 2017 11:22:32 PM
> > Subject: Re: [ovirt-users] Gluster setup disappears any chance to
> recover?
> >
> > ok I will answer by my self:
> > yes gluster daemon is managed by vdms:)
> > and to recover lost config simply one should add "force" keyword
> > gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA
> > 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu
> > 10.10.10.41:/zclei26/01/glu
> > force
> >
> > now everything is up an running !
> > one annoying thing is epel dependency in the zfs and conflicting
> ovirt...
> > every time one need to enable and then disable epel.
> >
> >
>
> Glusterd service will be started when you add/activate the host in oVirt.
> It will be configured to start after every reboot.
> Volumes disappearing seems to be a serious issue. We have never seen such
> an issue with XFS file system. Are you able to reproduce this issue
> consistently?.
>
> Regards,
> Ramesh
>
> >
> > On Wed, Mar 1, 2017 at 5:33 PM, Arman Khalatyan <arm2arm(a)gmail.com>
> wrote:
> >
> > > ok Finally by single brick up and running so I can access to data.
> > > Now the question is do we need to run glusterd daemon on startup? or
> it is
> > > managed by vdsmd?
> > >
> > >
> > > On Wed, Mar 1, 2017 at 2:36 PM, Arman Khalatyan <arm2arm(a)gmail.com>
> wrote:
> > >
> > >> all folders /var/lib/glusterd/vols/ are empty
> > >> In the history of one of the servers I found the command how it was
> > >> created:
> > >>
> > >> gluster volume create GluReplica replica 3 arbiter 1 transport
> TCP,RDMA
> > >> 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu 10.10.10.41:
> > >> /zclei26/01/glu
> > >>
> > >> But executing this command it claims that:
> > >> volume create: GluReplica: failed: /zclei22/01/glu is already part
> of a
> > >> volume
> > >>
> > >> Any chance to force it?
> > >>
> > >>
> > >>
> > >> On Wed, Mar 1, 2017 at 12:13 PM, Ramesh Nachimuthu <
> rnachimu(a)redhat.com>
> > >> wrote:
> > >>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> ----- Original Message -----
> > >>> > From: "Arman Khalatyan" <arm2arm(a)gmail.com>
> > >>> > To: "users" <users(a)ovirt.org>
> > >>> > Sent: Wednesday, March 1, 2017 3:10:38 PM
> > >>> > Subject: Re: [ovirt-users] Gluster setup disappears any chance
to
> > >>> recover?
> > >>> >
> > >>> > engine throws following errors:
> > >>> > 2017-03-01 10:39:59,608+01 WARN
> > >>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo
> gDirector]
> > >>> > (DefaultQuartzScheduler6) [d7f7d83] EVENT_ID:
> > >>> > GLUSTER_VOLUME_DELETED_FROM_CLI(4,027), Correlation ID: null,
> Call
> > >>> Stack:
> > >>> > null, Custom Event ID: -1, Message: Detected deletion of
volume
> > >>> GluReplica
> > >>> > on cluster HaGLU, and deleted it from engine DB.
> > >>> > 2017-03-01 10:39:59,610+01 ERROR
> > >>> > [org.ovirt.engine.core.bll.gluster.GlusterSyncJob]
> > >>> (DefaultQuartzScheduler6)
> > >>> > [d7f7d83] Error while removing volumes from database!:
> > >>> > org.springframework.dao.DataIntegrityViolationException:
> > >>> > CallableStatementCallback; SQL [{call
> deleteglustervolumesbyguids(?)
> > >>> }];
> > >>> > ERROR: update or delete on table "gluster_volumes"
violates
> foreign key
> > >>> > constraint "fk_storage_connection_to_glustervolume"
on table
> > >>> > "storage_server_connections"
> > >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is
still
> > >>> referenced
> > >>> > from table "storage_server_connections".
> > >>> > Where: SQL statement "DELETE
> > >>> > FROM gluster_volumes
> > >>> > WHERE id IN (
> > >>> > SELECT *
> > >>> > FROM fnSplitterUuid(v_volume_ids)
> > >>> > )"
> > >>> > PL/pgSQL function deleteglustervolumesbyguids(character
varying)
> line
> > >>> 3 at
> > >>> > SQL statement; nested exception is
org.postgresql.util.PSQLExcept
> ion:
> > >>> ERROR:
> > >>> > update or delete on table "gluster_volumes" violates
foreign key
> > >>> constraint
> > >>> > "fk_storage_connection_to_glustervolume" on table
> > >>> > "storage_server_connections"
> > >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is
still
> > >>> referenced
> > >>> > from table "storage_server_connections".
> > >>> > Where: SQL statement "DELETE
> > >>> > FROM gluster_volumes
> > >>> > WHERE id IN (
> > >>> > SELECT *
> > >>> > FROM fnSplitterUuid(v_volume_ids)
> > >>> > )"
> > >>> > PL/pgSQL function deleteglustervolumesbyguids(character
varying)
> line
> > >>> 3 at
> > >>> > SQL statement
> > >>> > at
> > >>> > org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTra
> > >>> nslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:243)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
> > >>> > org.springframework.jdbc.support.AbstractFallbackSQLExceptio
> > >>> nTranslator.translate(AbstractFallbackSQLExceptionTranslator
> .java:73)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
> > >>> ate.java:1094)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate
> > >>> .java:1130)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
> > >>> > org.springframework.jdbc.core.simple.AbstractJdbcCall.execut
> > >>> eCallInternal(AbstractJdbcCall.java:405)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
> > >>> > org.springframework.jdbc.core.simple.AbstractJdbcCall.doExec
> > >>> ute(AbstractJdbcCall.java:365)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
> > >>> > org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(
> > >>> SimpleJdbcCall.java:198)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
> > >>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
> > >>> ecuteImpl(SimpleJdbcCallsHandler.java:135)
> > >>> > [dal.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
> > >>> ecuteImpl(SimpleJdbcCallsHandler.java:130)
> > >>> > [dal.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.ex
> > >>> ecuteModification(SimpleJdbcCallsHandler.java:76)
> > >>> > [dal.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.dao.gluster.GlusterVolumeDaoImpl.remov
> > >>> eAll(GlusterVolumeDaoImpl.java:233)
> > >>> > [dal.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.removeDelet
> > >>> edVolumes(GlusterSyncJob.java:521)
> > >>> > [bll.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshVolu
> > >>> meData(GlusterSyncJob.java:465)
> > >>> > [bll.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshClus
> > >>> terData(GlusterSyncJob.java:133)
> > >>> > [bll.jar:]
> > >>> > at
> > >>> > org.ovirt.engine.core.bll.gluster.GlusterSyncJob.refreshLigh
> > >>> tWeightData(GlusterSyncJob.java:111)
> > >>> > [bll.jar:]
> > >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at
> > >>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> > >>> ssorImpl.java:62)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at
> > >>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> > >>> thodAccessorImpl.java:43)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at java.lang.reflect.Method.invoke(Method.java:498)
> [rt.jar:1.8.0_121]
> > >>> > at
> > >>> > org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(Jo
> > >>> bWrapper.java:77)
> > >>> > [scheduler.jar:]
> > >>> > at
org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrap
> > >>> per.java:51)
> > >>> > [scheduler.jar:]
> > >>> > at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
> [quartz.jar:]
> > >>> > at
java.util.concurrent.Executors$RunnableAdapter.call(Executor
> > >>> s.java:511)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at
> > >>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> > >>> Executor.java:1142)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at
> > >>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> > >>> lExecutor.java:617)
> > >>> > [rt.jar:1.8.0_121]
> > >>> > at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]
> > >>> > Caused by: org.postgresql.util.PSQLException: ERROR: update
or
> delete
> > >>> on
> > >>> > table "gluster_volumes" violates foreign key
constraint
> > >>> > "fk_storage_connection_to_glustervolume" on table
> > >>> > "storage_server_connections"
> > >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-bd317623ed2d) is
still
> > >>> referenced
> > >>> > from table "storage_server_connections".
> > >>> > Where: SQL statement "DELETE
> > >>> > FROM gluster_volumes
> > >>> > WHERE id IN (
> > >>> > SELECT *
> > >>> > FROM fnSplitterUuid(v_volume_ids)
> > >>> > )"
> > >>> > PL/pgSQL function deleteglustervolumesbyguids(character
varying)
> line
> > >>> 3 at
> > >>> > SQL statement
> > >>> > at
> > >>> > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorRespons
> > >>> e(QueryExecutorImpl.java:2157)
> > >>> > at
> > >>> > org.postgresql.core.v3.QueryExecutorImpl.processResults(Quer
> > >>> yExecutorImpl.java:1886)
> > >>> > at
> > >>> > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecut
> > >>> orImpl.java:255)
> > >>> > at
> > >>> > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
> > >>> Jdbc2Statement.java:555)
> > >>> > at
> > >>> > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags
> > >>> (AbstractJdbc2Statement.java:417)
> > >>> > at
> > >>> > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Abstract
> > >>> Jdbc2Statement.java:410)
> > >>> > at
> > >>> > org.jboss.jca.adapters.jdbc.CachedPreparedStatement.execute(
> > >>> CachedPreparedStatement.java:303)
> > >>> > at
> > >>> > org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.execute
> > >>> (WrappedPreparedStatement.java:442)
> > >>> > at
> > >>> > org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
> > >>> tement(JdbcTemplate.java:1133)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
> > >>> > org.springframework.jdbc.core.JdbcTemplate$6.doInCallableSta
> > >>> tement(JdbcTemplate.java:1130)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > at
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTempl
> > >>> ate.java:1078)
> > >>> > [spring-jdbc.jar:4.2.4.RELEASE]
> > >>> > ... 24 more
> > >>> >
> > >>> >
> > >>> >
> > >>>
> > >>> This is a side effect volume deletion in the gluster side. Looks
> like
> > >>> you have storage domains created using those volumes.
> > >>>
> > >>> > On Wed, Mar 1, 2017 at 9:49 AM, Arman Khalatyan <
> arm2arm(a)gmail.com >
> > >>> wrote:
> > >>> >
> > >>> >
> > >>> >
> > >>> > Hi,
> > >>> > I just tested power cut on the test system:
> > >>> >
> > >>> > Cluster with 3-Hosts each host has 4TB localdisk with zfs on
it
> > >>> /zhost/01/glu
> > >>> > folder as a brick.
> > >>> >
> > >>> > Glusterfs was with replicated to 3 disks with arbiter. So far
so
> good.
> > >>> Vm was
> > >>> > up an running with 5oGB OS disk: dd was showing 100-70MB/s
> performance
> > >>> with
> > >>> > the Vm disk.
> > >>> > I just simulated disaster powercut: with ipmi power-cycle all
3
> hosts
> > >>> same
> > >>> > time.
> > >>> > the result is all hosts are green up and running but bricks
are
> down.
> > >>> > in the processes I can see:
> > >>> > ps aux | grep gluster
> > >>> > root 16156 0.8 0.0 475360 16964 ? Ssl 08:47 0:00
> /usr/sbin/glusterd -p
> > >>> > /var/run/glusterd.pid --log-level INFO
> > >>> >
> > >>> > What happened with my volume setup??
> > >>> > Is it possible to recover it??
> > >>> > [root@clei21 ~]# gluster peer status
> > >>> > Number of Peers: 2
> > >>> >
> > >>> > Hostname: clei22.cls
> > >>> > Uuid: 96b52c7e-3526-44fd-af80-14a3073ebac2
> > >>> > State: Peer in Cluster (Connected)
> > >>> > Other names:
> > >>> > 192.168.101.40
> > >>> > 10.10.10.44
> > >>> >
> > >>> > Hostname: clei26.cls
> > >>> > Uuid: c9fab907-5053-41a8-a1fa-d069f34e42dc
> > >>> > State: Peer in Cluster (Connected)
> > >>> > Other names:
> > >>> > 10.10.10.41
> > >>> > [root@clei21 ~]# gluster volume info
> > >>> > No volumes present
> > >>> > [root@clei21 ~]#
> > >>>
> > >>> I not sure why all volumes are getting deleted after reboot. Do
you
> see
> > >>> any vol files under the directory /var/lib/glusterd/vols/?. Also
> > >>> /var/log/glusterfs/cmd_history.log should have all the gluster
> commands
> > >>> executed.
> > >>>
> > >>> Regards,
> > >>> Ramesh
> > >>>
> > >>> >
> > >>> >
> > >>> >
> > >>> > _______________________________________________
> > >>> > Users mailing list
> > >>> > Users(a)ovirt.org
> > >>> >
http://lists.ovirt.org/mailman/listinfo/users
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>