<div dir="ltr">forgot to mention number 4) my fault was with glustefs on zfs: setup was with the xattr=on one should put xattr=sa <br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 2, 2017 at 10:08 AM, Arman Khalatyan <span dir="ltr"><<a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div>I just discovered in the logs several troubles:<br></div>1) the rdma support was not installed from glusterfs (but the RDMA check box was selected)<br></div>2) somehow every second during the resync the connection was going down and up...<br></div>3)Due to 2) the hosts are restarging daemon glusterfs several times, with correct parameters and with no parameters.. they where giving conflict and one other other was overtaking.<br></div>Maybe the fault was due to the onboot enabled glusterfs service.<br><br></div>I can try to destroy whole cluster and reinstall from scratch to see if we can figure-out why the vol config files are disappears.<br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 2, 2017 at 5:34 AM, Ramesh Nachimuthu <span dir="ltr"><<a href="mailto:rnachimu@redhat.com" target="_blank">rnachimu@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><br>
<br>
<br>
<br>
----- Original Message -----<br>
> From: "Arman Khalatyan" <<a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a>><br>
</span><span>> To: "Ramesh Nachimuthu" <<a href="mailto:rnachimu@redhat.com" target="_blank">rnachimu@redhat.com</a>><br>
> Cc: "users" <<a href="mailto:users@ovirt.org" target="_blank">users@ovirt.org</a>>, "Sahina Bose" <<a href="mailto:sabose@redhat.com" target="_blank">sabose@redhat.com</a>><br>
> Sent: Wednesday, March 1, 2017 11:22:32 PM<br>
> Subject: Re: [ovirt-users] Gluster setup disappears any chance to recover?<br>
><br>
</span><span>> ok I will answer by my self:<br>
> yes gluster daemon is managed by vdms:)<br>
> and to recover lost config simply one should add "force" keyword<br>
> gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA<br>
> 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu<br>
> 10.10.10.41:/zclei26/01/glu<br>
> force<br>
><br>
> now everything is up an running !<br>
> one annoying thing is epel dependency in the zfs and conflicting ovirt...<br>
> every time one need to enable and then disable epel.<br>
><br>
><br>
<br>
</span>Glusterd service will be started when you add/activate the host in oVirt. It will be configured to start after every reboot.<br>
Volumes disappearing seems to be a serious issue. We have never seen such an issue with XFS file system. Are you able to reproduce this issue consistently?.<br>
<br>
Regards,<br>
Ramesh<br>
<div class="m_7946538243961233135HOEnZb"><div class="m_7946538243961233135h5"><br>
><br>
> On Wed, Mar 1, 2017 at 5:33 PM, Arman Khalatyan <<a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a>> wrote:<br>
><br>
> > ok Finally by single brick up and running so I can access to data.<br>
> > Now the question is do we need to run glusterd daemon on startup? or it is<br>
> > managed by vdsmd?<br>
> ><br>
> ><br>
> > On Wed, Mar 1, 2017 at 2:36 PM, Arman Khalatyan <<a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a>> wrote:<br>
> ><br>
> >> all folders /var/lib/glusterd/vols/ are empty<br>
> >> In the history of one of the servers I found the command how it was<br>
> >> created:<br>
> >><br>
> >> gluster volume create GluReplica replica 3 arbiter 1 transport TCP,RDMA<br>
> >> 10.10.10.44:/zclei22/01/glu 10.10.10.42:/zclei21/01/glu <a href="http://10.10.10.41" rel="noreferrer" target="_blank">10.10.10.41</a>:<br>
> >> /zclei26/01/glu<br>
> >><br>
> >> But executing this command it claims that:<br>
> >> volume create: GluReplica: failed: /zclei22/01/glu is already part of a<br>
> >> volume<br>
> >><br>
> >> Any chance to force it?<br>
> >><br>
> >><br>
> >><br>
> >> On Wed, Mar 1, 2017 at 12:13 PM, Ramesh Nachimuthu <<a href="mailto:rnachimu@redhat.com" target="_blank">rnachimu@redhat.com</a>><br>
> >> wrote:<br>
> >><br>
> >>><br>
> >>><br>
> >>><br>
> >>><br>
> >>> ----- Original Message -----<br>
> >>> > From: "Arman Khalatyan" <<a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a>><br>
> >>> > To: "users" <<a href="mailto:users@ovirt.org" target="_blank">users@ovirt.org</a>><br>
> >>> > Sent: Wednesday, March 1, 2017 3:10:38 PM<br>
> >>> > Subject: Re: [ovirt-users] Gluster setup disappears any chance to<br>
> >>> recover?<br>
> >>> ><br>
> >>> > engine throws following errors:<br>
> >>> > 2017-03-01 10:39:59,608+01 WARN<br>
> >>> > [org.ovirt.engine.core.dal.dbb<wbr>roker.auditloghandling.AuditLo<wbr>gDirector]<br>
> >>> > (DefaultQuartzScheduler6) [d7f7d83] EVENT_ID:<br>
> >>> > GLUSTER_VOLUME_DELETED_FROM_CL<wbr>I(4,027), Correlation ID: null, Call<br>
> >>> Stack:<br>
> >>> > null, Custom Event ID: -1, Message: Detected deletion of volume<br>
> >>> GluReplica<br>
> >>> > on cluster HaGLU, and deleted it from engine DB.<br>
> >>> > 2017-03-01 10:39:59,610+01 ERROR<br>
> >>> > [org.ovirt.engine.core.bll.glu<wbr>ster.GlusterSyncJob]<br>
> >>> (DefaultQuartzScheduler6)<br>
> >>> > [d7f7d83] Error while removing volumes from database!:<br>
> >>> > org.springframework.dao.DataIn<wbr>tegrityViolationException:<br>
> >>> > CallableStatementCallback; SQL [{call deleteglustervolumesbyguids(?)<br>
> >>> }];<br>
> >>> > ERROR: update or delete on table "gluster_volumes" violates foreign key<br>
> >>> > constraint "fk_storage_connection_to_glus<wbr>tervolume" on table<br>
> >>> > "storage_server_connections"<br>
> >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-<wbr>bd317623ed2d) is still<br>
> >>> referenced<br>
> >>> > from table "storage_server_connections".<br>
> >>> > Where: SQL statement "DELETE<br>
> >>> > FROM gluster_volumes<br>
> >>> > WHERE id IN (<br>
> >>> > SELECT *<br>
> >>> > FROM fnSplitterUuid(v_volume_ids)<br>
> >>> > )"<br>
> >>> > PL/pgSQL function deleteglustervolumesbyguids(ch<wbr>aracter varying) line<br>
> >>> 3 at<br>
> >>> > SQL statement; nested exception is org.postgresql.util.PSQLExcept<wbr>ion:<br>
> >>> ERROR:<br>
> >>> > update or delete on table "gluster_volumes" violates foreign key<br>
> >>> constraint<br>
> >>> > "fk_storage_connection_to_glus<wbr>tervolume" on table<br>
> >>> > "storage_server_connections"<br>
> >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-<wbr>bd317623ed2d) is still<br>
> >>> referenced<br>
> >>> > from table "storage_server_connections".<br>
> >>> > Where: SQL statement "DELETE<br>
> >>> > FROM gluster_volumes<br>
> >>> > WHERE id IN (<br>
> >>> > SELECT *<br>
> >>> > FROM fnSplitterUuid(v_volume_ids)<br>
> >>> > )"<br>
> >>> > PL/pgSQL function deleteglustervolumesbyguids(ch<wbr>aracter varying) line<br>
> >>> 3 at<br>
> >>> > SQL statement<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.suppo<wbr>rt.SQLErrorCodeSQLExceptionTra<br>
> >>> nslator.doTranslate(SQLErrorCo<wbr>deSQLExceptionTranslator.java:<wbr>243)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.suppo<wbr>rt.AbstractFallbackSQLExceptio<br>
> >>> nTranslator.translate(Abstract<wbr>FallbackSQLExceptionTranslator<wbr>.java:73)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at org.springframework.jdbc.core.<wbr>JdbcTemplate.execute(JdbcTempl<br>
> >>> ate.java:1094)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at org.springframework.jdbc.core.<wbr>JdbcTemplate.call(JdbcTemplate<br>
> >>> .java:1130)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.core.<wbr>simple.AbstractJdbcCall.execut<br>
> >>> eCallInternal(AbstractJdbcCall<wbr>.java:405)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.core.<wbr>simple.AbstractJdbcCall.doExec<br>
> >>> ute(AbstractJdbcCall.java:365)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.core.<wbr>simple.SimpleJdbcCall.execute(<br>
> >>> SimpleJdbcCall.java:198)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.dal.dbbr<wbr>oker.SimpleJdbcCallsHandler.ex<br>
> >>> ecuteImpl(SimpleJdbcCallsHandl<wbr>er.java:135)<br>
> >>> > [dal.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.dal.dbbr<wbr>oker.SimpleJdbcCallsHandler.ex<br>
> >>> ecuteImpl(SimpleJdbcCallsHandl<wbr>er.java:130)<br>
> >>> > [dal.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.dal.dbbr<wbr>oker.SimpleJdbcCallsHandler.ex<br>
> >>> ecuteModification(SimpleJdbcCa<wbr>llsHandler.java:76)<br>
> >>> > [dal.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.dao.glus<wbr>ter.GlusterVolumeDaoImpl.remov<br>
> >>> eAll(GlusterVolumeDaoImpl.java<wbr>:233)<br>
> >>> > [dal.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.bll.glus<wbr>ter.GlusterSyncJob.removeDelet<br>
> >>> edVolumes(GlusterSyncJob.java:<wbr>521)<br>
> >>> > [bll.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.bll.glus<wbr>ter.GlusterSyncJob.refreshVolu<br>
> >>> meData(GlusterSyncJob.java:465<wbr>)<br>
> >>> > [bll.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.bll.glus<wbr>ter.GlusterSyncJob.refreshClus<br>
> >>> terData(GlusterSyncJob.java:13<wbr>3)<br>
> >>> > [bll.jar:]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.bll.glus<wbr>ter.GlusterSyncJob.refreshLigh<br>
> >>> tWeightData(GlusterSyncJob.jav<wbr>a:111)<br>
> >>> > [bll.jar:]<br>
> >>> > at sun.reflect.NativeMethodAccess<wbr>orImpl.invoke0(Native Method)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at<br>
> >>> > sun.reflect.NativeMethodAccess<wbr>orImpl.invoke(NativeMethodAcce<br>
> >>> ssorImpl.java:62)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at<br>
> >>> > sun.reflect.DelegatingMethodAc<wbr>cessorImpl.invoke(DelegatingMe<br>
> >>> thodAccessorImpl.java:43)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at java.lang.reflect.Method.invok<wbr>e(Method.java:498) [rt.jar:1.8.0_121]<br>
> >>> > at<br>
> >>> > org.ovirt.engine.core.utils.ti<wbr>mer.JobWrapper.invokeMethod(Jo<br>
> >>> bWrapper.java:77)<br>
> >>> > [scheduler.jar:]<br>
> >>> > at org.ovirt.engine.core.utils.ti<wbr>mer.JobWrapper.execute(JobWrap<br>
> >>> per.java:51)<br>
> >>> > [scheduler.jar:]<br>
> >>> > at <a href="http://org.quartz.core.JobRunShell.ru">org.quartz.core.JobRunShell.ru</a><wbr>n(JobRunShell.java:213) [quartz.jar:]<br>
> >>> > at java.util.concurrent.Executors<wbr>$RunnableAdapter.call(Executor<br>
> >>> s.java:511)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at java.util.concurrent.FutureTas<wbr>k.run(FutureTask.java:266)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at<br>
> >>> > java.util.concurrent.ThreadPoo<wbr>lExecutor.runWorker(ThreadPool<br>
> >>> Executor.java:1142)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at<br>
> >>> > java.util.concurrent.ThreadPoo<wbr>lExecutor$Worker.run(ThreadPoo<br>
> >>> lExecutor.java:617)<br>
> >>> > [rt.jar:1.8.0_121]<br>
> >>> > at java.lang.Thread.run(Thread.ja<wbr>va:745) [rt.jar:1.8.0_121]<br>
> >>> > Caused by: org.postgresql.util.PSQLExcept<wbr>ion: ERROR: update or delete<br>
> >>> on<br>
> >>> > table "gluster_volumes" violates foreign key constraint<br>
> >>> > "fk_storage_connection_to_glus<wbr>tervolume" on table<br>
> >>> > "storage_server_connections"<br>
> >>> > Detail: Key (id)=(3d8bfa9d-1c83-46ac-b4e9-<wbr>bd317623ed2d) is still<br>
> >>> referenced<br>
> >>> > from table "storage_server_connections".<br>
> >>> > Where: SQL statement "DELETE<br>
> >>> > FROM gluster_volumes<br>
> >>> > WHERE id IN (<br>
> >>> > SELECT *<br>
> >>> > FROM fnSplitterUuid(v_volume_ids)<br>
> >>> > )"<br>
> >>> > PL/pgSQL function deleteglustervolumesbyguids(ch<wbr>aracter varying) line<br>
> >>> 3 at<br>
> >>> > SQL statement<br>
> >>> > at<br>
> >>> > org.postgresql.core.v3.QueryEx<wbr>ecutorImpl.receiveErrorRespons<br>
> >>> e(QueryExecutorImpl.java:2157)<br>
> >>> > at<br>
> >>> > org.postgresql.core.v3.QueryEx<wbr>ecutorImpl.processResults(Quer<br>
> >>> yExecutorImpl.java:1886)<br>
> >>> > at<br>
> >>> > org.postgresql.core.v3.QueryEx<wbr>ecutorImpl.execute(QueryExecut<br>
> >>> orImpl.java:255)<br>
> >>> > at<br>
> >>> > org.postgresql.jdbc2.AbstractJ<wbr>dbc2Statement.execute(Abstract<br>
> >>> Jdbc2Statement.java:555)<br>
> >>> > at<br>
> >>> > org.postgresql.jdbc2.AbstractJ<wbr>dbc2Statement.executeWithFlags<br>
> >>> (AbstractJdbc2Statement.java:4<wbr>17)<br>
> >>> > at<br>
> >>> > org.postgresql.jdbc2.AbstractJ<wbr>dbc2Statement.execute(Abstract<br>
> >>> Jdbc2Statement.java:410)<br>
> >>> > at<br>
> >>> > <a href="http://org.jboss.jca.adapters.jdbc.Ca">org.jboss.jca.adapters.jdbc.Ca</a><wbr>chedPreparedStatement.execute(<br>
> >>> CachedPreparedStatement.java:3<wbr>03)<br>
> >>> > at<br>
> >>> > org.jboss.jca.adapters.jdbc.Wr<wbr>appedPreparedStatement.execute<br>
> >>> (WrappedPreparedStatement.java<wbr>:442)<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.core.<wbr>JdbcTemplate$6.doInCallableSta<br>
> >>> tement(JdbcTemplate.java:1133)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at<br>
> >>> > org.springframework.jdbc.core.<wbr>JdbcTemplate$6.doInCallableSta<br>
> >>> tement(JdbcTemplate.java:1130)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > at org.springframework.jdbc.core.<wbr>JdbcTemplate.execute(JdbcTempl<br>
> >>> ate.java:1078)<br>
> >>> > [spring-jdbc.jar:4.2.4.RELEASE<wbr>]<br>
> >>> > ... 24 more<br>
> >>> ><br>
> >>> ><br>
> >>> ><br>
> >>><br>
> >>> This is a side effect volume deletion in the gluster side. Looks like<br>
> >>> you have storage domains created using those volumes.<br>
> >>><br>
> >>> > On Wed, Mar 1, 2017 at 9:49 AM, Arman Khalatyan < <a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a> ><br>
> >>> wrote:<br>
> >>> ><br>
> >>> ><br>
> >>> ><br>
> >>> > Hi,<br>
> >>> > I just tested power cut on the test system:<br>
> >>> ><br>
> >>> > Cluster with 3-Hosts each host has 4TB localdisk with zfs on it<br>
> >>> /zhost/01/glu<br>
> >>> > folder as a brick.<br>
> >>> ><br>
> >>> > Glusterfs was with replicated to 3 disks with arbiter. So far so good.<br>
> >>> Vm was<br>
> >>> > up an running with 5oGB OS disk: dd was showing 100-70MB/s performance<br>
> >>> with<br>
> >>> > the Vm disk.<br>
> >>> > I just simulated disaster powercut: with ipmi power-cycle all 3 hosts<br>
> >>> same<br>
> >>> > time.<br>
> >>> > the result is all hosts are green up and running but bricks are down.<br>
> >>> > in the processes I can see:<br>
> >>> > ps aux | grep gluster<br>
> >>> > root 16156 0.8 0.0 475360 16964 ? Ssl 08:47 0:00 /usr/sbin/glusterd -p<br>
> >>> > /var/run/glusterd.pid --log-level INFO<br>
> >>> ><br>
> >>> > What happened with my volume setup??<br>
> >>> > Is it possible to recover it??<br>
> >>> > [root@clei21 ~]# gluster peer status<br>
> >>> > Number of Peers: 2<br>
> >>> ><br>
> >>> > Hostname: clei22.cls<br>
> >>> > Uuid: 96b52c7e-3526-44fd-af80-14a307<wbr>3ebac2<br>
> >>> > State: Peer in Cluster (Connected)<br>
> >>> > Other names:<br>
> >>> > 192.168.101.40<br>
> >>> > 10.10.10.44<br>
> >>> ><br>
> >>> > Hostname: clei26.cls<br>
> >>> > Uuid: c9fab907-5053-41a8-a1fa-d069f3<wbr>4e42dc<br>
> >>> > State: Peer in Cluster (Connected)<br>
> >>> > Other names:<br>
> >>> > 10.10.10.41<br>
> >>> > [root@clei21 ~]# gluster volume info<br>
> >>> > No volumes present<br>
> >>> > [root@clei21 ~]#<br>
> >>><br>
> >>> I not sure why all volumes are getting deleted after reboot. Do you see<br>
> >>> any vol files under the directory /var/lib/glusterd/vols/?. Also<br>
> >>> /var/log/glusterfs/cmd_history<wbr>.log should have all the gluster commands<br>
> >>> executed.<br>
> >>><br>
> >>> Regards,<br>
> >>> Ramesh<br>
> >>><br>
> >>> ><br>
> >>> ><br>
> >>> ><br>
> >>> > ______________________________<wbr>_________________<br>
> >>> > Users mailing list<br>
> >>> > <a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
> >>> > <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
> >>> ><br>
> >>><br>
> >><br>
> >><br>
> ><br>
><br>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>