[ovirt-users] Re: [Suspected Spam] Re: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

6 Feb 2020

      On February 6, 2020 9:42:13 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:
...
Hi Strahil,
The 0-xlator-message still occurs, but not frequent. Yesterdag a couple
of times, but the 4th there were no entries at all.
What I did find out, was that the unsynced entries belonged to
VM-images which were on specific hosts.
Yesterday I migrated them all to other hosts and the unsynced entries
were gone except for 3. After a 'stat' of those files/directories, they
were gone too.
I think I can migrate the remaining hosts now. An option would be to
move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have
spare disks. What do you think?
Regards,
Bertjan
...
Did you manage to fix the issue with the 0-xlator ? If yes, most
...
the issue will be OK. Yet 'probably' doesn't meant that they
'will' be OK.
   If it was my lab - I would go ahead only if the 0-xlator issue is
   over.Yet, a lab is a different thing than prod - so it is your
sole
   decision. 
   Did you test the upgrade prior moving to Prod ?
   About the hooks - I had such issue before and I had to reinstall
...
gluster rpms to solve it.
   Best Regards,
   Strahil Nikolov
   В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate,
B.J.
   <b.j.goorkate@umcutrecht.nl> написа:
   Hi Strahil,
Thanks for your time so far!
The packages seem fine on all of the 3 nodes.
   Only /var/lib/glusterd/glusterd.info is modified and on the not
yet
   upgraded nodes these files are missing:
missing    /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh
   missing    /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh
   missing    /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh
   missing    /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh
   missing    /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh
But that doesn't seem relevant...
I stopped a couple of virtual machines with image files with
unsynched
   entries. When they were turned off, I couldn't find them anymore
in the
   unsyced entries list. When I turned them on again, some
re-appeared, some
   didn't.
I really don't know where to look next.
The big question is: will the problems be resolved when I upgrade
...
remaining nodes, or will it get worse?
Regards,
Bertjan
...
It seems that something went "off".
   That '0-xlator: dlsym(xlator_api) missing:
   /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api' is really worrisome.
   I think that you might have a bad package and based on the
info it
   could
   be glusterfs-libs which should provide
/usr/lib64/libgfrpc.so.0 . I'm
   currently on gluster v7.0 and I can't check it on my
installation.
   Run a for loop to check the rpms:
   for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V
$i';
   echo;echo; done
   Most probably you can safely redo (on the upgraded node) the
last yum
   transaction:
   yum history
   yum history info <ID> -> verify the gluster packages were
installed
   here
   yum history redo <ID>
   If kernel, glibc, systemd were not in this transaction , you
can stop
   gluster and start it again:
   Node in maintenance in oVirt
   systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd
supervdsmd
   systemctl stop sanlock
   systemctl stop glusterd
   /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh
   And then power up again:
   systemctl start glusterd
   gluster volume status -> check all connected
   systemctl start sanlock supervdsmd vdsmd
   systemctl start ovirt-ha-broker ovirt-ha-agent
   Check situation.
   Yet, you need to make gluster stop complaining , before you
can take
   care
   of the heal. Usually 'rsync' is my best friend - but this is
when
   gluster
   is working normally - and your case is far from normal.
   If redo doesn't work for you -> try the "yum history
rollback" to
   recover
   to last good state.
   I think that 'BOOM Boot Manager' is best in such cases.
   Note: Never take any of my words for granted. I'm not running
oVirt
   in
   production and some of my methods might not be OK for your
   environment.
   Best Regards,
   Strahil Nikolov
   В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark
Lamers
   <mark.r.lamers@gmail.com> написа:
Hi Strahil,
Bertjan is not in the office today, so I will reply if okay
with you.
First I like to describe the status of our network
There are three bricks:
and 5 nodes:
Every host has a management and migrate vlan iface  on a
different
   bond
   iface.
The last octet of the ipaddress is similar
The output from 'gluster volume heal <volname> info' gives a
long
On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote:
   list of
...
shards from all three nodes, the file is attached as
   'gluster_volume_heal_info.txt'. The node without shards in
...
is the
...
updated node to gluster6, oVirt 4.3. That is curious i think.
Further more I find the following errors ' E '  in the
glusterd.log
   of the
   upgrated host:
[2020-01-26 17:40:46.147730] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-26 22:47:16.655651] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-27 07:07:51.815490] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-27 18:28:14.953974] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-27 18:58:22.629457] E [MSGID: 106118]
   [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-27 18:58:22.629595] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-27 18:58:22.756430] E [MSGID: 106117]
   [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management:
Unable to
   release lock for vmstore1
   [2020-01-27 18:58:22.756581] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 05:31:52.427196] E [MSGID: 106118]
   [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 05:31:52.427315] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 05:31:52.537799] E [MSGID: 106115]
   [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management:
Locking
   failed on 10.13.250.14. Please check log file for details.
   [2020-01-28 05:31:52.537973] E [MSGID: 106150]
   [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management:
Locking
   Peers
   Failed.
   [2020-01-28 05:31:52.539620] E [MSGID: 106117]
   [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management:
Unable to
   release lock for vmstore1
   [2020-01-28 05:31:52.539759] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 05:31:52.539937] E [MSGID: 106115]
   [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management:
   Unlocking
   failed on 10.13.250.14. Please check log file for details.
   [2020-01-28 05:31:52.540446] E [MSGID: 106151]
   [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management:
Failed to
   unlock
   on some peer(s)
   [2020-01-28 08:51:45.638694] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 08:52:45.709950] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 08:54:12.455555] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-28 08:54:16.214779] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 08:57:59.496842] E [MSGID: 106118]
   [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 08:57:59.496905] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 08:57:59.505119] E
[rpcsvc.c:1577:rpcsvc_submit_generic]
   0-rpc-service: failed to submit message (XID: 0x2, Program:
GlusterD
   svc
   cli read-only, ProgVers: 2, Proc: 5) to rpc-transport
   (socket.management)
   [2020-01-28 08:57:59.505135] E [MSGID: 106430]
   [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd:
Reply
   submission
   failed
   [2020-01-28 08:57:59.647456] E [MSGID: 106117]
   [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management:
Unable to
   release lock for vmstore1
   [2020-01-28 08:57:59.647508] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: 1
   [2020-01-28 08:57:59.654929] E
[rpcsvc.c:1577:rpcsvc_submit_generic]
   0-rpc-service: failed to submit message (XID: 0x2, Program:
GlusterD
   svc
   cli, ProgVers: 2, Proc: 27) to rpc-transport
(socket.management)
   [2020-01-28 08:57:59.654943] E [MSGID: 106430]
   [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd:
Reply
   submission
   failed
   [2020-01-28 09:07:34.941350] E [MSGID: 106118]
   [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 09:07:34.941391] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 09:07:35.042466] E [MSGID: 106117]
   [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management:
Unable to
   release lock for vmstore1
   [2020-01-28 09:07:35.042510] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 09:13:32.329172] E [MSGID: 106244]
[glusterd.c:1785:init]
   0-management: creation of 1 listeners failed, continuing with
   succeeded
   transport
   [2020-01-28 09:13:44.024431] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote:
probably
the
the two
the list
dlsym(xlator_api)
dlsym(xlator_api)
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-28 09:45:46.347499] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 09:46:45.837466] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 09:47:45.976186] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 09:48:24.976568] E
[rpc-clnt.c:346:saved_frames_unwind]
   (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb]
   (-->
...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (-->
   /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]
...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] )))))
   0-management:
   forced unwinding frame type(glusterd mgmt v3) op(--(1))
called at
   2020-01-28 09:44:45.743365 (xid=0x52)
   [2020-01-28 09:48:24.976640] E [MSGID: 106115]
   [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management:
Locking
   failed on 10.13.250.11. Please check log file for details.
   [2020-01-28 09:48:24.976743] E [MSGID: 106150]
   [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management:
Locking
   Peers
   Failed.
   [2020-01-28 09:48:24.977361] E
[rpc-clnt.c:346:saved_frames_unwind]
   (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb]
   (-->
...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (-->
   /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->
...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] )))))
   0-management:
   forced unwinding frame type(glusterd mgmt) op(--(3)) called
at
   2020-01-28
   09:46:03.586874 (xid=0x53)
   [2020-01-28 09:48:24.977417] E [MSGID: 106152]
   [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging
failed
   on
   10.13.250.11. Please check log file for details.
   [2020-01-28 09:48:24.977631] E
[rpcsvc.c:1577:rpcsvc_submit_generic]
   0-rpc-service: failed to submit message (XID: 0x2, Program:
GlusterD
   svc
   cli, ProgVers: 2, Proc: 22) to rpc-transport
(socket.management)
   [2020-01-28 09:48:24.977664] E [MSGID: 106430]
   [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd:
Reply
   submission
   failed
   [2020-01-28 09:48:24.977744] E [MSGID: 106115]
   [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management:
   Unlocking
   failed on 10.13.250.13. Please check log file for details.
   [2020-01-28 09:48:24.977861] E [MSGID: 106115]
   [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management:
   Unlocking
   failed on 10.13.250.15. Please check log file for details.
   [2020-01-28 09:48:25.012037] E [MSGID: 106115]
   [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management:
   Unlocking
   failed on 10.13.250.14. Please check log file for details.
   [2020-01-28 09:48:25.012148] E [MSGID: 106151]
   [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management:
Failed to
   unlock
   on some peer(s)
   [2020-01-28 09:48:25.012315] E [MSGID: 106117]
   [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management:
Unable to
   release lock for vmstore1
   [2020-01-28 09:48:25.012425] E
[rpcsvc.c:1577:rpcsvc_submit_generic]
   0-rpc-service: failed to submit message (XID: 0x2, Program:
GlusterD
   svc
   cli, ProgVers: 2, Proc: 27) to rpc-transport
(socket.management)
   [2020-01-28 09:48:25.012452] E [MSGID: 106430]
   [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd:
Reply
   submission
   failed
   [2020-01-28 09:50:46.105492] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-28 10:13:16.199982] E [MSGID: 106118]
   [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 10:13:16.200078] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 10:13:16.345918] E [MSGID: 106117]
   [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management:
Unable to
   release lock for vmstore1
   [2020-01-28 10:13:16.346012] E [MSGID: 106376]
   [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler
   returned: -1
   [2020-01-28 12:42:46.212783] E [MSGID: 106118]
   [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management:
Unable to
   acquire lock for vmstore1
   [2020-01-28 13:25:49.857368] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-29 13:26:03.031179] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-30 12:33:18.590063] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-30 13:34:57.027468] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-30 15:08:10.814931] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
   [2020-01-31 09:40:22.725825] E [MSGID: 101097]
   [xlator.c:218:xlator_volopt_dynload] 0-xlator:
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]
dlsym(xlator_api)
dlsym(xlator_api)
dlsym(xlator_api)
dlsym(xlator_api)
dlsym(xlator_api)
dlsym(xlator_api)
dlsym(xlator_api)
...
missing:
...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined
symbol:
   xlator_api
I 'm curious for your opinion of the data sofar.
Love to here from you (or ather members of the list)
Regards Mark Lamers
_______________________________________________
   Users mailing list -- users@ovirt.org
   To unsubscribe send an email to users-leave@ovirt.org
   Privacy Statement: https://www.ovirt.org/site/privacy-policy/
   oVirt Code of Conduct:
   https://www.ovirt.org/community/about/community-guidelines/
   List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...
...
...
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
   https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...
...
------------------------------------------------------------------------------
...
De informatie opgenomen in dit bericht kan vertrouwelijk zijn en
is
...
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht
onterecht
   ontvangt, wordt u verzocht de inhoud niet te gebruiken en de
afzender
   direct
   te informeren door het bericht te retourneren. Het Universitair
Medisch
   Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin
van de
   W.H.W.
   (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat
geregistreerd
   bij
   de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.
Denk s.v.p aan het milieu voor u deze e-mail afdrukt.
------------------------------------------------------------------------------
...
This message may contain confidential information and is intended
   exclusively
   for the addressee. If you receive this message unintentionally,
please do
...
not
   use the contents but notify the sender immediately by return
e-mail.
   University
   Medical Center Utrecht is a legal person by public law and is
registered
   at
   the Chamber of Commerce for Midden-Nederland under no. 30244197.
Please consider the environment before printing this e-mail.
Hi Bertjan,

If you healed everything - I think there  is no need to migrate bricks.
Just a precaution - keep a test VM for testing power off and power  on after each node is upgraded.

What version are you upgrading to ?

Best Regards,
Strahil Nikolov