
On February 6, 2020 9:42:13 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:
Hi Strahil,
The 0-xlator-message still occurs, but not frequent. Yesterdag a couple of times, but the 4th there were no entries at all.
What I did find out, was that the unsynced entries belonged to VM-images which were on specific hosts.
Yesterday I migrated them all to other hosts and the unsynced entries were gone except for 3. After a 'stat' of those files/directories, they were gone too.
I think I can migrate the remaining hosts now. An option would be to move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have spare disks. What do you think?
Regards,
Bertjan
Did you manage to fix the issue with the 0-xlator ? If yes, most
the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK. If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall
gluster rpms to solve it. Best Regards, Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil,
Thanks for your time so far!
The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing:
missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh
But that doesn't seem relevant...
I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't.
I really don't know where to look next.
The big question is: will the problems be resolved when I upgrade
remaining nodes, or will it get worse?
Regards,
Bertjan
It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:
Hi Strahil,
Bertjan is not in the office today, so I will reply if okay with you.
First I like to describe the status of our network
There are three bricks:
and 5 nodes:
Every host has a management and migrate vlan iface on a different bond iface.
The last octet of the ipaddress is similar
The output from 'gluster volume heal <volname> info' gives a long
On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote: list of
shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in
is the
updated node to gluster6, oVirt 4.3. That is curious i think.
Further more I find the following errors ' E ' in the glusterd.log of the upgrated host:
[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote: probably the the two the list dlsym(xlator_api) dlsym(xlator_api)
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api)
missing:
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
I 'm curious for your opinion of the data sofar.
Love to here from you (or ather members of the list)
Regards Mark Lamers
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...
------------------------------------------------------------------------------
De informatie opgenomen in dit bericht kan vertrouwelijk zijn en
is
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.
Denk s.v.p aan het milieu voor u deze e-mail afdrukt.
------------------------------------------------------------------------------
This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally,
please do
not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.
Please consider the environment before printing this e-mail.
Hi Bertjan, If you healed everything - I think there is no need to migrate bricks. Just a precaution - keep a test VM for testing power off and power on after each node is upgraded. What version are you upgrading to ? Best Regards, Strahil Nikolov