It seems that something went "off".

That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome.
I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation.

Run a for loop to check the rpms:
for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done

Most probably you can safely redo (on the upgraded node) the last yum transaction:
yum history 
yum history info <ID> -> verify the gluster packages were installed here
yum history redo <ID> 

If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again:

Node in maintenance in oVirt
systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd 
systemctl stop sanlock
systemctl stop glusterd
/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh

And then power up again:
systemctl start glusterd
gluster volume status -> check all connected
systemctl start sanlock supervdsmd vdsmd
systemctl start ovirt-ha-broker ovirt-ha-agent

Check situation.

Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal.
If redo doesn't work for you -> try the "yum history rollback" to recover to last good state.
I think that 'BOOM Boot Manager' is best in such cases.

Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment.

Best Regards,
Strahil Nikolov





В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:


Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:


Every host has a management and migrate vlan iface  on a different bond iface.

The last octet of the ipaddress is similar


The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in the list is the updated node to gluster6, oVirt 4.3. That is curious i think.

Further more I find the following errors ' E '  in the glusterd.log of the upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1
[2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1
[2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details.
[2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed.
[2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1
[2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details.
[2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s)
[2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management)
[2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed
[2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1
[2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1
[2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed
[2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1
[2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52)
[2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details.
[2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed.
[2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53)
[2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details.
[2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management)
[2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed
[2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details.
[2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details.
[2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details.
[2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s)
[2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1
[2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed
[2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1
[2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1
[2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1
[2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api
[2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.


Love to here from you (or ather members of the list)

Regards Mark Lamers








_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGOT63J2Y2A5X7VXXW3H3/