Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

newer
Re: Add LDAP user : ERROR: null...

older
Re: Engine installation failed at...

Goorkate, B.J.

27 Jan 2020 27 Jan '20

4:17 p.m.

Hi all, I'm in the process of upgrading oVirt-nodes from 4.2 to 4.3. After upgrading the first of 3 oVirt/gluster nodes, there are between 600-1200 unsynced entries for a week now on 1 upgraded node and one not-yet-upgraded node. The third node (also not-yet-upgraded) says it's OK (no unsynced entries). The cluster doesn't seem to be very busy, but somehow self-heal doesn't complete. Is this because of different gluster versions across the nodes and will it resolve as soon as I upgraded all nodes? Since it's our production cluster, I don't want to take any risk... Does anybody recognise this problem? Of course I can provide more information if necessary. Any hints on troubleshooting the unsynced entries are more than welcome! Thanks in advance! Regards, Bertjan ------------------------------------------------------------------------------ De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197. Denk s.v.p aan het milieu voor u deze e-mail afdrukt. ------------------------------------------------------------------------------ This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally, please do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197. Please consider the environment before printing this e-mail.

Show replies by date

Strahil Nikolov

27 Jan 27 Jan

8:11 p.m.

New subject: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

On January 27, 2020 4:17:26 PM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...

Hi all,

I'm in the process of upgrading oVirt-nodes from 4.2 to 4.3.

After upgrading the first of 3 oVirt/gluster nodes, there are between 600-1200 unsynced entries for a week now on 1 upgraded node and one not-yet-upgraded node. The third node (also not-yet-upgraded) says it's OK (no unsynced entries).

The cluster doesn't seem to be very busy, but somehow self-heal doesn't complete.

Is this because of different gluster versions across the nodes and will it resolve as soon as I upgraded all nodes? Since it's our production cluster, I don't want to take any risk...

Does anybody recognise this problem? Of course I can provide more information if necessary.

Any hints on troubleshooting the unsynced entries are more than welcome!

Thanks in advance!

Regards,

Bertjan

------------------------------------------------------------------------------

De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally, please do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OSF5DPTRS4WS3G...

I don't wabt to scare you, but I don't think it's related to the different versions. Have yiu tried the following: 1. Run 'gluster volume heal <VOLNAME> full' 2. Run a stat to force an update from client side (wait for the full heal to finish). find /rhev/data-center/mnt/glusterSD -iname '*' -exec stat {} \; Best Regards, Strahil Nikolov

Goorkate, B.J.

30 Jan 30 Jan

12:20 p.m.

New subject: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Hi, Thanks for the info! I tried the full heal and the stat, but the unsynced entries still remain. Just to be sure: the find/stat command needs to be done on files in the fuse-mount, right? Or on the brick-mount itself? And other than 'gluster volume heal vmstore1 statistics', I cannot find a way to ensure that the full heal really started, let alone if it finished correctly... Regards, Bertjan On Mon, Jan 27, 2020 at 08:11:14PM +0200, Strahil Nikolov wrote:

...

On January 27, 2020 4:17:26 PM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...
Hi all,

I'm in the process of upgrading oVirt-nodes from 4.2 to 4.3.

After upgrading the first of 3 oVirt/gluster nodes, there are between 600-1200 unsynced entries for a week now on 1 upgraded node and one not-yet-upgraded node. The third node (also not-yet-upgraded) says it's OK (no unsynced entries).

The cluster doesn't seem to be very busy, but somehow self-heal doesn't complete.

Is this because of different gluster versions across the nodes and will it resolve as soon as I upgraded all nodes? Since it's our production cluster, I don't want to take any risk...

Does anybody recognise this problem? Of course I can provide more information if necessary.

Any hints on troubleshooting the unsynced entries are more than welcome!

Thanks in advance!

Regards,

Bertjan

------------------------------------------------------------------------------

De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally, please do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OSF5DPTRS4WS3G...

I don't wabt to scare you, but I don't think it's related to the different versions.

Have yiu tried the following: 1. Run 'gluster volume heal <VOLNAME> full' 2. Run a stat to force an update from client side (wait for the full heal to finish). find /rhev/data-center/mnt/glusterSD -iname '*' -exec stat {} \;

Best Regards, Strahil Nikolov

Strahil Nikolov

10:02 p.m.

New subject: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

On January 30, 2020 12:20:06 PM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...

Hi,

Thanks for the info!

I tried the full heal and the stat, but the unsynced entries still remain.

Just to be sure: the find/stat command needs to be done on files in the fuse-mount, right? Or on the brick-mount itself?

And other than 'gluster volume heal vmstore1 statistics', I cannot find a way to ensure that the full heal really started, let alone if it finished correctly...

Regards,

Bertjan

...
On January 27, 2020 4:17:26 PM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...
Hi all,

I'm in the process of upgrading oVirt-nodes from 4.2 to 4.3.

After upgrading the first of 3 oVirt/gluster nodes, there are between 600-1200 unsynced entries for a week now on 1 upgraded node and one not-yet-upgraded node. The third node (also not-yet-upgraded) says it's OK (no unsynced entries).

The cluster doesn't seem to be very busy, but somehow self-heal doesn't complete.

Is this because of different gluster versions across the nodes and will it resolve as soon as I upgraded all nodes? Since it's our

...
...
cluster, I don't want to take any risk...

Does anybody recognise this problem? Of course I can provide more information if necessary.

Any hints on troubleshooting the unsynced entries are more than welcome!

Thanks in advance!

Regards,

Bertjan

------------------------------------------------------------------------------

...
De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de

afzender

...
direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

...
This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally,

On Mon, Jan 27, 2020 at 08:11:14PM +0200, Strahil Nikolov wrote: production please

...
...
do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/OSF5DPTRS4WS3G...

I don't wabt to scare you, but I don't think it's related to the different versions.

Have yiu tried the following: 1. Run 'gluster volume heal <VOLNAME> full' 2. Run a stat to force an update from client side (wait for the full heal to finish). find /rhev/data-center/mnt/glusterSD -iname '*' -exec stat {} \;

Best Regards, Strahil Nikolov

Yes, the stat is against the FUSE , not the bricks. What is the output of 'gluster volume heal <volname> info' ? Best Regards, Strahil Nikolov

Mark Lamers

31 Jan 31 Jan

11:55 a.m.

New subject: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Hi Strahil, Bertjan is not in the office today, so I will reply if okay with you. First I like to describe the status of our network There are three bricks: and 5 nodes: Every host has a management and migrate vlan iface on a different bond iface. The last octet of the ipaddress is similar The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in the list is the updated node to gluster6, oVirt 4.3. That is curious i think. Further more I find the following errors ' E ' in the glusterd.log of the upgrated host: [2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api I 'm curious for your opinion of the data sofar. Love to here from you (or ather members of the list) Regards Mark Lamers

Strahil Nikolov

2 Feb 2 Feb

10:06 p.m.

New subject: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome.I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms:for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction:yum history yum history info <ID> -> verify the gluster packages were installed hereyum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirtsystemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlocksystemctl stop glusterd/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again:systemctl start glusterdgluster volume status -> check all connectedsystemctl start sanlock supervdsmd vdsmdsystemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal.If redo doesn't work for you -> try the "yum history rollback" to recover to last good state.I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards,Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа: Hi Strahil, Bertjan is not in the office today, so I will reply if okay with you. First I like to describe the status of our network There are three bricks: and 5 nodes: Every host has a management and migrate vlan iface on a different bond iface. The last octet of the ipaddress is similar The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in the list is the updated node to gluster6, oVirt 4.3. That is curious i think. Further more I find the following errors ' E ' in the glusterd.log of the upgrated host: [2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api I 'm curious for your opinion of the data sofar. Love to here from you (or ather members of the list) Regards Mark Lamers _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

Goorkate, B.J.

4 Feb 4 Feb

4:35 p.m.

New subject: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Hi Strahil, Thanks for your time so far! The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing: missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh But that doesn't seem relevant... I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't. I really don't know where to look next. The big question is: will the problems be resolved when I upgrade the two remaining nodes, or will it get worse? Regards, Bertjan On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote:

...

It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in the list is the updated node to gluster6, oVirt 4.3. That is curious i think.

Further more I find the following errors ' E ' in the glusterd.log of the upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

------------------------------------------------------------------------------ De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197. Denk s.v.p aan het milieu voor u deze e-mail afdrukt. ------------------------------------------------------------------------------ This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally, please do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197. Please consider the environment before printing this e-mail.

Strahil Nikolov

8:40 p.m.

New subject: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Did you manage to fix the issue with the 0-xlator ? If yes, most probably the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK.If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall the gluster rpms to solve it. Best Regards,Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil, Thanks for your time so far! The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing: missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh But that doesn't seem relevant... I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't. I really don't know where to look next. The big question is: will the problems be resolved when I upgrade the two remaining nodes, or will it get worse? Regards, Bertjan On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote:

...

It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in the list is the updated node to gluster6, oVirt 4.3. That is curious i think.

Further more I find the following errors ' E ' in the glusterd.log of the upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

Goorkate, B.J.

6 Feb 6 Feb

9:42 a.m.

New subject: [Suspected Spam] Re: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Hi Strahil, The 0-xlator-message still occurs, but not frequent. Yesterdag a couple of times, but the 4th there were no entries at all. What I did find out, was that the unsynced entries belonged to VM-images which were on specific hosts. Yesterday I migrated them all to other hosts and the unsynced entries were gone except for 3. After a 'stat' of those files/directories, they were gone too. I think I can migrate the remaining hosts now. An option would be to move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have spare disks. What do you think? Regards, Bertjan On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote:

...

Did you manage to fix the issue with the 0-xlator ? If yes, most probably the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK. If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall the gluster rpms to solve it. Best Regards, Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil,

Thanks for your time so far!

The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing:

missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh

But that doesn't seem relevant...

I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't.

I really don't know where to look next.

The big question is: will the problems be resolved when I upgrade the two remaining nodes, or will it get worse?

Regards,

Bertjan

...
It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long

On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote: list of

...
shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in the list is the updated node to gluster6, oVirt 4.3. That is curious i think.

Further more I find the following errors ' E ' in the glusterd.log of the upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

------------------------------------------------------------------------------

De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally, please do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail.

Strahil Nikolov

12:43 p.m.

New subject: [Suspected Spam] Re: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

On February 6, 2020 9:42:13 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...

Hi Strahil,

The 0-xlator-message still occurs, but not frequent. Yesterdag a couple of times, but the 4th there were no entries at all.

What I did find out, was that the unsynced entries belonged to VM-images which were on specific hosts.

Yesterday I migrated them all to other hosts and the unsynced entries were gone except for 3. After a 'stat' of those files/directories, they were gone too.

I think I can migrate the remaining hosts now. An option would be to move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have spare disks. What do you think?

Regards,

Bertjan

...
Did you manage to fix the issue with the 0-xlator ? If yes, most

...
the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK. If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall

...
gluster rpms to solve it. Best Regards, Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil,

Thanks for your time so far!

The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing:

missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh

But that doesn't seem relevant...

I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't.

I really don't know where to look next.

The big question is: will the problems be resolved when I upgrade

...
remaining nodes, or will it get worse?

Regards,

Bertjan

...
It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long

On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote: list of

...
shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in

...
is the

...
updated node to gluster6, oVirt 4.3. That is curious i think.

Further more I find the following errors ' E ' in the glusterd.log of the upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote: probably the the two the list dlsym(xlator_api) dlsym(xlator_api)

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api)

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...
...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

...
------------------------------------------------------------------------------

...
De informatie opgenomen in dit bericht kan vertrouwelijk zijn en

is

...
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

...
This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally,

please do

...
not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail.

Hi Bertjan, If you healed everything - I think there is no need to migrate bricks. Just a precaution - keep a test VM for testing power off and power on after each node is upgraded. What version are you upgrading to ? Best Regards, Strahil Nikolov

Goorkate, B.J.

7 Feb 7 Feb

11:28 a.m.

New subject: [Suspected Spam] Re: [Suspected Spam] Re: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Hi, There is just one unsynced entry which comes back every time: the dom_md/ids file. When I add/delete a VM it's gone for a short while, but then it reappears. I did a brick replace to a node which is already upgraded. After that I'll do the same with the brick on the remaining not-yet-upgraded node, just to be sure. Luckily we have the hardware for it. We're migrating from 4.2.8 (I think) to 4.3.7.2. Regards, Bertjan On Thu, Feb 06, 2020 at 12:43:48PM +0200, Strahil Nikolov wrote:

...

On February 6, 2020 9:42:13 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...
Hi Strahil,

The 0-xlator-message still occurs, but not frequent. Yesterdag a couple of times, but the 4th there were no entries at all.

What I did find out, was that the unsynced entries belonged to VM-images which were on specific hosts.

Yesterday I migrated them all to other hosts and the unsynced entries were gone except for 3. After a 'stat' of those files/directories, they were gone too.

I think I can migrate the remaining hosts now. An option would be to move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have spare disks. What do you think?

Regards,

Bertjan

...
Did you manage to fix the issue with the 0-xlator ? If yes, most

...
the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK. If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall

...
gluster rpms to solve it. Best Regards, Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil,

Thanks for your time so far!

The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing:

missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh

But that doesn't seem relevant...

I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't.

I really don't know where to look next.

The big question is: will the problems be resolved when I upgrade

...
remaining nodes, or will it get worse?

Regards,

Bertjan

...
It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node) the last yum transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long

On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote: list of

...
shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in

...
is the

...
updated node to gluster6, oVirt 4.3. That is curious i think.

Further more I find the following errors ' E ' in the glusterd.log of the upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote: probably the the two the list dlsym(xlator_api) dlsym(xlator_api)

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987] dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api) dlsym(xlator_api)

...
missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...
...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

...
------------------------------------------------------------------------------

...
De informatie opgenomen in dit bericht kan vertrouwelijk zijn en

is

...
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

...
This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally,

please do

...
not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail.

Hi Bertjan,

If you healed everything - I think there is no need to migrate bricks. Just a precaution - keep a test VM for testing power off and power on after each node is upgraded.

What version are you upgrading to ?

Best Regards, Strahil Nikolov

Strahil Nikolov

3:56 p.m.

New subject: [Suspected Spam] Re: [Suspected Spam] Re: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

On February 7, 2020 11:28:17 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...

Hi,

There is just one unsynced entry which comes back every time: the dom_md/ids file.

When I add/delete a VM it's gone for a short while, but then it reappears.

I did a brick replace to a node which is already upgraded. After that I'll do the same with the brick on the remaining not-yet-upgraded node, just to be sure. Luckily we have the hardware for it.

We're migrating from 4.2.8 (I think) to 4.3.7.2.

Regards,

Bertjan

...
On February 6, 2020 9:42:13 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...
Hi Strahil,

The 0-xlator-message still occurs, but not frequent. Yesterdag a couple of times, but the 4th there were no entries at all.

What I did find out, was that the unsynced entries belonged to VM-images which were on specific hosts.

Yesterday I migrated them all to other hosts and the unsynced entries were gone except for 3. After a 'stat' of those files/directories,

...
...
were gone too.

I think I can migrate the remaining hosts now. An option would be to move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have spare disks. What do you think?

Regards,

Bertjan

...
Did you manage to fix the issue with the 0-xlator ? If yes, most

...
the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK. If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall

...
gluster rpms to solve it. Best Regards, Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil,

Thanks for your time so far!

The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing:

missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh

But that doesn't seem relevant...

I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't.

I really don't know where to look next.

The big question is: will the problems be resolved when I upgrade

On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote: probably the the two

...
remaining nodes, or will it get worse?

Regards,

Bertjan

On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote:

...
It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node)

...
...
...
...
transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in

last yum the list

...
is the

...
updated node to gluster6, oVirt 4.3. That is curious i

...
...
...
...
Further more I find the following errors ' E ' in the

...
upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

glusterd.log of the dlsym(xlator_api) missing: dlsym(xlator_api) missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct:

https://www.ovirt.org/community/about/community-guidelines/

...
List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...
...
...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

...
...
------------------------------------------------------------------------------

...
...
De informatie opgenomen in dit bericht kan vertrouwelijk zijn

en is

...
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr.

...
...
...
Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

...
...
This message may contain confidential information and is

...
exclusively for the addressee. If you receive this message unintentionally,

intended please do

...
not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no.

On Thu, Feb 06, 2020 at 12:43:48PM +0200, Strahil Nikolov wrote: they the think. 30244197. 30244197.

...
...
...
Please consider the environment before printing this e-mail.

Hi Bertjan,

If you healed everything - I think there is no need to migrate bricks. Just a precaution - keep a test VM for testing power off and power on after each node is upgraded.

What version are you upgrading to ?

Best Regards, Strahil Nikolov

Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code o

You can check on the bricks which has the newest (usually the diff is in the timestamp inside) and copy that on the rest of the bricks. Then a heal can clear the pending heal entry. Sadly this happens for some time and I never had such issues with oVirt 4.2 on gluster v3. Best Regards, Strahil Nikolov

Goorkate, B.J.

20 Feb 20 Feb

1 p.m.

New subject: [Suspected Spam] Re: Re: [Suspected Spam] Re: [Suspected Spam] Re: [Suspected Spam] Re: Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

Hi, Sorry for the delayed response... After we migrated the gluster bricks to already upgraded nodes, everything was fine and it seems the cluster is healthy again. Thanks for the help! Regards, Bertjan On Fri, Feb 07, 2020 at 03:56:35PM +0200, Strahil Nikolov wrote:

...

On February 7, 2020 11:28:17 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...
Hi,

There is just one unsynced entry which comes back every time: the dom_md/ids file.

When I add/delete a VM it's gone for a short while, but then it reappears.

I did a brick replace to a node which is already upgraded. After that I'll do the same with the brick on the remaining not-yet-upgraded node, just to be sure. Luckily we have the hardware for it.

We're migrating from 4.2.8 (I think) to 4.3.7.2.

Regards,

Bertjan

...
On February 6, 2020 9:42:13 AM GMT+02:00, "Goorkate, B.J." <b.j.goorkate@umcutrecht.nl> wrote:

...
Hi Strahil,

The 0-xlator-message still occurs, but not frequent. Yesterdag a couple of times, but the 4th there were no entries at all.

What I did find out, was that the unsynced entries belonged to VM-images which were on specific hosts.

Yesterday I migrated them all to other hosts and the unsynced entries were gone except for 3. After a 'stat' of those files/directories,

...
...
were gone too.

I think I can migrate the remaining hosts now. An option would be to move the bricks of the not-yet-upgraded hosts to upgraded hosts. I have spare disks. What do you think?

Regards,

Bertjan

...
Did you manage to fix the issue with the 0-xlator ? If yes, most

...
the issue will be OK. Yet 'probably' doesn't meant that they 'will' be OK. If it was my lab - I would go ahead only if the 0-xlator issue is over.Yet, a lab is a different thing than prod - so it is your sole decision. Did you test the upgrade prior moving to Prod ? About the hooks - I had such issue before and I had to reinstall

...
gluster rpms to solve it. Best Regards, Strahil Nikolov В вторник, 4 февруари 2020 г., 16:35:27 ч. Гринуич+2, Goorkate, B.J. <b.j.goorkate@umcutrecht.nl> написа: Hi Strahil,

Thanks for your time so far!

The packages seem fine on all of the 3 nodes. Only /var/lib/glusterd/glusterd.info is modified and on the not yet upgraded nodes these files are missing:

missing /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh missing /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh missing /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh missing /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh missing /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh

But that doesn't seem relevant...

I stopped a couple of virtual machines with image files with unsynched entries. When they were turned off, I couldn't find them anymore in the unsyced entries list. When I turned them on again, some re-appeared, some didn't.

I really don't know where to look next.

The big question is: will the problems be resolved when I upgrade

On Tue, Feb 04, 2020 at 06:40:21PM +0000, Strahil Nikolov wrote: probably the the two

...
remaining nodes, or will it get worse?

Regards,

Bertjan

On Sun, Feb 02, 2020 at 08:06:58PM +0000, Strahil Nikolov wrote:

...
It seems that something went "off". That '0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api' is really worrisome. I think that you might have a bad package and based on the info it could be glusterfs-libs which should provide /usr/lib64/libgfrpc.so.0 . I'm currently on gluster v7.0 and I can't check it on my installation. Run a for loop to check the rpms: for i in $(rpm -qa | grep gluster) ; do echo "$i :" ; rpm -V $i'; echo;echo; done Most probably you can safely redo (on the upgraded node)

...
...
...
...
transaction: yum history yum history info <ID> -> verify the gluster packages were installed here yum history redo <ID> If kernel, glibc, systemd were not in this transaction , you can stop gluster and start it again: Node in maintenance in oVirt systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd systemctl stop sanlock systemctl stop glusterd /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh And then power up again: systemctl start glusterd gluster volume status -> check all connected systemctl start sanlock supervdsmd vdsmd systemctl start ovirt-ha-broker ovirt-ha-agent Check situation. Yet, you need to make gluster stop complaining , before you can take care of the heal. Usually 'rsync' is my best friend - but this is when gluster is working normally - and your case is far from normal. If redo doesn't work for you -> try the "yum history rollback" to recover to last good state. I think that 'BOOM Boot Manager' is best in such cases. Note: Never take any of my words for granted. I'm not running oVirt in production and some of my methods might not be OK for your environment. Best Regards, Strahil Nikolov В неделя, 2 февруари 2020 г., 08:56:01 ч. Гринуич+2, Mark Lamers <mark.r.lamers@gmail.com> написа:

Hi Strahil,

Bertjan is not in the office today, so I will reply if okay with you.

First I like to describe the status of our network

There are three bricks:

and 5 nodes:

Every host has a management and migrate vlan iface on a different bond iface.

The last octet of the ipaddress is similar

The output from 'gluster volume heal <volname> info' gives a long list of shards from all three nodes, the file is attached as 'gluster_volume_heal_info.txt'. The node without shards in

last yum the list

...
is the

...
updated node to gluster6, oVirt 4.3. That is curious i

...
...
...
...
Further more I find the following errors ' E ' in the

...
upgrated host:

[2020-01-26 17:40:46.147730] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-26 22:47:16.655651] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 07:07:51.815490] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:28:14.953974] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629457] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-27 18:58:22.629595] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-27 18:58:22.756430] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-27 18:58:22.756581] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.427196] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 05:31:52.427315] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.537799] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.537973] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 05:31:52.539620] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 05:31:52.539759] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 05:31:52.539937] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 05:31:52.540446] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 08:51:45.638694] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:52:45.709950] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:54:12.455555] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 08:54:16.214779] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496842] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 08:57:59.496905] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 08:57:59.505119] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli read-only, ProgVers: 2, Proc: 5) to rpc-transport (socket.management) [2020-01-28 08:57:59.505135] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 08:57:59.647456] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 08:57:59.647508] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: 1 [2020-01-28 08:57:59.654929] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 08:57:59.654943] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:07:34.941350] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:07:34.941391] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:07:35.042466] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:07:35.042510] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 09:13:32.329172] E [MSGID: 106244] [glusterd.c:1785:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2020-01-28 09:13:44.024431] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator:

glusterd.log of the dlsym(xlator_api) missing: dlsym(xlator_api) missing:

...
/usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 09:45:46.347499] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:46:45.837466] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:47:45.976186] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 09:48:24.976568] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2020-01-28 09:44:45.743365 (xid=0x52) [2020-01-28 09:48:24.976640] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.976743] E [MSGID: 106150] [glusterd-syncop.c:1931:gd_sync_task_begin] 0-management: Locking Peers Failed. [2020-01-28 09:48:24.977361] E [rpc-clnt.c:346:saved_frames_unwind] (-->

/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f61c7c07adb] (-->

...
/lib64/libgfrpc.so.0(+0xd7e4)[0x7f61c79ae7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f61c79ae8fe] (-->

/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61c79af987]

...
...
...
(--> /lib64/libgfrpc.so.0(+0xf518)[0x7f61c79b0518] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2020-01-28 09:46:03.586874 (xid=0x53) [2020-01-28 09:48:24.977417] E [MSGID: 106152] [glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Staging failed on 10.13.250.11. Please check log file for details. [2020-01-28 09:48:24.977631] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 22) to rpc-transport (socket.management) [2020-01-28 09:48:24.977664] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:48:24.977744] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.13. Please check log file for details. [2020-01-28 09:48:24.977861] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.15. Please check log file for details. [2020-01-28 09:48:25.012037] E [MSGID: 106115] [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on 10.13.250.14. Please check log file for details. [2020-01-28 09:48:25.012148] E [MSGID: 106151] [glusterd-syncop.c:1616:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2020-01-28 09:48:25.012315] E [MSGID: 106117] [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Unable to release lock for vmstore1 [2020-01-28 09:48:25.012425] E [rpcsvc.c:1577:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x2, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2020-01-28 09:48:25.012452] E [MSGID: 106430] [glusterd-utils.c:558:glusterd_submit_reply] 0-glusterd: Reply submission failed [2020-01-28 09:50:46.105492] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-28 10:13:16.199982] E [MSGID: 106118] [glusterd-op-sm.c:4133:glusterd_op_ac_lock] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 10:13:16.200078] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 10:13:16.345918] E [MSGID: 106117] [glusterd-op-sm.c:4189:glusterd_op_ac_unlock] 0-management: Unable to release lock for vmstore1 [2020-01-28 10:13:16.346012] E [MSGID: 106376] [glusterd-op-sm.c:8150:glusterd_op_sm] 0-management: handler returned: -1 [2020-01-28 12:42:46.212783] E [MSGID: 106118] [glusterd-syncop.c:1896:gd_sync_task_begin] 0-management: Unable to acquire lock for vmstore1 [2020-01-28 13:25:49.857368] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-29 13:26:03.031179] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 12:33:18.590063] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 13:34:57.027468] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-30 15:08:10.814931] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api [2020-01-31 09:40:22.725825] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.6/rpc-transport/socket.so: undefined symbol: xlator_api

I 'm curious for your opinion of the data sofar.

Love to here from you (or ather members of the list)

Regards Mark Lamers

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct:

https://www.ovirt.org/community/about/community-guidelines/

...
List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUBWMDUN3GMKGO...

...
...
...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/PF2ONI7EHIXWXR...

...
...
------------------------------------------------------------------------------

...
...
De informatie opgenomen in dit bericht kan vertrouwelijk zijn

en is

...
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr.

...
...
...
Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

...
...
This message may contain confidential information and is

...
exclusively for the addressee. If you receive this message unintentionally,

intended please do

...
not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no.

On Thu, Feb 06, 2020 at 12:43:48PM +0200, Strahil Nikolov wrote: they the think. 30244197. 30244197.

...
...
...
Please consider the environment before printing this e-mail.

Hi Bertjan,

If you healed everything - I think there is no need to migrate bricks. Just a precaution - keep a test VM for testing power off and power on after each node is upgraded.

What version are you upgrading to ?

Best Regards, Strahil Nikolov

Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code o

You can check on the bricks which has the newest (usually the diff is in the timestamp inside) and copy that on the rest of the bricks. Then a heal can clear the pending heal entry.

Sadly this happens for some time and I never had such issues with oVirt 4.2 on gluster v3.

Best Regards, Strahil Nikolov

1995

Age (days ago)

2019

Last active (days ago)

List overview

Download

12 comments

3 participants

participants (3)

Goorkate, B.J.
Mark Lamers
Strahil Nikolov

Unsynced entries do not self-heal during upgrade from oVirt 4.2 -> 4.3

tags

participants (3)