Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

Stefano Danzi

5 Nov 2015 5 Nov '15

6:30 p.m.

After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine. gluster log: [2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

Show replies by date

Sahina Bose

6 Nov 6 Nov

4:29 a.m.

Did you upgrade all the nodes too? Are some of your nodes not-reachable? Adding gluster-users for glusterd error. On 11/06/2015 12:00 AM, Stefano Danzi wrote:

...

After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine.

gluster log:

[2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Atin Mukherjee

5:32 a.m.

New subject: [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

...

...
[glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore The above log is the culprit here. Generally this function fails when GlusterD fails to resolve the associated host of a brick. Has any of the node undergone an IP change during the upgrade process?

~Atin On 11/06/2015 09:59 AM, Sahina Bose wrote:

...

Did you upgrade all the nodes too? Are some of your nodes not-reachable?

Adding gluster-users for glusterd error.

On 11/06/2015 12:00 AM, Stefano Danzi wrote:

...
After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine.

gluster log:

[2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users

Stefano Danzi

8:27 a.m.

New subject: [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

Hi! I have only one node (Test system) and I don't chage any ip address and the entry is on /etc/hosts. I thing that now gluster start before networking Il 06/11/2015 6.32, Atin Mukherjee ha scritto:

...

...
...
[glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore The above log is the culprit here. Generally this function fails when GlusterD fails to resolve the associated host of a brick. Has any of the node undergone an IP change during the upgrade process?

~Atin

On 11/06/2015 09:59 AM, Sahina Bose wrote:

...
Did you upgrade all the nodes too? Are some of your nodes not-reachable?

Adding gluster-users for glusterd error.

On 11/06/2015 12:00 AM, Stefano Danzi wrote:

...
After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine.

gluster log:

[2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users

Stefano Danzi

9 Nov 9 Nov

3:36 p.m.

New subject: [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

Here output from systemd-analyze critical-chain and systemd-analyze blame. I think that now glusterd start too early (before networking) [root@ovirt01 tmp]# systemd-analyze critical-chain The time after the unit is active or started is printed after the "@" character. The time the unit takes to start is printed after the "+" character. multi-user.target @17.148s └─ovirt-ha-agent.service @17.021s +127ms └─vdsmd.service @15.871s +1.148s └─vdsm-network.service @11.495s +4.373s └─libvirtd.service @11.238s +254ms └─iscsid.service @11.228s +8ms └─network.target @11.226s └─network.service @6.748s +4.476s └─iptables.service @6.630s +117ms └─basic.target @6.629s └─paths.target @6.629s └─brandbot.path @6.629s └─sysinit.target @6.615s └─systemd-update-utmp.service @6.610s +4ms └─auditd.service @6.450s +157ms └─systemd-tmpfiles-setup.service @6.369s +77ms └─rhel-import-state.service @6.277s +88ms └─local-fs.target @6.275s └─home-glusterfs-data.mount @5.805s +470ms └─home.mount @3.946s +1.836s └─systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service @3.937s +7ms └─dev-mapper-centos_ovirt01\x2dhome.device @3.936s [root@ovirt01 tmp]# systemd-analyze blame 4.476s network.service 4.373s vdsm-network.service 2.318s glusterd.service 2.076s postfix.service 1.836s home.mount 1.651s lvm2-monitor.service 1.258s lvm2-pvscan@9:1.service 1.211s systemd-udev-settle.service 1.148s vdsmd.service 1.079s dmraid-activation.service 1.046s boot.mount 904ms kdump.service 779ms multipathd.service 657ms var-lib-nfs-rpc_pipefs.mount 590ms systemd-fsck@dev-disk-by\x2duuid-e185849f\x2d2c82\x2d4eb2\x2da215\x2d97340e90c93e.service 547ms tuned.service 481ms kmod-static-nodes.service 470ms home-glusterfs-data.mount 427ms home-glusterfs-engine.mount 422ms sys-kernel-debug.mount 411ms dev-hugepages.mount 411ms dev-mqueue.mount 278ms systemd-fsck-root.service 263ms systemd-readahead-replay.service 254ms libvirtd.service 243ms systemd-tmpfiles-setup-dev.service 216ms systemd-modules-load.service 209ms rhel-readonly.service 195ms wdmd.service 192ms sanlock.service 191ms gssproxy.service 186ms systemd-udev-trigger.service 157ms auditd.service 151ms plymouth-quit-wait.service 151ms plymouth-quit.service 132ms proc-fs-nfsd.mount 127ms ovirt-ha-agent.service 117ms iptables.service 110ms ovirt-ha-broker.service 96ms avahi-daemon.service 89ms systemd-udevd.service 88ms rhel-import-state.service 77ms systemd-tmpfiles-setup.service 71ms sysstat.service 71ms microcode.service 71ms chronyd.service 69ms systemd-readahead-collect.service 68ms systemd-sysctl.service 65ms systemd-logind.service 61ms rsyslog.service 58ms systemd-remount-fs.service 46ms rpcbind.service 46ms nfs-config.service 45ms systemd-tmpfiles-clean.service 41ms rhel-dmesg.service 37ms dev-mapper-centos_ovirt01\x2dswap.swap 29ms systemd-vconsole-setup.service 26ms plymouth-read-write.service 26ms systemd-random-seed.service 24ms netcf-transaction.service 22ms mdmonitor.service 20ms systemd-machined.service 14ms plymouth-start.service 12ms systemd-update-utmp-runlevel.service 11ms systemd-fsck@dev-mapper-centos_ovirt01\x2dglusterOVEngine.service 8ms iscsid.service 7ms systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service 7ms systemd-readahead-done.service 7ms systemd-fsck@dev-mapper-centos_ovirt01\x2dglusterOVData.service 6ms sys-fs-fuse-connections.mount 4ms systemd-update-utmp.service 4ms glusterfsd.service 4ms rpc-statd-notify.service 3ms iscsi-shutdown.service 3ms systemd-journal-flush.service 2ms sys-kernel-config.mount 1ms systemd-user-sessions.service Il 06/11/2015 9.27, Stefano Danzi ha scritto:

...

Hi! I have only one node (Test system) and I don't chage any ip address and the entry is on /etc/hosts. I think that now gluster start before networking

Il 06/11/2015 6.32, Atin Mukherjee ha scritto:

...
...
...
[glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore The above log is the culprit here. Generally this function fails when GlusterD fails to resolve the associated host of a brick. Has any of the node undergone an IP change during the upgrade process?

~Atin

On 11/06/2015 09:59 AM, Sahina Bose wrote:

...
Did you upgrade all the nodes too? Are some of your nodes not-reachable?

Adding gluster-users for glusterd error.

On 11/06/2015 12:00 AM, Stefano Danzi wrote:

...
After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine.

gluster log:

[2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users

Kaushal M

10 Nov 10 Nov

7:02 a.m.

New subject: [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

On Mon, Nov 9, 2015 at 9:06 PM, Stefano Danzi <s.danzi@hawai.it> wrote:

...

Here output from systemd-analyze critical-chain and systemd-analyze blame. I think that now glusterd start too early (before networking)

You are nearly right. GlusterD did start too early. GlusterD is configured to start after network.target. But network.target in systemd only guarantees that the network management stack is up; it doesn't guarantee that the network devices have been configured and are usable (Ref [1]). This means that when GlusterD starts, the network is still not up and hence GlusterD will fail to resolve bricks. While we could start GlusterD after network-online.target, it would break GlusterFS mounts configured in /etc/fstab with _netdev option. Systemd automatically schedules _netdev mounts to be done after network-online.target. (Ref [1] network-online.target). This could allow the GlusterFS mounts to be done before GlusterD is up, causing them to fail. This can be done using systemd-220 [2] which introduced support for `x-systemd.requires` option for fstab, which can be used to order mounts after specific services, but is not possible with el7 which has systemd-208. [1]: https://wiki.freedesktop.org/www/Software/systemd/NetworkTarget/ [2]: https://bugzilla.redhat.com/show_bug.cgi?id=812826

...

[root@ovirt01 tmp]# systemd-analyze critical-chain The time after the unit is active or started is printed after the "@" character. The time the unit takes to start is printed after the "+" character.

multi-user.target @17.148s └─ovirt-ha-agent.service @17.021s +127ms └─vdsmd.service @15.871s +1.148s └─vdsm-network.service @11.495s +4.373s └─libvirtd.service @11.238s +254ms └─iscsid.service @11.228s +8ms └─network.target @11.226s └─network.service @6.748s +4.476s └─iptables.service @6.630s +117ms └─basic.target @6.629s └─paths.target @6.629s └─brandbot.path @6.629s └─sysinit.target @6.615s └─systemd-update-utmp.service @6.610s +4ms └─auditd.service @6.450s +157ms └─systemd-tmpfiles-setup.service @6.369s +77ms └─rhel-import-state.service @6.277s +88ms └─local-fs.target @6.275s └─home-glusterfs-data.mount @5.805s +470ms └─home.mount @3.946s +1.836s └─systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service @3.937s +7ms └─dev-mapper-centos_ovirt01\x2dhome.device @3.936s

[root@ovirt01 tmp]# systemd-analyze blame 4.476s network.service 4.373s vdsm-network.service 2.318s glusterd.service 2.076s postfix.service 1.836s home.mount 1.651s lvm2-monitor.service 1.258s lvm2-pvscan@9:1.service 1.211s systemd-udev-settle.service 1.148s vdsmd.service 1.079s dmraid-activation.service 1.046s boot.mount 904ms kdump.service 779ms multipathd.service 657ms var-lib-nfs-rpc_pipefs.mount 590ms systemd-fsck@dev-disk-by\x2duuid-e185849f\x2d2c82\x2d4eb2\x2da215\x2d97340e90c93e.service 547ms tuned.service 481ms kmod-static-nodes.service 470ms home-glusterfs-data.mount 427ms home-glusterfs-engine.mount 422ms sys-kernel-debug.mount 411ms dev-hugepages.mount 411ms dev-mqueue.mount 278ms systemd-fsck-root.service 263ms systemd-readahead-replay.service 254ms libvirtd.service 243ms systemd-tmpfiles-setup-dev.service 216ms systemd-modules-load.service 209ms rhel-readonly.service 195ms wdmd.service 192ms sanlock.service 191ms gssproxy.service 186ms systemd-udev-trigger.service 157ms auditd.service 151ms plymouth-quit-wait.service 151ms plymouth-quit.service 132ms proc-fs-nfsd.mount 127ms ovirt-ha-agent.service 117ms iptables.service 110ms ovirt-ha-broker.service 96ms avahi-daemon.service 89ms systemd-udevd.service 88ms rhel-import-state.service 77ms systemd-tmpfiles-setup.service 71ms sysstat.service 71ms microcode.service 71ms chronyd.service 69ms systemd-readahead-collect.service 68ms systemd-sysctl.service 65ms systemd-logind.service 61ms rsyslog.service 58ms systemd-remount-fs.service 46ms rpcbind.service 46ms nfs-config.service 45ms systemd-tmpfiles-clean.service 41ms rhel-dmesg.service 37ms dev-mapper-centos_ovirt01\x2dswap.swap 29ms systemd-vconsole-setup.service 26ms plymouth-read-write.service 26ms systemd-random-seed.service 24ms netcf-transaction.service 22ms mdmonitor.service 20ms systemd-machined.service 14ms plymouth-start.service 12ms systemd-update-utmp-runlevel.service 11ms systemd-fsck@dev-mapper-centos_ovirt01\x2dglusterOVEngine.service 8ms iscsid.service 7ms systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service 7ms systemd-readahead-done.service 7ms systemd-fsck@dev-mapper-centos_ovirt01\x2dglusterOVData.service 6ms sys-fs-fuse-connections.mount 4ms systemd-update-utmp.service 4ms glusterfsd.service 4ms rpc-statd-notify.service 3ms iscsi-shutdown.service 3ms systemd-journal-flush.service 2ms sys-kernel-config.mount 1ms systemd-user-sessions.service

Il 06/11/2015 9.27, Stefano Danzi ha scritto:

...
Hi! I have only one node (Test system) and I don't chage any ip address and the entry is on /etc/hosts. I think that now gluster start before networking

Il 06/11/2015 6.32, Atin Mukherjee ha scritto:

...
...
...
[glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore

The above log is the culprit here. Generally this function fails when GlusterD fails to resolve the associated host of a brick. Has any of the node undergone an IP change during the upgrade process?

~Atin

On 11/06/2015 09:59 AM, Sahina Bose wrote:

...
Did you upgrade all the nodes too? Are some of your nodes not-reachable?

Adding gluster-users for glusterd error.

On 11/06/2015 12:00 AM, Stefano Danzi wrote:

...
After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine.

gluster log:

[2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users

Stefano Danzi

12 Nov 12 Nov

4:44 p.m.

New subject: [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

to temporary fix this problem I changed [Unit] section in glusterd.service file: [Unit] Description=GlusterFS, a clustered file-system server After=network.target rpcbind.service network-online.target vdsm-network.service Before=vdsmd.service Il 10/11/2015 8.02, Kaushal M ha scritto:

...

On Mon, Nov 9, 2015 at 9:06 PM, Stefano Danzi <s.danzi@hawai.it> wrote:

...
Here output from systemd-analyze critical-chain and systemd-analyze blame. I think that now glusterd start too early (before networking) You are nearly right. GlusterD did start too early. GlusterD is configured to start after network.target. But network.target in systemd only guarantees that the network management stack is up; it doesn't guarantee that the network devices have been configured and are usable (Ref [1]). This means that when GlusterD starts, the network is still not up and hence GlusterD will fail to resolve bricks.

While we could start GlusterD after network-online.target, it would break GlusterFS mounts configured in /etc/fstab with _netdev option. Systemd automatically schedules _netdev mounts to be done after network-online.target. (Ref [1] network-online.target). This could allow the GlusterFS mounts to be done before GlusterD is up, causing them to fail. This can be done using systemd-220 [2] which introduced support for `x-systemd.requires` option for fstab, which can be used to order mounts after specific services, but is not possible with el7 which has systemd-208.

[1]: https://wiki.freedesktop.org/www/Software/systemd/NetworkTarget/ [2]: https://bugzilla.redhat.com/show_bug.cgi?id=812826

Stefano Danzi

6 Nov 6 Nov

8:23 a.m.

Hello! It's a test evoirment, so I have only one node. If I start manually glusterd some seconds after boot I have no problems. This error is only during boot. I think that something chages during upgrade. Maybe that now glusterd start before networkging or rpc. Il 06/11/2015 5.29, Sahina Bose ha scritto:

...

Did you upgrade all the nodes too? Are some of your nodes not-reachable?

Adding gluster-users for glusterd error.

On 11/06/2015 12:00 AM, Stefano Danzi wrote:

...
After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the host boot. Manual start of service after boot works fine.

gluster log:

[2015-11-04 13:37:55.360876] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2015-11-04 13:37:55.464540] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [Nessun device corrisponde] [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2015-11-04 13:37:55.464566] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-11-04 13:37:57.663862] I [MSGID: 106513] [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2015-11-04 13:37:58.284522] I [MSGID: 106194] [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-11-04 13:37:58.287477] E [MSGID: 106187] [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2015-11-04 13:37:58.287505] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-: received signum (0), shutting down

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

3694

Age (days ago)

3701

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

Atin Mukherjee
Kaushal M
Sahina Bose
Stefano Danzi

Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

tags

participants (4)