[ovirt-users] [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

Mon Nov 9 10:36:01 EST 2015

Here output from systemd-analyze critical-chain and  systemd-analyze blame.
I think that now glusterd start too early (before networking)

[root at ovirt01 tmp]# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" 
character.
The time the unit takes to start is printed after the "+" character.

multi-user.target @17.148s
└─ovirt-ha-agent.service @17.021s +127ms
   └─vdsmd.service @15.871s +1.148s
     └─vdsm-network.service @11.495s +4.373s
       └─libvirtd.service @11.238s +254ms
         └─iscsid.service @11.228s +8ms
           └─network.target @11.226s
             └─network.service @6.748s +4.476s
               └─iptables.service @6.630s +117ms
                 └─basic.target @6.629s
                   └─paths.target @6.629s
                     └─brandbot.path @6.629s
                       └─sysinit.target @6.615s
                         └─systemd-update-utmp.service @6.610s +4ms
                           └─auditd.service @6.450s +157ms
                             └─systemd-tmpfiles-setup.service @6.369s +77ms
                               └─rhel-import-state.service @6.277s +88ms
                                 └─local-fs.target @6.275s
                                   └─home-glusterfs-data.mount @5.805s 
+470ms
                                     └─home.mount @3.946s +1.836s
└─systemd-fsck at dev-mapper-centos_ovirt01\x2dhome.service @3.937s +7ms
└─dev-mapper-centos_ovirt01\x2dhome.device @3.936s

[root at ovirt01 tmp]# systemd-analyze blame
           4.476s network.service
           4.373s vdsm-network.service
           2.318s glusterd.service
           2.076s postfix.service
           1.836s home.mount
           1.651s lvm2-monitor.service
           1.258s lvm2-pvscan at 9:1.service
           1.211s systemd-udev-settle.service
           1.148s vdsmd.service
           1.079s dmraid-activation.service
           1.046s boot.mount
            904ms kdump.service
            779ms multipathd.service
            657ms var-lib-nfs-rpc_pipefs.mount
            590ms 
systemd-fsck at dev-disk-by\x2duuid-e185849f\x2d2c82\x2d4eb2\x2da215\x2d97340e90c93e.service
            547ms tuned.service
            481ms kmod-static-nodes.service
            470ms home-glusterfs-data.mount
            427ms home-glusterfs-engine.mount
            422ms sys-kernel-debug.mount
            411ms dev-hugepages.mount
            411ms dev-mqueue.mount
            278ms systemd-fsck-root.service
            263ms systemd-readahead-replay.service
            254ms libvirtd.service
            243ms systemd-tmpfiles-setup-dev.service
            216ms systemd-modules-load.service
            209ms rhel-readonly.service
            195ms wdmd.service
            192ms sanlock.service
            191ms gssproxy.service
            186ms systemd-udev-trigger.service
            157ms auditd.service
            151ms plymouth-quit-wait.service
            151ms plymouth-quit.service
            132ms proc-fs-nfsd.mount
            127ms ovirt-ha-agent.service
            117ms iptables.service
            110ms ovirt-ha-broker.service
             96ms avahi-daemon.service
             89ms systemd-udevd.service
             88ms rhel-import-state.service
             77ms systemd-tmpfiles-setup.service
             71ms sysstat.service
             71ms microcode.service
             71ms chronyd.service
             69ms systemd-readahead-collect.service
             68ms systemd-sysctl.service
             65ms systemd-logind.service
             61ms rsyslog.service
             58ms systemd-remount-fs.service
             46ms rpcbind.service
             46ms nfs-config.service
             45ms systemd-tmpfiles-clean.service
             41ms rhel-dmesg.service
             37ms dev-mapper-centos_ovirt01\x2dswap.swap
             29ms systemd-vconsole-setup.service
             26ms plymouth-read-write.service
             26ms systemd-random-seed.service
             24ms netcf-transaction.service
             22ms mdmonitor.service
             20ms systemd-machined.service
             14ms plymouth-start.service
             12ms systemd-update-utmp-runlevel.service
             11ms 
systemd-fsck at dev-mapper-centos_ovirt01\x2dglusterOVEngine.service
              8ms iscsid.service
              7ms systemd-fsck at dev-mapper-centos_ovirt01\x2dhome.service
              7ms systemd-readahead-done.service
              7ms 
systemd-fsck at dev-mapper-centos_ovirt01\x2dglusterOVData.service
              6ms sys-fs-fuse-connections.mount
              4ms systemd-update-utmp.service
              4ms glusterfsd.service
              4ms rpc-statd-notify.service
              3ms iscsi-shutdown.service
              3ms systemd-journal-flush.service
              2ms sys-kernel-config.mount
              1ms systemd-user-sessions.service

Il 06/11/2015 9.27, Stefano Danzi ha scritto:
> Hi!
> I have only one node (Test system) and I don't chage any ip address 
> and the entry is on /etc/hosts.
> I think that now gluster start before networking
>
> Il 06/11/2015 6.32, Atin Mukherjee ha scritto:
>>>> [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd:
>>>> resolve brick failed in restore
>> The above log is the culprit here. Generally this function fails when
>> GlusterD fails to resolve the associated host of a brick. Has any of the
>> node undergone an IP change during the upgrade process?
>>
>> ~Atin
>>
>> On 11/06/2015 09:59 AM, Sahina Bose wrote:
>>> Did you upgrade all the nodes too?
>>> Are some of your nodes not-reachable?
>>>
>>> Adding gluster-users for glusterd error.
>>>
>>> On 11/06/2015 12:00 AM, Stefano Danzi wrote:
>>>> After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the
>>>> host boot.
>>>> Manual start of service after boot works fine.
>>>>
>>>> gluster log:
>>>>
>>>> [2015-11-04 13:37:55.360876] I [MSGID: 100030]
>>>> [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running
>>>> /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p
>>>> /var/run/glusterd.pid)
>>>> [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init]
>>>> 0-management: Maximum allowed open file descriptors set to 65536
>>>> [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init]
>>>> 0-management: Using /var/lib/glusterd as working directory
>>>> [2015-11-04 13:37:55.464540] W [MSGID: 103071]
>>>> [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>>>> channel creation failed [Nessun device corrisponde]
>>>> [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init]
>>>> 0-rdma.management: Failed to initialize IB Device
>>>> [2015-11-04 13:37:55.464566] W
>>>> [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma'
>>>> initialization failed
>>>> [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create]
>>>> 0-rpc-service: cannot create listener, initing the transport failed
>>>> [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init]
>>>> 0-management: creation of 1 listeners failed, continuing with
>>>> succeeded transport
>>>> [2015-11-04 13:37:57.663862] I [MSGID: 106513]
>>>> [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd:
>>>> retrieved op-version: 30600
>>>> [2015-11-04 13:37:58.284522] I [MSGID: 106194]
>>>> [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list]
>>>> 0-management: No missed snaps list.
>>>> [2015-11-04 13:37:58.287477] E [MSGID: 106187]
>>>> [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd:
>>>> resolve brick failed in restore
>>>> [2015-11-04 13:37:58.287505] E [MSGID: 101019]
>>>> [xlator.c:428:xlator_init] 0-management: Initialization of volume
>>>> 'management' failed, review your volfile again
>>>> [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init]
>>>> 0-management: initializing translator failed
>>>> [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate]
>>>> 0-graph: init failed
>>>> [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit]
>>>> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d]
>>>> -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6]
>>>> -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-:
>>>> received signum (0), shutting down
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users