
So why is this the default behavior of the ovirt Node distro? On Sun, Feb 3, 2019 at 5:16 PM Strahil <hunter86_bg@yahoo.com> wrote:
2. Reboot all nodes. I was testing for power outage response. All nodes come up, but glusterd is not running (seems to have failed for some reason). I can manually restart glusterd on all nodes and it comes up and starts communicating normally. However, the engine does not come online. So I figure out where it last lived, and try to start it manually through the web interface. This fails because vdsm-ovirtmgmt is not up. I figured out the correct way to start up the engine would be through the cli via hosted-engine --vm-start. This does work, but it takes a very long time, and it usually starts up on any node other than the one I told it to start on.
If you use fstab - prepare for pain... Systemd mounts are more effective. Here is a sample:
[root@ovirt1 ~]# systemctl cat gluster_bricks-engine.mount # /etc/systemd/system/gluster_bricks-engine.mount [Unit] Description=Mount glusterfs brick - ENGINE Requires = vdo.service After = vdo.service Before = glusterd.service Conflicts = umount.target
[Mount] What=/dev/mapper/gluster_vg_md0-gluster_lv_engine Where=/gluster_bricks/engine Type=xfs Options=inode64,noatime,nodiratime
[Install] WantedBy=glusterd.service
[root@ovirt1 ~]# systemctl cat glusterd.service # /etc/systemd/system/glusterd.service [Unit] Description=GlusterFS, a clustered file-system server Requires=rpcbind.service gluster_bricks-engine.mount gluster_bricks-data.mount After=network.target rpcbind.service gluster_bricks-engine.mount Before=network-online.target
[Service] Type=forking PIDFile=/var/run/glusterd.pid LimitNOFILE=65536 Environment="LOG_LEVEL=INFO" EnvironmentFile=-/etc/sysconfig/glusterd ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVELKillMode=process SuccessExitStatus=15
[Install] WantedBy=multi-user.target
# /etc/systemd/system/glusterd.service.d/99-cpu.conf [Service] CPUAccounting=yes Slice=glusterfs.slice
Note : Some of the 'After=' and 'Requires=' entries were removed during copy-pasting.
So I guess two (or three) questions. What is the expected operation after a full cluster reboot (ie: in the event of a power failure)? Why doesn't the engine start automatically, and what might be causing glusterd to fail, when it can be restarted manually and works fine?
Expected -everything to be up and running. Root cause , the system's fstab generator starts after cluster tries to start the bricks - and of course fails. Then everything on the chain fails.
Just use systemd's mount entries ( I have added automount also) and you won't have such issues.
Best Regards, Strahil Nikolov
-- _____ Fact: 1. Ninjas are mammals. 2. Ninjas fight ALL the time. 3. The purpose of the ninja is to flip out and kill people.