So why is this the default behavior of the ovirt Node distro?
On Sun, Feb 3, 2019 at 5:16 PM Strahil <hunter86_bg(a)yahoo.com> wrote:
2. Reboot all nodes. I was testing for power outage response. All nodes
come up, but glusterd is not running (seems to have failed for some
reason). I can manually restart glusterd on all nodes and it comes up and
starts communicating normally. However, the engine does not come online. So
I figure out where it last lived, and try to start it manually through the
web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
the correct way to start up the engine would be through the cli via
hosted-engine --vm-start. This does work, but it takes a very long time,
and it usually starts up on any node other than the one I told it to start
on.
If you use fstab - prepare for pain... Systemd mounts are more effective.
Here is a sample:
[root@ovirt1 ~]# systemctl cat gluster_bricks-engine.mount
# /etc/systemd/system/gluster_bricks-engine.mount
[Unit]
Description=Mount glusterfs brick - ENGINE
Requires = vdo.service
After = vdo.service
Before = glusterd.service
Conflicts = umount.target
[Mount]
What=/dev/mapper/gluster_vg_md0-gluster_lv_engine
Where=/gluster_bricks/engine
Type=xfs
Options=inode64,noatime,nodiratime
[Install]
WantedBy=glusterd.service
[root@ovirt1 ~]# systemctl cat glusterd.service
# /etc/systemd/system/glusterd.service
[Unit]
Description=GlusterFS, a clustered file-system server
Requires=rpcbind.service gluster_bricks-engine.mount
gluster_bricks-data.mount
After=network.target rpcbind.service gluster_bricks-engine.mount
Before=network-online.target
[Service]
Type=forking
PIDFile=/var/run/glusterd.pid
LimitNOFILE=65536
Environment="LOG_LEVEL=INFO"
EnvironmentFile=-/etc/sysconfig/glusterd
ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level
$LOG_LEVELKillMode=process
SuccessExitStatus=15
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/glusterd.service.d/99-cpu.conf
[Service]
CPUAccounting=yes
Slice=glusterfs.slice
Note : Some of the 'After=' and 'Requires=' entries were removed during
copy-pasting.
So I guess two (or three) questions. What is the expected operation after
a full cluster reboot (ie: in the event of a power failure)? Why doesn't
the engine start automatically, and what might be causing glusterd to fail,
when it can be restarted manually and works fine?
Expected -everything to be up and running.
Root cause , the system's fstab generator starts after cluster tries to
start the bricks - and of course fails.
Then everything on the chain fails.
Just use systemd's mount entries ( I have added automount also) and you
won't have such issues.
Best Regards,
Strahil Nikolov
--
_____
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.