On Mon, Feb 4, 2019 at 4:15 PM feral <blistovmhz@gmail.com> wrote:
I think I found the answer to glusterd not starting.
https://bugzilla.redhat.com/show_bug.cgi?id=1472267

Apparently the version of gluster (3.12.15) that comes packaged with ovirt-node 4.2.8 has a known issue where gluster tries to come up before networking, fails, and crashes. This was fixed in gluster 3.13.0 (apparently). Do devs paruse this list?

Yes :)
 
Any chance someone who can update the gluster package might read this?

+Sahina might be able to help
The developers list is https://lists.ovirt.org/archives/list/devel@ovirt.org/ 


On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi <stirabos@redhat.com> wrote:


On Sat, Feb 2, 2019 at 7:32 PM feral <blistovmhz@gmail.com> wrote:
How is an oVirt hyperconverged cluster supposed to come back to life after a power outage to all 3 nodes?

Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso) to get things going, but I've run into multiple issues.

1. During the gluster setup, the volume sizes I specify, are not reflected in the deployment configuration. The auto-populated values are used every time. I manually hacked on the config to get the volume sizes correct. I also noticed if I create the deployment config with "sdb" by accident, but click back and change it to "vdb", again, the changes are not reflected in the config.
My deployment config does seem to work. All volumes are created (though the xfs options used don't make sense as you end up with stripe sizes that aren't a multiple of the block size).
Once gluster is deployed, I deploy the hosted engine, and everything works.

2. Reboot all nodes. I was testing for power outage response. All nodes come up, but glusterd is not running (seems to have failed for some reason). I can manually restart glusterd on all nodes and it comes up and starts communicating normally. However, the engine does not come online. So I figure out where it last lived, and try to start it manually through the web interface. This fails because vdsm-ovirtmgmt is not up. I figured out the correct way to start up the engine would be through the cli via hosted-engine --vm-start.

This is not required at all.
Are you sure that your cluster is not set in global maintenance mode?
Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and broker.log from your hosts?
 
This does work, but it takes a very long time, and it usually starts up on any node other than the one I told it to start on.

So I guess two (or three) questions. What is the expected operation after a full cluster reboot (ie: in the event of a power failure)? Why doesn't the engine start automatically, and what might be causing glusterd to fail, when it can be restarted manually and works fine?

--
_____
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/


--
_____
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/


--

GREG SHEREMETA

SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX

Red Hat NA

gshereme@redhat.com    IRC: gshereme