[ovirt-users] Re: System unable to recover after a crash

Sunday, 14 July 2019

Hi Carl,

I'd recommend you to avoid  DNS &  DHCP unless you oVirt infra consistes of
hundreds of servers.
It is far more reliable to use  static IPs + /etc/hosts .
As you could 'ssh' to the engine, check the logs - there should be a clue why it
failed.
Most  probably it's related to the DNS/IP used.
I think the devs can tell their opinion on Monday.

Best Regards,
Strahil NikolovOn Jul 13, 2019 15:08, carl langlois <crl.langlois(a)gmail.com&gt; wrote:
...

 Hi 
 Thanks for the info. There have been some progress with the situation. So to make the
story as short as possible we are in a process of changing our range of IP addresse to
10.8.X.X to 10.16.X.X for all of the ovirt infra. This implies a new DHCP server, new
switchs etc etc. For now we went back to our old IP address ranges because we were not
able to stabilize the system. 

 So the last status using our new range of addresses was that gluster was all fine, the
hosted engine domaine was moutning okey. I suspect DNS table was not properly updated..
but i am not 100% sure. But  if we tried to used the  new range of adrreses everything
seems to be fine except that the hosted-engine always fail the "liveliness
check" after going up. I was not able to solve this situation so i went back to our
previous DHCP server. 

 So i am not sure what is missing for the hosted-engine to use the DHCP server. Is there
any hardcode config in the hosted-egnine that need to be updated when chaging DHCP
server(i.e new address with the same hostname, new gateway..)

 More info on the test i did with the new DHCP server -- > All node have name
resolution working. I am able to ssh to the hosted-engine 

 Any suggestions will be appreciated as i am out of idea for now. Do i need to redo some
sort of setup in the engine to take into account the range of address/new gateway? There
is also a LDAP server access configure in the engine for username mapping..
 Carl

 On Sat, Jul 13, 2019 at 6:31 AM Strahil Nikolov <hunter86_bg(a)yahoo.com&gt; wrote:
>
> Can you mount the volume manually at another location ?
> Also, have you done any changes to Gluster ?
>
> Please provide "gluster volume info engine" . I have noticed the following
in your logs: option 'parallel-readdir' is not recognized
>
> Best Regards,
> Strahil Nikolov
>
> В петък, 12 юли 2019 г., 22:30:41 ч. Гринуич+3, carl langlois
<crl.langlois(a)gmail.com&gt; написа:
>
>
> Hi ,
>
> I am in state where my system does not recover from a major failure. I have pinpoint
the probleme to be that the hosted engine storage domain is not able to mount
>
> I have a glusterfs containing the storage domain. but when it attempt to mount
glusterfs to /rhev/data-center/mnt/glusterSD/ovhost1:_engine i get 
>
> +------------------------------------------------------------------------------+
> [2019-07-12 19:19:44.063608] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-engine-client-2:
changing port to 49153 (from 0)
> [2019-07-12 19:19:55.033725] I [fuse-bridge.c:4205:fuse_init] 0-glusterfs-fuse: FUSE
inited with protocol versions: glusterfs 7.24 kernel 7.22
> [2019-07-12 19:19:55.033748] I [fuse-bridge.c:4835:fuse_graph_sync] 0-fuse: switched
to graph 0
> [2019-07-12 19:19:55.033895] I [MSGID: 108006] [afr-common.c:537 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: System unable to recover after a crash