[ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

Simone Tiraboschi stirabos at redhat.com
Tue Mar 10 04:58:14 EDT 2015



----- Original Message -----
> From: "Bob Doolittle" <bob at doolittle.us.com>
> To: "Simone Tiraboschi" <stirabos at redhat.com>
> Cc: "users-ovirt" <users at ovirt.org>
> Sent: Monday, March 9, 2015 11:48:03 PM
> Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed
> state)
> 
> 
> On 03/09/2015 02:47 PM, Bob Doolittle wrote:
> > Resending with CC to list (and an update).
> >
> > On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
> >> ----- Original Message -----
> >>> From: "Bob Doolittle" <bob at doolittle.us.com>
> >>> To: "Simone Tiraboschi" <stirabos at redhat.com>
> >>> Cc: "users-ovirt" <users at ovirt.org>
> >>> Sent: Monday, March 9, 2015 6:26:30 PM
> >>> Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
> >>> F20 (Cannot add the host to cluster ... SSH
> >>> has failed)
> >>>
> >>>
> >>> On 03/09/2015 12:53 PM, Simone Tiraboschi wrote:
> >>>> ----- Original Message -----
> >>>>> From: "Bob Doolittle" <bob at doolittle.us.com>
> >>>>> To: "Simone Tiraboschi" <stirabos at redhat.com>
> >>>>> Cc: "users-ovirt" <users at ovirt.org>
> >>>>> Sent: Monday, March 9, 2015 12:48:37 PM
> >>>>> Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
> >>>>> on
> >>>>> F20 (Cannot add the host to cluster ... SSH
> >>>>> has failed)
> >>>>>
> >>>>>
> >>>>> On 03/09/2015 07:12 AM, Simone Tiraboschi wrote:
> >>>>>> ----- Original Message -----
> >>>>>>> From: "Bob Doolittle" <bob at doolittle.us.com>
> >>>>>>> To: "Simone Tiraboschi" <stirabos at redhat.com>
> >>>>>>> Sent: Monday, March 9, 2015 12:02:49 PM
> >>>>>>> Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
> >>>>>>> on
> >>>>>>> F20 (Cannot add the host to cluster ... SSH
> >>>>>>> has failed)
> >>>>>>>
> >>>>>>> On Mar 9, 2015 5:23 AM, "Simone Tiraboschi" <stirabos at redhat.com>
> >>>>>>> wrote:
> >>>>>>>> ----- Original Message -----
> >>>>>>>>> From: "Bob Doolittle" <bob at doolittle.us.com>
> >>>>>>>>> To: "users-ovirt" <users at ovirt.org>
> >>>>>>>>> Sent: Friday, March 6, 2015 9:21:20 PM
> >>>>>>>>> Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1
> >>>>>>>>> on
> >>>>>>> F20 (Cannot add the host to cluster ... SSH has
> >>>>>>>>> failed)
> >>>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I'm following the instructions here:
> >>>>>>> http://www.ovirt.org/Hosted_Engine_Howto
> >>>>>>>>> My self-hosted install failed near the end:
> >>>>>>>>>
> >>>>>>>>> To continue make a selection from the options below:
> >>>>>>>>>           (1) Continue setup - engine installation is complete
> >>>>>>>>>           (2) Power off and restart the VM
> >>>>>>>>>           (3) Abort setup
> >>>>>>>>>           (4) Destroy VM and abort setup
> >>>>>>>>>
> >>>>>>>>>           (1, 2, 3, 4)[1]: 1
> >>>>>>>>> [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
> >>>>>>>>>           Enter the name of the cluster to which you want to add
> >>>>>>>>>           the
> >>>>>>> host
> >>>>>>>>>           (Default) [Default]:
> >>>>>>>>> [ ERROR ] Cannot automatically add the host to cluster Default:
> >>>>>>>>> Cannot
> >>>>>>> add
> >>>>>>>>> Host. Connecting to host via SSH has failed, verify that the host
> >>>>>>>>> is
> >>>>>>>>> reachable (IP address, routable address etc.) You may refer to the
> >>>>>>>>> engine.log file for further details.
> >>>>>>>>> [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host
> >>>>>>>>> to
> >>>>>>>>> cluster Default
> >>>>>>>>> [ INFO  ] Stage: Clean up
> >>>>>>>>> [ INFO  ] Generating answer file
> >>>>>>>>> '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf'
> >>>>>>>>> [ INFO  ] Stage: Pre-termination
> >>>>>>>>> [ INFO  ] Stage: Termination
> >>>>>>>>>
> >>>>>>>>> I can ssh into the engine VM both locally and remotely. There is no
> >>>>>>>>> /root/.ssh directory, however. Did I need to set that up somehow?
> >>>>>>>> It's the engine that needs to open an SSH connection to the host
> >>>>>>>> calling
> >>>>>>> it by its hostname.
> >>>>>>>> So please be sure that you can SSH to the host from the engine using
> >>>>>>>> its
> >>>>>>> hostname and not its IP address.
> >>>>>>>
> >>>>>>> I'm assuming this should be a password-less login (key-based
> >>>>>>> authentication?).
> >>>>>> Yes, it is.
> >>>>>>
> >>>>>>> As what user?
> >>>>>> root
> >>>>> OK, I see a couple of problems.
> >>>>> First off, I didn't have my deploying-host hostname in the hosts map
> >>>>> for
> >>>>> my
> >>>>> engine.
> >>>> This is enough by itself to make the deploy procedure failing. If
> >>>> possible
> >>>> we recommend to rely a DNS infrastructure especially if you are
> >>>> deploying
> >>>> more than one host.
> >>> OK, I've started over. Simply removing the storage domain was
> >>> insufficient,
> >>> the hosted-engine deploy failed when it found the HA and Broker services
> >>> already configured. I decided to just start over fresh starting with
> >>> re-installing the OS on my host.
> >>>
> >>> I can't deploy DNS at the moment, so I have to simply replicate
> >>> /etc/hosts
> >>> files on my host/engine. I did that this time, but have run into a new
> >>> problem:
> >>>
> >>> [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
> >>>           Enter the name of the cluster to which you want to add the host
> >>>           (Default) [Default]:
> >>> [ INFO  ] Waiting for the host to become operational in the engine. This
> >>> may
> >>> take several minutes...
> >>> [ ERROR ] The VDSM host was found in a failed state. Please check engine
> >>> and
> >>> bootstrap installation logs.
> >>> [ ERROR ] Unable to add ovirt-vm to the manager
> >>>           Please shutdown the VM allowing the system to launch it as a
> >>>           monitored service.
> >>>           The system will wait until the VM is down.
> >>> [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
> >>> refused
> >>> [ INFO  ] Stage: Clean up
> >>> [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
> >>> refused
> >>>
> >>>
> >>> I've attached my engine log and the ovirt-hosted-engine-setup log. I
> >>> think I
> >>> had an issue with resolving external hostnames, or else a connectivity
> >>> issue
> >>> during the install.
> >> For some reason your engine wasn't able to deploy your hosts but the SSH
> >> session this time was established.
> >> 2015-03-09 13:05:58,514 ERROR
> >> [org.ovirt.engine.core.bll.InstallVdsInternalCommand]
> >> (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
> >> for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
> >> java.io.IOException: Command returned failure code 1 during SSH session
> >> 'root at xion2.smartcity.net'
> >>
> >> Can you please attach host-deploy logs from the engine VM?
> > OK, attached.
> >
> > Like I said, it looks to me like a name-resolution issue during the yum
> > update on the engine. I think I've fixed that, but do you have a better
> > suggestion for cleaning up and re-deploying other than installing the OS
> > on my host and starting all over again?
> 
> I just finished starting over from scratch, starting with OS installation on
> my host/node, and wound up with a very similar problem - the engine couldn't
> reach the hosts during the yum operation. But this time the error was
> "Network is unreachable". Which is weird, because I can ssh into the engine
> and ping many of those hosts, after the operation has failed.
> 
> Here's my latest host-deploy log from the engine. I'd appreciate any clues.

It seams that now your host is able to resolve that addresses but it's not able to connect over http.
On your hosts some of them resolves as IPv6 addresses; can you please try to use curl to get one of the file that it wasn't able to fetch?
Can you please check your network configuration before and after host-deploy?


> Thanks,
>    Bob
> 
> 
> 


More information about the Users mailing list