Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)

10 Mar 2015

      This is a multi-part message in MIME format.
--------------070402000308020002020408
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

On 03/10/2015 10:20 AM, Simone Tiraboschi wrote:
...
----- Original Message -----
...
From: "Bob Doolittle" <bob@doolittle.us.com>
To: "Simone Tiraboschi" <stirabos@redhat.com>
Cc: "users-ovirt" <users@ovirt.org>
Sent: Tuesday, March 10, 2015 2:40:13 PM
Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed
state)
On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:
...
----- Original Message -----
...
From: "Bob Doolittle" <bob@doolittle.us.com>
To: "Simone Tiraboschi" <stirabos@redhat.com>
Cc: "users-ovirt" <users@ovirt.org>
Sent: Monday, March 9, 2015 11:48:03 PM
Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
F20 (The VDSM host was found in a failed
state)
On 03/09/2015 02:47 PM, Bob Doolittle wrote:
...
Resending with CC to list (and an update).
On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
...
----- Original Message -----
> From: "Bob Doolittle" <bob@doolittle.us.com>
> To: "Simone Tiraboschi" <stirabos@redhat.com>
> Cc: "users-ovirt" <users@ovirt.org>
> Sent: Monday, March 9, 2015 6:26:30 PM
> Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
> on
> F20 (Cannot add the host to cluster ... SSH
> has failed)
>
...
> OK, I've started over. Simply removing the storage domain was
> insufficient,
> the hosted-engine deploy failed when it found the HA and Broker
> services
> already configured. I decided to just start over fresh starting with
> re-installing the OS on my host.
>
> I can't deploy DNS at the moment, so I have to simply replicate
> /etc/hosts
> files on my host/engine. I did that this time, but have run into a new
> problem:
>
> [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
>           Enter the name of the cluster to which you want to add the
>           host
>           (Default) [Default]:
> [ INFO  ] Waiting for the host to become operational in the engine.
> This
> may
> take several minutes...
> [ ERROR ] The VDSM host was found in a failed state. Please check
> engine
> and
> bootstrap installation logs.
> [ ERROR ] Unable to add ovirt-vm to the manager
>           Please shutdown the VM allowing the system to launch it as a
>           monitored service.
>           The system will wait until the VM is down.
> [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
> refused
> [ INFO  ] Stage: Clean up
> [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
> refused
>
>
> I've attached my engine log and the ovirt-hosted-engine-setup log. I
> think I
> had an issue with resolving external hostnames, or else a connectivity
> issue
> during the install.
For some reason your engine wasn't able to deploy your hosts but the SSH
session this time was established.
2015-03-09 13:05:58,514 ERROR
[org.ovirt.engine.core.bll.InstallVdsInternalCommand]
(org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
java.io.IOException: Command returned failure code 1 during SSH session
'root@xion2.smartcity.net'
Can you please attach host-deploy logs from the engine VM?
OK, attached.
Like I said, it looks to me like a name-resolution issue during the yum
update on the engine. I think I've fixed that, but do you have a better
suggestion for cleaning up and re-deploying other than installing the OS
on my host and starting all over again?
I just finished starting over from scratch, starting with OS installation
on
my host/node, and wound up with a very similar problem - the engine
couldn't
reach the hosts during the yum operation. But this time the error was
"Network is unreachable". Which is weird, because I can ssh into the
engine
and ping many of those hosts, after the operation has failed.
Here's my latest host-deploy log from the engine. I'd appreciate any
clues.
It seams that now your host is able to resolve that addresses but it's not
able to connect over http.
On your hosts some of them resolves as IPv6 addresses; can you please try
to use curl to get one of the file that it wasn't able to fetch?
Can you please check your network configuration before and after
host-deploy?
I can give you the network configuration after host-deploy, at least for the
host/Node. The engine won't start for me this morning, after I shut down the
host for the night.
In order to give you the config before host-deploy (or, apparently for the
engine), I'll have to re-install the OS on the host and start again from
scratch. Obviously I'd rather not do that unless absolutely necessary.
Here's the host config after the failed host-deploy:
Host/Node:
# ip route
169.254.0.0/16 dev ovirtmgmt  scope link  metric 1007
172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58
You are missing a default gateway and so the issue.
Are you sure that it was properly configured before trying to deploy that host?
It should have been, it was a fresh OS install. So I'm starting again, and keeping careful records of my network config.

Here is my initial network config of my host/node, immediately following a new OS install:

% ip route
default via 172.16.0.1 dev p3p1  proto static  metric 1024 
172.16.0.0/16 dev p3p1  proto kernel  scope link  src 172.16.0.58 

% ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.58/16 brd 172.16.255.255 scope global p3p1
       valid_lft forever preferred_lft forever
    inet6 fe80::baca:3aff:fe79:2212/64 scope link 
       valid_lft forever preferred_lft forever
3: wlp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 1c:3e:84:50:8d:c3 brd ff:ff:ff:ff:ff:ff

After the VM is first created, the host/node config is:

# ip route
default via 172.16.0.1 dev ovirtmgmt 
169.254.0.0/16 dev ovirtmgmt  scope link  metric 1006 
172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58 

# ip addr 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovirtmgmt state UP group default qlen 1000
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::baca:3aff:fe79:2212/64 scope link 
       valid_lft forever preferred_lft forever
3: wlp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 1c:3e:84:50:8d:c3 brd ff:ff:ff:ff:ff:ff
4: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 92:cb:9d:97:18:36 brd ff:ff:ff:ff:ff:ff
5: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default 
    link/ether 9a:bc:29:52:82:38 brd ff:ff:ff:ff:ff:ff
6: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.58/16 brd 172.16.255.255 scope global ovirtmgmt
       valid_lft forever preferred_lft forever
    inet6 fe80::baca:3aff:fe79:2212/64 scope link 
       valid_lft forever preferred_lft forever
7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovirtmgmt state UNKNOWN group default qlen 500
    link/ether fe:16:3e:16:a4:37 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc16:3eff:fe16:a437/64 scope link 
       valid_lft forever preferred_lft forever

At this point, I was already seeing a problem on the host/node. I remembered that a newer version of sos package is delivered from the ovirt repositories. So I tried to do a "yum update" on my host, and got a similar problem:

% sudo yum update
[sudo] password for rad: 
Loaded plugins: langpacks, refresh-packagekit
Resolving Dependencies
--> Running transaction check
---> Package sos.noarch 0:3.1-1.fc20 will be updated
---> Package sos.noarch 0:3.2-0.2.fc20.ovirt will be an update
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================================================
 Package             Arch                   Version                             Repository                 Size
================================================================================================================
Updating:
 sos                 noarch                 3.2-0.2.fc20.ovirt                  ovirt-3.5                 292 k

Transaction Summary
================================================================================================================
Upgrade  1 Package

Total download size: 292 k
Is this ok [y/d/N]: y
Downloading packages:
No Presto metadata available for ovirt-3.5
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-...: [Errno 14] curl#6 - "Could not resolve host: www.gtlib.gatech.edu"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
ftp://ftp.gtlib.gatech.edu/pub/oVirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm: [Errno 14] curl#6 - "Could not resolve host: ftp.gtlib.gatech.edu"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
http://resources.ovirt.org/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ov...: [Errno 14] curl#6 - "Could not resolve host: resources.ovirt.org"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
http://ftp.snt.utwente.nl/pub/software/ovirt/ovirt-3.5/rpm/fc20/noarch/sos-3...: [Errno 14] curl#6 - "Could not resolve host: ftp.snt.utwente.nl"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
http://ftp.nluug.nl/os/Linux/virtual/ovirt/ovirt-3.5/rpm/fc20/noarch/sos-3.2...: [Errno 14] curl#6 - "Could not resolve host: ftp.nluug.nl"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
http://mirror.linux.duke.edu/ovirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2...: [Errno 14] curl#6 - "Could not resolve host: mirror.linux.duke.edu"
Trying other mirror.

Error downloading packages:
  sos-3.2-0.2.fc20.ovirt.noarch: [Errno 256] No more mirrors to try.

This was similar to my previous failures. I took a look, and the problem was that /etc/resolv.conf had no nameservers, and the /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt file contained no entries for DNS1 or DOMAIN.

So, it appears that when hosted-engine set up my bridged network, it neglected to carry over the DNS configuration necessary to the bridge.

Note that I am using *static* network configuration, rather than DHCP. During installation of the OS I am setting up the network configuration as Manual. Perhaps the hosted-engine script is not properly prepared to deal with that?

I went ahead and modified the ifcfg-ovirtmgmt network script (for the next service restart/boot) and resolv.conf (I was afraid to restart the network in the middle of hosted-engine execution since I don't know what might already be connected to the engine). This time it got further, but ultimately it still failed at the very end:

[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...
[ INFO  ] The VDSM Host is now operational
          Please shutdown the VM allowing the system to launch it as a monitored service.
          The system will wait until the VM is down.
[ ERROR ] Failed to execute stage 'Closing up': Error acquiring VM status
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150310140028.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination

At that point, neither the ovirt-ha-broker or ovirt-ha-agent services were running.

Note there was no significant pause after it said "The system will wait until the VM is down".

After the script completed, I shut down the VM, and manually started the ha services, and the VM came up. I could login to the Administration Portal, and finally see my HostedEngine VM. :-)

I seem to be in a bad state however: The Data Center has *no* storage domains attached. I'm not sure what else might need cleaning up. Any assistance appreciated.

-Bob
...
...
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: p3p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master
ovirtmgmt state UP group default qlen 1000
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::baca:3aff:fe79:2212/64 scope link
       valid_lft forever preferred_lft forever
3: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue
state DOWN group default
    link/ether 56:56:f7:cf:73:27 brd ff:ff:ff:ff:ff:ff
4: wlp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
group default qlen 1000
    link/ether 1c:3e:84:50:8d:c3 brd ff:ff:ff:ff:ff:ff
6: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default
    link/ether 22:a1:01:9e:30:71 brd ff:ff:ff:ff:ff:ff
7: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP group default
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.58/16 brd 172.16.255.255 scope global ovirtmgmt
       valid_lft forever preferred_lft forever
    inet6 fe80::baca:3aff:fe79:2212/64 scope link
       valid_lft forever preferred_lft forever
The only unusual thing about my setup that I can think of, from the network
perspective, is that my physical host has a wireless interface, which I've
not configured. Could it be confusing hosted-engine --deploy?
-Bob
--------------070402000308020002020408
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <div class="moz-cite-prefix">On 03/10/2015 10:20 AM, Simone
      Tiraboschi wrote:<br>
    </div>
    <blockquote
      cite="mid:330401686.19087599.1425997248583.JavaMail.zimbra@redhat.com"
      type="cite">
      <pre wrap="">

----- Original Message -----
</pre>
      <blockquote type="cite">
        <pre wrap="">From: "Bob Doolittle" <a class="moz-txt-link-rfc2396E" href="mailto:bob@doolittle.us.com"><bob@doolittle.us.com></a>
To: "Simone Tiraboschi" <a class="moz-txt-link-rfc2396E" href="mailto:stirabos@redhat.com"><stirabos@redhat.com></a>
Cc: "users-ovirt" <a class="moz-txt-link-rfc2396E" href="mailto:users@ovirt.org"><users@ovirt.org></a>
Sent: Tuesday, March 10, 2015 2:40:13 PM
Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed
state)

On 03/10/2015 04:58 AM, Simone Tiraboschi wrote:
</pre>
        <blockquote type="cite">
          <pre wrap="">
----- Original Message -----
</pre>
          <blockquote type="cite">
            <pre wrap="">From: "Bob Doolittle" <a class="moz-txt-link-rfc2396E" href="mailto:bob@doolittle.us.com"><bob@doolittle.us.com></a>
To: "Simone Tiraboschi" <a class="moz-txt-link-rfc2396E" href="mailto:stirabos@redhat.com"><stirabos@redhat.com></a>
Cc: "users-ovirt" <a class="moz-txt-link-rfc2396E" href="mailto:users@ovirt.org"><users@ovirt.org></a>
Sent: Monday, March 9, 2015 11:48:03 PM
Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
F20 (The VDSM host was found in a failed
state)

On 03/09/2015 02:47 PM, Bob Doolittle wrote:
</pre>
            <blockquote type="cite">
              <pre wrap="">Resending with CC to list (and an update).

On 03/09/2015 01:40 PM, Simone Tiraboschi wrote:
</pre>
              <blockquote type="cite">
                <pre wrap="">----- Original Message -----
</pre>
                <blockquote type="cite">
                  <pre wrap="">From: "Bob Doolittle" <a class="moz-txt-link-rfc2396E" href="mailto:bob@doolittle.us.com"><bob@doolittle.us.com></a>
To: "Simone Tiraboschi" <a class="moz-txt-link-rfc2396E" href="mailto:stirabos@redhat.com"><stirabos@redhat.com></a>
Cc: "users-ovirt" <a class="moz-txt-link-rfc2396E" href="mailto:users@ovirt.org"><users@ovirt.org></a>
Sent: Monday, March 9, 2015 6:26:30 PM
Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1
on
F20 (Cannot add the host to cluster ... SSH
has failed)

</pre>
                </blockquote>
              </blockquote>
            </blockquote>
          </blockquote>
        </blockquote>
        <pre wrap="">...
</pre>
        <blockquote type="cite">
          <blockquote type="cite">
            <blockquote type="cite">
              <blockquote type="cite">
                <blockquote type="cite">
                  <pre wrap="">OK, I've started over. Simply removing the storage domain was
insufficient,
the hosted-engine deploy failed when it found the HA and Broker
services
already configured. I decided to just start over fresh starting with
re-installing the OS on my host.

I can't deploy DNS at the moment, so I have to simply replicate
/etc/hosts
files on my host/engine. I did that this time, but have run into a new
problem:

[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
          Enter the name of the cluster to which you want to add the
          host
          (Default) [Default]:
[ INFO  ] Waiting for the host to become operational in the engine.
This
may
take several minutes...
[ ERROR ] The VDSM host was found in a failed state. Please check
engine
and
bootstrap installation logs.
[ ERROR ] Unable to add ovirt-vm to the manager
          Please shutdown the VM allowing the system to launch it as a
          monitored service.
          The system will wait until the VM is down.
[ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
refused
[ INFO  ] Stage: Clean up
[ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection
refused

I've attached my engine log and the ovirt-hosted-engine-setup log. I
think I
had an issue with resolving external hostnames, or else a connectivity
issue
during the install.
</pre>
                </blockquote>
                <pre wrap="">For some reason your engine wasn't able to deploy your hosts but the SSH
session this time was established.
2015-03-09 13:05:58,514 ERROR
[org.ovirt.engine.core.bll.InstallVdsInternalCommand]
(org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed
for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.:
java.io.IOException: Command returned failure code 1 during SSH session
'<a class="moz-txt-link-abbreviated" href="mailto:root@xion2.smartcity.net">root@xion2.smartcity.net</a>'

Can you please attach host-deploy logs from the engine VM?
</pre>
              </blockquote>
              <pre wrap="">OK, attached.

Like I said, it looks to me like a name-resolution issue during the yum
update on the engine. I think I've fixed that, but do you have a better
suggestion for cleaning up and re-deploying other than installing the OS
on my host and starting all over again?
</pre>
            </blockquote>
            <pre wrap="">I just finished starting over from scratch, starting with OS installation
on
my host/node, and wound up with a very similar problem - the engine
couldn't
reach the hosts during the yum operation. But this time the error was
"Network is unreachable". Which is weird, because I can ssh into the
engine
and ping many of those hosts, after the operation has failed.

Here's my latest host-deploy log from the engine. I'd appreciate any
clues.
</pre>
          </blockquote>
          <pre wrap="">It seams that now your host is able to resolve that addresses but it's not
able to connect over http.
On your hosts some of them resolves as IPv6 addresses; can you please try
to use curl to get one of the file that it wasn't able to fetch?
Can you please check your network configuration before and after
host-deploy?
</pre>
        </blockquote>
        <pre wrap="">
I can give you the network configuration after host-deploy, at least for the
host/Node. The engine won't start for me this morning, after I shut down the
host for the night.

In order to give you the config before host-deploy (or, apparently for the
engine), I'll have to re-install the OS on the host and start again from
scratch. Obviously I'd rather not do that unless absolutely necessary.

Here's the host config after the failed host-deploy:

Host/Node:

# ip route
169.254.0.0/16 dev ovirtmgmt  scope link  metric 1007
172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58
</pre>
      </blockquote>
      <pre wrap="">
You are missing a default gateway and so the issue.
Are you sure that it was properly configured before trying to deploy that host?</pre>
    </blockquote>
    <br>
    It should have been, it was a fresh OS install. So I'm starting
    again, and keeping careful records of my network config.<br>
    <br>
    Here is my initial network config of my host/node, immediately
    following a new OS install:<br>
    <br>
    <pre>% ip route
default via 172.16.0.1 dev p3p1  proto static  metric 1024 
172.16.0.0/16 dev p3p1  proto kernel  scope link  src 172.16.0.58 

% ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.58/16 brd 172.16.255.255 scope global p3p1
       valid_lft forever preferred_lft forever
    inet6 fe80::baca:3aff:fe79:2212/64 scope link 
       valid_lft forever preferred_lft forever
3: wlp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 1c:3e:84:50:8d:c3 brd ff:ff:ff:ff:ff:ff
</pre>
    <br>
    After the VM is first created, the host/node config is:<br>
    <br>
    <pre># ip route
default via 172.16.0.1 dev ovirtmgmt 
169.254.0.0/16 dev ovirtmgmt  scope link  metric 1006 
172.16.0.0/16 dev ovirtmgmt  proto kernel  scope link  src 172.16.0.58 

# ip addr 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovirtmgmt state UP group default qlen 1000
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::baca:3aff:fe79:2212/64 scope link 
       valid_lft forever preferred_lft forever
3: wlp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 1c:3e:84:50:8d:c3 brd ff:ff:ff:ff:ff:ff
4: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 92:cb:9d:97:18:36 brd ff:ff:ff:ff:ff:ff
5: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default 
    link/ether 9a:bc:29:52:82:38 brd ff:ff:ff:ff:ff:ff
6: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.58/16 brd 172.16.255.255 scope global ovirtmgmt
       valid_lft forever preferred_lft forever
    inet6 fe80::baca:3aff:fe79:2212/64 scope link 
       valid_lft forever preferred_lft forever
7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovirtmgmt state UNKNOWN group default qlen 500
    link/ether fe:16:3e:16:a4:37 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc16:3eff:fe16:a437/64 scope link 
       valid_lft forever preferred_lft forever
</pre>
    <br>
    At this point, I was already seeing a problem on the host/node. I
    remembered that a newer version of sos package is delivered from the
    ovirt repositories. So I tried to do a "yum update" on my host, and
    got a similar problem:<br>
    <br>
    <pre>% sudo yum update
[sudo] password for rad: 
Loaded plugins: langpacks, refresh-packagekit
Resolving Dependencies
--> Running transaction check
---> Package sos.noarch 0:3.1-1.fc20 will be updated
---> Package sos.noarch 0:3.2-0.2.fc20.ovirt will be an update
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================================================
 Package             Arch                   Version                             Repository                 Size
================================================================================================================
Updating:
 sos                 noarch                 3.2-0.2.fc20.ovirt                  ovirt-3.5                 292 k

Transaction Summary
================================================================================================================
Upgrade  1 Package

Total download size: 292 k
Is this ok [y/d/N]: y
Downloading packages:
No Presto metadata available for ovirt-3.5
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
<a class="moz-txt-link-freetext" href="http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm">http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm</a>: [Errno 14] curl#6 - "Could not resolve host: <a class="moz-txt-link-abbreviated" href="http://www.gtlib.gatech.edu">www.gtlib.gatech.edu</a>"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
<a class="moz-txt-link-freetext" href="ftp://ftp.gtlib.gatech.edu/pub/oVirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm">ftp://ftp.gtlib.gatech.edu/pub/oVirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm</a>: [Errno 14] curl#6 - "Could not resolve host: <a class="moz-txt-link-abbreviated" href="ftp://ftp.gtlib.gatech.edu">ftp.gtlib.gatech.edu</a>"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
<a class="moz-txt-link-freetext" href="http://resources.ovirt.org/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm">http://resources.ovirt.org/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm</a>: [Errno 14] curl#6 - "Could not resolve host: resources.ovirt.org"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
<a class="moz-txt-link-freetext" href="http://ftp.snt.utwente.nl/pub/software/ovirt/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm">http://ftp.snt.utwente.nl/pub/software/ovirt/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm</a>: [Errno 14] curl#6 - "Could not resolve host: <a class="moz-txt-link-abbreviated" href="ftp://ftp.snt.utwente.nl">ftp.snt.utwente.nl</a>"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
<a class="moz-txt-link-freetext" href="http://ftp.nluug.nl/os/Linux/virtual/ovirt/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm">http://ftp.nluug.nl/os/Linux/virtual/ovirt/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm</a>: [Errno 14] curl#6 - "Could not resolve host: <a class="moz-txt-link-abbreviated" href="ftp://ftp.nluug.nl">ftp.nluug.nl</a>"
Trying other mirror.
sos-3.2-0.2.fc20.ovirt.noarch. FAILED                                          
<a class="moz-txt-link-freetext" href="http://mirror.linux.duke.edu/ovirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm">http://mirror.linux.duke.edu/ovirt/pub/ovirt-3.5/rpm/fc20/noarch/sos-3.2-0.2.fc20.ovirt.noarch.rpm</a>: [Errno 14] curl#6 - "Could not resolve host: mirror.linux.duke.edu"
Trying other mirror.

Error downloading packages:
  sos-3.2-0.2.fc20.ovirt.noarch: [Errno 256] No more mirrors to try.
</pre>
    <br>
    This was similar to my previous failures. I took a look, and the
    problem was that /etc/resolv.conf had no nameservers, and the
    /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt file contained no
    entries for DNS1 or DOMAIN.<br>
    <br>
    So, it appears that when hosted-engine set up my bridged network, it
    neglected to carry over the DNS configuration necessary to the
    bridge.<br>
    <br>
    Note that I am using *static* network configuration, rather than
    DHCP. During installation of the OS I am setting up the network
    configuration as Manual. Perhaps the hosted-engine script is not
    properly prepared to deal with that?<br>
    <br>
    I went ahead and modified the ifcfg-ovirtmgmt network script (for
    the next service restart/boot) and resolv.conf (I was afraid to
    restart the network in the middle of hosted-engine execution since I
    don't know what might already be connected to the engine). This time
    it got further, but ultimately it still failed at the very end:<br>
    <br>
    <pre>[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...
[ INFO  ] The VDSM Host is now operational
          Please shutdown the VM allowing the system to launch it as a monitored service.
          The system will wait until the VM is down.
[ ERROR ] Failed to execute stage 'Closing up': Error acquiring VM status
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150310140028.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
</pre>
    <br>
    At that point, neither the ovirt-ha-broker or ovirt-ha-agent
    services were running.<br>
    <br>
    Note there was no significant pause after it said "The system will
    wait until the VM is down".<br>
    <br>
    After the script completed, I shut down the VM, and manually started
    the ha services, and the VM came up. I could login to the
    Administration Portal, and finally see my HostedEngine VM. :-)<br>
    <br>
    I seem to be in a bad state however: The Data Center has <b>no</b>
    storage domains attached. I'm not sure what else might need cleaning
    up. Any assistance appreciated.<br>
    <br>
    -Bob<br>
    <br>
    <br>
    <blockquote
      cite="mid:330401686.19087599.1425997248583.JavaMail.zimbra@redhat.com"
      type="cite">
      <blockquote type="cite">
        <pre wrap=""># ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: p3p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master
ovirtmgmt state UP group default qlen 1000
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::baca:3aff:fe79:2212/64 scope link
       valid_lft forever preferred_lft forever
3: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue
state DOWN group default
    link/ether 56:56:f7:cf:73:27 brd ff:ff:ff:ff:ff:ff
4: wlp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
group default qlen 1000
    link/ether 1c:3e:84:50:8d:c3 brd ff:ff:ff:ff:ff:ff
6: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default
    link/ether 22:a1:01:9e:30:71 brd ff:ff:ff:ff:ff:ff
7: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP group default
    link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.58/16 brd 172.16.255.255 scope global ovirtmgmt
       valid_lft forever preferred_lft forever
    inet6 fe80::baca:3aff:fe79:2212/64 scope link
       valid_lft forever preferred_lft forever

The only unusual thing about my setup that I can think of, from the network
perspective, is that my physical host has a wireless interface, which I've
not configured. Could it be confusing hosted-engine --deploy?

-Bob

</pre>
      </blockquote>
    </blockquote>
    <br>
  </body>
</html>

--------------070402000308020002020408--