On Wed, Oct 21, 2015 at 5:44 PM, Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
On Wed, Oct 21, 2015 at 4:58 PM, Simone Tiraboschi
<stirabos(a)redhat.com>
wrote:
>
>> Ok, I will crosscheck the output...
>>
>
> That question is:
> Add lines for the appliance itself and for this host to /etc/hosts on
> the engine VM?
> Note: ensuring that this host could resolve the engine VM hostname is
> still up to you
>
You are right. I misunderstood the message meaning..
>
>
>> I run it through screen and I'm not able to scroll through the output of
>> it....
>>
>
> 1. Hit your screen prefix combination ( C-a / control + A by default),
> then hit Escape .
> 2. Move up/down with the arrow keys to scrool the output buffer.
> 3. When you're done, hit Return twice to get back to the end of the
> scroll buffer.
>
Ok, thank you very much! I never used screen before and didn't find this
useful infromation.
>
>
>> > Can I re-run using the answer file? In this case do I have to
>>> pre-insert sh
>>> > engine hostname inside /etc/hosts of hypervisor?
>>>
>>> Yes.
>>>
>>>
>>>
>> What is the supposed command for using answer file?
>>
>
> hosted-engine --deploy
>
--config-append=/var/lib/ovirt-hosted-engine-setup/answers/answers-20151021154529.conf
>
OK. It went successfull fro engine point of view.
One note: I chose spice as protocol for the sh engine and while I was able
to connect to it using root user, I wasn't using a normal user.
I got:
[g.cecchi@ovc71 ~]$ remote-viewer
--spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
spice://localhost?tls-port=5900 --spice-host-subject="C=EN, L=Test, O=Test,
CN=Test"
** (remote-viewer:1633): WARNING **: Couldn't connect to accessibility
bus: Failed to connect to socket /tmp/dbus-1N7zvOm2Zf: Connection refused
GLib-GIO-Message: Using the 'memory' GSettings backend. Your settings
will not be saved or shared with other applications.
(remote-viewer:1633): GSpice-WARNING **: loading ca certs from
/etc/pki/vdsm/libvirt-spice/ca-cert.pem failed
(/usr/bin/remote-viewer:1633): Spice-Warning **:
ssl_verify.c:428:openssl_verify: Error in certificate chain verification:
self signed certificate in certificate chain
(num=19:depth1:/C=EN/L=Test/O=Test/CN=TestCA)
(remote-viewer:1633): GSpice-WARNING **: main-1:0: SSL_connect:
error:00000001:lib(0):func(0):reason(1)
It seems that the hypervisor phase was stalled in some way and then failed:
[ INFO ] Engine is still not reachable, waiting...
[ INFO ] Engine replied: DB Up!Welcome to Health Status!
[ INFO ] Connecting to the Engine
[ INFO ] Waiting for the host to become operational in the engine. This
may take several minutes...
[ INFO ] Still waiting for VDSM host to become operational...
[ INFO ] Still waiting for VDSM host to become operational...
..
[ INFO ] Still waiting for VDSM host to become operational...
[ ERROR ] Timed out while waiting for host to start. Please check the logs.
[ ERROR ] Unable to add hosted_engine_1 to the manager
[ INFO ] Saving hosted-engine configuration on the shared storage domain
[ INFO ] Shutting down the engine VM
[ INFO ] Enabling and starting HA services
Hosted Engine successfully set up
[ INFO ] Stage: Clean up
[ INFO ] Generating answer file
'/var/lib/ovirt-hosted-engine-setup/answers/answers-20151021173522.conf'
[ INFO ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
During this waiting phase on hypervisor I got:
[root@ovc71 ~]# systemctl status vdsmd
vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
Active: active (running) since Wed 2015-10-21 17:10:55 CEST; 22min ago
Main PID: 506 (vdsm)
CGroup: /system.slice/vdsmd.service
├─319 /usr/libexec/ioprocess --read-pipe-fd 29 --write-pipe-fd
23 --max-threads 10 --m...
├─506 /usr/bin/python /usr/share/vdsm/vdsm
└─943 /usr/libexec/ioprocess --read-pipe-fd 57 --write-pipe-fd
56 --max-threads 10 --m...
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5 client
step 1
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5
ask_user_info()
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5 client
step 1
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5
ask_user_info()
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5
make_client_response()
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5 client
step 2
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5
parse_server_challenge()
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5
ask_user_info()
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5
make_client_response()
Oct 21 17:10:56 ovc71.localdomain.local python[506]: DIGEST-MD5 client
step 3
Inside vdsm.log under /var/log/vdsm/ I saw many loops of type
Thread-160::DEBUG::2015-10-21
17:31:23,084::fileSD::173::Storage.Misc.excCmd::(getReadDelay) /usr/bi
n/dd
if=/rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN/f538db13-330f-4a73-bf6d-1f7e3af12
370/dom_md/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
Thread-160::DEBUG::2015-10-21
17:31:23,091::fileSD::173::Storage.Misc.excCmd::(getReadDelay) SUCCESS
: <err> = '0+1 records in\n0+1 records out\n311 bytes (311 B) copied,
0.000233708 s, 1.3 MB/s\n'; <r
c> = 0
Reactor thread::INFO::2015-10-21
17:31:30,028::protocoldetector::72::ProtocolDetector.AcceptorImpl::
(handle_accept) Accepting connection from 127.0.0.1:40382
Reactor thread::DEBUG::2015-10-21
17:31:30,034::protocoldetector::82::ProtocolDetector.Detector::(__
init__) Using required_size=11
Reactor thread::INFO::2015-10-21
17:31:30,035::protocoldetector::118::ProtocolDetector.Detector::(ha
ndle_read) Detected protocol xml from 127.0.0.1:40382
Reactor thread::DEBUG::2015-10-21
17:31:30,035::bindingxmlrpc::1297::XmlDetector::(handle_socket) xm
l over http detected from ('127.0.0.1', 40382)
BindingXMLRPC::INFO::2015-10-21
17:31:30,036::xmlrpc::73::vds.XMLRPCServer::(handle_request) Startin
g request handler for 127.0.0.1:40382
Thread-284::INFO::2015-10-21
17:31:30,037::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request
handler for 127.0.0.1:40382 started
Thread-284::INFO::2015-10-21
17:31:30,043::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request
handler for 127.0.0.1:40382 stopped
Thread-160::DEBUG::2015-10-21
17:31:33,095::fileSD::173::Storage.Misc.excCmd::(getReadDelay) /usr/bin/dd
if=/rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN/f538db13-330f-4a73-bf6d-1f7e3af12370/dom_md/metadata
iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
Thread-160::DEBUG::2015-10-21
17:31:33,104::fileSD::173::Storage.Misc.excCmd::(getReadDelay) SUCCESS:
<err> = '0+1 records in\n0+1 records out\n311 bytes (311 B) copied,
0.00030368 s, 1.0 MB/s\n'; <rc> = 0
Thread-160::DEBUG::2015-10-21
17:31:43,109::fileSD::173::Storage.Misc.excCmd::(getReadDelay) /usr/bin/dd
if=/rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN/f538db13-330f-4a73-bf6d-1f7e3af12370/dom_md/metadata
iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
Thread-160::DEBUG::2015-10-21
17:31:43,116::fileSD::173::Storage.Misc.excCmd::(getReadDelay) SUCCESS:
<err> = '0+1 records in\n0+1 records out\n311 bytes (311 B) copied,
0.00109833 s, 283 kB/s\n'; <rc> = 0
Reactor thread::INFO::2015-10-21
17:31:45,067::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept)
Accepting connection from 127.0.0.1:40384
Reactor thread::DEBUG::2015-10-21
17:31:45,072::protocoldetector::82::ProtocolDetector.Detector::(__init__)
Using required_size=11
Reactor thread::INFO::2015-10-21
17:31:45,073::protocoldetector::118::ProtocolDetector.Detector::(handle_read)
Detected protocol xml from 127.0.0.1:40384
Reactor thread::DEBUG::2015-10-21
17:31:45,073::bindingxmlrpc::1297::XmlDetector::(handle_socket) xml over
http detected from ('127.0.0.1', 40384)
BindingXMLRPC::INFO::2015-10-21
17:31:45,074::xmlrpc::73::vds.XMLRPCServer::(handle_request) Starting
request handler for 127.0.0.1:40384
Thread-285::INFO::2015-10-21
17:31:45,074::xmlrpc::84::vds.XMLRPCServer::(_process_requests) Request
handler for 127.0.0.1:40384 started
Thread-285::INFO::2015-10-21
17:31:45,080::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request
handler for 127.0.0.1:40384 stopped
Full vdsm.log in gzip format here:
https://drive.google.com/file/d/0BwoPbcrMv8mvbjRqQ1ZuTzhsUEU/view?usp=sha...
Exactly what kind of check does the install run to veryfy hypervisor is up
?
It pools engine REST API checking the host status for 10 minutes till it
become 'up' or 'non_operational'. In your case it reached the 10 minutes
timeout.
Please check engine and host-deploy logs on the engine VM.
Can I simply restart the hypervisor and see what happens? I remember
similar thinigs in all-in-one setups where initial setup failed the vdsm
part but then a restart was ok.....