[ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

Simone Tiraboschi stirabos at redhat.com
Fri Oct 23 16:10:31 UTC 2015


On Fri, Oct 23, 2015 at 5:55 PM, Gianluca Cecchi <gianluca.cecchi at gmail.com>
wrote:

> On Fri, Oct 23, 2015 at 5:05 PM, Simone Tiraboschi <stirabos at redhat.com>
> wrote:
>
>>
>>>
>> OK, can you please try again the whole reboot procedure just to ensure
>> that it was just a temporary NFS glitch?
>>
>
>
> It seems reproducible.
>
> This time I was able to shutdown the hypervisor without manual power off.
> Only strange thing is that I ran
>
> shutdown -h now
>
> and actually the VM at some point (I was able to see that the watchdog
> stopped...) booted.... ?
>
> Related lines in messages:
> Oct 23 17:33:32 ovc71 systemd: Unmounting RPC Pipe File System...
> Oct 23 17:33:32 ovc71 systemd: Stopping Session 11 of user root.
> Oct 23 17:33:33 ovc71 systemd: Stopped Session 11 of user root.
> Oct 23 17:33:33 ovc71 systemd: Stopping user-0.slice.
> Oct 23 17:33:33 ovc71 systemd: Removed slice user-0.slice.
> Oct 23 17:33:33 ovc71 systemd: Stopping vdsm-dhclient.slice.
> Oct 23 17:33:33 ovc71 systemd: Removed slice vdsm-dhclient.slice.
> Oct 23 17:33:33 ovc71 systemd: Stopping vdsm.slice.
> Oct 23 17:33:33 ovc71 systemd: Removed slice vdsm.slice.
> Oct 23 17:33:33 ovc71 systemd: Stopping Sound Card.
> Oct 23 17:33:33 ovc71 systemd: Stopped target Sound Card.
> Oct 23 17:33:33 ovc71 systemd: Stopping LVM2 PV scan on device 8:2...
> Oct 23 17:33:33 ovc71 systemd: Stopping LVM2 PV scan on device 8:16...
> Oct 23 17:33:33 ovc71 systemd: Stopping Dump dmesg to /var/log/dmesg...
> Oct 23 17:33:33 ovc71 systemd: Stopped Dump dmesg to /var/log/dmesg.
> Oct 23 17:33:33 ovc71 systemd: Stopping Watchdog Multiplexing Daemon...
> Oct 23 17:33:33 ovc71 systemd: Stopping Multi-User System.
> Oct 23 17:33:33 ovc71 systemd: Stopped target Multi-User System.
> Oct 23 17:33:33 ovc71 systemd: Stopping ABRT kernel log watcher...
> Oct 23 17:33:33 ovc71 systemd: Stopping Command Scheduler...
> Oct 23 17:33:33 ovc71 rsyslogd: [origin software="rsyslogd"
> swVersion="7.4.7" x-pid="690" x-info="http://www.rsyslog.com"] exiting on
> signal 15.
> Oct 23 17:36:24 ovc71 rsyslogd: [origin software="rsyslogd"
> swVersion="7.4.7" x-pid="697" x-info="http://www.rsyslog.com"] start
> Oct 23 17:36:21 ovc71 journal: Runtime journal is using 8.0M (max 500.0M,
> leaving 750.0M of free 4.8G, current limit 500.0M).
> Oct 23 17:36:21 ovc71 kernel: Initializing cgroup subsys cpuset
>
>
> Coming back with the ovrt processes I see:
>
> [root at ovc71 ~]# systemctl status ovirt-ha-broker
> ovirt-ha-broker.service - oVirt Hosted Engine High Availability
> Communications Broker
>    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
> enabled)
>    Active: inactive (dead) since Fri 2015-10-23 17:36:25 CEST; 31s ago
>   Process: 849 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-broker stop
> (code=exited, status=0/SUCCESS)
>   Process: 723 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
> (code=exited, status=0/SUCCESS)
>  Main PID: 844 (code=exited, status=0/SUCCESS)
>    CGroup: /system.slice/ovirt-ha-broker.service
>
> Oct 23 17:36:24 ovc71.localdomain.local systemd-ovirt-ha-broker[723]:
> Starting ovirt-ha-broker: [...
> Oct 23 17:36:24 ovc71.localdomain.local systemd[1]: Started oVirt Hosted
> Engine High Availabili...r.
> Oct 23 17:36:25 ovc71.localdomain.local systemd-ovirt-ha-broker[849]:
> Stopping ovirt-ha-broker: [...
> Hint: Some lines were ellipsized, use -l to show in full.
>
> ANd
> [root at ovc71 ~]# systemctl status nfs-server
> nfs-server.service - NFS server and services
>    Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled)
>    Active: active (exited) since Fri 2015-10-23 17:36:27 CEST; 1min 9s ago
>   Process: 1123 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited,
> status=0/SUCCESS)
>   Process: 1113 ExecStartPre=/usr/sbin/exportfs -r (code=exited,
> status=0/SUCCESS)
>  Main PID: 1123 (code=exited, status=0/SUCCESS)
>    CGroup: /system.slice/nfs-server.service
>
> Oct 23 17:36:27 ovc71.localdomain.local systemd[1]: Starting NFS server
> and services...
> Oct 23 17:36:27 ovc71.localdomain.local systemd[1]: Started NFS server and
> services.
>
> So it seems that the broker tries to start and fails (17:36:25) before NFS
> server start phase completes (17:36:27)...?
>
> Again if I then manually start ha-broker and ha-agent, they start ok and
> I'm able to become operational again with the sh engine up
>
> systemd file for broker is this
>
> [Unit]
> Description=oVirt Hosted Engine High Availability Communications Broker
>
> [Service]
> Type=forking
> EnvironmentFile=-/etc/sysconfig/ovirt-ha-broker
> ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
> ExecStop=/usr/lib/systemd/systemd-ovirt-ha-broker stop
>
> [Install]
> WantedBy=multi-user.target
>
> Probably inside the [unit] section I should add
> After=nfs-server.service
>
>
Ok, I understood.
You are right: the broker was failing cause the NFS storage was not ready
cause it was served in loopback and there isn't any explicit service
dependency on that.

We are not imposing it cause generally an NFS shared domain is generally
thought to be served from and external system while a loopback NFS is just
a degenerate case.
Simply fix it manually.


> but this should be true only for sh engine configured with NFS.... so to
> be done at install/setup time?
>
> If you want I can set this change for my environment and verify...
>
>
>
>>
>> The issue was here:  --spice-host-subject="C=EN, L=Test, O=Test, CN=Test"
>> This one was just the temporary subject used by hosted-engine-setup
>> during the bootstrap sequence when your engine was still to come.
>> At the end that cert got replace by the engine CA signed ones and so you
>> have to substitute that subject to match the one you used during your setup.
>>
>>
>
> Even using correct certificate I have problem
> On hypervisor
>
> [root at ovc71 ~]# openssl x509 -in /etc/pki/vdsm/libvirt-spice/ca-cert.pem
> -text | grep Subject
>         Subject: C=US, O=localdomain.local,
> CN=shengine.localdomain.local.75331
>         Subject Public Key Info:
>             X509v3 Subject Key Identifier:
>
> On engine
> [root at shengine ~]# openssl x509 -in  /etc/pki/ovirt-engine/ca.pem -text |
> grep Subject
>         Subject: C=US, O=localdomain.local,
> CN=shengine.localdomain.local.75331
>         Subject Public Key Info:
>             X509v3 Subject Key Identifier:
>
> but
>
> [root at ovc71 ~]# hosted-engine --add-console-password
> Enter password:
> code = 0
> message = 'Done'
>
> [root at ovc71 ~]# remote-viewer
> --spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> spice://localhost?tls-port=5900 --spice-host-subject="C=US,
> O=localdomain.local, CN=shengine.localdomain.local.75331"
>

it should be:
remote-viewer --spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
spice://ovc71.localdomain.local?tls-port=5900 --spice-host-subject="C=US,
O=localdomain.local, CN=ovc71.localdomain.local"



> ** (remote-viewer:4297): WARNING **: Couldn't connect to accessibility
> bus: Failed to connect to socket /tmp/dbus-Gb5xXSKiKK: Connection refused
> GLib-GIO-Message: Using the 'memory' GSettings backend.  Your settings
> will not be saved or shared with other applications.
> (/usr/bin/remote-viewer:4297): Spice-Warning **:
> ssl_verify.c:492:openssl_verify: ssl: subject 'C=US, O=localdomain.local,
> CN=shengine.localdomain.local.75331' verification failed
> (/usr/bin/remote-viewer:4297): Spice-Warning **:
> ssl_verify.c:494:openssl_verify: ssl: verification failed
>
> (remote-viewer:4297): GSpice-WARNING **: main-1:0: SSL_connect:
> error:00000001:lib(0):func(0):reason(1)
>
>
> and the remote-viewer window with
>
>
>  Unable to connect to the graphic server spice://localhost?tls-port=5900
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151023/a09f397e/attachment-0001.html>


More information about the Users mailing list