Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

23 Oct 2015


      On Fri, Oct 23, 2015 at 3:57 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com>
wrote:
...
On Thu, Oct 22, 2015 at 5:38 PM, Simone Tiraboschi <stirabos@redhat.com>
wrote:
...
...
In case I want to setup a single host with self hosted engine, could I
configure on hypervisor
a) one NFS share for sh engine
b) one NFS share for ISO DOMAIN
c) a local filesystem to be used to create then a local POSIX complant
FS storage domain
and work this way as a replacement of all-in-one?
Yes but c is just a workaround, using another external NFS share would
help a lot if in the future you plan to add o to migrate to a new server.
Why do you see this as a workaround, if I plan to have this for example as
a devel personal infra without no other hypervisors?
I think about better performance directly going local instead of adding
overhead of NFS with itself....
Just cause you are using as a shared storage something that is not really
shared.
...
...
...
...
Put the host in global maintenance (otherwise the engine VM will be
restarted)
Shutdown the engine VM
Shutdown the host
Please note that at some point I had to power off the hypervisor in the
previous step, because it was stalled trying to stop two processes:
"Watchdog Multiplexing Daemon"
and
"Shared Storage Lease Manager"
https://drive.google.com/file/d/0BwoPbcrMv8mvTVoyNzhRNGpqN1U/view?usp=sharin...
It was apparently able to stop the "Watchdog Multiplexing Daemon" after
some minutes
https://drive.google.com/file/d/0BwoPbcrMv8mvZExNNkw5LVBiXzA/view?usp=sharin...
But no way for the Shared Storage Lease Manager and the screen above is
when I forced a power off yesterday, after global maintenance and correct
shutdown of sh engine and shutdown of hypervisor stalled.
...
...
...
...
Ok. And for starting all again, is this correct:
a) power on hypevisor
b) hosted-engine --set-maintenance --mode=none
other steps required?
No, that's correct
Today after powering on hypervisor and waiting about 6 minutes I then ran:
[root@ovc71 ~]# ps -ef|grep qemu
root      2104  1985  0 15:41 pts/0    00:00:00 grep --color=auto qemu
--> as expected no VM in execution
[root@ovc71 ~]# systemctl status vdsmd
vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
   Active: active (running) since Fri 2015-10-23 15:34:46 CEST; 3min 25s
ago
  Process: 1666 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
--pre-start (code=exited, status=0/SUCCESS)
 Main PID: 1745 (vdsm)
   CGroup: /system.slice/vdsmd.service
           ├─1745 /usr/bin/python /usr/share/vdsm/vdsm
           └─1900 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd
55 --max-threads 10 --...
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 1
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 1
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
make_client_response()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 2
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
parse_server_challenge()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
make_client_response()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 3
--> I think it is expected that vdsmd starts anyway, even in global
maintenance, is it correct?
But then:
[root@ovc71 ~]# hosted-engine --set-maintenance --mode=none
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
line 73, in <module>
    if not maintenance.set_mode(sys.argv[1]):
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
line 61, in set_mode
    value=m_global,
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 259, in set_maintenance_mode
    str(value))
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 201, in set_global_md_flag
    with broker.connection(self._retries, self._wait):
  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 99, in connection
    self.connect(retries, wait)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 78, in connect
    raise BrokerConnectionError(error_msg)
ovirt_hosted_engine_ha.lib.exceptions.BrokerConnectionError: Failed to
connect to broker, the number of errors has exceeded the limit (1)
What to do next?
Are ovirt-ha-agent and ovirt-ha-broker up and running?
Can you please try to restart them via systemd?