[ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error
Simone Tiraboschi
stirabos at redhat.com
Fri Oct 23 14:42:38 UTC 2015
On Fri, Oct 23, 2015 at 3:57 PM, Gianluca Cecchi <gianluca.cecchi at gmail.com>
wrote:
> On Thu, Oct 22, 2015 at 5:38 PM, Simone Tiraboschi <stirabos at redhat.com>
> wrote:
>
>>
>>
>>> In case I want to setup a single host with self hosted engine, could I
>>> configure on hypervisor
>>> a) one NFS share for sh engine
>>> b) one NFS share for ISO DOMAIN
>>> c) a local filesystem to be used to create then a local POSIX complant
>>> FS storage domain
>>> and work this way as a replacement of all-in-one?
>>>
>>
>> Yes but c is just a workaround, using another external NFS share would
>> help a lot if in the future you plan to add o to migrate to a new server.
>>
>
> Why do you see this as a workaround, if I plan to have this for example as
> a devel personal infra without no other hypervisors?
> I think about better performance directly going local instead of adding
> overhead of NFS with itself....
>
Just cause you are using as a shared storage something that is not really
shared.
>
>>>> Put the host in global maintenance (otherwise the engine VM will be
>>>> restarted)
>>>> Shutdown the engine VM
>>>> Shutdown the host
>>>>
>>>>
>>>
> Please note that at some point I had to power off the hypervisor in the
> previous step, because it was stalled trying to stop two processes:
> "Watchdog Multiplexing Daemon"
> and
> "Shared Storage Lease Manager"
>
> https://drive.google.com/file/d/0BwoPbcrMv8mvTVoyNzhRNGpqN1U/view?usp=sharing
>
> It was apparently able to stop the "Watchdog Multiplexing Daemon" after
> some minutes
>
> https://drive.google.com/file/d/0BwoPbcrMv8mvZExNNkw5LVBiXzA/view?usp=sharing
>
> But no way for the Shared Storage Lease Manager and the screen above is
> when I forced a power off yesterday, after global maintenance and correct
> shutdown of sh engine and shutdown of hypervisor stalled.
>
>
>
>
>
>>
>>>>>
>>> Ok. And for starting all again, is this correct:
>>>
>>> a) power on hypevisor
>>> b) hosted-engine --set-maintenance --mode=none
>>>
>>> other steps required?
>>>
>>>
>> No, that's correct
>>
>
>
> Today after powering on hypervisor and waiting about 6 minutes I then ran:
>
> [root at ovc71 ~]# ps -ef|grep qemu
> root 2104 1985 0 15:41 pts/0 00:00:00 grep --color=auto qemu
>
> --> as expected no VM in execution
>
> [root at ovc71 ~]# systemctl status vdsmd
> vdsmd.service - Virtual Desktop Server Manager
> Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
> Active: active (running) since Fri 2015-10-23 15:34:46 CEST; 3min 25s
> ago
> Process: 1666 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> --pre-start (code=exited, status=0/SUCCESS)
> Main PID: 1745 (vdsm)
> CGroup: /system.slice/vdsmd.service
> ├─1745 /usr/bin/python /usr/share/vdsm/vdsm
> └─1900 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd
> 55 --max-threads 10 --...
>
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 1
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> ask_user_info()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 1
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> ask_user_info()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> make_client_response()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 2
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> parse_server_challenge()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> ask_user_info()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> make_client_response()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 3
>
> --> I think it is expected that vdsmd starts anyway, even in global
> maintenance, is it correct?
>
> But then:
>
> [root at ovc71 ~]# hosted-engine --set-maintenance --mode=none
> Traceback (most recent call last):
> File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
> File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
> line 73, in <module>
> if not maintenance.set_mode(sys.argv[1]):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
> line 61, in set_mode
> value=m_global,
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 259, in set_maintenance_mode
> str(value))
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 201, in set_global_md_flag
> with broker.connection(self._retries, self._wait):
> File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
> return self.gen.next()
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 99, in connection
> self.connect(retries, wait)
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 78, in connect
> raise BrokerConnectionError(error_msg)
> ovirt_hosted_engine_ha.lib.exceptions.BrokerConnectionError: Failed to
> connect to broker, the number of errors has exceeded the limit (1)
>
> What to do next?
>
Are ovirt-ha-agent and ovirt-ha-broker up and running?
Can you please try to restart them via systemd?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151023/46a7cd5e/attachment-0001.html>
More information about the Users
mailing list