On Fri, Oct 23, 2015 at 3:57 PM, Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
On Thu, Oct 22, 2015 at 5:38 PM, Simone Tiraboschi
<stirabos(a)redhat.com>
wrote:
>
>
>> In case I want to setup a single host with self hosted engine, could I
>> configure on hypervisor
>> a) one NFS share for sh engine
>> b) one NFS share for ISO DOMAIN
>> c) a local filesystem to be used to create then a local POSIX complant
>> FS storage domain
>> and work this way as a replacement of all-in-one?
>>
>
> Yes but c is just a workaround, using another external NFS share would
> help a lot if in the future you plan to add o to migrate to a new server.
>
Why do you see this as a workaround, if I plan to have this for example as
a devel personal infra without no other hypervisors?
I think about better performance directly going local instead of adding
overhead of NFS with itself....
Just cause you are using as a shared storage something that is not really
shared.
>>> Put the host in global maintenance (otherwise the engine VM will be
>>> restarted)
>>> Shutdown the engine VM
>>> Shutdown the host
>>>
>>>
>>
Please note that at some point I had to power off the hypervisor in the
previous step, because it was stalled trying to stop two processes:
"Watchdog Multiplexing Daemon"
and
"Shared Storage Lease Manager"
https://drive.google.com/file/d/0BwoPbcrMv8mvTVoyNzhRNGpqN1U/view?usp=sha...
It was apparently able to stop the "Watchdog Multiplexing Daemon" after
some minutes
https://drive.google.com/file/d/0BwoPbcrMv8mvZExNNkw5LVBiXzA/view?usp=sha...
But no way for the Shared Storage Lease Manager and the screen above is
when I forced a power off yesterday, after global maintenance and correct
shutdown of sh engine and shutdown of hypervisor stalled.
>
>>>>
>> Ok. And for starting all again, is this correct:
>>
>> a) power on hypevisor
>> b) hosted-engine --set-maintenance --mode=none
>>
>> other steps required?
>>
>>
> No, that's correct
>
Today after powering on hypervisor and waiting about 6 minutes I then ran:
[root@ovc71 ~]# ps -ef|grep qemu
root 2104 1985 0 15:41 pts/0 00:00:00 grep --color=auto qemu
--> as expected no VM in execution
[root@ovc71 ~]# systemctl status vdsmd
vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
Active: active (running) since Fri 2015-10-23 15:34:46 CEST; 3min 25s
ago
Process: 1666 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
--pre-start (code=exited, status=0/SUCCESS)
Main PID: 1745 (vdsm)
CGroup: /system.slice/vdsmd.service
├─1745 /usr/bin/python /usr/share/vdsm/vdsm
└─1900 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd
55 --max-threads 10 --...
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 1
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 1
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
make_client_response()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 2
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
parse_server_challenge()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
make_client_response()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 3
--> I think it is expected that vdsmd starts anyway, even in global
maintenance, is it correct?
But then:
[root@ovc71 ~]# hosted-engine --set-maintenance --mode=none
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
line 73, in <module>
if not maintenance.set_mode(sys.argv[1]):
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
line 61, in set_mode
value=m_global,
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 259, in set_maintenance_mode
str(value))
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 201, in set_global_md_flag
with broker.connection(self._retries, self._wait):
File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 99, in connection
self.connect(retries, wait)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 78, in connect
raise BrokerConnectionError(error_msg)
ovirt_hosted_engine_ha.lib.exceptions.BrokerConnectionError: Failed to
connect to broker, the number of errors has exceeded the limit (1)
What to do next?
Are ovirt-ha-agent and ovirt-ha-broker up and running?
Can you please try to restart them via systemd?