
On Fri, Oct 23, 2015 at 3:57 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Oct 22, 2015 at 5:38 PM, Simone Tiraboschi <stirabos@redhat.com> wrote:
In case I want to setup a single host with self hosted engine, could I configure on hypervisor a) one NFS share for sh engine b) one NFS share for ISO DOMAIN c) a local filesystem to be used to create then a local POSIX complant FS storage domain and work this way as a replacement of all-in-one?
Yes but c is just a workaround, using another external NFS share would help a lot if in the future you plan to add o to migrate to a new server.
Why do you see this as a workaround, if I plan to have this for example as a devel personal infra without no other hypervisors? I think about better performance directly going local instead of adding overhead of NFS with itself....
Just cause you are using as a shared storage something that is not really shared.
Put the host in global maintenance (otherwise the engine VM will be restarted) Shutdown the engine VM Shutdown the host
Please note that at some point I had to power off the hypervisor in the previous step, because it was stalled trying to stop two processes: "Watchdog Multiplexing Daemon" and "Shared Storage Lease Manager"
https://drive.google.com/file/d/0BwoPbcrMv8mvTVoyNzhRNGpqN1U/view?usp=sharin...
It was apparently able to stop the "Watchdog Multiplexing Daemon" after some minutes
https://drive.google.com/file/d/0BwoPbcrMv8mvZExNNkw5LVBiXzA/view?usp=sharin...
But no way for the Shared Storage Lease Manager and the screen above is when I forced a power off yesterday, after global maintenance and correct shutdown of sh engine and shutdown of hypervisor stalled.
Ok. And for starting all again, is this correct:
a) power on hypevisor b) hosted-engine --set-maintenance --mode=none
other steps required?
No, that's correct
Today after powering on hypervisor and waiting about 6 minutes I then ran:
[root@ovc71 ~]# ps -ef|grep qemu root 2104 1985 0 15:41 pts/0 00:00:00 grep --color=auto qemu
--> as expected no VM in execution
[root@ovc71 ~]# systemctl status vdsmd vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled) Active: active (running) since Fri 2015-10-23 15:34:46 CEST; 3min 25s ago Process: 1666 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS) Main PID: 1745 (vdsm) CGroup: /system.slice/vdsmd.service ├─1745 /usr/bin/python /usr/share/vdsm/vdsm └─1900 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd 55 --max-threads 10 --...
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client step 1 Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 ask_user_info() Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client step 1 Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 ask_user_info() Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 make_client_response() Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client step 2 Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 parse_server_challenge() Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 ask_user_info() Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 make_client_response() Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client step 3
--> I think it is expected that vdsmd starts anyway, even in global maintenance, is it correct?
But then:
[root@ovc71 ~]# hosted-engine --set-maintenance --mode=none Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py", line 73, in <module> if not maintenance.set_mode(sys.argv[1]): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py", line 61, in set_mode value=m_global, File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 259, in set_maintenance_mode str(value)) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 201, in set_global_md_flag with broker.connection(self._retries, self._wait): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection self.connect(retries, wait) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect raise BrokerConnectionError(error_msg) ovirt_hosted_engine_ha.lib.exceptions.BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1)
What to do next?
Are ovirt-ha-agent and ovirt-ha-broker up and running? Can you please try to restart them via systemd?