On Wed, Jun 3, 2020 at 4:30 PM Gianluca Cecchi
<gianluca.cecchi(a)gmail.com> wrote:
On Mon, May 25, 2020 at 8:50 AM Yedidyah Bar David <didi(a)redhat.com> wrote:
>
> On Sun, May 24, 2020 at 9:36 PM Gianluca Cecchi
> <gianluca.cecchi(a)gmail.com> wrote:
> >
> > On Sun, May 24, 2020 at 11:47 AM Yedidyah Bar David <didi(a)redhat.com>
wrote:
> >>
> >>
> >>
> >> Hi, Gianluca. Replying to your email on "4.4 HCI Install Failure -
> >> Missing /etc/pki/CA/cacert.pem":
> >>
> >> On Sun, May 24, 2020 at 12:28 PM Gianluca Cecchi
> >> <gianluca.cecchi(a)gmail.com> wrote:
> >> >
> >> > I I remember correctly it happened to me during the beta cycle and the
only "strange" character I used for the admin password was the @
> >> > Donna if it related with what you reported for the % character
> >>
> >> Did you open a bug?
> >>
> >> In any case, my above patch is not supposed to fix '@', only
'%' (I think).
> >>
> >> Thanks and best regards,
> >>
> >
> > No, I didn't open a bug, because I scratched the system and installed again
this time without the error, but I don't remember if I used the same password with the
@ character or not....
> > I will put attention in case of future 4.4 new installations
>
> Very well, thanks :-)
> --
> Didi
Just to avoid opening a bug for a different thing, today I tried a single host HCI setup
with the wizard and it failed.
Installed from ovirt-node-ng final 4.4 iso.
I see I have no /etc/pki/CA directory on the host at the moment, but I don't know if
the install workflow had not arrived there yet or what.
Last lines in my wizard window are these ones below.
Password used contains only letters, numbers and the "_" character in this
attempt
I'm in the "Prepare VM" stage.
[ INFO ] TASK [ovirt.hosted_engine_setup : Stop libvirt service]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Drop vdsm config statements]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Restore initial abrt config files]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Restart abrtd service]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Drop libvirt sasl2 configuration by vdsm]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Stop and disable services]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Restore initial libvirt default network
configuration]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Start libvirt]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
"Unable to start service libvirtd: Job for libvirtd.service failed because the
control process exited with error code.\nSee \"systemctl status
libvirtd.service\" and \"journalctl -xe\" for details.\n"}
Status of libvirtd service is this one:
[root@ovirt01 g.cecchi]# systemctl status libvirtd -l --no-pager
● libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset:
enabled)
Drop-In: /etc/systemd/system/libvirtd.service.d
└─unlimited-core.conf
Active: failed (Result: exit-code) since Wed 2020-06-03 15:13:35 CEST; 7min ago
Docs: man:libvirtd(8)
https://libvirt.org
Process: 20001 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=exited, status=6)
Main PID: 20001 (code=exited, status=6)
Tasks: 2 (limit: 32768)
Memory: 70.1M
CGroup: /system.slice/libvirtd.service
├─3926 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf
--leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
└─3927 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf
--leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
Jun 03 15:13:35 ovirt01.mydomain.local systemd[1]: libvirtd.service: Service
RestartSec=100ms expired, scheduling restart.
Jun 03 15:13:35 ovirt01.mydomain.local systemd[1]: libvirtd.service: Scheduled restart
job, restart counter is at 5.
Jun 03 15:13:35 ovirt01.mydomain.local systemd[1]: Stopped Virtualization daemon.
Jun 03 15:13:35 ovirt01.mydomain.local systemd[1]: libvirtd.service: Start request
repeated too quickly.
This looks simply like libvirt was restarted too much too quickly. Did
you restart it yourself before deploy?
If not, and it's only due to the deploy process restarting it, then either:
1. That's ok, and we should simply configure systemd to allow that,
2. or there is some other problem, and restarting libvirt is just a symptom.
Jun 03 15:13:35 ovirt01.mydomain.local systemd[1]: libvirtd.service:
Failed with result 'exit-code'.
Jun 03 15:13:35 ovirt01.mydomain.local systemd[1]: Failed to start Virtualization
daemon.
[root@ovirt01 g.cecchi]#
Let me know what files do you want to analyze the problem
Under /var/log/ovirt-hosted-engine-setup I have:
[root@ovirt01 ovirt-hosted-engine-setup]# ls -lrt
total 632
-rw-r--r--. 1 root root 123814 Jun 3 15:07
ovirt-hosted-engine-setup-ansible-get_network_interfaces-20205315737-ooohyb.log
-rw-r--r--. 1 root root 126674 Jun 3 15:08
ovirt-hosted-engine-setup-ansible-validate_hostnames-20205315737-oqixuw.log
-rw-r--r--. 1 root root 127548 Jun 3 15:10
ovirt-hosted-engine-setup-ansible-validate_hostnames-202053151022-yls4qo.log
-rw-r--r--. 1 root root 261482 Jun 3 15:13
ovirt-hosted-engine-setup-ansible-initial_clean-20205315123-7x25zv.log
[root@ovirt01 ovirt-hosted-engine-setup]#
Perhaps check these for libvirt restarts, also /var/log/messages (or
journalctl).
Do you see there other errors?
In this environment no dns but entry in /etc/hosts of the server.
Host is on 192.168.1.x on eno1 and a vlan 100 on the same interface, used for
"simulated" storage network
For now, I'd assume that's not related, but can't be sure.
Thanks and best regards,
--
Didi