Thanks for your input Didi.

On 12/5/21 1:20 PM, Yedidyah Bar David wrote:

On Sun, Dec 5, 2021 at 8:13 PM Valerio Luccio <valerio.luccio@nyu.edu> wrote:

I stopped the VMs and then "hosted-engine --vm-shutdown" followed by "hosted-engine --vm-start", I thought that would restart everything, but apparently not.

Now I just did "systemctl status ovirt-ha-broker" and got this ugly thing:

ovirt-ha-broker ovirt_hosted_engine_ha.broker.notifications.Notifications ERRO>
                                                             Traceback (most recent call last):
                                                               File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/notific>
                                                                 timeout=float(cfg["smtp-timeout"]))
                                                               File "/usr/lib64/python3.6/smtplib.py", line 251, in __init__
                                                                 (code, msg) = self.connect(host, port)
                                                               File "/usr/lib64/python3.6/smtplib.py", line 336, in connect
                                                                 self.sock = self._get_socket(host, port, self.timeout)
                                                               File "/usr/lib64/python3.6/smtplib.py", line 307, in _get_socket
                                                                 self.source_address)
                                                               File "/usr/lib64/python3.6/socket.py", line 724, in create_connection
                                                                 raise err
                                                               File "/usr/lib64/python3.6/socket.py", line 713, in create_connection
                                                                 sock.connect(sa)
                                                             ConnectionRefusedError: [Errno 111] Connection refused

This is harmless, can be ignored. It most likely happens because you configured to send notifications via email to an unreachable mail server - the default is localhost. I suggest to check the log for more/other errors.
Thanks, I'll dig around. Glad to hear it's harmless.
 

And

systemctl status --no-pager --full ovirt-ha-agent
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-11-19 15:17:19 EST; 2 weeks 1 days ago
 Main PID: 14290 (ovirt-ha-agent)
    Tasks: 2 (limit: 406449)
   Memory: 105.8M
   CGroup: /system.slice/ovirt-ha-agent.service
           └─14290 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent

Dec 05 05:34:40 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore ERROR Unable to extract HEVM OVF
Dec 05 05:34:40 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Failed extracting VM OVF from the OVF_STORE volume, falling back to initial vm.conf

That's not good. You might find more details it the HA logs or vdsm logs.
I'll see what I can find out.
 
Dec 05 07:34:37 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore ERROR Unable to extract HEVM OVF
Dec 05 07:34:37 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Failed extracting VM OVF from the OVF_STORE volume, falling back to initial vm.conf
Dec 05 08:34:45 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore ERROR Unable to extract HEVM OVF
Dec 05 08:34:45 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Failed extracting VM OVF from the OVF_STORE volume, falling back to initial vm.conf
Dec 05 10:34:39 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore ERROR Unable to extract HEVM OVF
Dec 05 10:34:39 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Failed extracting VM OVF from the OVF_STORE volume, falling back to initial vm.conf
Dec 05 11:34:44 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore ERROR Unable to extract HEVM OVF
Dec 05 11:34:44 argus.cbi.fas.nyu.edu ovirt-ha-agent[14290]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Failed extracting VM OVF from the OVF_STORE volume, falling back to initial vm.conf

Is the broker error causing the agent error or vice-versa ?

All I did a couple of weeks ago was reboot the host (after having shut down the VMs and stopped the engine).

The engine and VMs seem to be working fine.


That's good - so not all hope is lost :-). But better find the root cause and handle it.
I didn't check the status because everything was working. I didn't suspect a problem until I saw those huge log files.

Best regards,
 

Thanks for all of your help.

On 12/2/21 1:15 PM, Strahil Nikolov wrote:

You need to restart ovirt-ha-broker.service and ovirt-ha-agent.service

Best Regards,
Strahil Nikolov

On Mon, Nov 29, 2021 at 18:31, Valerio Luccio
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ABCACVKMVJ6YE2UE2TSOAQSFUOXRWNMN/
--
Valerio Luccio    
High Performance Computing     10 Astor Place, Room 416D
New York University     New York, NY 10003

"In an open world, who needs windows or gates ?"
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AUYRVXQGSV5GKCIQB33UZZGAYMDJ27QJ/


--
Didi


--
Valerio Luccio    
High Performance Computing     10 Astor Place, Room 416D
New York University     New York, NY 10003

"In an open world, who needs windows or gates ?"