[ovirt-users] Solved: Re: 3.5 to 3.6 upgrade stuck

Simone Tiraboschi stirabos at redhat.com
Fri Jul 22 07:58:16 UTC 2016


On Fri, Jul 22, 2016 at 4:11 AM, Robert Story <rstory at tislabs.com> wrote:
> On Thu, 21 Jul 2016 16:04:41 -0400 Robert wrote:
> RS> Thread-1::config::278::ovirt_hosted_engine_ha.broker.notifications.Notifications.config
> RS>      ::(refresh_local_conf_file) local conf file was correctly written
> RS>
> RS> And then .... nothing. It just hangs. Nothing more is logged Thread-1.
>
> So I started digging around the the python source, starting from
> refresh_local_conf_file. I ended up in ./broker/notifications.py, in
> send_email. I added some logging:
>
> def send_email(cfg, email_body):
>     """Send email."""
>
>     logger = logging.getLogger("%s.Notifications" % __name__)
>
>     try:
>         logger.debug("#### setting up smtp 1")
>         server = smtplib.SMTP(cfg["smtp-server"], port=cfg["smtp-port"])
>         logger.debug("#### setting up smtp 2")
>         ...
>
> Now the final messages are:
>
> Thread-1::DEBUG::2016-07-21 21:35:05,280::config::278::
>   ovirt_hosted_engine_ha.broker.notifications.Notifications.config::
>   (refresh_local_conf_file) local conf file was correctly written
> Thread-1::DEBUG::2016-07-21 21:35:05,282::notifications::27::
>   ovirt_hosted_engine_ha.broker.notifications.Notifications::
>   (send_email) #### setting up smtp 1
>
>
> So the culprit is:
>
>         server = smtplib.SMTP(cfg["smtp-server"], port=cfg["smtp-port"])
>
> Note that this does actually send the email - 2 minutes later.

Thanks for time and your effort Robert!
In general the agent shouldn't got stuck if the broker is not able to
send a notification email within a certain amount of time.
I'm open a bug to track this. Adding Martin here.

> So I tried:
>
>   $ telnet localhost 25
>   Trying ::1...
>
> which hung, and a little bell went off in my brain...
>
> After changing /etc/hosts from:
>
> 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
> ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
>
> to
>
> 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
> ::1         localhost6 localhost6.localdomain6
>
> localhost resolves to 127.0.0.1, the delay is gone, and everything is fine.

We are seeing similar reports regarding ip4/ip6 issues also migrating on 4.0
See also http://lists.ovirt.org/pipermail/users/2016-June/040578.html and
https://bugzilla.redhat.com/show_bug.cgi?id=1358530

Adding Oved here.

> I don't want to update /etc/hosts on each host. Is there somewhere I can
> edit the broker config for mail?

The shortest option is to edit broker.conf inside the configuration
volume on the hosted-engine storage domain but it's a bit tricky and
also potentially dangerous if not well done.
We have an RFE about letting you reconfigure it from the engine, for
now, if you are brave enough, please try something like this.

dir=`mktemp -d` && cd $dir
mnt_point=/rhev/data-center/mnt/192.168.1.115:_Virtual_ext35u36 #
pleace with your local mount point
systemctl stop ovirt-ha-broker # on all the hosts!
sdUUID_line=$(grep sdUUID /etc/ovirt-hosted-engine/hosted-engine.conf)
sdUUID=${sdUUID_line:7:36}
conf_volume_UUID_line=$(grep conf_volume_UUID
/etc/ovirt-hosted-engine/hosted-engine.conf)
conf_volume_UUID=${conf_volume_UUID_line:17:36}
conf_image_UUID_line=$(grep conf_image_UUID
/etc/ovirt-hosted-engine/hosted-engine.conf)
conf_image_UUID=${conf_image_UUID_line:16:36}
sudo -u vdsm dd
if=$mnt_point/$sdUUID/images/$conf_image_UUID/$conf_volume_UUID
2>/dev/null| tar -xvf -
# here you have to edit the locally extracted broker.conf
tar -cO * | sudo -u vdsm dd
of=$mnt_point/$sdUUID/images/$conf_image_UUID/$conf_volume_UUID
systemctl restart ovirt-ha-agent # on all the hosts

I strongly advice to take a backup before editing.

> Robert
>
> --
> Senior Software Engineer @ Parsons



More information about the Users mailing list