On Sun, Sep 12, 2021 at 10:35 AM Yedidyah Bar David <didi@redhat.com> wrote:

>>
>> It was the step I suspect there was a regression for in 4.4.8 (comparing with 4.4.7) when updating the first hosted-engine host during the upgrade flow and retaining its hostname details.

What's the regression?

I thought that in 4.4.7 there was not this problem if you use the same hostname but with different (real or virtual) hw as the first host during your SHE upgrade from 4.3.10 to 4.4.7.
But probably it was not so and I didn't remember correctly....


>> I'm going to test with latest async 2 4.4.8 and see if it solves the problem. Otherwise I'm going to open a bugzilla sending the logs.

Can you clarify what the bug is?

The automatic mgmt of host adding during the "hosted-engine --deploy --restore-from-file=backup.bck" step if you have different hw and you want to recycle your previous hostname.
In the past it often happened to me to combine upgrades of systems with hw refreshing (with standalone hosts, rhcs clusters, also ovirt/rhv from 4.2 to 4.3 if I remember correctly, ecc.) where you re-use an existing hostname on new hardware
More than a bug it would be an RFE perhaps....



> As novirt2 and novirt1 (in 4.3) are VMS running on the same hypervisor I see that in their hw details I have the same serial number and the usual random uuid

Same serial number? Doesn't sound right. Any idea why it's the same?

My env is nested oVirt and my hypervisors are Vms.
I notice that in oVirt if you clone a VM it changes the uuid in the clone but it retains the serial number...

> Unfortunately I cannot try at the moment the scenario where I deploy the new novirt2 on the same virtual hw, because in the first 4.3 install I configured the OS disk as 50Gb and with this size 4.4.8 complains about insufficient space. And having the snapshot active in preview I cannot resize the disk
> Eventually I can reinstall 4.3 on an 80Gb disk and try the same, maintaining the same hw ... but this would imply that in general I cannot upgrade using different hw and reusing the same hostnames.... correct?

Yes. Either reuse a host and keep its name (what we recommend in the
upgrade guide) or use a new host and a new name (backup/restore
guide).

The condition to remove the host prior to adding it is based on
unique_id_out, which is set in (see also bz 1642440, 1654697):

      - name: Get host unique id
        shell: |
          if [ -e /etc/vdsm/vdsm.id ];
          then cat /etc/vdsm/vdsm.id;
          elif [ -e /proc/device-tree/system-id ];
          then cat /proc/device-tree/system-id; #ppc64le
          else dmidecode -s system-uuid;
          fi;
        environment: "{{ he_cmd_lang }}"
        changed_when: true
        register: unique_id_out

So if you want to "make this work", you can set the uuid (either in
your (virtual) BIOS, to affect the /proc value, or in
/etc/vdsm/vdsm.id) to match the one of the old host (the one you want
to reuse its name). I didn't test this myself, though.


I confirm that I reverted the snapshots of the 2 VMs used as hypervisors taking them again at initial 4.3 status and remade all the steps, but right after the install of the OS of 4.4.8 oVirt node I created /etc/vdsm/vdsm.id inside novirt2 with the old 4.3 value (the file was not there at that moment) and then all the flow went as expected and I was then able to reach the final 4.4.8 async 2 env with both hosts at 4.4.8, cluster and DC updated to 4.6 compatibility level and no downtime for the VMs inside the env, because I was able to execute live migration after upgrading the first host


Perhaps, if you do want to open a bug, it should say something like:
"HE deploy should remove the old host based on its name, and not its
UUID". However, it's not completely clear to me that this won't
introduce new regressions.

I admit I didn't completely understand your flow, and especially your
considerations there. If you think the current behavior prevents an
important flow, please clarify.

Best regards,
--
Didi


My considerations, as explained at the beginning, were to give the chance to reuse the hostname (often the oVirt admin is not responsible for hostname creation/mgmt) if you want to leverage new hw in combination with the upgrade process.

Thanks for all the other considerations you put into your answer.

Gianluca