This is resolved.
I manually shutdown each VM, and then from within oVirt, I went to the host, and in the upper corner of the page when looking at the host, I clicked on 'Confirm Host has been rebooted'. 

This allowed oVirt to then recognize that the VMs were down, and I was able to bring them back online on a healthy host.

..... That's what you're supposed to do, anyway.
I intentionally cheated, and did the order of things a little bit differently. I knew that none of the VMs on that host were currently configured for HA, so I knew that if oVirt thought the VMs were turned off, that oVirt would NOT turn the VMs back online.

So just to make sure that it would even work, I marked the problematic host as rebooted FIRST. Then, once I knew that worked, and the VMs were showing down in the oVirt UI (but still online on the problematic host), I ssh'd to each server and manually shut them down before bringing them back online.

Hopefully this helps someone else!

-David

Sent with Proton Mail secure email.

------- Original Message -------
On Monday, September 19th, 2022 at 3:44 PM, David White via Users <users@ovirt.org> wrote:

Restarting the vdsmd​ service on 1 of the problematic hosts brought that host back, and ovirt can see it.

But that did not fix the problem on the last remaining host. I'm still troubleshooting...

Sent with Proton Mail secure email.

------- Original Message -------
On Monday, September 19th, 2022 at 11:37 AM, David White via Users <users@ovirt.org> wrote:

I tried rebooting the engine to see if that would magically solve the problem (worth a try, right?). But as I expected, it didn't help.

Now one of the hosts is in a "Non Responsive" state and the other is permanently in a "Connecting" state. All VMs associated with those 2 hosts now show a question mark on the oVirt dashboard.

The storage for these VMs is good, and these VMs are online. Everything is "working" -- I just need to get these VMs moved onto hosts that oVirt is able to manage.

If it helps for troubleshooting purposes, prior to rebooting the engine, the following errors were showing up in the oVirt UI for both of these hosts:

VDSM cha1-storage.example.com command Get Host Capabilities failed: Internal JSON-RPC error: {'reason': '[Errno 24] Too many open files'}

Any ideas? If I need to take some downtime for these VMs, so be it, but I need to keep downtime at a minimum.

Sent with Proton Mail secure email.

------- Original Message -------
On Monday, September 19th, 2022 at 8:41 AM, David White via Users <users@ovirt.org> wrote:

Ok, now that I'm able to (re)deploy ovirt to new hosts, I now need to migrate VMs that are running on hosts that are currently in an "unassigned" state in the cluser.

This is the result of having moved the oVirt engine OUT of a hyperconverged environment onto its own stand-alone system, while simultaneously upgrading oVirt from v4.4 to the latest v4.5.

See the following email threads: 

The oVirt engine knows about the VMs, and oVirt knows about the storage that those VMs are on. But the engine sees 2 of my hosts as "unassigned", and I've been unable to migrate the disks to new storage, nor live migrate a VM from an unassigned host, nor make a clone of an existing VM.

Is there a way to recover from this scenario? I was thinking something along the lines of manually shutting down the VM on the unassigned host, and then somehow force the engine to bring the VM online again from a healthy host?

Thanks,
David

Sent with Proton Mail secure email.