Restarting the `vdsmd` service on 1 of the problematic hosts brought that host back, and
ovirt can see it.
But that did not fix the problem on the last remaining host. I'm still
troubleshooting...
Sent with Proton Mail secure email.
------- Original Message -------
On Monday, September 19th, 2022 at 11:37 AM, David White via Users <users(a)ovirt.org>
wrote:
I tried rebooting the engine to see if that would magically solve the
problem (worth a try, right?). But as I expected, it didn't help.
Now one of the hosts is in a "Non Responsive" state and the
other is permanently in a "Connecting" state. All VMs associated with those 2
hosts now show a question mark on the oVirt dashboard.
The storage for these VMs is good, and these VMs are online.
Everything is "working" -- I just need to get these VMs moved onto hosts that
oVirt is able to manage.
If it helps for troubleshooting purposes, prior to rebooting the
engine, the following errors were showing up in the oVirt UI for both of these hosts:
VDSM
cha1-storage.example.com command Get Host Capabilities failed:
Internal JSON-RPC error: {'reason': '[Errno 24] Too many open files'}
> Any ideas? If I need to take some downtime for these VMs, so be it, but I need to
keep downtime at a minimum.
> Sent with Proton Mail secure email.
> ------- Original Message -------
> On Monday, September 19th, 2022 at 8:41 AM, David White via Users
<users(a)ovirt.org> wrote:
> > Ok, now that I'm able to (re)deploy ovirt to new hosts, I now need to
migrate VMs that are running on hosts that are currently in an "unassigned"
state in the cluser.
>
> > This is the result of having moved the oVirt engine OUT of a hyperconverged
environment onto its own stand-alone system, while simultaneously upgrading oVirt from
v4.4 to the latest v4.5.
>
> > See the following email threads:
>
> > -
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/TZAUCM3GB5ER...
> > -
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/3IWXZ7VXM6CY...
>
>
> > The oVirt engine knows about the VMs, and oVirt knows about the storage that
those VMs are on. But the engine sees 2 of my hosts as "unassigned", and
I've been unable to migrate the disks to new storage, nor live migrate a VM from an
unassigned host, nor make a clone of an existing VM.
>
> > Is there a way to recover from this scenario? I was thinking something along the
lines of manually shutting down the VM on the unassigned host, and then somehow force the
engine to bring the VM online again from a healthy host?
>
> > Thanks,
> > David
>
>
> Sent with Proton Mail secure email.