Arik / Strahil,
Many thanks!
Just in-case anyone else is hitting the same issue (*NOTE* Host and VM
ID _will_ be different!)
0. Ran a backup:
1. Connect to the hosted-engine and DB:
$ ssh root@vmengine
$ su - postgres
$ psql engine
2. Execute a select query to verify that the VM's run_on_vds is NULL:
# select * from vm_dynamic where vm_guid='b411e573-bcda-4689-b61f-1811c6f03ad5';
3. Execute Arik's update query:
# update vm_dynamic set
run_on_vds='82f92946-9130-4dbd-8663-1ac0b50668a1' where
vm_guid='b411e573-bcda-4689-b61f-1811c6f03ad5';
4. Re-started the engine:
$ systemctl restart ovirt-engine
5. Everything seems fine now. Profit!
Thanks again,
Gilboa
On Mon, Sep 21, 2020 at 4:28 PM Arik Hadas <ahadas(a)redhat.com> wrote:
>
>
>
> On Sun, Sep 20, 2020 at 11:21 AM Gilboa Davara <gilboad(a)gmail.com> wrote:
>>
>> On Sat, Sep 19, 2020 at 7:44 PM Arik Hadas <ahadas(a)redhat.com> wrote:
>> >
>> >
>> >
>> > On Fri, Sep 18, 2020 at 8:27 AM Gilboa Davara <gilboad(a)gmail.com>
wrote:
>> >>
>> >> Hello all (and happy new year),
>> >>
>> >> (Note: Also reported as
https://bugzilla.redhat.com/show_bug.cgi?id=1880251)
>> >>
>> >> Self hosted engine, single node, NFS.
>> >> Attempted to install CentOS over an existing Fedora VM with one host
>> >> device (USB printer).
>> >> Reboot failed, trying to boot from a non-existent CDROM.
>> >> Tried shutting the VM down, failed.
>> >> Tried powering off the VM, failed.
>> >> Dropped cluster to global maintenance, reboot host + engine (was
>> >> planning to upgrade it anyhow...), VM still stuck.
>> >>
>> >> When trying to power off the VM, the following message can be found
>> >> the in engine.log:
>> >> 2020-09-18 07:58:51,439+03 INFO
>> >> [org.ovirt.engine.core.bll.StopVmCommand]
>> >> (EE-ManagedThreadFactory-engine-Thread-42)
>> >> [7bc4ac71-f0b2-4af7-b081-100dc99b6123] Running command: StopVmCommand
>> >> internal: false. Entities affected : ID:
>> >> b411e573-bcda-4689-b61f-1811c6f03ad5 Type: VMAction group STOP_VM with
>> >> role type USER
>> >> 2020-09-18 07:58:51,441+03 WARN
>> >> [org.ovirt.engine.core.bll.StopVmCommand]
>> >> (EE-ManagedThreadFactory-engine-Thread-42)
>> >> [7bc4ac71-f0b2-4af7-b081-100dc99b6123] Strange, according to the
>> >> status 'RebootInProgress' virtual machine
>> >> 'b411e573-bcda-4689-b61f-1811c6f03ad5' should be running in a
host but
>> >> it isn't.
>> >> 2020-09-18 07:58:51,594+03 ERROR
>> >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> >> (EE-ManagedThreadFactory-engine-Thread-42)
>> >> [7bc4ac71-f0b2-4af7-b081-100dc99b6123] EVENT_ID:
>> >> USER_FAILED_STOP_VM(56), Failed to power off VM kids-home-srv (Host:
>> >> <UNKNOWN>, User: gilboa@internal-authz).
>> >>
>> >> My question is simple: Pending a solution to the bug, can I somehow
>> >> drop the state of the VM? It's currently holding a sizable disk
image
>> >> and a USB device I need (printer).
>> >
>> >
>> > It would be best to modify the VM as if it should still be running on the
host and let the system discover that it's not running there and update the VM
accordingly.
>> >
>> > You can do it by changing the database with:
>> > update vm_dynamic set
run_on_vds='82f92946-9130-4dbd-8663-1ac0b50668a1' where
vm_guid='b411e573-bcda-4689-b61f-1811c6f03ad5';
>> >
>> >
>> >>
>> >>
>> >> As it's my private VM cluster, I have no problem dropping the site
>> >> completely for maintenance.
>> >>
>> >> Thanks,
>> >>
>> >> Gilboa
>>
>>
>> Hello,
>>
>> Thanks for the prompt answer.
>>
>> Edward,
>>
>> Full reboot of both engine and host didn't help.
>> Most likely there's a consistency problem in the oVirt DB.
>>
>> Arik,
>>
>> To which DB I should connect and as which user?
>> E.g. psql -U user db_name
>
>
> To the 'engine' database.
> I usually connect to it by switching to the 'postgres' user as Strahil
described.
>
>>
>>
>> Thanks again,
>> - Gilboa
>>