help: Hosted-engine lost data and how to recover vm config to deploy new cluster
by Pandey, Shreyas
Hi ovirt team,
There are couple of questions I am struggling to get answers for.
We have ovirt cluster setup on two servers.
1)
The cluster went down and upon troubleshooting we noticed the hosted engine is not able to restart.
[root@j3sv7sr01ctr01 ~]# hosted-engine --vm-status
The hosted engine configuration has not been retrieved from shared storage yet,
please ensure that ovirt-ha-agent service is running.
[root@j3sv7sr01ctr01 ~]#
ovirt-ha-agent and ovirt-ha-broker both are failing because of storage related issue.
The glusterfs volume which is being used by hosted-engine doesnt have the hosted engine related configuration.
[root@j3sv7sr01ctr01 ~]# cd /rhev/data-center/mnt/glusterSD/
[root@j3sv7sr01ctr01 glusterSD]# ls
10.52.60.131:_j3sv7sr01datastore3
[root@j3sv7sr01ctr01 glusterSD]# cd 10.52.60.131\:_j3sv7sr01datastore3/
[root@j3sv7sr01ctr01 10.52.60.131:_j3sv7sr01datastore3]#
[root@j3sv7sr01ctr01 10.52.60.131:_j3sv7sr01datastore3]# ls -al
total 1
drwxr-xr-x 4 vdsm kvm 95 Dec 12 20:26 .
drwxr-xr-x 3 vdsm kvm 47 Dec 13 19:16 ..
[root@j3sv7sr01ctr01 10.52.60.131:_j3sv7sr01datastore3]#
[root@j3sv7sr01ctr01 10.52.60.131:_j3sv7sr01datastore3]#
Also we dont have the snapshots of the glusterfs and so it looks like we cant get the hosted-engine data now.
Is there anyway to recover the cluster from this state?
We still have the metadata of the vms as shown below -
[root@j3sv7sr01stg01 01a2b8d8-e360-41cc-beea-4080d48f436a]# pwd
/mnt/datastore1/vms/6f2c0622-fa3b-48f4-b412-9bd6f20892cb/images/01a2b8d8-e360-41cc-beea-4080d48f436a
[root@j3sv7sr01stg01 01a2b8d8-e360-41cc-beea-4080d48f436a]#
[root@j3sv7sr01stg01 01a2b8d8-e360-41cc-beea-4080d48f436a]#
[root@j3sv7sr01stg01 01a2b8d8-e360-41cc-beea-4080d48f436a]#
[root@j3sv7sr01stg01 01a2b8d8-e360-41cc-beea-4080d48f436a]#
[root@j3sv7sr01stg01 01a2b8d8-e360-41cc-beea-4080d48f436a]# cd ..
[root@j3sv7sr01stg01 images]# cd ..
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]# ls
dom_md images
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]#
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]# ls images/
01a2b8d8-e360-41cc-beea-4080d48f436a 39edc9c3-0f6a-4dc4-b0c9-8279c0d2301f 6841ade8-515b-438e-a464-87ee269b22aa b76b6031-a2ba-482b-a367-620521be9b9b e61b82ae-dae5-4131-b5f1-68050247ac11
05fc7f51-f217-419f-8dcd-781f363c6ec3 3bcf4941-9352-489d-b1cd-e81f6bac08e5 688f08fb-ec0b-4d4a-ba5f-aeb1ebea3c37 b8b48d39-6891-46a2-866c-dcfbd78d02a8 e6b473f9-e18d-49bd-af60-269baa6801ad
085fbfd5-523f-483c-b65f-b50fcdec4883 3c2c19a8-2c01-4487-b8c8-bf0e700f3a52 6ac1b668-6e17-48f9-b334-e271bfcb7788 b8e4bf92-1b68-4452-a2f5-244543a64467 ec9e26ab-fdc4-434c-bbf3-2959f3c1776c
08e0efb2-82de-4727-8527-dcdd134a75ef 3ebd222a-7f5e-4e27-bcb3-8fcdd3a2cfca 6f4ffa95-9855-46d1-b38f-c6fb90e9c92e bc7d3fd2-83a5-4c64-a68e-b50750ea1bee ef1d7d7d-ca6a-43c1-9f7c-3e52e1153842
0cba2187-6d43-4aa4-af8d-1c917aece6bb 401c7293-4f7f-4c47-b6c8-ff0a6f94fb0e 7a44c0a0-f202-423f-b390-13907fcf333e bcc9bc93-e7e3-4f6c-ae4c-f1911cfbebaf efb1b965-5e04-4537-954f-b5a70b95275b
18d527e7-822c-4faf-91d4-0be5940d3663 455dbdc1-2990-4474-ab1f-17767770bcb1 7d89ceb1-13b6-4646-8e1e-70e10a970b5f c12542d9-f9a9-417e-942e-d2b06a44d8d4 efc1d529-5fad-406d-baa5-100d187f8033
18ff181a-f620-49d8-a18d-92fb7c21e2d1 470e1c0e-f8f1-4477-9059-3220c68bad48 846bc7f5-6380-46ce-b31d-470bf2d10054 c27aeb05-e44e-447f-9563-5a8398eb73fc efd940ae-511a-4068-8c6c-97aaafbddcf4
1a2384e4-b850-4dae-a191-bb165c2833d9 48b7f23b-9041-4bd7-a4d9-6894117636e2 8cdf9897-5777-487c-967c-f2505f22755c c4874372-08e0-4ff3-9c5d-0404bdc7d194 f212e13e-3871-4c4a-8d15-450507202518
1a38eb80-6e21-4848-8518-943ac5625caf 4f709682-9159-40fa-b4c2-3080b26b72c3 903f9c51-9788-4a7d-b336-52a6fe4cc3ac c7ac51e0-4575-4d79-9cd9-247752df45a7 f4110154-4926-4019-8412-d76021aeb841
1d753f80-6811-4671-a807-865b7a04e11f 52634d4a-88f7-48e8-99c1-18e574b3cd23 9df26dbf-0370-40d4-badd-c0ad88cf96be c88fa564-2284-46ba-9784-5b66eac420be f6f910a9-0812-4fb1-8b70-bed869dcf580
21a99266-1c78-4271-a2ba-c65ad10cf26a 55356333-996e-4f13-86b3-ed064ec58ff7 a2b6ae87-71f9-41f7-97e8-90bb16e60517 cb354e40-5b5a-494f-a22d-b0a37fec09b3 fa7d2900-812f-4f9d-8f05-09a9321880d2
2330dbe1-abeb-4a0b-a4bd-e7e5fae68be2 57b097ed-9896-4a2c-a7a7-8ff387388752 a4b980d8-ff6a-4e10-a8ee-d4cfc9a4765a cb94ec8c-b29f-41cd-8b74-77687a69e75b fc63ae3e-35fb-4360-9c83-2089ec5d81c5
280fe3d4-dd16-49c7-ae7f-db8f70143f07 5bc471ba-52c9-4bbd-a1da-d2375e42b6bb a8afa44f-cf4f-4cc4-9fbb-5ce1b6be4bd5 d26d5671-6f6c-4b9b-81bc-d5a7e28eab0e ff0447dc-e950-4419-800b-495af75a5c65
31e2b9a1-946e-400b-be62-3c78575b23bc 5bf990c6-8898-41a9-bcc8-61bc81c872f9 a9fa7d64-70b3-46ce-9682-a976d86380fb da0c9b8e-66de-4283-85b4-f639205dd76a
355d27a1-ecc7-4e5d-bfe5-16d0e83d7df6 5c02176c-7ac1-4067-8cef-2be322798519 b171c219-0bb7-4619-aa45-886583f1dc5f dafdfde9-aa69-46f6-9f6d-9bf4ca7d5c94
39061360-c4d7-4852-a946-dee6a279039b 61dc998c-3ad9-404f-8103-82e66612e31d b578ddb5-da3b-4e73-9630-5ae605b92ee0 e1eb62b4-54a9-47a8-b2b2-bf8e3612f74c
39c0f701-574c-4795-91a4-60bfac700bcc 67f13533-2377-401b-8584-b0c2d70bad7c b6500cfa-d578-437c-98bb-602cd843228d e2fe200c-efa8-470c-a5cb-7a2ed6b50afa
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]#
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]# ls images/01a2b8d8-e360-41cc-beea-4080d48f436a/
e45d87e2-6fdb-41ab-9f1f-ac113db71ba5 e45d87e2-6fdb-41ab-9f1f-ac113db71ba5.meta f7529d80-f20b-46a8-bac4-5827afe2a648.lease
e45d87e2-6fdb-41ab-9f1f-ac113db71ba5.lease f7529d80-f20b-46a8-bac4-5827afe2a648 f7529d80-f20b-46a8-bac4-5827afe2a648.meta
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]#
[root@j3sv7sr01stg01 6f2c0622-fa3b-48f4-b412-9bd6f20892cb]#
2)
If we think of redeploying new cluster , can we use exisinting datastore which has the metdata of the vms to have the vms in its pre-outage state?
What other options do we have here?
Any help would be really important for us.
Thanks!
1 year
Failed to migrate VM to Host ovirt3.XXX.cz due to an Error: Fatal error during migration. Trying to migrate to another Host.
by Jirka Simon
Hello there,
after today's update I have problem with live migration to this host.
with message
2023-12-14 10:00:01,089+01 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-11) [67218183] VM
'77f85710-45e7-43ca-b0f4-69f87766cc43'(ca1.access.prod.hq.sldev.cz) was
unexpectedly detected as 'Down' on VDS
'044b7175-ca36-49b2-b01b-0253f9af7e4f'(ovirt3.corp.sldev.cz) (expected
on '858b8951-9b5a-4b8f-994e-4e11788c34d6')
2023-12-14 10:00:01,090+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(ForkJoinPool-1-worker-11) [67218183] START, DestroyVDSCommand(HostName
= ovirt3.corp.sldev.cz, DestroyVmV
DSCommandParameters:{hostId='044b7175-ca36-49b2-b01b-0253f9af7e4f',
vmId='77f85710-45e7-43ca-b0f4-69f87766cc43', secondsToWait='0',
gracefully='false', reason='', ignoreNoVm='true'}), log id: 696e7f0e
2023-12-14 10:00:01,336+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(ForkJoinPool-1-worker-11) [67218183] FINISH, DestroyVDSCommand, return:
, log id: 696e7f0e
2023-12-14 10:00:01,337+01 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-11) [67218183] VM
'77f85710-45e7-43ca-b0f4-69f87766cc43'(ca1.access.prod.hq.sldev.cz) was
unexpectedly detected as 'Down' on VDS
'044b7175-ca36-49b2-b01b-0253f9af7e4f'(ovirt3.corp.sldev.cz) (expected
on '858b8951-9b5a-4b8f-994e-4e11788c34d6')
2023-12-14 10:00:01,337+01 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-11) [67218183] Migration of VM
'ca1.access.prod.hq.sldev.cz' to host 'ovirt3.corp.sldev.c
z' failed: VM destroyed during the startup.
When I stop a VM and start it again it starts on affected without any
problem, but migration doesn't work.
thank you for any help.
Jirka
1 year
Grafana - Origin Not Allowed
by Maton, Brett
oVirt 4.5.0.8-1.el8
I tried to connect to grafana via the monitoring portal link from the dash
and all panels are failing to display any data with varying error messages,
but all include 'Origin Not Allowed'
I navigated to Data Sources and ran a test on the PostgreSQL connection
(localhost) which threw the same Origin Not Allowed error message.
Any suggestions?
1 year
virsh console support
by marek
hi,
is it possible use virsh console on ovirt host?
[root@server ~]# virsh -c
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf console
21 --safe
Connected to domain 'server-dev-1'
Escape character is ^] (Ctrl + ])
error: internal error: character device
ua-e4d8f97d-a7dd-4769-972c-27412212b955 is not using a PTY
dumpxml 21 shows
<console type='unix'>
<source mode='bind'
path='/var/run/ovirt-vmconsole-console/5a06d390-431c-48b5-9fb6-ebc34af73a88.sock'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
Marek
1 year
Move SHE to a LUN on another storage appliance
by Colin Coe
Hi all
I'm decommissioning an old SAN that holds our DEV/TST RHV instance.
I'm following https://access.redhat.com/solutions/6529691 (How to move the
Hosted-Engine storage to a new storage domain) and I'm on step 9 (Make sure
that the self-hosted engine is shut down).
hosted-engine --vm-status is reporting:
hosted-engine --vm-status | grep -E 'Engine status|Hostname'
Engine status : {"vm": "up", "health": "good",
"detail": "Up"}
Hostname : rhvh01.example.com
Engine status : {"vm": "down_unexpected", "health":
"bad", "detail": "unknown", "reason": "vm not running on this host"}
Hostname : rhvh02.example.com
Even though the vm is definitely shut down.
The instructions say call Red Hat but its nearly 6PM and we only have
standard support for this environment.
Any ideas on how to resolve this? I'm not sure if it's a big issue but I
don't want to cause myself more problems later.
Thanks
1 year
Migrate Hosted-Engine to different cluster
by Michaal R
I know this question pops up on a regular with this forum, but my best search engine manipulation turned up only to back up the engine configuration to a shared storage, then restore it on the other cluster using hosted-engine --deploy --restore-from-file=backup.tar.gz. While this is workable (it's by no means the end of the world for me), I wanted to ask if a different or better way has been or will be implemented. Here's why:
I am trying to move to oVirt from ESXi (the latter has outgrown its usefulness to me and has become increasingly more difficult to maintain for my purposes, as VMware is intent on pushing newer CPU technologies with each new version, leaving perfectly good systems with perfectly good processors and hardware out in the cold. On top of that, VMware won't let you easily do vGPU or GPU passthrough without significant risk to the system stability.). I started with a VM on the ESXi host as a proof-of-concept for setting up and configuring oVirt to run the hosted engine and a couple of VMs. This worked after some great work from the members here on the forum (shout out to Tomas and Arik!). This initial cluster is an Intel CPU based cluster, as the server its VM is running on is an R720. This is the server I need to migrate off of (I have a plan). So I set up in VMware Workstation on my PC a second host in its own cluster (as that machine is an AMD CPU machine). Now, I know I can't live migr
ate between clusters to begin with, but I was hoping there was an easier way to move the hosted-engine to the shared NAS storage, put the hosted-engine in maintenance mode, shut it down, change a variable or three with the --set-shared-config option (if that applies), run some script on the other host that starts the hosted-engine on that host in that cluster, bringing it out of maintenance mode automatically once up and verified stable.
I understand that process COULD be slower and more fraught with danger than just simply backing up the config to shared storage and deploying on the AMD host with the restored config, but on the off-chance it ISN'T I wanted to explore that possibility. Besides, it would be a nice feature of the hosted-engine and oVirt as a whole if you could eventually right-click the hosted-engine VM in the portal and click "Move to new cluster...", which would automate that entire process for you.
All that said, I'm working on one last problematic VM import before I start the arduous process of pulling everything off of ESXi and reformatting the R720 for oVirt. Hopefully that goes smoother than my start with oVirt. :)
1 year
[ANN] oVirt 4.5.5 is now generally available
by Sandro Bonazzola
oVirt 4.5.5 is now generally available
The oVirt project is excited to announce the general availability of oVirt
4.5.5, as of December 1st, 2023.
Important notes before you install / upgrade
If you’re going to install oVirt 4.5.5 on RHEL or similar, please read
Installing
on RHEL or derivatives <https://ovirt.org/download/install_on_rhel.html> f
irst.
Suggestion to use nightly
As discussed in oVirt Users mailing list
<https://lists.ovirt.org/archives/list/users@ovirt.org/thread/DMCC5QCHL6EC...>
we suggest the user community to use oVirt master snapshot repositories
<https://ovirt.org/develop/dev-process/install-nightly-snapshot.html>
ensuring that the latest fixes for the platform regressions will be
promptly available.
This oVirt 4.5.5 release is meant to provide what has been made available
in nightly repositories as base for new installations.
If you are already using oVirt master snapshot you should already have
received this release content.
Documentation
Be sure to follow instructions for oVirt 4.5!
-
If you want to try oVirt as quickly as possible, follow the instructions
on the Download <https://ovirt.org/download/> page.
-
For complete installation, administration, and usage instructions, see
the oVirt Documentation <https://ovirt.org/documentation/>.
-
For upgrading from a previous version, see the oVirt Upgrade Guide
<https://ovirt.org/documentation/upgrade_guide/>.
-
For a general overview of oVirt, see About oVirt
<https://ovirt.org/community/about.html>.
What’s new in oVirt 4.5.5 Release?
This release is available now on x86_64 architecture for:
-
oVirt Node NG (based on CentOS Stream 8)
-
oVirt Node NG (based on CentOS Stream 9)
-
CentOS Stream 8
-
CentOS Stream 9
-
RHEL 8 and derivatives
-
RHEL 9 and derivatives
Experimental builds are also available for ppc64le and aarch64.
See the release notes for installation instructions and a list of new
features and bugs fixed.
Additional resources:
-
Read more about the oVirt 4.5.5 release highlights:
https://www.ovirt.org/release/4.5.5/
-
Check out the latest project news on the oVirt blog:
https://blogs.ovirt.org/
--
Sandro* Bonazzola*
oVirt Project
1 year
ovirt node NonResponsive
by carlos.mendes@mgo.cv
Hello,
I have ovirt with two nodes and one that are NonResponsive and and cant manage them
because they are in Unknown state.
It seems that nodes lost connection for a while with their gateway.
The node (ovirt2) however is having consistent problems. The follow sequence of events is
reproducible and is causing the host to enter a "NonOperational" state on the
cluster:
What is the proper way of restoring management?
I have a two-node cluster with the ovirt manager running standlone on the virtual maachine
CentOS-Stream-9 and the ovirt node running the most recent oVirt Node 4.5.4 software.
I can then re-activate ovirt2, which appears as green for approximately 5 minutes and then
repeats all of the above issues.
What can I do to troubleshoot this?
1 year
Something went wrong, connection is closed web vnc console
by antonio.riggio@mail.com
After our admin updated the cert for ovirt after getting PKIK path validation failed:java.security.cert when trying to login. I get Something went wrong, connection is closed when trying to access console with noVnc. Any ideas what can cause this ?
Thank you
1 year