As we have tested off the list, it seems that the symbolic link in /var/lib/vdsm that
ovirt-ha-agent/broker create was missing.Yet migration succeeds, but the donor host looses
score as the vm 'died unexpectedly'.
Try to cleanup the host2's metadata and try to provision it , so you can proceed with
the fix of host1 & host3.
I have no clue if engine-cleanup will affect the shared syorage, but it's possible -
so use as last resort.
If you fail to add host2 , you can always reinstall it as host4 and try to add it fresh.
Best Regards,Strahil Nikolov
On Fri, Apr 30, 2021 at 16:27, Marko Vrgotic<M.Vrgotic(a)activevideo.com> wrote:
<!--#yiv1414828792 _filtered {} _filtered {} _filtered {}#yiv1414828792 #yiv1414828792
p.yiv1414828792MsoNormal, #yiv1414828792 li.yiv1414828792MsoNormal, #yiv1414828792
div.yiv1414828792MsoNormal {margin:0cm;font-size:11.0pt;font-family:"Calibri",
sans-serif;}#yiv1414828792 p.yiv1414828792MsoListParagraph, #yiv1414828792
li.yiv1414828792MsoListParagraph, #yiv1414828792 div.yiv1414828792MsoListParagraph
{margin-top:0cm;margin-right:0cm;margin-bottom:0cm;margin-left:36.0pt;font-size:11.0pt;font-family:"Calibri",
sans-serif;}#yiv1414828792 span.yiv1414828792EmailStyle17
{font-family:"Calibri", sans-serif;color:windowtext;}#yiv1414828792
.yiv1414828792MsoChpDefault {font-family:"Calibri", sans-serif;} _filtered
{}#yiv1414828792 div.yiv1414828792WordSection1 {}#yiv1414828792 _filtered {} _filtered {}
_filtered {} _filtered {} _filtered {} _filtered {} _filtered {} _filtered {} _filtered {}
_filtered {}#yiv1414828792 ol {margin-bottom:0cm;}#yiv1414828792 ul
{margin-bottom:0cm;}-->
Dear oVirt,
I have already reached out twice regarding the issues that occurred, due to power outage,
but noticed only when upgrading engine to latest 4.3. version.
I am unable to redeploy engine on Host2, the hosted-engine file stays empty and VDSM on
Hosts1 and 3 is reporting, even though I cleared the metadata for the Host2, on Host 1 and
Host3:
2021-04-30 05:57:58,454-0700 ERROR (jsonrpc/7)
[ovirt_hosted_engine_ha.client.client.HAClient] Malformed metadata for host 2: received 0
of 512 expected bytes (client:137)
Today I tried to migrate HE from Host 3 to Host 1 and it fails each time with following
message:
On Engine:
2021-04-30 12:57:56,961Z ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-1233892) [] EVENT_ID:
VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed (VM: HostedEngine, Source:
ovirt-sj-03.ictv.com, Destination:
ovirt-sj-01.ictv.com).
On source Host:
2021-04-30 05:57:56,705-0700 ERROR (migsrc/66b6d489) [virt.vm]
(vmId='66b6d489-ceb8-486a-951a-355e21f13627') Failed to migrate (migration:450)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 431, in
_regular_run
time.time(), migrationParams, machineParams
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 505, in
_startUnderlyingMigration
self._perform_with_conv_schedule(duri, muri)
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 591, in
_perform_with_conv_schedule
self._perform_migration(duri, muri)
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 525, in
_perform_migration
self._migration_flags)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in
f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line
131, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in
wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1781, in
migrateToURI3
if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed',
dom=self)
libvirtError: operation aborted: migration out job: canceled by client
I know that this version is end of life – but I would very much appreciate if someone
could help me asses if this means corruption in DB or the overall damage, simply to know
how to plan further actions.
My impression was that I still had to functional HE Hosts in the pool, but after seeing
migration failure, it’s pretty much down to single host.
This is production system, so I cannot just move on to upgrading/deploying to 4.4.
Additionally – :
- Is the effect of the engine-cleanup on HE Host local or it affects all HE Hosts?
Could that help bringing the Host back to state so that HE can be re-deployed?
- What is the effect or reinitialize-lockspace?
Kindly awaiting your reply. Happy to provide any additional information needed.
-----
kind regards/met vriendelijke groeten
Marko Vrgotic
Sr. System Engineer @ System Administration
ActiveVideo
o: +31 (35) 6774131
m: +31 (65) 5734174
e: m.vrgotic@activevideo.com
w: www.activevideo.com
ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217 WJ Hilversum, The
Netherlands. The information contained in this message may be legally privileged and
confidential. It is intended to be read only by the individual or entity to whom it is
addressed or by their designee. If the reader of this message is not the intended
recipient, you are on notice that any distribution of this message, in any form, is
strictly prohibited. If you have received this message in error, please immediately
notify the sender and/or ActiveVideo Networks, LLC by telephone at +1 408.931.9200 and
delete or destroy any copy of this message.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FRTRNTSGPLQ...