On Wed, May 5, 2021 at 3:18 PM Marko Vrgotic <M.Vrgotic(a)activevideo.com> wrote:
Status Update:
During migration between Host1 and Host3, links are being created and I see no Errors as
before in the agent/broker logs.
Having all links in place, and having waited 24hours, for potential Engine updates, I
tried to deploy HE on Host2:
Part1:
From oVirt UI – HostedEngine Deploy
Imeddiately on Host2 started seeing messages like “ Is the HostedEngine deployed ?”
hosted-engine.conf got populated only with host_id and ca_path
it failed
Part2:
I noticed that during deployment no links in
/var/rund/vdsm/storage/<hosted_storage_id> were created
Copied hosted-engine.conf from Host1 to Host2
Replace the host_id with correct value
Reran the deployment
Noticed that /var/rund/vdsm/storage/<hosted_storage_id> two links got created, one
of them being the link to metadata_image
Host1 and Host3 hosted-engine –vm-status was showing Host2 but with status unknown/stale
data
Deployment failed
Part3:
Since link to conf_image was not created in first phase, I added it manually
Populated the hosted-engine.conf
Reran the deployment
Same result
Hosted-engine.conf on Host2 would end up with only host_id and ca_path values and
deployment would fail
At this point, I cleaned up all hosted-engie remains from Host2 using
ovirt-hosted-engine-cleanup and removed metadata of Host2 from host1 and Host3
I am out of ideas. It seems that Host1 and Host3 are happily operating, but I am unable
to add any other hosts to HE pool.
Please assist if you have any ideas.
Look, Marko - I admit it seems to me like you simply enjoy debugging
this yourself...
1. Did you try to reinstall the OS on host2? If not, is there any
reason not to? Other than the (legitimate!) wish to "understand what's
broken and fix just that"? Would reinstallation take a lot of
time/work? You can also do a full backup beforehand and then compare
later, to see what the differences were.
2. It's very hard to help you by only guessing around. If you have a
concrete issue, such as "I add a host with 'Deploy Hosted Engine' and
this fails", then please provide all relevant logs.
3. If you still want to continue debugging by yourself, fine - the
code is open, and at least I personally try quite hard to make it
easy, in the relatively small parts of the code I touched, to search
around it even without having a complete picture of its structure,
mainly by making texts inside logs be "unique enough" so that you can
easily find them in the code.
I am open to executing restore – but I would much rather like to at least discover where
or what is the problem, before moving to planning restore.
Good luck and best regards,
--
Didi