Dear oVirt community,
After days/hours of investigation and numerous, very educational, attempts to restore
HostedEngine into HA mode, we finally stumbled upon existing Bug:
https://access.redhat.com/solutions/4588001
The root cause, diagnostics steps and solution is also in there and I can confirm its
working.
In case your SHE HA Host had an MDS flag within CPU Family and it got removed upon Host
Upgrade, most likely you are affected by this bug.
The way to, sense it, check it, well:
* your Updated Host will not be allowed back to Cluster due to unsupported/missing CPU
Flag “md-clear”.
* Running “grep md_ /proc/cpuinfo” on updated Host will give back only prompt
* In case you went as far as I did, with downgrading Cluster CPU type for example
from: “Intel SandyBridge IBRA SSBD MDS” to “ Intel SandyBridge IBRA SSBD”, in order to be
able to join the updated Host back into Cluster, you will not be able to migrate
HostedEngine onto that Host due to missing md_clear flag of the Host CPU
* In my case, only HostedEngine was affected by it. Hint: only HostedEngine in my
environment is HA VM, so it might be related. Does not have to be same for your
Environment.
The steps to solve this, valid for each Host:
* Put the Host into Maintenance mode – in order to Migrate all VMs from it
* # mkdir -p /etc/microcode_ctl/ ucode_with_caveats
* # touch /etc/microcode_ctl/ucode_with_caveats/force-intel-06-2d-07
* # dracut -f --regenerate-all
* # reboot
* Once Host is back online – execute:
* # grep md_ /proc/cpuinfo
* from WebUI refresh Host capabilities – you should see CPU Family of the Host
updated back with MDS
* Activate
A shoutout to Strahil Nikolov who was helping me the entire way, and eventually discovered
this existing bug from above, which helped us to solve the problem.
-----
kind regards/met vriendelijke groeten
Marko Vrgotic
Sr. System Engineer @ System Administration
ActiveVideo
o: +31 (35) 6774131
e: m.vrgotic@activevideo.com<mailto:m.vrgotic@activevideo.com>
w:
www.activevideo.com<http://www.activevideo.com>
ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217 WJ Hilversum, The
Netherlands. The information contained in this message may be legally privileged and
confidential. It is intended to be read only by the individual or entity to whom it is
addressed or by their designee. If the reader of this message is not the intended
recipient, you are on notice that any distribution of this message, in any form, is
strictly prohibited. If you have received this message in error, please immediately
notify the sender and/or ActiveVideo Networks, LLC by telephone at +1 408.931.9200 and
delete or destroy any copy of this message.
From: "Vrgotic, Marko" <M.Vrgotic(a)activevideo.com>
Date: Tuesday, 18 February 2020 at 18:51
To: Strahil Nikolov <hunter86_bg(a)yahoo.com>, "users(a)ovirt.org"
<users(a)ovirt.org>
Cc: Darko Stojchev <D.Stojchev(a)activevideo.com>
Subject: Re: [ovirt-users] Re: HostedEngine migration fails with VM destroyed during the
startup.
Hi Strahil,
Unfortunately, no luck,
When I try to start Engine: hosted-engine --vm-start it fails with VM in Wait for Launch
I VM log, says status shutdown: failed
When I dumpxml of the engine - I see that its defined CPU contains "md-clear"
flag.
Since I do not know how to check if Engine has synced it configuration, I cannot say if 8
hours waiting was long enough.
-----
kind regards/met vriendelijke groeten
Marko Vrgotic
Sr. System Engineer @ System Administration
ActiveVideo
o: +31 (35) 6774131
e: m.vrgotic(a)activevideo.com
w:
www.activevideo.com <
http://www.activevideo.com>
ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217 WJ Hilversum, The
Netherlands. The information contained in this message may be legally privileged and
confidential. It is intended to be read only by the individual or entity to whom it is
addressed or by their designee. If the reader of this message is not the intended
recipient, you are on notice that any distribution of this message, in any form, is
strictly prohibited. If you have received this message in error, please immediately
notify the sender and/or ActiveVideo Networks, LLC by telephone at +1 408.931.9200 and
delete or destroy any copy of this message.
On 18/02/2020, 14:16, "Strahil Nikolov" <hunter86_bg(a)yahoo.com> wrote:
On February 18, 2020 1:01:32 PM GMT+02:00, "Vrgotic, Marko"
<M.Vrgotic(a)activevideo.com> wrote:
Hi Strahil,
We got to meet at next oVirt conf, as all beer rounds will be on me!
Ok, just to be sure, so upon those 8 hours:
* Node1 #ssh root@ovirt-engine “shutdown -h now” <= engine is
currently running, via virsh start, on this Node1
* Node1 # virsh undefine HostedEngine
* Node2 # hosted-engine –vm-start <= Node3 still needs to be
updated
* Node2 # hosted-engine --set-maintenance --mode=none
* Node3 # hosted-engine --set-maintenance --mode=local
* Patch Node3
* Test HA
On 18/02/2020, 11:46, "Strahil Nikolov"
<hunter86_bg(a)yahoo.com> wrote:
On February 18, 2020 12:03:31 PM GMT+02:00, "Vrgotic,
Marko"
<M.Vrgotic(a)activevideo.com> wrote:
>Dear Strahil,
>
>
>
>Thank you for all knowledge sharing and support so far.
>
>
>
>The procedure went fine so far and I have the Engine running on
Node1
>(it was on Node3).
>
>
>
>However, I see “strange things” :
>
> * Engine is running and I have access to WebUI as well -
good.
>* None of the HA Nodes actually show who is hosting the Engine
atm –
>all crowns are gray - Strange
>* If I look at the list of VMs, I see HostedEngine VM as powered
off
>- Strange
>
>
>
>Can I safely assume procedure went fine and now the Engine conf
sync
>time of 12 hours started or something went wrong?
>
>
>
>Kindly awaiting your reply.
>
>
>
>-----
>kind regards/met vriendelijke groeten
>
>Marko Vrgotic
>Sr. System Engineer @ System Administration
>
>ActiveVideo
>o: +31 (35) 6774131
>e:
m.vrgotic@activevideo.com<mailto:m.vrgotic@activevideo.com>
>
>ActiveVideo Networks BV. Mediacentrum 3745 Joop van den
Endeplein
>1.1217 WJ Hilversum, The Netherlands. The information contained
in
this
>message may be legally privileged and confidential. It is
intended to
>be read only by the individual or entity to whom it is addressed
or by
>their designee. If the reader of this message is not the
intended
>recipient, you are on notice that any distribution of this
message, in
>any form, is strictly prohibited. If you have received this
message
in
>error, please immediately notify the sender and/or
ActiveVideo
>Networks, LLC by telephone at +1 408.931.9200 and delete or
destroy
any
>copy of this message.
>
>
>
>
>
>
>
>
>
>
>On 17/02/2020, 15:04, "Strahil Nikolov"
<hunter86_bg(a)yahoo.com> wrote:
>
>
>
>On February 17, 2020 1:55:13 PM GMT+02:00, "Vrgotic,
Marko"
>
<M.Vrgotic(a)activevideo.com> wrote:
>
> >Good day Strahil,
>
> >
>
> >
>
> >
>
> >I believe I found the causing link:
>
> >
>
> >
>
> >
>
> >HostedEngine.log-20200216:-cpu
>
>>SandyBridge,pcid=on,spec-ctrl=on,ssbd=on,md-clear=on,vme=on,hypervisor=on,arat=on,xsaveopt=on
>
> >\
>
> >
>
> >HostedEngine.log-20200216:2020-02-13T17:58:38.674630Z
qemu-kvm:
>
>>warning: host doesn't support requested feature:
>CPUID.07H:EDX.md-clear
>
> >[bit 10]
>
> >
>
> >HostedEngine.log-20200216:2020-02-13T17:58:38.676205Z
qemu-kvm:
>
>>warning: host doesn't support requested feature:
>CPUID.07H:EDX.md-clear
>
> >[bit 10]
>
> >
>
> >HostedEngine.log-20200216:2020-02-13T17:58:38.676901Z
qemu-kvm:
>
>>warning: host doesn't support requested feature:
>CPUID.07H:EDX.md-clear
>
> >[bit 10]
>
> >
>
> >HostedEngine.log-20200216:2020-02-13T17:58:38.677616Z
qemu-kvm:
>
>>warning: host doesn't support requested feature:
>CPUID.07H:EDX.md-clear
>
> >[bit 10]
>
> >
>
> >
>
> >
>
> >The "md-clear" CPU seem to be removed as
feature due to spectre
>
> >vulnerabilities.
>
> >
>
> >
>
> >
>
>>However, when I check the CPU Type/flags of the VMs on the
same Host
>as
>
>>where Engine is currently, as well as on the other hosts, the
md-clear
>
> >seems to be only present on the HostedEngine:
>
> >
>
> >
>
> >
>
> > * HostedEngine:
>
> >
>
> >FromwebUI:
>
> >Intel SandyBridge IBRS SSBD Family
>
> >
>
> >
>
> >
>
> >Via virsh:
>
> >
>
> >#virsh dumpxml
>
> >
>
> ><cpu mode='custom' match='exact'
check='full'>
>
> >
>
> > <model
fallback='forbid'>SandyBridge</model>
>
> >
>
> > <topology sockets='16'
cores='4' threads='1'/>
>
> >
>
> > <feature policy='require'
name='pcid'/>
>
> >
>
> > <feature policy='require'
name='spec-ctrl'/>
>
> >
>
> > <feature policy='require'
name='ssbd'/>
>
> >
>
> > <feature policy='require'
name='md-clear'/>
>
> >
>
> > <feature policy='require'
name='vme'/>
>
> >
>
> > <feature policy='require'
name='hypervisor'/>
>
> >
>
> > <feature policy='require'
name='arat'/>
>
> >
>
> > <feature policy='require'
name='xsaveopt'/>
>
> >
>
> > <numa>
>
> >
>
> > <cell id='0' cpus='0-3'
memory='16777216' unit='KiB'/>
>
> >
>
> > </numa>
>
> >
>
> ></cpu>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > * OtherVMs:
>
> >
>
> >From webUI:
>
> >(SandyBridge,+pcid,+spec-ctrl,+ssbd)
>
> >
>
> >
>
> >
>
> >Via virsh:
>
> >
>
> >#virsh dumpxml
>
> >
>
> ><cpu mode='custom' match='exact'
check='full'>
>
> >
>
> > <model
fallback='forbid'>SandyBridge</model>
>
> >
>
> > <topology sockets='16'
cores='1' threads='1'/>
>
> >
>
> > <feature policy='require'
name='pcid'/>
>
> >
>
> > <feature policy='require'
name='spec-ctrl'/>
>
> >
>
> > <feature policy='require'
name='ssbd'/>
>
> >
>
> > <feature policy='require'
name='vme'/>
>
> >
>
> > <feature policy='require'
name='hypervisor'/>
>
> >
>
> > <feature policy='require'
name='arat'/>
>
> >
>
> > <feature policy='require'
name='xsaveopt'/>
>
> >
>
> > <numa>
>
> >
>
> > <cell id='0' cpus='0-3'
memory='4194304' unit='KiB'/>
>
> >
>
> > </numa>
>
> >
>
> > </cpu>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >Strahil, knowing this, do you propose different approach or
shall
I
>
> >just proceed with initially suggested workaround?
>
> >
>
> >
>
> >
>
> >Kindly awaiting your eply.
>
> >
>
> >
>
> >
>
>
>-----
>
>
>kind regards/met vriendelijke
groeten
>
> >
>
>
>Marko Vrgotic
>
>
>Sr. System Engineer @ System
Administration
>
> >
>
>
>ActiveVideo
>
>
>o: +31 (35) 6774131
>
> >e:
m.vrgotic@activevideo.com<mailto:m.vrgotic@activevideo.com>
>
>
>
> >
>
> >ActiveVideo Networks BV. Mediacentrum 3745 Joop van den
Endeplein
>
>>1.1217 WJ Hilversum, The Netherlands. The information
contained in
>
this
>
> >message may be legally privileged and confidential. It is
intended
to
>
>>be read only by the individual or entity to whom it is
addressed or
by
>
> >their designee. If the reader of this message is not the
intended
>
>>recipient, you are on notice that any distribution of this
message,
in
>
>>any form, is strictly prohibited. If you have received this
message
>
in
>
> >error, please immediately notify the sender and/or
ActiveVideo
>
>>Networks, LLC by telephone at +1 408.931.9200 and delete or
destroy
>
any
>
>
>copy of this message.
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
>>On 16/02/2020, 15:28, "Strahil Nikolov"
<hunter86_bg(a)yahoo.com>
wrote:
>
> >
>
> >
>
> >
>
> > ssh root@engine "poweroff"
>
> >
>
> >ssh host-that-holded-engine "virsh undefine
HostedEngine; virsh
list
>
> >--all"
>
> >
>
> >
>
> >
>
> > Lot's of virsh - less vdsm :)
>
> >
>
> >
>
> >
>
> > Good luck
>
> >
>
> >
>
> >
>
> > Best Regards,
>
> >
>
> > Strahil Nikolov
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >В неделя, 16 февруари 2020 г., 16:01:44 ч. Гринуич+2,
Vrgotic,
Marko
>
> ><m.vrgotic(a)activevideo.com> написа:
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > Hi Strahil,
>
> >
>
> >
>
> >
>
>> Regarding step 3: Stop and undefine the VM on the last
working
>host
>
> >
>
> >One question: How do I undefine HostedEngine from last
Host?
>
> >Hosted-engine command does not provide such option, or
it's just
not
>
> >obvious.
>
> >
>
> >
>
> >
>
> > Kindly awaiting your reply.
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > -----
>
> >
>
> > kind regards/met vriendelijke groeten
>
> >
>
> >
>
> >
>
> > Marko Vrgotic
>
> >
>
> > ActiveVideo
>
> >
>
> >
>
> >
>
> >
>
> >
>
>> On 14/02/2020, 18:44, "Strahil Nikolov"
<hunter86_bg(a)yahoo.com>
>
wrote:
>
> >
>
> >
>
> >
>
> >On February 14, 2020 4:19:53 PM GMT+02:00,
"Vrgotic, Marko"
>
> >
<M.Vrgotic(a)activevideo.com> wrote:
>
> >
>
> > >Good answer Strahil,
>
> >
>
> > >
>
> >
>
> > >Thank you, I forgot.
>
> >
>
> > >
>
> >
>
> > >Libvirt logs are actually showing the
reason why:
>
> >
>
> > >
>
> >
>
> > >2020-02-14T12:33:51.847970Z qemu-kvm:
-drive
>
> >
>
>>>file=/var/run/vdsm/storage/054c43fc-1924-4106-9f80-0f2ac62b9886/b019c5fa-8fb5-4bfc-8339-f5b7f590a051/f1ce8ba6-2d3b-4309-bca0-e6a00ce74c75,format=raw,if=none,id=drive-ua-b019c5fa-8fb5-4bfc-8339-f5b7f590a051,serial=b019c5fa-8fb5-4bfc-8339-f5b7f590a051,werror=stop,rerror=stop,cache=none,aio=threads:
>
> >
>
>> >'serial' is deprecated, please use the
corresponding option
>of
>
> >
>
> > >'-device' instead
>
> >
>
>> >Spice-Message: 04:33:51.856: setting TLS option
'CipherString'
>
to
>
> >
>
> >
>'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from
>
> >
>
> > >/etc/pki/tls/spice.cnf configuration file
>
> >
>
>> >2020-02-14T12:33:51.863449Z qemu-kvm: warning: CPU(s)
not present
>
in
>
> >
>
>> >any NUMA nodes: CPU 4 [socket-id: 1, core-id: 0,
thread-id: 0], CPU
>5
>
> >
>
> >>[socket-id: 1, core-id: 1, thread-id: 0], CPU 6
[socket-id: 1,
>
> >core-id:
>
> >
>
>>>2, thread-id: 0], CPU 7 [socket-id: 1, core-id: 3,
thread-id: 0],
CPU
>
> >8
>
> >
>
> >>[socket-id: 2, core-id: 0, thread-id: 0], CPU 9
[socket-id: 2,
>
> >core-id:
>
> >
>
>>>1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2,
thread-id: 0],
>
CPU
>
> >
>
>> >11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12
[socket-id:
>3,
>
> >
>
> >>core-id: 0, thread-id: 0], CPU 13 [socket-id: 3,
core-id: 1,
>
> >thread-id:
>
> >
>
> >>0], CPU 14 [socket-id: 3, core-id: 2, thread-id:
0], CPU 15
>
> >[socket-id:
>
> >
>
>> >3, core-id: 3, thread-id: 0], CPU 16 [socket-id:
4, core-id:
>0,
>
> >
>
>>>thread-id: 0], CPU 17 [socket-id: 4, core-id: 1,
thread-id: 0], CPU
>18
>
> >
>
>> >[socket-id: 4, core-id: 2, thread-id: 0], CPU 19
[socket-id:
>4,
>
> >
>
> >>core-id: 3, thread-id: 0], CPU 20 [socket-id: 5,
core-id: 0,
>
> >thread-id:
>
> >
>
> >>0], CPU 21 [socket-id: 5, core-id: 1, thread-id:
0], CPU 22
>
> >[socket-id:
>
> >
>
>> >5, core-id: 2, thread-id: 0], CPU 23 [socket-id:
5, core-id:
>3,
>
> >
>
>>>thread-id: 0], CPU 24 [socket-id: 6, core-id: 0,
thread-id: 0], CPU
>25
>
> >
>
>> >[socket-id: 6, core-id: 1, thread-id: 0], CPU 26
[socket-id:
>6,
>
> >
>
> >>core-id: 2, thread-id: 0], CPU 27 [socket-id: 6,
core-id: 3,
>
> >thread-id:
>
> >
>
> >>0], CPU 28 [socket-id: 7, core-id: 0, thread-id:
0], CPU 29
>
> >[socket-id:
>
> >
>
>> >7, core-id: 1, thread-id: 0], CPU 30 [socket-id:
7, core-id:
>2,
>
> >
>
>>>thread-id: 0], CPU 31 [socket-id: 7, core-id: 3,
thread-id: 0], CPU
>32
>
> >
>
>> >[socket-id: 8, core-id: 0, thread-id: 0], CPU 33
[socket-id:
>8,
>
> >
>
> >>core-id: 1, thread-id: 0], CPU 34 [socket-id: 8,
core-id: 2,
>
> >thread-id:
>
> >
>
> >>0], CPU 35 [socket-id: 8, core-id: 3, thread-id:
0], CPU 36
>
> >[socket-id:
>
> >
>
>> >9, core-id: 0, thread-id: 0], CPU 37 [socket-id:
9, core-id:
>1,
>
> >
>
>>>thread-id: 0], CPU 38 [socket-id: 9, core-id: 2,
thread-id: 0], CPU
>39
>
> >
>
>> >[socket-id: 9, core-id: 3, thread-id: 0], CPU 40
[socket-id:
>10,
>
> >
>
>> >core-id: 0, thread-id: 0], CPU 41 [socket-id: 10,
core-id:
1,
>
> >
>
> >>thread-id: 0], CPU 42 [socket-id: 10, core-id: 2,
thread-id: 0],
CPU
>
> >43
>
> >
>
>> >[socket-id: 10, core-id: 3, thread-id: 0], CPU 44
[socket-id:
>11,
>
> >
>
>> >core-id: 0, thread-id: 0], CPU 45 [socket-id: 11,
core-id:
1,
>
> >
>
> >>thread-id: 0], CPU 46 [socket-id: 11, core-id: 2,
thread-id: 0],
CPU
>
> >47
>
> >
>
>> >[socket-id: 11, core-id: 3, thread-id: 0], CPU 48
[socket-id:
>12,
>
> >
>
>> >core-id: 0, thread-id: 0], CPU 49 [socket-id: 12,
core-id:
1,
>
> >
>
> >>thread-id: 0], CPU 50 [socket-id: 12, core-id: 2,
thread-id: 0],
CPU
>
> >51
>
> >
>
>> >[socket-id: 12, core-id: 3, thread-id: 0], CPU 52
[socket-id:
>13,
>
> >
>
>> >core-id: 0, thread-id: 0], CPU 53 [socket-id: 13,
core-id:
1,
>
> >
>
> >>thread-id: 0], CPU 54 [socket-id: 13, core-id: 2,
thread-id: 0],
CPU
>
> >55
>
> >
>
>> >[socket-id: 13, core-id: 3, thread-id: 0], CPU 56
[socket-id:
>14,
>
> >
>
>> >core-id: 0, thread-id: 0], CPU 57 [socket-id: 14,
core-id:
1,
>
> >
>
> >>thread-id: 0], CPU 58 [socket-id: 14, core-id: 2,
thread-id: 0],
CPU
>
> >59
>
> >
>
>> >[socket-id: 14, core-id: 3, thread-id: 0], CPU 60
[socket-id:
>15,
>
> >
>
>> >core-id: 0, thread-id: 0], CPU 61 [socket-id: 15,
core-id:
1,
>
> >
>
> >>thread-id: 0], CPU 62 [socket-id: 15, core-id: 2,
thread-id: 0],
CPU
>
> >63
>
> >
>
> > >[socket-id: 15, core-id: 3, thread-id: 0]
>
> >
>
> >>2020-02-14T12:33:51.863475Z qemu-kvm: warning: All
CPU(s) up to
>
> >maxcpus
>
> >
>
>> >should be described in NUMA config, ability to start
up with
>partial
>
> >
>
> > >NUMA mappings is obsoleted and will be removed
in future
>
> >
>
>> >2020-02-14T12:33:51.863973Z qemu-kvm: warning: host
doesn't
>support
>
> >
>
> > >requested feature: CPUID.07H:EDX.md-clear
[bit 10]
>
> >
>
>> >2020-02-14T12:33:51.865066Z qemu-kvm: warning: host
doesn't
>support
>
> >
>
> > >requested feature: CPUID.07H:EDX.md-clear
[bit 10]
>
> >
>
>> >2020-02-14T12:33:51.865547Z qemu-kvm: warning: host
doesn't
>support
>
> >
>
> > >requested feature: CPUID.07H:EDX.md-clear
[bit 10]
>
> >
>
>> >2020-02-14T12:33:51.865996Z qemu-kvm: warning: host
doesn't
>support
>
> >
>
> > >requested feature: CPUID.07H:EDX.md-clear
[bit 10]
>
> >
>
> > >2020-02-14 12:33:51.932+0000: shutting down,
reason=failed
>
> >
>
> > >
>
> >
>
>> >But then I wonder if the following is related to
error
above:
>
> >
>
> > >
>
> >
>
>> >Before I started upgrading Host by Host, all Hosts
in Cluster
>were
>
> >
>
>> >showing CPU Family type: " Intel SandyBridge
IBRS SSBD MDS
>Family"
>
> >
>
>> >After first Host was upgraded, his CPU Family type was
changed to:
>"
>
> >
>
>> >Intel SandyBridge IBRS SSBD Family" and that
forced me to have
>do
>
> >
>
>> >"downgrade" Cluster family type to
" Intel SandyBridge IBRS
>SSBD
>
> >
>
>> >Family" in order to be able to Activate the Host
back inside
>Cluster.
>
> >
>
> >>Following further, each Host CPU family type changed
after
Upgrade
>
> >from
>
> >
>
> >>"" Intel SandyBridge IBRS SSBD MDS
Family" to "" Intel
SandyBridge
>
> >IBRS
>
> >
>
>> >SSBD Family" , except one where
HostedEngine is currently
>one.
>
> >
>
> > >
>
> >
>
> >>Could this possibly be the reason why I cannot
Migrate the
>
> >HostedEngine
>
> >
>
> > >now and how to solve it?
>
> >
>
> > >
>
> >
>
> > >Kindly awaiting your reply.
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >-----
>
> >
>
> > >kind regards/met vriendelijke groeten
>
> >
>
> > >
>
> >
>
> > >Marko Vrgotic
>
> >
>
> > >Sr. System Engineer @ System
Administration
>
> >
>
> > >
>
> >
>
> > >ActiveVideo
>
> >
>
> > >o: +31 (35) 6774131
>
> >
>
> > >e: m.vrgotic(a)activevideo.com
>
> >
>
>
> >
>
> > >
>
> >
>
>> >ActiveVideo Networks BV. Mediacentrum 3745 Joop
van den
>Endeplein
>
> >
>
> >>1.1217 WJ Hilversum, The Netherlands. The information
contained
in
>
> >
this
>
> >
>
>> >message may be legally privileged and confidential. It
is intended
>
to
>
> >
>
>>>be read only by the individual or entity to whom it is
addressed or
>
by
>
> >
>
>> >their designee. If the reader of this message is
not the
>intended
>
> >
>
>>>recipient, you are on notice that any distribution of
this message,
>
in
>
> >
>
> >>any form, is strictly prohibited. If you have received
this
message
>
> >
in
>
> >
>
>> >error, please immediately notify the sender
and/or
>ActiveVideo
>
> >
>
> >>Networks, LLC by telephone at +1 408.931.9200 and delete
or
destroy
>
> >
any
>
> >
>
> > >copy of this message.
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
>>>On 14/02/2020, 14:01, "Strahil Nikolov"
<hunter86_bg(a)yahoo.com>
>
wrote:
>
> >
>
> > >
>
> >
>
> > >On February 14, 2020 2:47:04 PM GMT+02:00,
"Vrgotic,
Marko"
>
> >
>
> > >
<M.Vrgotic(a)activevideo.com> wrote:
>
> >
>
> > > >Dear oVirt,
>
> >
>
> > > >
>
> >
>
> >> >I have problem migrating HostedEngine, only HA VM
server, to
other
>
> >HA
>
> >
>
> > > >nodes.
>
> >
>
> > > >
>
> >
>
> > > >Bit of background story:
>
> >
>
> > > >
>
> >
>
> > > > * We have oVirt SHE 4.3.5
>
> >
>
> > > > * Three Nodes act as HA pool
for SHE
>
> >
>
> > > > * Node 3 is currently Hosting
SHE
>
> >
>
> > > > * Actions:
>
> >
>
>>>>* Put Node1 in Maintenance mode, all VMs were
successfully
>migrated,
>
> >
>
> > > >than Upgrade packages, Activate Host –
all looks
good
>
> >
>
>>>>* Put Node2 in Maintenance mode, all VMs were
successfully
>migrated,
>
> >
>
> > > >than Upgrade packages, Activate Host –
all looks
good
>
> >
>
> > > >
>
> >
>
> > > >Not the problem:
>
> >
>
>> > >Try to set Node3 in Maintenance mode, all
VMs were
>successfully
>
> >
>
> > > >migrated, except HostedEngine.
>
> >
>
> > > >
>
> >
>
>> > >When attempting Migration of the VM
HostedEngine, it fails
>with
>
> >
>
> > > >following error message:
>
> >
>
> > > >
>
> >
>
> > > >2020-02-14 12:33:49,960Z INFO
>
> >
>
>> > >[org.ovirt.engine.core.bll.MigrateVmCommand]
(default
>task-265)
>
> >
>
>> > >[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Lock
Acquired to
>object
>
> >
>
>>>>'EngineLock:{exclusiveLocks='[66b6d489-ceb8-486a-951a-355e21f13627=VM]',
>
> >
>
> > > >sharedLocks=''}'
>
> >
>
> > > >2020-02-14 12:33:49,984Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(default
>
> >
>
>> > >task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
>host
>
> >
>
>>> >'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id: null)
>
> >
>
> > > >2020-02-14 12:33:49,984Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(default
>
> >
>
>> > >task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
>host
>
> >
>
>>> >'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id: null)
>
> >
>
> > > >2020-02-14 12:33:49,997Z INFO
>
> >
>
>> > >[org.ovirt.engine.core.bll.MigrateVmCommand]
(default
>task-265)
>
> >
>
> > > >[16f4559e-e262-4c9d-80b4-ec81c2cbf950]
Running
command:
>
> >
>
>> > >MigrateVmCommand internal: false.
Entities affected :
>ID:
>
> >
>
> >> >66b6d489-ceb8-486a-951a-355e21f13627 Type:
VMAction group
>
> >MIGRATE_VM
>
> >
>
> > > >with role type USER
>
> >
>
> > > >2020-02-14 12:33:50,008Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(default
>
> >
>
>> > >task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
>host
>
> >
>
>>> >'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id:
16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>
> >
>
> > > >2020-02-14 12:33:50,008Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(default
>
> >
>
>> > >task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
>host
>
> >
>
>>> >'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id:
16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>
> >
>
> > > >2020-02-14 12:33:50,033Z INFO
>
> >
>
>
>>>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default
>
>
>task-265)
>
> >
>
>> > >[16f4559e-e262-4c9d-80b4-ec81c2cbf950]
START,
>MigrateVDSCommand(
>
> >
>
>>>>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>
> >
>
> > >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>
> >
>
> > >
>srcHost='ovirt-sj-03.ictv.com',
>
> >
>
> > >
>dstVdsId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',
>
> >
>
>> >
>dstHost='ovirt-sj-01.ictv.com:54321',
>migrationMethod='ONLINE',
>
> >
>
> >> >tunnelMigration='false',
migrationDowntime='0',
>
> >autoConverge='true',
>
> >
>
> >> >migrateCompressed='false',
consoleAddress='null',
>
> >maxBandwidth='40',
>
> >
>
> > > >enableGuestEvents='true',
maxIncomingMigrations='2',
>
> >
>
> > >
>maxOutgoingMigrations='2',
>
> >
>
>> >
>convergenceSchedule='[init=[{name=setDowntime,
>params=[100]}],
>
> >
>
>> >>stalling=[{limit=1,
action={name=setDowntime,
>params=[150]}},
>
> >
>
> > >{limit=2,
>
> >
>
> > > >action={name=setDowntime,
params=[200]}}, {limit=3,
>
> >
>
> > > >action={name=setDowntime,
params=[300]}}, {limit=4,
>
> >
>
> > > >action={name=setDowntime,
params=[400]}}, {limit=6,
>
> >
>
> > > >action={name=setDowntime,
params=[500]}},
{limit=-1,
>
> >
>
> >> >action={name=abort, params=[]}}]]',
dstQemu='10.210.13.11'}),
log
>
> >id:
>
> >
>
> > > >5c126a47
>
> >
>
> > > >2020-02-14 12:33:50,036Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>
> >
>
>> > >(default task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950]
>START,
>
> >
>
>> > >MigrateBrokerVDSCommand(HostName =
>
> >
>
>>>>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>
> >
>
> > >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>
> >
>
> > >
>srcHost='ovirt-sj-03.ictv.com',
>
> >
>
> > >
>dstVdsId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',
>
> >
>
>> >
>dstHost='ovirt-sj-01.ictv.com:54321',
>migrationMethod='ONLINE',
>
> >
>
> >> >tunnelMigration='false',
migrationDowntime='0',
>
> >autoConverge='true',
>
> >
>
> >> >migrateCompressed='false',
consoleAddress='null',
>
> >maxBandwidth='40',
>
> >
>
> > > >enableGuestEvents='true',
maxIncomingMigrations='2',
>
> >
>
> > >
>maxOutgoingMigrations='2',
>
> >
>
>> >
>convergenceSchedule='[init=[{name=setDowntime,
>params=[100]}],
>
> >
>
>> >>stalling=[{limit=1,
action={name=setDowntime,
>params=[150]}},
>
> >
>
> > >{limit=2,
>
> >
>
> > > >action={name=setDowntime,
params=[200]}}, {limit=3,
>
> >
>
> > > >action={name=setDowntime,
params=[300]}}, {limit=4,
>
> >
>
> > > >action={name=setDowntime,
params=[400]}}, {limit=6,
>
> >
>
> > > >action={name=setDowntime,
params=[500]}},
{limit=-1,
>
> >
>
> >> >action={name=abort, params=[]}}]]',
dstQemu='10.210.13.11'}),
log
>
> >id:
>
> >
>
> > > >a0f776d
>
> >
>
> > > >2020-02-14 12:33:50,043Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>
> >
>
>>> >(default task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950]
>FINISH,
>
> >
>
> > > >MigrateBrokerVDSCommand, return: ,
log id: a0f776d
>
> >
>
> > > >2020-02-14 12:33:50,046Z INFO
>
> >
>
>
>>>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default
>
>
>task-265)
>
> >
>
>>> >[16f4559e-e262-4c9d-80b4-ec81c2cbf950]
FINISH,
>MigrateVDSCommand,
>
> >
>
> > > >return: MigratingFrom, log id:
5c126a47
>
> >
>
> > > >2020-02-14 12:33:50,052Z INFO
>
> >
>
>>>>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>
> >
>
>>> >(default task-265)
[16f4559e-e262-4c9d-80b4-ec81c2cbf950]
>EVENT_ID:
>
> >
>
> >> >VM_MIGRATION_START(62), Migration started (VM:
HostedEngine,
>
> >Source:
>
> >
>
>> > >ovirt-sj-03.ictv.com, Destination:
ovirt-sj-01.ictv.com,
>User:
>
> >
>
> > >
>mvrgotic@ictv.com(a)ictv.com-authz).
>
> >
>
> > > >2020-02-14 12:33:52,893Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> >>>(ForkJoinPool-1-worker-8) [] VM
>
> >'66b6d489-ceb8-486a-951a-355e21f13627'
>
> >
>
> > > >was reported as Down on VDS
>
> >
>
> >>
>>'9808f434-5cd4-48b5-8bbc-e639e391c6a5'(ovirt-sj-01.ictv.com)
>
> >
>
> > > >2020-02-14 12:33:52,893Z INFO
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>
> >
>
>> > >(ForkJoinPool-1-worker-8) [] START,
DestroyVDSCommand(HostName
>=
>
> >
>
> > > >ovirt-sj-01.ictv.com,
>
> >
>
>>>>DestroyVmVDSCommandParameters:{hostId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',
>
> >
>
>> >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>secondsToWait='0',
>
> >
>
> >> >gracefully='false', reason='',
ignoreNoVm='true'}), log id:
>
> >7532a8c0
>
> >
>
> > > >2020-02-14 12:33:53,217Z INFO
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>
> >
>
> > > >(ForkJoinPool-1-worker-8) [] Failed
to destroy VM
>
> >
>
>>> >'66b6d489-ceb8-486a-951a-355e21f13627'
because VM does not
>exist,
>
> >
>
> > > >ignoring
>
> >
>
> > > >2020-02-14 12:33:53,217Z INFO
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>
> >
>
> >> >(ForkJoinPool-1-worker-8) [] FINISH,
DestroyVDSCommand, return:
,
>
> >
log
>
> >
>
> > > >id: 7532a8c0
>
> >
>
> > > >2020-02-14 12:33:53,217Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > > >(ForkJoinPool-1-worker-8) [] VM
>
> >
>
> >>
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) was
>
> >unexpectedly
>
> >
>
> > > >detected as 'Down' on VDS
>
> >
>
>
>>>'9808f434-5cd4-48b5-8bbc-e639e391c6a5'(ovirt-sj-01.ictv.com)
>
> >(expected
>
> >
>
> > > >on
'f8d27efb-1527-45f0-97d6-d34a86abaaa2')
>
> >
>
> > > >2020-02-14 12:33:53,217Z ERROR
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
>>> >(ForkJoinPool-1-worker-8) [] Migration of VM
'HostedEngine' to
>host
>
> >
>
>> > >'ovirt-sj-01.ictv.com' failed: VM
destroyed during the
>startup.
>
> >
>
> > > >2020-02-14 12:33:53,219Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > > >(ForkJoinPool-1-worker-15) [] VM
>
> >
>
>> >
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) moved
>from
>
> >
>
> > > >'MigratingFrom' -->
'Up'
>
> >
>
> > > >2020-02-14 12:33:53,219Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > > >(ForkJoinPool-1-worker-15) []
Adding VM
>
> >
>
>>>
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) to re-run
>
list
>
> >
>
> > > >2020-02-14 12:33:53,221Z ERROR
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]
>
> >
>
> > > >(ForkJoinPool-1-worker-15) []
Rerun VM
>
> >
>
> > >
>'66b6d489-ceb8-486a-951a-355e21f13627'. Called from
VDS
>
> >
>
> > > >'ovirt-sj-03.ictv.com'
>
> >
>
> > > >2020-02-14 12:33:53,259Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
START,
>
> >
>
>> > >MigrateStatusVDSCommand(HostName =
>
> >
>
>>>>MigrateStatusVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>
> >
>
>> >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627'}), log id:
>62bac076
>
> >
>
> > > >2020-02-14 12:33:53,265Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
>FINISH,
>
> >
>
> > > >MigrateStatusVDSCommand, return: , log
id: 62bac076
>
> >
>
> > > >2020-02-14 12:33:53,277Z WARN
>
> >
>
>>>>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
>EVENT_ID:
>
> >
>
> >> >VM_MIGRATION_TRYING_RERUN(128), Failed to migrate
VM
HostedEngine
>
> >
to
>
> >
>
>> > >Host
ovirt-sj-01.ictv.com . Trying to
migrate to another
>Host.
>
> >
>
> > > >2020-02-14 12:33:53,330Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>
> >
>
>> > >(EE-ManagedThreadFactory-engine-Thread-377323)
[] Candidate
>host
>
> >
>
>>> >'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id: null)
>
> >
>
> > > >2020-02-14 12:33:53,330Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>
> >
>
>> > >(EE-ManagedThreadFactory-engine-Thread-377323)
[] Candidate
>host
>
> >
>
>>> >'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id: null)
>
> >
>
> > > >2020-02-14 12:33:53,345Z INFO
>
> >
>
> > >
>[org.ovirt.engine.core.bll.MigrateVmCommand]
>
> >
>
> >>
>(EE-ManagedThreadFactory-engine-Thread-377323) [] Running
>
> >
command:
>
> >
>
>> > >MigrateVmCommand internal: false.
Entities affected :
>ID:
>
> >
>
> >> >66b6d489-ceb8-486a-951a-355e21f13627 Type:
VMAction group
>
> >MIGRATE_VM
>
> >
>
> > > >with role type USER
>
> >
>
> > > >2020-02-14 12:33:53,356Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>
> >
>
>> > >(EE-ManagedThreadFactory-engine-Thread-377323)
[] Candidate
>host
>
> >
>
>>> >'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id:
16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>
> >
>
> > > >2020-02-14 12:33:53,356Z INFO
>
> >
>
>>>
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>
> >
>
>> > >(EE-ManagedThreadFactory-engine-Thread-377323)
[] Candidate
>host
>
> >
>
>>> >'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd')
>was
>
> >
>
> >> >filtered out by
'VAR__FILTERTYPE__INTERNAL' filter 'HA'
>
> >(correlation
>
> >
>
> > > >id:
16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>
> >
>
> > > >2020-02-14 12:33:53,380Z INFO
>
> >
>
> > >
>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
START,
>
> >
>
> > >
>MigrateVDSCommand(
>
> >
>
>>>>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>
> >
>
> > >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>
> >
>
> > >
>srcHost='ovirt-sj-03.ictv.com',
>
> >
>
> > >
>dstVdsId='33e8ff78-e396-4f40-b43c-685bfaaee9af',
>
> >
>
>> >
>dstHost='ovirt-sj-02.ictv.com:54321',
>migrationMethod='ONLINE',
>
> >
>
> >> >tunnelMigration='false',
migrationDowntime='0',
>
> >autoConverge='true',
>
> >
>
> >> >migrateCompressed='false',
consoleAddress='null',
>
> >maxBandwidth='40',
>
> >
>
> > > >enableGuestEvents='true',
maxIncomingMigrations='2',
>
> >
>
> > >
>maxOutgoingMigrations='2',
>
> >
>
>> >
>convergenceSchedule='[init=[{name=setDowntime,
>params=[100]}],
>
> >
>
>> >>stalling=[{limit=1,
action={name=setDowntime,
>params=[150]}},
>
> >
>
> > >{limit=2,
>
> >
>
> > > >action={name=setDowntime,
params=[200]}}, {limit=3,
>
> >
>
> > > >action={name=setDowntime,
params=[300]}}, {limit=4,
>
> >
>
> > > >action={name=setDowntime,
params=[400]}}, {limit=6,
>
> >
>
> > > >action={name=setDowntime,
params=[500]}},
{limit=-1,
>
> >
>
> >> >action={name=abort, params=[]}}]]',
dstQemu='10.210.13.12'}),
log
>
> >id:
>
> >
>
> > > >d99059f
>
> >
>
> > > >2020-02-14 12:33:53,380Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
START,
>
> >
>
>> > >MigrateBrokerVDSCommand(HostName =
>
> >
>
>>>>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>
> >
>
> > >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>
> >
>
> > >
>srcHost='ovirt-sj-03.ictv.com',
>
> >
>
> > >
>dstVdsId='33e8ff78-e396-4f40-b43c-685bfaaee9af',
>
> >
>
>> >
>dstHost='ovirt-sj-02.ictv.com:54321',
>migrationMethod='ONLINE',
>
> >
>
> >> >tunnelMigration='false',
migrationDowntime='0',
>
> >autoConverge='true',
>
> >
>
> >> >migrateCompressed='false',
consoleAddress='null',
>
> >maxBandwidth='40',
>
> >
>
> > > >enableGuestEvents='true',
maxIncomingMigrations='2',
>
> >
>
> > >
>maxOutgoingMigrations='2',
>
> >
>
>> >
>convergenceSchedule='[init=[{name=setDowntime,
>params=[100]}],
>
> >
>
>> >>stalling=[{limit=1,
action={name=setDowntime,
>params=[150]}},
>
> >
>
> > >{limit=2,
>
> >
>
> > > >action={name=setDowntime,
params=[200]}}, {limit=3,
>
> >
>
> > > >action={name=setDowntime,
params=[300]}}, {limit=4,
>
> >
>
> > > >action={name=setDowntime,
params=[400]}}, {limit=6,
>
> >
>
> > > >action={name=setDowntime,
params=[500]}},
{limit=-1,
>
> >
>
> >> >action={name=abort, params=[]}}]]',
dstQemu='10.210.13.12'}),
log
>
> >id:
>
> >
>
> > > >6f0483ac
>
> >
>
> > > >2020-02-14 12:33:53,386Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
>FINISH,
>
> >
>
> > > >MigrateBrokerVDSCommand, return: , log
id: 6f0483ac
>
> >
>
> > > >2020-02-14 12:33:53,388Z INFO
>
> >
>
> > >
>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
>FINISH,
>
> >
>
>> > >MigrateVDSCommand, return:
MigratingFrom, log id:
>d99059f
>
> >
>
> > > >2020-02-14 12:33:53,391Z INFO
>
> >
>
>>>>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engine-Thread-377323) []
>EVENT_ID:
>
> >
>
> >> >VM_MIGRATION_START(62), Migration started (VM:
HostedEngine,
>
> >Source:
>
> >
>
>> > >ovirt-sj-03.ictv.com, Destination:
ovirt-sj-02.ictv.com,
>User:
>
> >
>
> > >
>mvrgotic@ictv.com(a)ictv.com-authz).
>
> >
>
> > > >2020-02-14 12:33:55,108Z INFO
>
> >
>
>>>
>>[org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher]
>
> >
>
> >> >(EE-ManagedThreadFactory-engineScheduled-Thread-96)
[] Fetched
10
>
> >VMs
>
> >
>
> > > >from VDS
'33e8ff78-e396-4f40-b43c-685bfaaee9af'
>
> >
>
> > > >2020-02-14 12:33:55,110Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
>> >
>(EE-ManagedThreadFactory-engineScheduled-Thread-96) []
>VM
>
> >
>
>> >
>'66b6d489-ceb8-486a-951a-355e21f13627' is migrating to
>
VDS
>
> >
>
> >>
>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com)
>
> >ignoring
>
> >
>
> > > >it in the refresh until migration
is done
>
> >
>
> > > >2020-02-14 12:33:57,224Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > >>(ForkJoinPool-1-worker-15) [] VM
>
> >
>
> >
>'66b6d489-ceb8-486a-951a-355e21f13627'
>
> >
>
> > > >was reported as Down on VDS
>
> >
>
> >>
>>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com)
>
> >
>
> > > >2020-02-14 12:33:57,225Z INFO
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>
> >
>
>>> >(ForkJoinPool-1-worker-15) [] START,
DestroyVDSCommand(HostName
>=
>
> >
>
> > > >ovirt-sj-02.ictv.com,
>
> >
>
>>>>DestroyVmVDSCommandParameters:{hostId='33e8ff78-e396-4f40-b43c-685bfaaee9af',
>
> >
>
>> >
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>secondsToWait='0',
>
> >
>
> >> >gracefully='false', reason='',
ignoreNoVm='true'}), log id:
>
> >1dec553e
>
> >
>
> > > >2020-02-14 12:33:57,672Z INFO
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>
> >
>
> > > >(ForkJoinPool-1-worker-15) [] Failed
to destroy VM
>
> >
>
>>> >'66b6d489-ceb8-486a-951a-355e21f13627'
because VM does not
>exist,
>
> >
>
> > > >ignoring
>
> >
>
> > > >2020-02-14 12:33:57,672Z INFO
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>
> >
>
> >>>(ForkJoinPool-1-worker-15) [] FINISH,
DestroyVDSCommand, return:
,
>
> >
log
>
> >
>
> > > >id: 1dec553e
>
> >
>
> > > >2020-02-14 12:33:57,672Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > > >(ForkJoinPool-1-worker-15) [] VM
>
> >
>
> >>
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) was
>
> >unexpectedly
>
> >
>
> > > >detected as 'Down' on VDS
>
> >
>
>
>>>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com)
>
> >(expected
>
> >
>
> > > >on
'f8d27efb-1527-45f0-97d6-d34a86abaaa2')
>
> >
>
> > > >2020-02-14 12:33:57,672Z ERROR
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> >> >(ForkJoinPool-1-worker-15) [] Migration of VM
'HostedEngine'
to
>
>
>host
>
> >
>
>> > >'ovirt-sj-02.ictv.com' failed: VM
destroyed during the
>startup.
>
> >
>
> > > >2020-02-14 12:33:57,674Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > > >(ForkJoinPool-1-worker-8) [] VM
>
> >
>
>> >
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) moved
>from
>
> >
>
> > > >'MigratingFrom' -->
'Up'
>
> >
>
> > > >2020-02-14 12:33:57,674Z INFO
>
> >
>
>> >
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>
> >
>
> > > >(ForkJoinPool-1-worker-8) []
Adding VM
>
> >
>
>>>
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) to re-run
>
list
>
> >
>
> > > >2020-02-14 12:33:57,676Z ERROR
>
> >
>
> >>
>>[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]
>
> >
>
> > > >(ForkJoinPool-1-worker-8) [] Rerun
VM
>
> >
>
> > > >'66b6d489-ceb8-48
>
> >
>
> >
>
> >
>
> > I am afraid that your suspicions are right.
>
> >
>
> >
>
> >
>
> > What is the host cpu and the
HostedEngine's xml?
>
> >
>
> >
>
> >
>
> >Have you checked the xml on any working VM ? What cpu flags
do
the
>
> >working VMs have ?
>
> >
>
> >
>
> >
>
>> How to solve - I think I have a solution , but you might
not like
>it.
>
> >
>
> >
>
> >
>
> > 1. Get current VM xml with virsh
>
> >
>
> >2. Set all nodes in maintenance 'hosted-engine
--set-maintenance
>
> >--mode=global'
>
> >
>
> > 3. Stop and undefine the VM on the last
working host
>
> >
>
> >4. Edit the xml from step 1 and add/remove the flags
that are
>
> >different from the other (working) VMs
>
> >
>
> > 5. Define the HostedEngine on any of the
updated hosts
>
> >
>
> > 6. Start the HostedEngine via virsh.
>
> >
>
> > 7. Try with different cpu flags until the engine
starts.
>
> >
>
>>8. Leave the engine for at least 12 hours , so it will have
enough
>time
>
> >to update it's own configuration.
>
> >
>
> >9. Remove the maintenance and migrate the engine to
the other
>
> >upgraded host
>
> >
>
> > 10. Patch the last HostedEngine's host
>
> >
>
> >
>
> >
>
> >I have done this procedure in order to recover my engine
(except
>
> >changing the cpu flags).
>
> >
>
> >
>
> >
>
> > Note: You may hit some hiccups:
>
> >
>
> > A) virsh alias
>
> >
>
> >alias virsh='virsh -c
>
>
>qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'
>
> >
>
> >
>
> >
>
> > B) HostedEngine network missing:
>
> >
>
> > [root@ovirt1 ~]# virsh net-dumpxml
vdsm-ovirtmgmt
>
> >
>
> > <network>
>
> >
>
> > <name>vdsm-ovirtmgmt</name>
>
> >
>
> >
<uuid>986c27cf-a1ec-44d8-ae61-ee09ce75c886</uuid>
>
> >
>
> > <forward mode='bridge'/>
>
> >
>
> > <bridge name='ovirtmgmt'/>
>
> >
>
> > </network>
>
> >
>
> >
>
> >
>
> > Define in xml and add it via:
>
> >
>
> > virsh net-define somefile.xml
>
> >
>
> > C) Missing disk
>
> >
>
> > Vdsm is creating symlinks like these:
>
> >
>
> > [root@ovirt1
808423f9-8a5c-40cd-bc9f-2568c85b8c74]# pwd
>
> >
>
>>
/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74
>
> >
>
> > [root@ovirt1
808423f9-8a5c-40cd-bc9f-2568c85b8c74]# ls -l
>
> >
>
> > total 20
>
> >
>
> >lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:05
>
> >2c74697a-8bd9-4472-8a98-bf624f3462d5 ->
>
>>/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/2c74697a-8bd9-4472-8a98-bf624f3462d5
>
> >
>
> >lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:09
>
> >3ec27d6d-921c-4348-b799-f50543b6f919 ->
>
>>/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/3ec27d6d-921c-4348-b799-f50543b6f919
>
> >
>
> >lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:09
>
> >441abdc8-6cb1-49a4-903f-a1ec0ed88429 ->
>
>>/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/441abdc8-6cb1-49a4-903f-a1ec0ed88429
>
> >
>
> >lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:09
>
> >94ade632-6ecc-4901-8cec-8e39f3d69cb0 ->
>
>>/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/94ade632-6ecc-4901-8cec-8e39f3d69cb0
>
> >
>
> >lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:05
>
> >fe62a281-51e9-4b23-87b3-2deb52357304 ->
>
>>/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/fe62a281-51e9-4b23-87b3-2deb52357304
>
> >
>
> > [root@ovirt1
808423f9-8a5c-40cd-bc9f-2568c85b8c74]#
>
> >
>
> >
>
> >
>
> >Just create the link, so it points to correct
destinationand
power
>
> >up again.
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > Good luck !
>
> >
>
> >
>
> >
>
> > Best Regards,
>
> >
>
> > Strahil Nikolov
>
>
>
> Hi Marko,
>
>
>
> If the other VMs work without issues -> it's
worth trying.
>
>
>
> Best Regards,
>
> Strahil Nikolov
Hi Marco,
As the VM was started manually (and not by the ovirt-ha-agent) -
this
is expected.
Keep the engine running and it will update it's own OVF . On the
safe
side - 8 hours is an overkill, but will save you from repeating the
procedure.
Last step is to shutdown the VM from inside, undefine it 'virsh
undefine HostedEngine' and last start it manually via vdsm
'hosted-engine --vm-start' on one of the updated nodes.
Once you do a migration from one updated to another updated node, you
can remove the global maintenance.
'hosted-engine --set-maintenance --mode=none'
Next , you can test the HA of the engine by powering it off:
A)ssh to not-yet-updated node and set it in local maintenance:
hosted-engine --set-maintenance --mode=local
B) ssh engine "poweroff"
Check if the engine is powered up on the last of the 3 nodes (also
updated).
Don't forget to patch the last node (and remove the maintenance
after
the reboot).
The hard part is over - now you just need to verify that the HA is
working properly.
Best Regards,
Strahil Nikolov
Hey Marko,
Leave the patch for another day.
If something is not OK - you need to have a host where to power up the HostedEngine
via vdsm. Call it paranoia, but I prefer to be on the safe side.
Best Regards,
Strahil Nikolov