Good day Strahil,
I believe I found the causing link:
HostedEngine.log-20200216:-cpu
SandyBridge,pcid=on,spec-ctrl=on,ssbd=on,md-clear=on,vme=on,hypervisor=on,arat=on,xsaveopt=on
\
HostedEngine.log-20200216:2020-02-13T17:58:38.674630Z qemu-kvm: warning: host doesn't
support requested feature: CPUID.07H:EDX.md-clear [bit 10]
HostedEngine.log-20200216:2020-02-13T17:58:38.676205Z qemu-kvm: warning: host doesn't
support requested feature: CPUID.07H:EDX.md-clear [bit 10]
HostedEngine.log-20200216:2020-02-13T17:58:38.676901Z qemu-kvm: warning: host doesn't
support requested feature: CPUID.07H:EDX.md-clear [bit 10]
HostedEngine.log-20200216:2020-02-13T17:58:38.677616Z qemu-kvm: warning: host doesn't
support requested feature: CPUID.07H:EDX.md-clear [bit 10]
The "md-clear" CPU seem to be removed as feature due to spectre
vulnerabilities.
However, when I check the CPU Type/flags of the VMs on the same Host as where Engine is
currently, as well as on the other hosts, the md-clear seems to be only present on the
HostedEngine:
* HostedEngine:
FromwebUI:
Intel SandyBridge IBRS SSBD Family
Via virsh:
#virsh dumpxml
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>SandyBridge</model>
<topology sockets='16' cores='4' threads='1'/>
<feature policy='require' name='pcid'/>
<feature policy='require' name='spec-ctrl'/>
<feature policy='require' name='ssbd'/>
<feature policy='require' name='md-clear'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='xsaveopt'/>
<numa>
<cell id='0' cpus='0-3' memory='16777216'
unit='KiB'/>
</numa>
</cpu>
* OtherVMs:
From webUI:
(SandyBridge,+pcid,+spec-ctrl,+ssbd)
Via virsh:
#virsh dumpxml
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>SandyBridge</model>
<topology sockets='16' cores='1' threads='1'/>
<feature policy='require' name='pcid'/>
<feature policy='require' name='spec-ctrl'/>
<feature policy='require' name='ssbd'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='xsaveopt'/>
<numa>
<cell id='0' cpus='0-3' memory='4194304'
unit='KiB'/>
</numa>
</cpu>
Strahil, knowing this, do you propose different approach or shall I just proceed with
initially suggested workaround?
Kindly awaiting your eply.
-----
kind regards/met vriendelijke groeten
Marko Vrgotic
Sr. System Engineer @ System Administration
ActiveVideo
o: +31 (35) 6774131
e: m.vrgotic@activevideo.com<mailto:m.vrgotic@activevideo.com>
w:
www.activevideo.com<http://www.activevideo.com>
ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217 WJ Hilversum, The
Netherlands. The information contained in this message may be legally privileged and
confidential. It is intended to be read only by the individual or entity to whom it is
addressed or by their designee. If the reader of this message is not the intended
recipient, you are on notice that any distribution of this message, in any form, is
strictly prohibited. If you have received this message in error, please immediately
notify the sender and/or ActiveVideo Networks, LLC by telephone at +1 408.931.9200 and
delete or destroy any copy of this message.
On 16/02/2020, 15:28, "Strahil Nikolov" <hunter86_bg(a)yahoo.com> wrote:
ssh root@engine "poweroff"
ssh host-that-holded-engine "virsh undefine HostedEngine; virsh list
--all"
Lot's of virsh - less vdsm :)
Good luck
Best Regards,
Strahil Nikolov
В неделя, 16 февруари 2020 г., 16:01:44 ч. Гринуич+2, Vrgotic, Marko
<m.vrgotic(a)activevideo.com> написа:
Hi Strahil,
Regarding step 3: Stop and undefine the VM on the last working host
One question: How do I undefine HostedEngine from last Host? Hosted-engine command
does not provide such option, or it's just not obvious.
Kindly awaiting your reply.
-----
kind regards/met vriendelijke groeten
Marko Vrgotic
ActiveVideo
On 14/02/2020, 18:44, "Strahil Nikolov" <hunter86_bg(a)yahoo.com>
wrote:
On February 14, 2020 4:19:53 PM GMT+02:00, "Vrgotic, Marko"
<M.Vrgotic(a)activevideo.com> wrote:
Good answer Strahil,
Thank you, I forgot.
Libvirt logs are actually showing the reason why:
2020-02-14T12:33:51.847970Z qemu-kvm: -drive
file=/var/run/vdsm/storage/054c43fc-1924-4106-9f80-0f2ac62b9886/b019c5fa-8fb5-4bfc-8339-f5b7f590a051/f1ce8ba6-2d3b-4309-bca0-e6a00ce74c75,format=raw,if=none,id=drive-ua-b019c5fa-8fb5-4bfc-8339-f5b7f590a051,serial=b019c5fa-8fb5-4bfc-8339-f5b7f590a051,werror=stop,rerror=stop,cache=none,aio=threads:
'serial' is deprecated, please use the corresponding option
of
'-device' instead
Spice-Message: 04:33:51.856: setting TLS option 'CipherString'
to
'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from
/etc/pki/tls/spice.cnf configuration file
2020-02-14T12:33:51.863449Z qemu-kvm: warning: CPU(s) not present in
any NUMA nodes: CPU 4 [socket-id: 1, core-id: 0, thread-id: 0], CPU 5
[socket-id: 1, core-id: 1, thread-id: 0], CPU 6 [socket-id: 1,
core-id:
2, thread-id: 0], CPU 7 [socket-id: 1, core-id: 3, thread-id: 0], CPU
8
[socket-id: 2, core-id: 0, thread-id: 0], CPU 9 [socket-id: 2,
core-id:
1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2, thread-id: 0],
CPU
11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12 [socket-id: 3,
core-id: 0, thread-id: 0], CPU 13 [socket-id: 3, core-id: 1,
thread-id:
0], CPU 14 [socket-id: 3, core-id: 2, thread-id: 0], CPU 15
[socket-id:
3, core-id: 3, thread-id: 0], CPU 16 [socket-id: 4, core-id: 0,
thread-id: 0], CPU 17 [socket-id: 4, core-id: 1, thread-id: 0], CPU
18
[socket-id: 4, core-id: 2, thread-id: 0], CPU 19 [socket-id: 4,
core-id: 3, thread-id: 0], CPU 20 [socket-id: 5, core-id: 0,
thread-id:
0], CPU 21 [socket-id: 5, core-id: 1, thread-id: 0], CPU 22
[socket-id:
5, core-id: 2, thread-id: 0], CPU 23 [socket-id: 5, core-id: 3,
thread-id: 0], CPU 24 [socket-id: 6, core-id: 0, thread-id: 0], CPU
25
[socket-id: 6, core-id: 1, thread-id: 0], CPU 26 [socket-id: 6,
core-id: 2, thread-id: 0], CPU 27 [socket-id: 6, core-id: 3,
thread-id:
0], CPU 28 [socket-id: 7, core-id: 0, thread-id: 0], CPU 29
[socket-id:
7, core-id: 1, thread-id: 0], CPU 30 [socket-id: 7, core-id: 2,
thread-id: 0], CPU 31 [socket-id: 7, core-id: 3, thread-id: 0], CPU
32
[socket-id: 8, core-id: 0, thread-id: 0], CPU 33 [socket-id: 8,
core-id: 1, thread-id: 0], CPU 34 [socket-id: 8, core-id: 2,
thread-id:
0], CPU 35 [socket-id: 8, core-id: 3, thread-id: 0], CPU 36
[socket-id:
9, core-id: 0, thread-id: 0], CPU 37 [socket-id: 9, core-id: 1,
thread-id: 0], CPU 38 [socket-id: 9, core-id: 2, thread-id: 0], CPU
39
[socket-id: 9, core-id: 3, thread-id: 0], CPU 40 [socket-id: 10,
core-id: 0, thread-id: 0], CPU 41 [socket-id: 10, core-id: 1,
thread-id: 0], CPU 42 [socket-id: 10, core-id: 2, thread-id: 0], CPU
43
[socket-id: 10, core-id: 3, thread-id: 0], CPU 44 [socket-id: 11,
core-id: 0, thread-id: 0], CPU 45 [socket-id: 11, core-id: 1,
thread-id: 0], CPU 46 [socket-id: 11, core-id: 2, thread-id: 0], CPU
47
[socket-id: 11, core-id: 3, thread-id: 0], CPU 48 [socket-id: 12,
core-id: 0, thread-id: 0], CPU 49 [socket-id: 12, core-id: 1,
thread-id: 0], CPU 50 [socket-id: 12, core-id: 2, thread-id: 0], CPU
51
[socket-id: 12, core-id: 3, thread-id: 0], CPU 52 [socket-id: 13,
core-id: 0, thread-id: 0], CPU 53 [socket-id: 13, core-id: 1,
thread-id: 0], CPU 54 [socket-id: 13, core-id: 2, thread-id: 0], CPU
55
[socket-id: 13, core-id: 3, thread-id: 0], CPU 56 [socket-id: 14,
core-id: 0, thread-id: 0], CPU 57 [socket-id: 14, core-id: 1,
thread-id: 0], CPU 58 [socket-id: 14, core-id: 2, thread-id: 0], CPU
59
[socket-id: 14, core-id: 3, thread-id: 0], CPU 60 [socket-id: 15,
core-id: 0, thread-id: 0], CPU 61 [socket-id: 15, core-id: 1,
thread-id: 0], CPU 62 [socket-id: 15, core-id: 2, thread-id: 0], CPU
63
[socket-id: 15, core-id: 3, thread-id: 0]
2020-02-14T12:33:51.863475Z qemu-kvm: warning: All CPU(s) up to
maxcpus
should be described in NUMA config, ability to start up with partial
NUMA mappings is obsoleted and will be removed in future
2020-02-14T12:33:51.863973Z qemu-kvm: warning: host doesn't
support
requested feature: CPUID.07H:EDX.md-clear [bit 10]
2020-02-14T12:33:51.865066Z qemu-kvm: warning: host doesn't
support
requested feature: CPUID.07H:EDX.md-clear [bit 10]
2020-02-14T12:33:51.865547Z qemu-kvm: warning: host doesn't
support
requested feature: CPUID.07H:EDX.md-clear [bit 10]
2020-02-14T12:33:51.865996Z qemu-kvm: warning: host doesn't
support
requested feature: CPUID.07H:EDX.md-clear [bit 10]
2020-02-14 12:33:51.932+0000: shutting down, reason=failed
But then I wonder if the following is related to error above:
Before I started upgrading Host by Host, all Hosts in Cluster were
showing CPU Family type: " Intel SandyBridge IBRS SSBD MDS
Family"
After first Host was upgraded, his CPU Family type was changed to:
"
Intel SandyBridge IBRS SSBD Family" and that forced me to have
do
"downgrade" Cluster family type to " Intel SandyBridge
IBRS SSBD
Family" in order to be able to Activate the Host back inside
Cluster.
Following further, each Host CPU family type changed after Upgrade
from
"" Intel SandyBridge IBRS SSBD MDS Family" to
"" Intel SandyBridge IBRS
SSBD Family" , except one where HostedEngine is currently one.
Could this possibly be the reason why I cannot Migrate the
HostedEngine
now and how to solve it?
Kindly awaiting your reply.
-----
kind regards/met vriendelijke groeten
Marko Vrgotic
Sr. System Engineer @ System Administration
ActiveVideo
o: +31 (35) 6774131
e: m.vrgotic(a)activevideo.com
ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein
1.1217 WJ Hilversum, The Netherlands. The information contained in
this
message may be legally privileged and confidential. It is intended to
be read only by the individual or entity to whom it is addressed or
by
their designee. If the reader of this message is not the intended
recipient, you are on notice that any distribution of this message,
in
any form, is strictly prohibited. If you have received this message
in
error, please immediately notify the sender and/or ActiveVideo
Networks, LLC by telephone at +1 408.931.9200 and delete or destroy
any
copy of this message.
On 14/02/2020, 14:01, "Strahil Nikolov"
<hunter86_bg(a)yahoo.com> wrote:
On February 14, 2020 2:47:04 PM GMT+02:00, "Vrgotic, Marko"
<M.Vrgotic(a)activevideo.com> wrote:
>Dear oVirt,
>
>I have problem migrating HostedEngine, only HA VM server, to
other HA
>nodes.
>
>Bit of background story:
>
> * We have oVirt SHE 4.3.5
> * Three Nodes act as HA pool for SHE
> * Node 3 is currently Hosting SHE
> * Actions:
>* Put Node1 in Maintenance mode, all VMs were successfully
migrated,
>than Upgrade packages, Activate Host – all looks good
>* Put Node2 in Maintenance mode, all VMs were successfully
migrated,
>than Upgrade packages, Activate Host – all looks good
>
>Not the problem:
>Try to set Node3 in Maintenance mode, all VMs were
successfully
>migrated, except HostedEngine.
>
>When attempting Migration of the VM HostedEngine, it fails
with
>following error message:
>
>2020-02-14 12:33:49,960Z INFO
>[org.ovirt.engine.core.bll.MigrateVmCommand] (default
task-265)
>[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Lock Acquired to
object
>'EngineLock:{exclusiveLocks='[66b6d489-ceb8-486a-951a-355e21f13627=VM]',
>sharedLocks=''}'
>2020-02-14 12:33:49,984Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
(default
>task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
host
>'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: null)
>2020-02-14 12:33:49,984Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
(default
>task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
host
>'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: null)
>2020-02-14 12:33:49,997Z INFO
>[org.ovirt.engine.core.bll.MigrateVmCommand] (default
task-265)
>[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Running command:
>MigrateVmCommand internal: false. Entities affected : ID:
>66b6d489-ceb8-486a-951a-355e21f13627 Type: VMAction group
MIGRATE_VM
>with role type USER
>2020-02-14 12:33:50,008Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
(default
>task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
host
>'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>2020-02-14 12:33:50,008Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
(default
>task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate
host
>'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>2020-02-14 12:33:50,033Z INFO
>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default
task-265)
>[16f4559e-e262-4c9d-80b4-ec81c2cbf950] START,
MigrateVDSCommand(
>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>srcHost='ovirt-sj-03.ictv.com',
>dstVdsId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',
>dstHost='ovirt-sj-01.ictv.com:54321',
migrationMethod='ONLINE',
>tunnelMigration='false', migrationDowntime='0',
autoConverge='true',
>migrateCompressed='false',
consoleAddress='null', maxBandwidth='40',
>enableGuestEvents='true',
maxIncomingMigrations='2',
>maxOutgoingMigrations='2',
>convergenceSchedule='[init=[{name=setDowntime,
params=[100]}],
>stalling=[{limit=1, action={name=setDowntime, params=[150]}},
{limit=2,
>action={name=setDowntime, params=[200]}}, {limit=3,
>action={name=setDowntime, params=[300]}}, {limit=4,
>action={name=setDowntime, params=[400]}}, {limit=6,
>action={name=setDowntime, params=[500]}}, {limit=-1,
>action={name=abort, params=[]}}]]',
dstQemu='10.210.13.11'}), log id:
>5c126a47
>2020-02-14 12:33:50,036Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>(default task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950]
START,
>MigrateBrokerVDSCommand(HostName =
ovirt-sj-03.ictv.com,
>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>srcHost='ovirt-sj-03.ictv.com',
>dstVdsId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',
>dstHost='ovirt-sj-01.ictv.com:54321',
migrationMethod='ONLINE',
>tunnelMigration='false', migrationDowntime='0',
autoConverge='true',
>migrateCompressed='false',
consoleAddress='null', maxBandwidth='40',
>enableGuestEvents='true',
maxIncomingMigrations='2',
>maxOutgoingMigrations='2',
>convergenceSchedule='[init=[{name=setDowntime,
params=[100]}],
>stalling=[{limit=1, action={name=setDowntime, params=[150]}},
{limit=2,
>action={name=setDowntime, params=[200]}}, {limit=3,
>action={name=setDowntime, params=[300]}}, {limit=4,
>action={name=setDowntime, params=[400]}}, {limit=6,
>action={name=setDowntime, params=[500]}}, {limit=-1,
>action={name=abort, params=[]}}]]',
dstQemu='10.210.13.11'}), log id:
>a0f776d
>2020-02-14 12:33:50,043Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>(default task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950]
FINISH,
>MigrateBrokerVDSCommand, return: , log id: a0f776d
>2020-02-14 12:33:50,046Z INFO
>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default
task-265)
>[16f4559e-e262-4c9d-80b4-ec81c2cbf950] FINISH,
MigrateVDSCommand,
>return: MigratingFrom, log id: 5c126a47
>2020-02-14 12:33:50,052Z INFO
>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>(default task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950]
EVENT_ID:
>VM_MIGRATION_START(62), Migration started (VM: HostedEngine,
Source:
>ovirt-sj-03.ictv.com, Destination:
ovirt-sj-01.ictv.com,
User:
>mvrgotic@ictv.com(a)ictv.com-authz).
>2020-02-14 12:33:52,893Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-8) [] VM
'66b6d489-ceb8-486a-951a-355e21f13627'
>was reported as Down on VDS
>'9808f434-5cd4-48b5-8bbc-e639e391c6a5'(ovirt-sj-01.ictv.com)
>2020-02-14 12:33:52,893Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>(ForkJoinPool-1-worker-8) [] START, DestroyVDSCommand(HostName
=
>ovirt-sj-01.ictv.com,
>DestroyVmVDSCommandParameters:{hostId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
secondsToWait='0',
>gracefully='false', reason='',
ignoreNoVm='true'}), log id: 7532a8c0
>2020-02-14 12:33:53,217Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>(ForkJoinPool-1-worker-8) [] Failed to destroy VM
>'66b6d489-ceb8-486a-951a-355e21f13627' because VM does
not exist,
>ignoring
>2020-02-14 12:33:53,217Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>(ForkJoinPool-1-worker-8) [] FINISH, DestroyVDSCommand, return: ,
log
>id: 7532a8c0
>2020-02-14 12:33:53,217Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-8) [] VM
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) was
unexpectedly
>detected as 'Down' on VDS
>'9808f434-5cd4-48b5-8bbc-e639e391c6a5'(ovirt-sj-01.ictv.com)
(expected
>on 'f8d27efb-1527-45f0-97d6-d34a86abaaa2')
>2020-02-14 12:33:53,217Z ERROR
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-8) [] Migration of VM
'HostedEngine' to host
>'ovirt-sj-01.ictv.com' failed: VM destroyed during the
startup.
>2020-02-14 12:33:53,219Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-15) [] VM
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine)
moved from
>'MigratingFrom' --> 'Up'
>2020-02-14 12:33:53,219Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-15) [] Adding VM
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) to
re-run list
>2020-02-14 12:33:53,221Z ERROR
>[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]
>(ForkJoinPool-1-worker-15) [] Rerun VM
>'66b6d489-ceb8-486a-951a-355e21f13627'. Called from
VDS
>'ovirt-sj-03.ictv.com'
>2020-02-14 12:33:53,259Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] START,
>MigrateStatusVDSCommand(HostName =
ovirt-sj-03.ictv.com,
>MigrateStatusVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627'}), log id:
62bac076
>2020-02-14 12:33:53,265Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] FINISH,
>MigrateStatusVDSCommand, return: , log id: 62bac076
>2020-02-14 12:33:53,277Z WARN
>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] EVENT_ID:
>VM_MIGRATION_TRYING_RERUN(128), Failed to migrate VM
HostedEngine to
>Host
ovirt-sj-01.ictv.com . Trying to migrate to another
Host.
>2020-02-14 12:33:53,330Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate
host
>'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: null)
>2020-02-14 12:33:53,330Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate
host
>'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: null)
>2020-02-14 12:33:53,345Z INFO
>[org.ovirt.engine.core.bll.MigrateVmCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] Running
command:
>MigrateVmCommand internal: false. Entities affected : ID:
>66b6d489-ceb8-486a-951a-355e21f13627 Type: VMAction group
MIGRATE_VM
>with role type USER
>2020-02-14 12:33:53,356Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate
host
>'ovirt-sj-04.ictv.com'
('d98843da-bd81-46c9-9425-065b196ac59d') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>2020-02-14 12:33:53,356Z INFO
>[org.ovirt.engine.core.bll.scheduling.SchedulingManager]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate
host
>'ovirt-sj-05.ictv.com'
('e3176705-9fb0-41d6-8721-367dfa2e62bd') was
>filtered out by 'VAR__FILTERTYPE__INTERNAL' filter
'HA' (correlation
>id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)
>2020-02-14 12:33:53,380Z INFO
>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] START,
>MigrateVDSCommand(
>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>srcHost='ovirt-sj-03.ictv.com',
>dstVdsId='33e8ff78-e396-4f40-b43c-685bfaaee9af',
>dstHost='ovirt-sj-02.ictv.com:54321',
migrationMethod='ONLINE',
>tunnelMigration='false', migrationDowntime='0',
autoConverge='true',
>migrateCompressed='false',
consoleAddress='null', maxBandwidth='40',
>enableGuestEvents='true',
maxIncomingMigrations='2',
>maxOutgoingMigrations='2',
>convergenceSchedule='[init=[{name=setDowntime,
params=[100]}],
>stalling=[{limit=1, action={name=setDowntime, params=[150]}},
{limit=2,
>action={name=setDowntime, params=[200]}}, {limit=3,
>action={name=setDowntime, params=[300]}}, {limit=4,
>action={name=setDowntime, params=[400]}}, {limit=6,
>action={name=setDowntime, params=[500]}}, {limit=-1,
>action={name=abort, params=[]}}]]',
dstQemu='10.210.13.12'}), log id:
>d99059f
>2020-02-14 12:33:53,380Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] START,
>MigrateBrokerVDSCommand(HostName =
ovirt-sj-03.ictv.com,
>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
>srcHost='ovirt-sj-03.ictv.com',
>dstVdsId='33e8ff78-e396-4f40-b43c-685bfaaee9af',
>dstHost='ovirt-sj-02.ictv.com:54321',
migrationMethod='ONLINE',
>tunnelMigration='false', migrationDowntime='0',
autoConverge='true',
>migrateCompressed='false',
consoleAddress='null', maxBandwidth='40',
>enableGuestEvents='true',
maxIncomingMigrations='2',
>maxOutgoingMigrations='2',
>convergenceSchedule='[init=[{name=setDowntime,
params=[100]}],
>stalling=[{limit=1, action={name=setDowntime, params=[150]}},
{limit=2,
>action={name=setDowntime, params=[200]}}, {limit=3,
>action={name=setDowntime, params=[300]}}, {limit=4,
>action={name=setDowntime, params=[400]}}, {limit=6,
>action={name=setDowntime, params=[500]}}, {limit=-1,
>action={name=abort, params=[]}}]]',
dstQemu='10.210.13.12'}), log id:
>6f0483ac
>2020-02-14 12:33:53,386Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] FINISH,
>MigrateBrokerVDSCommand, return: , log id: 6f0483ac
>2020-02-14 12:33:53,388Z INFO
>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] FINISH,
>MigrateVDSCommand, return: MigratingFrom, log id: d99059f
>2020-02-14 12:33:53,391Z INFO
>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>(EE-ManagedThreadFactory-engine-Thread-377323) [] EVENT_ID:
>VM_MIGRATION_START(62), Migration started (VM: HostedEngine,
Source:
>ovirt-sj-03.ictv.com, Destination:
ovirt-sj-02.ictv.com,
User:
>mvrgotic@ictv.com(a)ictv.com-authz).
>2020-02-14 12:33:55,108Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher]
>(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] Fetched 10
VMs
>from VDS '33e8ff78-e396-4f40-b43c-685bfaaee9af'
>2020-02-14 12:33:55,110Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM
>'66b6d489-ceb8-486a-951a-355e21f13627' is migrating to
VDS
>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com) ignoring
>it in the refresh until migration is done
>2020-02-14 12:33:57,224Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-15) [] VM
'66b6d489-ceb8-486a-951a-355e21f13627'
>was reported as Down on VDS
>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com)
>2020-02-14 12:33:57,225Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>(ForkJoinPool-1-worker-15) [] START,
DestroyVDSCommand(HostName =
>ovirt-sj-02.ictv.com,
>DestroyVmVDSCommandParameters:{hostId='33e8ff78-e396-4f40-b43c-685bfaaee9af',
>vmId='66b6d489-ceb8-486a-951a-355e21f13627',
secondsToWait='0',
>gracefully='false', reason='',
ignoreNoVm='true'}), log id: 1dec553e
>2020-02-14 12:33:57,672Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>(ForkJoinPool-1-worker-15) [] Failed to destroy VM
>'66b6d489-ceb8-486a-951a-355e21f13627' because VM does
not exist,
>ignoring
>2020-02-14 12:33:57,672Z INFO
>[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>(ForkJoinPool-1-worker-15) [] FINISH, DestroyVDSCommand, return: ,
log
>id: 1dec553e
>2020-02-14 12:33:57,672Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-15) [] VM
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) was
unexpectedly
>detected as 'Down' on VDS
>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com)
(expected
>on 'f8d27efb-1527-45f0-97d6-d34a86abaaa2')
>2020-02-14 12:33:57,672Z ERROR
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-15) [] Migration of VM
'HostedEngine' to host
>'ovirt-sj-02.ictv.com' failed: VM destroyed during the
startup.
>2020-02-14 12:33:57,674Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-8) [] VM
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine)
moved from
>'MigratingFrom' --> 'Up'
>2020-02-14 12:33:57,674Z INFO
>[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>(ForkJoinPool-1-worker-8) [] Adding VM
>'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) to
re-run list
>2020-02-14 12:33:57,676Z ERROR
>[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]
>(ForkJoinPool-1-worker-8) [] Rerun VM
>'66b6d489-ceb8-48
I am afraid that your suspicions are right.
What is the host cpu and the HostedEngine's xml?
Have you checked the xml on any working VM ? What cpu flags do the working VMs
have ?
How to solve - I think I have a solution , but you might not like it.
1. Get current VM xml with virsh
2. Set all nodes in maintenance 'hosted-engine --set-maintenance
--mode=global'
3. Stop and undefine the VM on the last working host
4. Edit the xml from step 1 and add/remove the flags that are different from the
other (working) VMs
5. Define the HostedEngine on any of the updated hosts
6. Start the HostedEngine via virsh.
7. Try with different cpu flags until the engine starts.
8. Leave the engine for at least 12 hours , so it will have enough time to update
it's own configuration.
9. Remove the maintenance and migrate the engine to the other upgraded host
10. Patch the last HostedEngine's host
I have done this procedure in order to recover my engine (except changing the cpu
flags).
Note: You may hit some hiccups:
A) virsh alias
alias virsh='virsh -c
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'
B) HostedEngine network missing:
[root@ovirt1 ~]# virsh net-dumpxml vdsm-ovirtmgmt
<network>
<name>vdsm-ovirtmgmt</name>
<uuid>986c27cf-a1ec-44d8-ae61-ee09ce75c886</uuid>
<forward mode='bridge'/>
<bridge name='ovirtmgmt'/>
</network>
Define in xml and add it via:
virsh net-define somefile.xml
C) Missing disk
Vdsm is creating symlinks like these:
[root@ovirt1 808423f9-8a5c-40cd-bc9f-2568c85b8c74]# pwd
/var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74
[root@ovirt1 808423f9-8a5c-40cd-bc9f-2568c85b8c74]# ls -l
total 20
lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:05 2c74697a-8bd9-4472-8a98-bf624f3462d5 ->
/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/2c74697a-8bd9-4472-8a98-bf624f3462d5
lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:09 3ec27d6d-921c-4348-b799-f50543b6f919 ->
/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/3ec27d6d-921c-4348-b799-f50543b6f919
lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:09 441abdc8-6cb1-49a4-903f-a1ec0ed88429 ->
/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/441abdc8-6cb1-49a4-903f-a1ec0ed88429
lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:09 94ade632-6ecc-4901-8cec-8e39f3d69cb0 ->
/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/94ade632-6ecc-4901-8cec-8e39f3d69cb0
lrwxrwxrwx. 1 vdsm kvm 129 Feb 2 19:05 fe62a281-51e9-4b23-87b3-2deb52357304 ->
/rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/fe62a281-51e9-4b23-87b3-2deb52357304
[root@ovirt1 808423f9-8a5c-40cd-bc9f-2568c85b8c74]#
Just create the link, so it points to correct destinationand power up again.
Good luck !
Best Regards,
Strahil Nikolov