Hi Strahil,

 

While writing previous email, I realized where and how to check if Hosted Engine is defined on other HA Nodes 1 & 2.

 

The HostedEngine DEPLOYed(the crown) and HostedEngine DEFINED are two different things.

 

Please, if possible, be so kind to review the workaround steps.

 

 

-----

kind regards/met vriendelijke groeten

 

Marko Vrgotic
ActiveVideo

 

 

From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com>
Date: Monday, 17 February 2020 at 14:09
To: Strahil Nikolov <hunter86_bg@yahoo.com>, "users@ovirt.org" <users@ovirt.org>
Cc: Darko Stojchev <D.Stojchev@activevideo.com>
Subject: Re: [ovirt-users] Re: HostedEngine migration fails with VM destroyed during the startup.

 

Hi Strahil,

 

In case your answer is to proceed with initially suggested(check my previous email), please review following steps as I want to make sure I am on the right path:

 

Situation:

HA Node1 – updated – SHE deployed/defined – not hosting SHE

HA Node2 – updated – SHE deployed/defined – not hosting SHE

HA Node3 – not updated – SHE deployed/defined - currently hosting SHE

 

 

  1. Execute engine-backup
  2. Get current VM xml with virsh, from Node where HostedEngine is currently running
    1. Node3 # virsh dumpxml  {hosted_engine_id} > ovirt-engine-file/HostedEngine.xml
    2. Node3 # virsh nett-dumpxml vdsm-ovirtmgmt > ovirt-engine-file/vdsm-ovirtmgmt.xml

 

  1. Set all nodes in maintenance 'hosted-engine --set-maintenance  --mode=global'
    1. Node3 # hosted-engine --set-maintenance  --mode=global

 

  1. Stop and undefine the VM on the last working host (Node3)
    1. Node3 # hosted-engine –vm-shutdown
    2. Node3 # virsh undefine {hosted_engine_id}

 

  1. Edit the xml from step 1 and add/remove the flags  that are different from the other (working) VMs
    1. Remove the md-clear flag from cpu features list in xml

 

  1. Define and Start the HostedEngine on any of the updated hosts
    1. Node1 # virsh define HostedEngine.xml
    2. Node2 # virsh start {hosted_engine_id}
    3. This steps might first required creating vdsm-ovirtmgmt and links to images on storage
  2. Leave the engine for at least 12 hours , so it will have enough time to update  it's  own configuration.
  3. Remove the maintenance  and migrate the engine to the other  upgraded  host – via WebUI
  4. Patch the last HostedEngine's host

 

Questions:

 

Kindly awaiting your reply.

 

 

-----

kind regards/met vriendelijke groeten

 

Marko Vrgotic
Sr. System Engineer @ System Administration


ActiveVideo

o: +31 (35) 6774131

e: m.vrgotic@activevideo.com
w: www.activevideo.com

 

ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217 WJ Hilversum, The Netherlands. The information contained in this message may be legally privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or ActiveVideo Networks, LLC by telephone at +1 408.931.9200 and delete or destroy any copy of this message.

 

 

 

From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com>
Date: Monday, 17 February 2020 at 12:55
To: Strahil Nikolov <hunter86_bg@yahoo.com>, "users@ovirt.org" <users@ovirt.org>
Cc: Darko Stojchev <D.Stojchev@activevideo.com>
Subject: Re: [ovirt-users] Re: HostedEngine migration fails with VM destroyed during the startup.

 

Good day Strahil,

 

I believe I found the causing link:

 

HostedEngine.log-20200216:-cpu SandyBridge,pcid=on,spec-ctrl=on,ssbd=on,md-clear=on,vme=on,hypervisor=on,arat=on,xsaveopt=on \

HostedEngine.log-20200216:2020-02-13T17:58:38.674630Z qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EDX.md-clear [bit 10]

HostedEngine.log-20200216:2020-02-13T17:58:38.676205Z qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EDX.md-clear [bit 10]

HostedEngine.log-20200216:2020-02-13T17:58:38.676901Z qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EDX.md-clear [bit 10]

HostedEngine.log-20200216:2020-02-13T17:58:38.677616Z qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EDX.md-clear [bit 10]

 

The "md-clear" CPU seem to be removed as feature due to spectre vulnerabilities.

 

However, when I check the CPU Type/flags of the VMs on the same Host as where Engine is currently, as well as on the other hosts, the md-clear seems to be only present on the HostedEngine:

 

FromwebUI:

Intel SandyBridge IBRS SSBD Family

 

Via virsh:

#virsh dumpxml

<cpu mode='custom' match='exact' check='full'>

    <model fallback='forbid'>SandyBridge</model>

    <topology sockets='16' cores='4' threads='1'/>

    <feature policy='require' name='pcid'/>

    <feature policy='require' name='spec-ctrl'/>

    <feature policy='require' name='ssbd'/>

    <feature policy='require' name='md-clear'/>

    <feature policy='require' name='vme'/>

    <feature policy='require' name='hypervisor'/>

    <feature policy='require' name='arat'/>

    <feature policy='require' name='xsaveopt'/>

    <numa>

      <cell id='0' cpus='0-3' memory='16777216' unit='KiB'/>

    </numa>

</cpu>

 

 

From webUI:

(SandyBridge,+pcid,+spec-ctrl,+ssbd)

 

Via virsh:

#virsh dumpxml

<cpu mode='custom' match='exact' check='full'>

    <model fallback='forbid'>SandyBridge</model>

    <topology sockets='16' cores='1' threads='1'/>

    <feature policy='require' name='pcid'/>

    <feature policy='require' name='spec-ctrl'/>

    <feature policy='require' name='ssbd'/>

    <feature policy='require' name='vme'/>

    <feature policy='require' name='hypervisor'/>

    <feature policy='require' name='arat'/>

    <feature policy='require' name='xsaveopt'/>

    <numa>

      <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>

    </numa>

  </cpu>

 

 

Strahil, knowing this, do you propose different approach or shall I just proceed with initially suggested workaround?

 

Kindly awaiting your eply.

 

 

-----

kind regards/met vriendelijke groeten

 

Marko Vrgotic
Sr. System Engineer @ System Administration


ActiveVideo

o: +31 (35) 6774131

e: m.vrgotic@activevideo.com
w: www.activevideo.com

 

ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217 WJ Hilversum, The Netherlands. The information contained in this message may be legally privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or ActiveVideo Networks, LLC by telephone at +1 408.931.9200 and delete or destroy any copy of this message.

 

 

 

 

 

On 16/02/2020, 15:28, "Strahil Nikolov" <hunter86_bg@yahoo.com> wrote:

 

    ssh root@engine "poweroff"

    ssh host-that-holded-engine "virsh undefine HostedEngine; virsh list --all"

   

    Lot's of virsh - less vdsm :)

   

    Good luck

   

    Best Regards,

    Strahil Nikolov

   

    

    В неделя, 16 февруари 2020 г., 16:01:44 ч. Гринуич+2, Vrgotic, Marko <m.vrgotic@activevideo.com> написа:

    

    

    

    

    

    Hi Strahil,

   

    Regarding step 3:  Stop and undefine the VM on the last working host

    One question: How do I undefine HostedEngine from last Host? Hosted-engine command does not provide such option, or it's just not obvious.

   

    Kindly awaiting your reply.

   

    

    -----

    kind regards/met vriendelijke groeten

   

    Marko Vrgotic

    ActiveVideo

   

    

    On 14/02/2020, 18:44, "Strahil Nikolov" <hunter86_bg@yahoo.com> wrote:

   

        On February 14, 2020 4:19:53 PM GMT+02:00, "Vrgotic, Marko" <M.Vrgotic@activevideo.com> wrote:

        >Good answer Strahil,

        >

        >Thank you, I forgot.

        >

        >Libvirt logs are actually showing the reason why:

        >

        >2020-02-14T12:33:51.847970Z qemu-kvm: -drive

        >file=/var/run/vdsm/storage/054c43fc-1924-4106-9f80-0f2ac62b9886/b019c5fa-8fb5-4bfc-8339-f5b7f590a051/f1ce8ba6-2d3b-4309-bca0-e6a00ce74c75,format=raw,if=none,id=drive-ua-b019c5fa-8fb5-4bfc-8339-f5b7f590a051,serial=b019c5fa-8fb5-4bfc-8339-f5b7f590a051,werror=stop,rerror=stop,cache=none,aio=threads:

        >'serial' is deprecated, please use the corresponding option of

        >'-device' instead

        >Spice-Message: 04:33:51.856: setting TLS option 'CipherString' to

        >'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from

        >/etc/pki/tls/spice.cnf configuration file

        >2020-02-14T12:33:51.863449Z qemu-kvm: warning: CPU(s) not present in

        >any NUMA nodes: CPU 4 [socket-id: 1, core-id: 0, thread-id: 0], CPU 5

        >[socket-id: 1, core-id: 1, thread-id: 0], CPU 6 [socket-id: 1, core-id:

        >2, thread-id: 0], CPU 7 [socket-id: 1, core-id: 3, thread-id: 0], CPU 8

        >[socket-id: 2, core-id: 0, thread-id: 0], CPU 9 [socket-id: 2, core-id:

        >1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2, thread-id: 0], CPU

        >11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12 [socket-id: 3,

        >core-id: 0, thread-id: 0], CPU 13 [socket-id: 3, core-id: 1, thread-id:

        >0], CPU 14 [socket-id: 3, core-id: 2, thread-id: 0], CPU 15 [socket-id:

        >3, core-id: 3, thread-id: 0], CPU 16 [socket-id: 4, core-id: 0,

        >thread-id: 0], CPU 17 [socket-id: 4, core-id: 1, thread-id: 0], CPU 18

        >[socket-id: 4, core-id: 2, thread-id: 0], CPU 19 [socket-id: 4,

        >core-id: 3, thread-id: 0], CPU 20 [socket-id: 5, core-id: 0, thread-id:

        >0], CPU 21 [socket-id: 5, core-id: 1, thread-id: 0], CPU 22 [socket-id:

        >5, core-id: 2, thread-id: 0], CPU 23 [socket-id: 5, core-id: 3,

        >thread-id: 0], CPU 24 [socket-id: 6, core-id: 0, thread-id: 0], CPU 25

        >[socket-id: 6, core-id: 1, thread-id: 0], CPU 26 [socket-id: 6,

        >core-id: 2, thread-id: 0], CPU 27 [socket-id: 6, core-id: 3, thread-id:

        >0], CPU 28 [socket-id: 7, core-id: 0, thread-id: 0], CPU 29 [socket-id:

        >7, core-id: 1, thread-id: 0], CPU 30 [socket-id: 7, core-id: 2,

        >thread-id: 0], CPU 31 [socket-id: 7, core-id: 3, thread-id: 0], CPU 32

        >[socket-id: 8, core-id: 0, thread-id: 0], CPU 33 [socket-id: 8,

        >core-id: 1, thread-id: 0], CPU 34 [socket-id: 8, core-id: 2, thread-id:

        >0], CPU 35 [socket-id: 8, core-id: 3, thread-id: 0], CPU 36 [socket-id:

        >9, core-id: 0, thread-id: 0], CPU 37 [socket-id: 9, core-id: 1,

        >thread-id: 0], CPU 38 [socket-id: 9, core-id: 2, thread-id: 0], CPU 39

        >[socket-id: 9, core-id: 3, thread-id: 0], CPU 40 [socket-id: 10,

        >core-id: 0, thread-id: 0], CPU 41 [socket-id: 10, core-id: 1,

        >thread-id: 0], CPU 42 [socket-id: 10, core-id: 2, thread-id: 0], CPU 43

        >[socket-id: 10, core-id: 3, thread-id: 0], CPU 44 [socket-id: 11,

        >core-id: 0, thread-id: 0], CPU 45 [socket-id: 11, core-id: 1,

        >thread-id: 0], CPU 46 [socket-id: 11, core-id: 2, thread-id: 0], CPU 47

        >[socket-id: 11, core-id: 3, thread-id: 0], CPU 48 [socket-id: 12,

        >core-id: 0, thread-id: 0], CPU 49 [socket-id: 12, core-id: 1,

        >thread-id: 0], CPU 50 [socket-id: 12, core-id: 2, thread-id: 0], CPU 51

        >[socket-id: 12, core-id: 3, thread-id: 0], CPU 52 [socket-id: 13,

        >core-id: 0, thread-id: 0], CPU 53 [socket-id: 13, core-id: 1,

        >thread-id: 0], CPU 54 [socket-id: 13, core-id: 2, thread-id: 0], CPU 55

        >[socket-id: 13, core-id: 3, thread-id: 0], CPU 56 [socket-id: 14,

        >core-id: 0, thread-id: 0], CPU 57 [socket-id: 14, core-id: 1,

        >thread-id: 0], CPU 58 [socket-id: 14, core-id: 2, thread-id: 0], CPU 59

        >[socket-id: 14, core-id: 3, thread-id: 0], CPU 60 [socket-id: 15,

        >core-id: 0, thread-id: 0], CPU 61 [socket-id: 15, core-id: 1,

        >thread-id: 0], CPU 62 [socket-id: 15, core-id: 2, thread-id: 0], CPU 63

        >[socket-id: 15, core-id: 3, thread-id: 0]

        >2020-02-14T12:33:51.863475Z qemu-kvm: warning: All CPU(s) up to maxcpus

        >should be described in NUMA config, ability to start up with partial

        >NUMA mappings is obsoleted and will be removed in future

        >2020-02-14T12:33:51.863973Z qemu-kvm: warning: host doesn't support

        >requested feature: CPUID.07H:EDX.md-clear [bit 10]

        >2020-02-14T12:33:51.865066Z qemu-kvm: warning: host doesn't support

        >requested feature: CPUID.07H:EDX.md-clear [bit 10]

        >2020-02-14T12:33:51.865547Z qemu-kvm: warning: host doesn't support

        >requested feature: CPUID.07H:EDX.md-clear [bit 10]

        >2020-02-14T12:33:51.865996Z qemu-kvm: warning: host doesn't support

        >requested feature: CPUID.07H:EDX.md-clear [bit 10]

        >2020-02-14 12:33:51.932+0000: shutting down, reason=failed

        >

        >But then I wonder if the following is related to error above:

        >

        >Before I started upgrading Host by Host, all Hosts in Cluster were

        >showing CPU Family type: " Intel SandyBridge IBRS SSBD MDS Family"

        >After first Host was upgraded, his CPU Family type was changed to: "

        >Intel SandyBridge IBRS SSBD Family" and that forced me to have do

        >"downgrade" Cluster family type to " Intel SandyBridge IBRS SSBD

        >Family" in order to be able to Activate the Host back inside Cluster.

        >Following further, each Host CPU family type changed after Upgrade from

        >"" Intel SandyBridge IBRS SSBD MDS Family" to "" Intel SandyBridge IBRS

        >SSBD Family" , except one where HostedEngine is currently one.

        >

        >Could this possibly be the reason why I cannot Migrate the HostedEngine

        >now and how to solve it?

        >

        >Kindly awaiting your reply.

        >

        >

        >-----

        >kind regards/met vriendelijke groeten

        >

        >Marko Vrgotic

        >Sr. System Engineer @ System Administration

        >

        >ActiveVideo

        >o: +31 (35) 6774131

        >e: m.vrgotic@activevideo.com

        >w: www.activevideo.com <http://www.activevideo.com>

        >

        >ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein

        >1.1217 WJ Hilversum, The Netherlands. The information contained in this

        >message may be legally privileged and confidential. It is intended to

        >be read only by the individual or entity to whom it is addressed or by

        >their designee. If the reader of this message is not the intended

        >recipient, you are on notice that any distribution of this message, in

        >any form, is strictly prohibited.  If you have received this message in

        >error, please immediately notify the sender and/or ActiveVideo

        >Networks, LLC by telephone at +1 408.931.9200 and delete or destroy any

        >copy of this message.

        >

        >

        >

        >

        >

        >

        >

        >On 14/02/2020, 14:01, "Strahil Nikolov" <hunter86_bg@yahoo.com> wrote:

        >

        >On February 14, 2020 2:47:04 PM GMT+02:00, "Vrgotic, Marko"

        ><M.Vrgotic@activevideo.com> wrote:

        >    >Dear oVirt,

        >    >

        > >I have problem migrating HostedEngine, only HA VM server, to other HA

        >    >nodes.

        >    >

        >    >Bit of background story:

        >    >

        >    >  *  We have oVirt SHE 4.3.5

        >    >  *  Three Nodes act as HA pool for SHE

        >    >  *  Node 3 is currently Hosting SHE

        >    >  *  Actions:

        >>*  Put Node1 in Maintenance mode, all VMs were successfully migrated,

        >    >than Upgrade packages, Activate Host – all looks good

        >>*  Put Node2 in Maintenance mode, all VMs were successfully migrated,

        >    >than Upgrade packages, Activate Host – all looks good

        >    >

        >    >Not the problem:

        >    >Try to set  Node3 in Maintenance mode, all VMs were successfully

        >    >migrated, except HostedEngine.

        >    >

        >    >When attempting Migration of the VM HostedEngine, it fails with

        >    >following error message:

        >    >

        >    >2020-02-14 12:33:49,960Z INFO

        >    >[org.ovirt.engine.core.bll.MigrateVmCommand] (default task-265)

        >    >[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Lock Acquired to object

        >>'EngineLock:{exclusiveLocks='[66b6d489-ceb8-486a-951a-355e21f13627=VM]',

        >    >sharedLocks=''}'

        >    >2020-02-14 12:33:49,984Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default

        >    >task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate host

        >  >'ovirt-sj-04.ictv.com' ('d98843da-bd81-46c9-9425-065b196ac59d') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: null)

        >    >2020-02-14 12:33:49,984Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default

        >    >task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate host

        >  >'ovirt-sj-05.ictv.com' ('e3176705-9fb0-41d6-8721-367dfa2e62bd') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: null)

        >    >2020-02-14 12:33:49,997Z INFO

        >    >[org.ovirt.engine.core.bll.MigrateVmCommand] (default task-265)

        >    >[16f4559e-e262-4c9d-80b4-ec81c2cbf950] Running command:

        >    >MigrateVmCommand internal: false. Entities affected :  ID:

        >  >66b6d489-ceb8-486a-951a-355e21f13627 Type: VMAction group MIGRATE_VM

        >    >with role type USER

        >    >2020-02-14 12:33:50,008Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default

        >    >task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate host

        >  >'ovirt-sj-04.ictv.com' ('d98843da-bd81-46c9-9425-065b196ac59d') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)

        >    >2020-02-14 12:33:50,008Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default

        >    >task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] Candidate host

        >  >'ovirt-sj-05.ictv.com' ('e3176705-9fb0-41d6-8721-367dfa2e62bd') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)

        >    >2020-02-14 12:33:50,033Z INFO

        >>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-265)

        >    >[16f4559e-e262-4c9d-80b4-ec81c2cbf950] START, MigrateVDSCommand(

        >>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627',

        >    >srcHost='ovirt-sj-03.ictv.com',

        >    >dstVdsId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',

        >    >dstHost='ovirt-sj-01.ictv.com:54321', migrationMethod='ONLINE',

        >  >tunnelMigration='false', migrationDowntime='0', autoConverge='true',

        >  >migrateCompressed='false', consoleAddress='null', maxBandwidth='40',

        >    >enableGuestEvents='true', maxIncomingMigrations='2',

        >    >maxOutgoingMigrations='2',

        >    >convergenceSchedule='[init=[{name=setDowntime, params=[100]}],

        >>stalling=[{limit=1, action={name=setDowntime, params=[150]}},

        >{limit=2,

        >    >action={name=setDowntime, params=[200]}}, {limit=3,

        >    >action={name=setDowntime, params=[300]}}, {limit=4,

        >    >action={name=setDowntime, params=[400]}}, {limit=6,

        >    >action={name=setDowntime, params=[500]}}, {limit=-1,

        > >action={name=abort, params=[]}}]]', dstQemu='10.210.13.11'}), log id:

        >    >5c126a47

        >    >2020-02-14 12:33:50,036Z INFO

        >  >[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]

        >    >(default task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] START,

        >    >MigrateBrokerVDSCommand(HostName = ovirt-sj-03.ictv.com,

        >>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627',

        >    >srcHost='ovirt-sj-03.ictv.com',

        >    >dstVdsId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',

        >    >dstHost='ovirt-sj-01.ictv.com:54321', migrationMethod='ONLINE',

        >  >tunnelMigration='false', migrationDowntime='0', autoConverge='true',

        >  >migrateCompressed='false', consoleAddress='null', maxBandwidth='40',

        >    >enableGuestEvents='true', maxIncomingMigrations='2',

        >    >maxOutgoingMigrations='2',

        >    >convergenceSchedule='[init=[{name=setDowntime, params=[100]}],

        >>stalling=[{limit=1, action={name=setDowntime, params=[150]}},

        >{limit=2,

        >    >action={name=setDowntime, params=[200]}}, {limit=3,

        >    >action={name=setDowntime, params=[300]}}, {limit=4,

        >    >action={name=setDowntime, params=[400]}}, {limit=6,

        >    >action={name=setDowntime, params=[500]}}, {limit=-1,

        > >action={name=abort, params=[]}}]]', dstQemu='10.210.13.11'}), log id:

        >    >a0f776d

        >    >2020-02-14 12:33:50,043Z INFO

        >  >[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]

        >    >(default task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] FINISH,

        >    >MigrateBrokerVDSCommand, return: , log id: a0f776d

        >    >2020-02-14 12:33:50,046Z INFO

        >>[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-265)

        >    >[16f4559e-e262-4c9d-80b4-ec81c2cbf950] FINISH, MigrateVDSCommand,

        >    >return: MigratingFrom, log id: 5c126a47

        >    >2020-02-14 12:33:50,052Z INFO

        >>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]

        >  >(default task-265) [16f4559e-e262-4c9d-80b4-ec81c2cbf950] EVENT_ID:

        >  >VM_MIGRATION_START(62), Migration started (VM: HostedEngine, Source:

        >    >ovirt-sj-03.ictv.com, Destination: ovirt-sj-01.ictv.com, User:

        >    >mvrgotic@ictv.com@ictv.com-authz).

        >    >2020-02-14 12:33:52,893Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >>(ForkJoinPool-1-worker-8) [] VM '66b6d489-ceb8-486a-951a-355e21f13627'

        >    >was reported as Down on VDS

        >    >'9808f434-5cd4-48b5-8bbc-e639e391c6a5'(ovirt-sj-01.ictv.com)

        >    >2020-02-14 12:33:52,893Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]

        >    >(ForkJoinPool-1-worker-8) [] START, DestroyVDSCommand(HostName =

        >    >ovirt-sj-01.ictv.com,

        >>DestroyVmVDSCommandParameters:{hostId='9808f434-5cd4-48b5-8bbc-e639e391c6a5',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627', secondsToWait='0',

        >  >gracefully='false', reason='', ignoreNoVm='true'}), log id: 7532a8c0

        >    >2020-02-14 12:33:53,217Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]

        >    >(ForkJoinPool-1-worker-8) [] Failed to destroy VM

        >    >'66b6d489-ceb8-486a-951a-355e21f13627' because VM does not exist,

        >    >ignoring

        >    >2020-02-14 12:33:53,217Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]

        > >(ForkJoinPool-1-worker-8) [] FINISH, DestroyVDSCommand, return: , log

        >    >id: 7532a8c0

        >    >2020-02-14 12:33:53,217Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(ForkJoinPool-1-worker-8) [] VM

        > >'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) was unexpectedly

        >    >detected as 'Down' on VDS

        >>'9808f434-5cd4-48b5-8bbc-e639e391c6a5'(ovirt-sj-01.ictv.com) (expected

        >    >on 'f8d27efb-1527-45f0-97d6-d34a86abaaa2')

        >    >2020-02-14 12:33:53,217Z ERROR

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >  >(ForkJoinPool-1-worker-8) [] Migration of VM 'HostedEngine' to host

        >    >'ovirt-sj-01.ictv.com' failed: VM destroyed during the startup.

        >    >2020-02-14 12:33:53,219Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(ForkJoinPool-1-worker-15) [] VM

        >    >'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) moved from

        >    >'MigratingFrom' --> 'Up'

        >    >2020-02-14 12:33:53,219Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(ForkJoinPool-1-worker-15) [] Adding VM

        >  >'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) to re-run list

        >    >2020-02-14 12:33:53,221Z ERROR

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]

        >    >(ForkJoinPool-1-worker-15) [] Rerun VM

        >    >'66b6d489-ceb8-486a-951a-355e21f13627'. Called from VDS

        >    >'ovirt-sj-03.ictv.com'

        >    >2020-02-14 12:33:53,259Z INFO

        >  >[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] START,

        >    >MigrateStatusVDSCommand(HostName = ovirt-sj-03.ictv.com,

        >>MigrateStatusVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627'}), log id: 62bac076

        >    >2020-02-14 12:33:53,265Z INFO

        >  >[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] FINISH,

        >    >MigrateStatusVDSCommand, return: , log id: 62bac076

        >    >2020-02-14 12:33:53,277Z WARN

        >>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] EVENT_ID:

        >  >VM_MIGRATION_TRYING_RERUN(128), Failed to migrate VM HostedEngine to

        >    >Host ovirt-sj-01.ictv.com . Trying to migrate to another Host.

        >    >2020-02-14 12:33:53,330Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate host

        >  >'ovirt-sj-04.ictv.com' ('d98843da-bd81-46c9-9425-065b196ac59d') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: null)

        >    >2020-02-14 12:33:53,330Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate host

        >  >'ovirt-sj-05.ictv.com' ('e3176705-9fb0-41d6-8721-367dfa2e62bd') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: null)

        >    >2020-02-14 12:33:53,345Z INFO

        >    >[org.ovirt.engine.core.bll.MigrateVmCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] Running command:

        >    >MigrateVmCommand internal: false. Entities affected :  ID:

        >  >66b6d489-ceb8-486a-951a-355e21f13627 Type: VMAction group MIGRATE_VM

        >    >with role type USER

        >    >2020-02-14 12:33:53,356Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate host

        >  >'ovirt-sj-04.ictv.com' ('d98843da-bd81-46c9-9425-065b196ac59d') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)

        >    >2020-02-14 12:33:53,356Z INFO

        >    >[org.ovirt.engine.core.bll.scheduling.SchedulingManager]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] Candidate host

        >  >'ovirt-sj-05.ictv.com' ('e3176705-9fb0-41d6-8721-367dfa2e62bd') was

        >  >filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HA' (correlation

        >    >id: 16f4559e-e262-4c9d-80b4-ec81c2cbf950)

        >    >2020-02-14 12:33:53,380Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] START,

        >    >MigrateVDSCommand(

        >>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627',

        >    >srcHost='ovirt-sj-03.ictv.com',

        >    >dstVdsId='33e8ff78-e396-4f40-b43c-685bfaaee9af',

        >    >dstHost='ovirt-sj-02.ictv.com:54321', migrationMethod='ONLINE',

        >  >tunnelMigration='false', migrationDowntime='0', autoConverge='true',

        >  >migrateCompressed='false', consoleAddress='null', maxBandwidth='40',

        >    >enableGuestEvents='true', maxIncomingMigrations='2',

        >    >maxOutgoingMigrations='2',

        >    >convergenceSchedule='[init=[{name=setDowntime, params=[100]}],

        >>stalling=[{limit=1, action={name=setDowntime, params=[150]}},

        >{limit=2,

        >    >action={name=setDowntime, params=[200]}}, {limit=3,

        >    >action={name=setDowntime, params=[300]}}, {limit=4,

        >    >action={name=setDowntime, params=[400]}}, {limit=6,

        >    >action={name=setDowntime, params=[500]}}, {limit=-1,

        > >action={name=abort, params=[]}}]]', dstQemu='10.210.13.12'}), log id:

        >    >d99059f

        >    >2020-02-14 12:33:53,380Z INFO

        >  >[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] START,

        >    >MigrateBrokerVDSCommand(HostName = ovirt-sj-03.ictv.com,

        >>MigrateVDSCommandParameters:{hostId='f8d27efb-1527-45f0-97d6-d34a86abaaa2',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627',

        >    >srcHost='ovirt-sj-03.ictv.com',

        >    >dstVdsId='33e8ff78-e396-4f40-b43c-685bfaaee9af',

        >    >dstHost='ovirt-sj-02.ictv.com:54321', migrationMethod='ONLINE',

        >  >tunnelMigration='false', migrationDowntime='0', autoConverge='true',

        >  >migrateCompressed='false', consoleAddress='null', maxBandwidth='40',

        >    >enableGuestEvents='true', maxIncomingMigrations='2',

        >    >maxOutgoingMigrations='2',

        >    >convergenceSchedule='[init=[{name=setDowntime, params=[100]}],

        >>stalling=[{limit=1, action={name=setDowntime, params=[150]}},

        >{limit=2,

        >    >action={name=setDowntime, params=[200]}}, {limit=3,

        >    >action={name=setDowntime, params=[300]}}, {limit=4,

        >    >action={name=setDowntime, params=[400]}}, {limit=6,

        >    >action={name=setDowntime, params=[500]}}, {limit=-1,

        > >action={name=abort, params=[]}}]]', dstQemu='10.210.13.12'}), log id:

        >    >6f0483ac

        >    >2020-02-14 12:33:53,386Z INFO

        >  >[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] FINISH,

        >    >MigrateBrokerVDSCommand, return: , log id: 6f0483ac

        >    >2020-02-14 12:33:53,388Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] FINISH,

        >    >MigrateVDSCommand, return: MigratingFrom, log id: d99059f

        >    >2020-02-14 12:33:53,391Z INFO

        >>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]

        >    >(EE-ManagedThreadFactory-engine-Thread-377323) [] EVENT_ID:

        >  >VM_MIGRATION_START(62), Migration started (VM: HostedEngine, Source:

        >    >ovirt-sj-03.ictv.com, Destination: ovirt-sj-02.ictv.com, User:

        >    >mvrgotic@ictv.com@ictv.com-authz).

        >    >2020-02-14 12:33:55,108Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher]

        > >(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] Fetched 10 VMs

        >    >from VDS '33e8ff78-e396-4f40-b43c-685bfaaee9af'

        >    >2020-02-14 12:33:55,110Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM

        >    >'66b6d489-ceb8-486a-951a-355e21f13627' is migrating to VDS

        > >'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com) ignoring

        >    >it in the refresh until migration is done

        >    >2020-02-14 12:33:57,224Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >>(ForkJoinPool-1-worker-15) [] VM

        >'66b6d489-ceb8-486a-951a-355e21f13627'

        >    >was reported as Down on VDS

        >    >'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com)

        >    >2020-02-14 12:33:57,225Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]

        >    >(ForkJoinPool-1-worker-15) [] START, DestroyVDSCommand(HostName =

        >    >ovirt-sj-02.ictv.com,

        >>DestroyVmVDSCommandParameters:{hostId='33e8ff78-e396-4f40-b43c-685bfaaee9af',

        >    >vmId='66b6d489-ceb8-486a-951a-355e21f13627', secondsToWait='0',

        >  >gracefully='false', reason='', ignoreNoVm='true'}), log id: 1dec553e

        >    >2020-02-14 12:33:57,672Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]

        >    >(ForkJoinPool-1-worker-15) [] Failed to destroy VM

        >    >'66b6d489-ceb8-486a-951a-355e21f13627' because VM does not exist,

        >    >ignoring

        >    >2020-02-14 12:33:57,672Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]

        >>(ForkJoinPool-1-worker-15) [] FINISH, DestroyVDSCommand, return: , log

        >    >id: 1dec553e

        >    >2020-02-14 12:33:57,672Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(ForkJoinPool-1-worker-15) [] VM

        > >'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) was unexpectedly

        >    >detected as 'Down' on VDS

        >>'33e8ff78-e396-4f40-b43c-685bfaaee9af'(ovirt-sj-02.ictv.com) (expected

        >    >on 'f8d27efb-1527-45f0-97d6-d34a86abaaa2')

        >    >2020-02-14 12:33:57,672Z ERROR

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >  >(ForkJoinPool-1-worker-15) [] Migration of VM 'HostedEngine' to host

        >    >'ovirt-sj-02.ictv.com' failed: VM destroyed during the startup.

        >    >2020-02-14 12:33:57,674Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(ForkJoinPool-1-worker-8) [] VM

        >    >'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) moved from

        >    >'MigratingFrom' --> 'Up'

        >    >2020-02-14 12:33:57,674Z INFO

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]

        >    >(ForkJoinPool-1-worker-8) [] Adding VM

        >  >'66b6d489-ceb8-486a-951a-355e21f13627'(HostedEngine) to re-run list

        >    >2020-02-14 12:33:57,676Z ERROR

        >    >[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]

        >    >(ForkJoinPool-1-worker-8) [] Rerun VM

        >    >'66b6d489-ceb8-48

       

        I am afraid that your suspicions  are  right.

       

        What is  the host cpu and the HostedEngine's xml?

       

        Have  you checked the xml on any working VM ? What cpu flags  do the working VMs have ?

       

        How to solve - I think I have a solution , but you might not like it.

       

        1. Get current VM xml with virsh

        2. Set all nodes in maintenance 'hosted-engine --set-maintenance  --mode=global'

        3. Stop and undefine the VM on the last working host

        4. Edit the xml from step 1 and add/remove the flags  that are different from the other (working) VMs

        5. Define the HostedEngine on any of the updated hosts

        6. Start the HostedEngine via  virsh.

        7. Try with different cpu flags until the engine starts.

        8. Leave the engine for at least 12 hours , so it will have enough time to update  it's  own configuration.

        9.  Remove the maintenance  and migrate the engine to the other  upgraded  host

        10.  Patch the last HostedEngine's host

        

        I have done this procedure  in order to recover my engine (except changing the cpu flags).

       

        Note: You may hit some hiccups:

        A) virsh alias

        alias virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'

       

        B) HostedEngine network missing:

        [root@ovirt1 ~]# virsh net-dumpxml vdsm-ovirtmgmt

        <network>

          <name>vdsm-ovirtmgmt</name>

          <uuid>986c27cf-a1ec-44d8-ae61-ee09ce75c886</uuid>

          <forward mode='bridge'/>

          <bridge name='ovirtmgmt'/>

        </network>

       

        Define in xml and add  it via:

        virsh net-define somefile.xml

        C) Missing disk

        Vdsm is creating symlinks like these:

        [root@ovirt1 808423f9-8a5c-40cd-bc9f-2568c85b8c74]# pwd

        /var/run/vdsm/storage/808423f9-8a5c-40cd-bc9f-2568c85b8c74

        [root@ovirt1 808423f9-8a5c-40cd-bc9f-2568c85b8c74]# ls -l

        total 20

        lrwxrwxrwx. 1 vdsm kvm 129 Feb  2 19:05 2c74697a-8bd9-4472-8a98-bf624f3462d5 -> /rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/2c74697a-8bd9-4472-8a98-bf624f3462d5

        lrwxrwxrwx. 1 vdsm kvm 129 Feb  2 19:09 3ec27d6d-921c-4348-b799-f50543b6f919 -> /rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/3ec27d6d-921c-4348-b799-f50543b6f919

        lrwxrwxrwx. 1 vdsm kvm 129 Feb  2 19:09 441abdc8-6cb1-49a4-903f-a1ec0ed88429 -> /rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/441abdc8-6cb1-49a4-903f-a1ec0ed88429

        lrwxrwxrwx. 1 vdsm kvm 129 Feb  2 19:09 94ade632-6ecc-4901-8cec-8e39f3d69cb0 -> /rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/94ade632-6ecc-4901-8cec-8e39f3d69cb0

        lrwxrwxrwx. 1 vdsm kvm 129 Feb  2 19:05 fe62a281-51e9-4b23-87b3-2deb52357304 -> /rhev/data-center/mnt/glusterSD/gluster1:_engine/808423f9-8a5c-40cd-bc9f-2568c85b8c74/images/fe62a281-51e9-4b23-87b3-2deb52357304

        [root@ovirt1 808423f9-8a5c-40cd-bc9f-2568c85b8c74]#

       

        Just create the link,  so it  points to correct  destinationand power up again.

       

        

        Good  luck !

       

        Best Regards,

        Strahil Nikolov