After running a few OSTs manually, it seems that the patch is the cause.
Investigating...
On Tue, 7 May 2019 at 14:58, Andrej Krejcir <akrejcir(a)redhat.com> wrote:
Hi,
The issue is probably not caused by the patch.
This log line means that the VM does not exist in the DB:
2019-05-07 06:02:04,215-04 WARN
[org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
of action 'MigrateMultipleVms' failed for user admin@internal-authz.
Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
I will investigate more, why the VM is missing.
On Tue, 7 May 2019 at 14:07, Dafna Ron <dron(a)redhat.com> wrote:
> Hi,
>
> We are failing test upgrade_hosts on upgrade-from-release-suite-master.
> From the logs I can see that we are calling migrate vm when we have only
> one host and the vm seem to have been shut down before the maintenance call
> is issued.
>
> Can you please look into this?
>
> suspected patch reported as root cause by CQ is:
>
>
https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
> command and use it for host maintenance
>
>
> logs are found here:
>
>
>
>
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artif...
>
>
> I can see the issue is vm migration when putting host in maintenance:
>
>
> 2019-05-07 06:02:04,170-04 INFO
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
> [05592db2-f859-487b-b779-4b32eec5bab
> 3] Running command: MaintenanceVdsCommand internal: true. Entities
> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
> of action
> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
> ACTION_TYPE_FAILED_VMS_NOT_FOUND
> 2019-05-07 06:02:04,221-04 ERROR
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
> migrate one or
> more VMs.
> 2019-05-07 06:02:04,227-04 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
> 2019-05-07 06:02:04,239-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> Acquired to object 'Eng
> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
> sharedLocks=''}'
> 2019-05-07 06:02:04,242-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
> command: ActivateVds
> Command internal: true. Entities affected : ID:
> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
> with role type ADMIN
> 2019-05-07 06:02:04,243-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
> acquiring lock in ord
> er to prevent monitoring for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center
'test-dc'
> 2019-05-07 06:02:04,243-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> acquired, from now a mo
> nitoring of host will be skipped for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center
'test-dc'
> 2019-05-07 06:02:04,252-04 INFO
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
> SetVdsStatu
> sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
>
SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
> status='Unassigned', n
> onOperationalReason='NONE', stopSpmFailureLogged='false',
> maintenanceReason='null'}), log id: 2c8aa211
> 2019-05-07 06:02:04,256-04 INFO
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
> SetVdsStat
> usVDSCommand, return: , log id: 2c8aa211
> 2019-05-07 06:02:04,261-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate
> host finished. Lock
> released. Monitoring can run now for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center
'test-dc'
> 2019-05-07 06:02:04,265-04 INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN
> T_ID: VDS_ACTIVATE(16), Activation of host
> lago-upgrade-from-release-suite-master-host-0 initiated by
> admin@internal-authz.
> 2019-05-07 06:02:04,266-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock freed
> to object 'Engine
> Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
> sharedLocks=''}'
> 2019-05-07 06:02:04,484-04 ERROR
> [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback]
> (EE-ManagedThreadFactory-engineScheduled-Thread-96)
> [05592db2-f859-487b-b779-4b32
> eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' failed to
> move to maintenance mode. Upgrade process is terminated.
>
> I can see there was only one vm running:
>
>
> drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu
> [dron@dron post-004_basic_sanity.py]$ ls -l
> lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/
> total 6
> -rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log
>
> and I can see that there was an attempt to terminate it with an error
> that it does not exist:
>
>
> stroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0',
> gracefully='false', reason='', ig
> noreNoVm='false'}), log id: 24278e9b
> 2019-05-07 06:01:41,082-04 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default
> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] START, DestroyVDSComma
> nd(HostName = lago-upgrade-from-release-suite-master-host-0,
>
DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
> vmId='dfbd75e2-a9cb-4fca-8788-a169
> 54db4abf', secondsToWait='0', gracefully='false',
reason='',
> ignoreNoVm='false'}), log id: 78bba2f8
> 2019-05-07 06:01:42,090-04 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default
> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVDSComm
> and, return: , log id: 78bba2f8
> 2019-05-07 06:01:42,090-04 INFO
> [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1)
> [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVmVDSCommand, r
> eturn: , log id: 24278e9b
> 2019-05-07 06:01:42,094-04 INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (ForkJoinPool-1-worker-4) [] VM 'dfbd75e2-a9cb-4fca-8788-a16954db4abf' was
> reported
> as Down on VDS
>
'38e1379b-c3b6-4a2e-91df-d1f346e414a9'(lago-upgrade-from-release-suite-master-host-0)
> 2019-05-07 06:01:42,096-04 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
> (ForkJoinPool-1-worker-4) [] START, DestroyVDSCommand(HostName =
> lago-upgrade-
> from-release-suite-master-host-0,
>
DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0
> ', gracefully='false', reason='', ignoreNoVm='true'}),
log id: 1dbd31eb
> 2019-05-07 06:01:42,114-04 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
> (ForkJoinPool-1-worker-4) [] Failed to destroy VM
> 'dfbd75e2-a9cb-4fca-8788-a16
> 954db4abf' because VM does not exist, ignoring
>
>
>
>