
As IL are on independence day, anyone else can merge? https://gerrit.ovirt.org/#/c/99845/ On Thu, May 9, 2019 at 11:30 AM Dafna Ron <dron@redhat.com> wrote:
Thanks Andrej. I will follow the patch and update. Dafna
On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir <akrejcir@redhat.com> wrote:
Hi,
Ok, I have posted the reverting patch: https://gerrit.ovirt.org/#/c/99845/
I'm still investigating what is the problem. Sorry for the delay, we had a public holiday yesturday.
Andrej
On Thu, 9 May 2019 at 11:20, Dafna Ron <dron@redhat.com> wrote:
Hi,
I have not heard back on this issue and ovirt-engine has been broken for the past 3 days.
As this does not seem a simple debug and fix I suggest reverting the patch and investigating later.
thanks, Dafna
On Wed, May 8, 2019 at 9:42 AM Dafna Ron <dron@redhat.com> wrote:
Any news?
Thanks, Dafna
On Tue, May 7, 2019 at 4:57 PM Dafna Ron <dron@redhat.com> wrote:
thanks for the quick reply and investigation. Please update me if I can help any further and if you find the cause and have a patch let me know. Note that ovirt-engine project is broken and if we cannot find the cause relatively fast we should consider reverting the patch to allow a new package to be built in CQ with other changes that were submitted.
Thanks, Dafna
On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir <akrejcir@redhat.com> wrote:
After running a few OSTs manually, it seems that the patch is the cause. Investigating...
On Tue, 7 May 2019 at 14:58, Andrej Krejcir <akrejcir@redhat.com> wrote:
> Hi, > > The issue is probably not caused by the patch. > > This log line means that the VM does not exist in the DB: > > 2019-05-07 06:02:04,215-04 WARN > [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand] > (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation > of action 'MigrateMultipleVms' failed for user admin@internal-authz. > Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND > > I will investigate more, why the VM is missing. > > On Tue, 7 May 2019 at 14:07, Dafna Ron <dron@redhat.com> wrote: > >> Hi, >> >> We are failing test upgrade_hosts on >> upgrade-from-release-suite-master. >> From the logs I can see that we are calling migrate vm when we have >> only one host and the vm seem to have been shut down before the maintenance >> call is issued. >> >> Can you please look into this? >> >> suspected patch reported as root cause by CQ is: >> >> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms >> command and use it for host maintenance >> >> >> logs are found here: >> >> >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact... >> >> >> I can see the issue is vm migration when putting host in >> maintenance: >> >> >> 2019-05-07 06:02:04,170-04 INFO >> [org.ovirt.engine.core.bll.MaintenanceVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) >> [05592db2-f859-487b-b779-4b32eec5bab >> 3] Running command: MaintenanceVdsCommand internal: true. Entities >> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS >> 2019-05-07 06:02:04,215-04 WARN >> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation >> of action >> 'MigrateMultipleVms' failed for user admin@internal-authz. >> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND >> 2019-05-07 06:02:04,221-04 ERROR >> [org.ovirt.engine.core.bll.MaintenanceVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to >> migrate one or >> more VMs. >> 2019-05-07 06:02:04,227-04 ERROR >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN >> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host >> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode. >> 2019-05-07 06:02:04,239-04 INFO >> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >> Acquired to object 'Eng >> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]', >> sharedLocks=''}' >> 2019-05-07 06:02:04,242-04 INFO >> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running >> command: ActivateVds >> Command internal: true. Entities affected : ID: >> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST >> with role type ADMIN >> 2019-05-07 06:02:04,243-04 INFO >> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before >> acquiring lock in ord >> er to prevent monitoring for host >> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc' >> 2019-05-07 06:02:04,243-04 INFO >> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >> acquired, from now a mo >> nitoring of host will be skipped for host >> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc' >> 2019-05-07 06:02:04,252-04 INFO >> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START, >> SetVdsStatu >> sVDSCommand(HostName = >> lago-upgrade-from-release-suite-master-host-0, >> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >> status='Unassigned', n >> onOperationalReason='NONE', stopSpmFailureLogged='false', >> maintenanceReason='null'}), log id: 2c8aa211 >> 2019-05-07 06:02:04,256-04 INFO >> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH, >> SetVdsStat >> usVDSCommand, return: , log id: 2c8aa211 >> 2019-05-07 06:02:04,261-04 INFO >> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate >> host finished. Lock >> released. Monitoring can run now for host >> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc' >> 2019-05-07 06:02:04,265-04 INFO >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN >> T_ID: VDS_ACTIVATE(16), Activation of host >> lago-upgrade-from-release-suite-master-host-0 initiated by >> admin@internal-authz. >> 2019-05-07 06:02:04,266-04 INFO >> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock freed >> to object 'Engine >> Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]', >> sharedLocks=''}' >> 2019-05-07 06:02:04,484-04 ERROR >> [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback] >> (EE-ManagedThreadFactory-engineScheduled-Thread-96) >> [05592db2-f859-487b-b779-4b32 >> eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' >> failed to move to maintenance mode. Upgrade process is terminated. >> >> I can see there was only one vm running: >> >> >> drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu >> [dron@dron post-004_basic_sanity.py]$ ls -l >> lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/ >> total 6 >> -rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log >> >> and I can see that there was an attempt to terminate it with an >> error that it does not exist: >> >> >> stroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0', >> gracefully='false', reason='', ig >> noreNoVm='false'}), log id: 24278e9b >> 2019-05-07 06:01:41,082-04 INFO >> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default >> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] START, DestroyVDSComma >> nd(HostName = lago-upgrade-from-release-suite-master-host-0, >> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >> vmId='dfbd75e2-a9cb-4fca-8788-a169 >> 54db4abf', secondsToWait='0', gracefully='false', reason='', >> ignoreNoVm='false'}), log id: 78bba2f8 >> 2019-05-07 06:01:42,090-04 INFO >> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default >> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVDSComm >> and, return: , log id: 78bba2f8 >> 2019-05-07 06:01:42,090-04 INFO >> [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) >> [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVmVDSCommand, r >> eturn: , log id: 24278e9b >> 2019-05-07 06:01:42,094-04 INFO >> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] >> (ForkJoinPool-1-worker-4) [] VM 'dfbd75e2-a9cb-4fca-8788-a16954db4abf' was >> reported >> as Down on VDS >> '38e1379b-c3b6-4a2e-91df-d1f346e414a9'(lago-upgrade-from-release-suite-master-host-0) >> 2019-05-07 06:01:42,096-04 INFO >> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] >> (ForkJoinPool-1-worker-4) [] START, DestroyVDSCommand(HostName = >> lago-upgrade- >> from-release-suite-master-host-0, >> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0 >> ', gracefully='false', reason='', ignoreNoVm='true'}), log id: >> 1dbd31eb >> 2019-05-07 06:01:42,114-04 INFO >> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] >> (ForkJoinPool-1-worker-4) [] Failed to destroy VM >> 'dfbd75e2-a9cb-4fca-8788-a16 >> 954db4abf' because VM does not exist, ignoring >> >> >> >>