[ OST Failure Report ] [ oVirt Master (vdsm) ] [ 10-05-2019 ] [
004_basic_sanity.vm_run ]
by Dafna Ron
Hi,
We are failing upgrade-from-release-suite.el7.x86_64 /
004_basic_sanity.vm_run
The issue is an unexpected exception in vdsm.
root cause based on CQ is this patch:
https://gerrit.ovirt.org/#/c/99854/ - yum: Allow downloading only 'vdsm'
package
Logs can be found here:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14079/artif...
Marcin, can you please take a look?
error:
vdsm:
2019-05-10 05:06:38,329-0400 ERROR (jsonrpc/0) [api] FINISH create
error=Unexpected exception (api:131)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 245, in create
raise exception.UnexpectedError()
UnexpectedError: Unexpected exception
2019-05-10 05:06:38,330-0400 INFO (jsonrpc/0) [api.virt] FINISH create
return={'status': {'message': 'Unexpected exception', 'code': 16}}
from=::ffff:192.168.201.2,39486,
flow_id=83e4f0c4-a39a-45aa-891f-4022765e1a87,
vmId=397013a0-b7d4-4d38-86c3-e944cebd75a7 (api:54)
2019-05-10 05:06:38,331-0400 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
call VM.create failed (error 16) in 0.00 seconds (__init__:312)
2019-05-10 05:06:38,629-0400 INFO (jsonrpc/7) [vdsm.api] FINISH
getStorageDomainInfo return={'info': {'uuid':
'24498263-8985-46a9-9161-f65ee776cb7f', 'vgMetadataDevice':
'360014051f333158820a4cc6ab3be5b55', 'vguuid':
'rXTLNc-JwEz-0dtL-Z2EC-ASqn-gzkg-jDJa1b', 'metadataDevice':
'360014051f333158820a4cc6ab3be5b55', 'state': 'OK', 'version': '4', 'role':
'Master', 'type': 'ISCSI', 'class': 'Data', 'pool':
['3e330122-c587-4078-b9a4-13dbb697c5cc'], 'name': 'iscsi'}}
from=::ffff:192.168.201.2,39486, flow_id=3f76d32a,
task_id=084e7ee5-3c43-4905-833f-9a522043a862 (api:54)
engine:
2019-05-10 05:06:38,328-04 ERROR
[org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (default task-1)
[83e4f0c4-a39a-45aa-891f-4022765e1a87] VDS::create Failed creating vm 'v
m0' in vds = '566eda0f-3ef0-4791-a618-1e649af4b0be' error =
'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
VDSGenericException: VDSErrorException: Failed to C
reateBrokerVDS, error = Unexpected exception, code = 16'
2019-05-10 05:06:38,328-04 INFO
[org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (default task-1)
[83e4f0c4-a39a-45aa-891f-4022765e1a87] FINISH, CreateVDSCommand, return:
Down, log id: 3cb7c62d
2019-05-10 05:06:38,328-04 WARN [org.ovirt.engine.core.bll.RunVmCommand]
(default task-1) [83e4f0c4-a39a-45aa-891f-4022765e1a87] Failed to run VM
'vm0': EngineException: or
g.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
VDSGenericException: VDSErrorException: Failed to CreateBrokerVDS, error =
Unexpected exception, code = 16 (Failed
with error unexpected and code 16)
2019-05-10 05:06:38,328-04 INFO
[org.ovirt.engine.core.bll.RunVmOnceCommand] (default task-1)
[83e4f0c4-a39a-45aa-891f-4022765e1a87] Lock freed to object
'EngineLock:{exclu
siveLocks='[397013a0-b7d4-4d38-86c3-e944cebd75a7=VM]', sharedLocks=''}'
2019-05-10 05:06:38,329-04 INFO [org.ovirt.engine.core.bll.RunVmCommand]
(default task-1) [83e4f0c4-a39a-45aa-891f-4022765e1a87] Trying to rerun VM
'vm0'
2019-05-10 05:06:38,373-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-1) [83e4f0c4-a39a-45aa-891f-4022765e1a87] EVENT_ID: USE
R_INITIATED_RUN_VM_FAILED(151), Failed to run VM vm0 on Host
lago-upgrade-from-release-suite-master-host-0.
2019-05-10 05:06:38,381-04 INFO
[org.ovirt.engine.core.bll.RunVmOnceCommand] (default task-1)
[83e4f0c4-a39a-45aa-891f-4022765e1a87] Lock Acquired to object
'EngineLock:{ex
clusiveLocks='[397013a0-b7d4-4d38-86c3-e944cebd75a7=VM]', sharedLocks=''}'
2019-05-10 05:06:38,393-04 INFO
[org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default
task-1) [83e4f0c4-a39a-45aa-891f-4022765e1a87] START, IsVmDuringIn
itiatingVDSCommand(
IsVmDuringInitiatingVDSCommandParameters:{vmId='397013a0-b7d4-4d38-86c3-e944cebd75a7'}),
log id: 96be328
2019-05-10 05:06:38,393-04 INFO
[org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default
task-1) [83e4f0c4-a39a-45aa-891f-4022765e1a87] FINISH, IsVmDuringI
nitiatingVDSCommand, return: false, log id: 96be328
2019-05-10 05:06:38,403-04 WARN
[org.ovirt.engine.core.bll.RunVmOnceCommand] (default task-1)
[83e4f0c4-a39a-45aa-891f-4022765e1a87] Validation of action 'RunVmOnce'
failed
for user admin@internal-authz. Reasons:
VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,SCHEDULING_NO_HOSTS
2019-05-10 05:06:38,404-04 INFO
[org.ovirt.engine.core.bll.RunVmOnceCommand] (default task-1)
[83e4f0c4-a39a-45aa-891f-4022765e1a87] Lock freed to object
'EngineLock:{exclu
siveLocks='[397013a0-b7d4-4d38-86c3-e944cebd75a7=VM]', sharedLocks=''}'
2019-05-10 05:06:38,413-04 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-1) [83e4f0c4-a39a-45aa-891f-4022765e1a87] EVENT_ID: USE
R_FAILED_RUN_VM(54), Failed to run VM vm0 (User: admin@internal-authz).
2019-05-10 05:06:38,462-04 ERROR
[org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default
task-1) [] Operation Failed: [There are no hosts to use. Check that the
cluster contains at least one host in Up state.]
2019-05-10 05:06:38,466-04 INFO
[org.ovirt.engine.core.bll.ProcessDownVmCommand]
(EE-ManagedThreadFactory-engine-Thread-35) [3f40b3e8] Running command:
ProcessDownVmCommand internal: true.
Thanks,
Dafna
5 years, 7 months
Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [
05-05-2019 ] [ upgrade_hosts ]
by Dafna Ron
Hi,
We are failing test upgrade_hosts on upgrade-from-release-suite-master.
From the logs I can see that we are calling migrate vm when we have only
one host and the vm seem to have been shut down before the maintenance call
is issued.
Can you please look into this?
suspected patch reported as root cause by CQ is:
https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms command
and use it for host maintenance
logs are found here:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artif...
I can see the issue is vm migration when putting host in maintenance:
2019-05-07 06:02:04,170-04 INFO
[org.ovirt.engine.core.bll.MaintenanceVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2)
[05592db2-f859-487b-b779-4b32eec5bab
3] Running command: MaintenanceVdsCommand internal: true. Entities affected
: ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
2019-05-07 06:02:04,215-04 WARN
[org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
of action
'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
ACTION_TYPE_FAILED_VMS_NOT_FOUND
2019-05-07 06:02:04,221-04 ERROR
[org.ovirt.engine.core.bll.MaintenanceVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
migrate one or
more VMs.
2019-05-07 06:02:04,227-04 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
2019-05-07 06:02:04,239-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
Acquired to object 'Eng
ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
sharedLocks=''}'
2019-05-07 06:02:04,242-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
command: ActivateVds
Command internal: true. Entities affected : ID:
38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
with role type ADMIN
2019-05-07 06:02:04,243-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
acquiring lock in ord
er to prevent monitoring for host
'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
2019-05-07 06:02:04,243-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
acquired, from now a mo
nitoring of host will be skipped for host
'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
2019-05-07 06:02:04,252-04 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
SetVdsStatu
sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
status='Unassigned', n
onOperationalReason='NONE', stopSpmFailureLogged='false',
maintenanceReason='null'}), log id: 2c8aa211
2019-05-07 06:02:04,256-04 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
SetVdsStat
usVDSCommand, return: , log id: 2c8aa211
2019-05-07 06:02:04,261-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate
host finished. Lock
released. Monitoring can run now for host
'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
2019-05-07 06:02:04,265-04 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN
T_ID: VDS_ACTIVATE(16), Activation of host
lago-upgrade-from-release-suite-master-host-0 initiated by
admin@internal-authz.
2019-05-07 06:02:04,266-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock freed
to object 'Engine
Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
sharedLocks=''}'
2019-05-07 06:02:04,484-04 ERROR
[org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback]
(EE-ManagedThreadFactory-engineScheduled-Thread-96)
[05592db2-f859-487b-b779-4b32
eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' failed to
move to maintenance mode. Upgrade process is terminated.
I can see there was only one vm running:
drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu
[dron@dron post-004_basic_sanity.py]$ ls -l
lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/
total 6
-rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log
and I can see that there was an attempt to terminate it with an error that
it does not exist:
stroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0',
gracefully='false', reason='', ig
noreNoVm='false'}), log id: 24278e9b
2019-05-07 06:01:41,082-04 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default
task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] START, DestroyVDSComma
nd(HostName = lago-upgrade-from-release-suite-master-host-0,
DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
vmId='dfbd75e2-a9cb-4fca-8788-a169
54db4abf', secondsToWait='0', gracefully='false', reason='',
ignoreNoVm='false'}), log id: 78bba2f8
2019-05-07 06:01:42,090-04 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default
task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVDSComm
and, return: , log id: 78bba2f8
2019-05-07 06:01:42,090-04 INFO
[org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1)
[105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVmVDSCommand, r
eturn: , log id: 24278e9b
2019-05-07 06:01:42,094-04 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-4) [] VM 'dfbd75e2-a9cb-4fca-8788-a16954db4abf' was
reported
as Down on VDS
'38e1379b-c3b6-4a2e-91df-d1f346e414a9'(lago-upgrade-from-release-suite-master-host-0)
2019-05-07 06:01:42,096-04 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(ForkJoinPool-1-worker-4) [] START, DestroyVDSCommand(HostName =
lago-upgrade-
from-release-suite-master-host-0,
DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0
', gracefully='false', reason='', ignoreNoVm='true'}), log id: 1dbd31eb
2019-05-07 06:01:42,114-04 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(ForkJoinPool-1-worker-4) [] Failed to destroy VM
'dfbd75e2-a9cb-4fca-8788-a16
954db4abf' because VM does not exist, ignoring
5 years, 7 months