Hi,

Things are going from bad to worse it seems.

I've created a new VM to be used as a template and installed it with CentOS 7. I've created a template of this VM and created a new pool based on this template.

When I try to boot one of the VM's from the pool, it fails and logs the following error:

2019-05-17 14:48:01,709+0200 ERROR (vm/f7da02e4) [virt.vm] (vmId='f7da02e4-725c-4c6c-bdd4-9f2cae8b10e4') The vm start process failed (vm:937)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 866, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run
    dom = self._connection.defineXML(self._domain.xml)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3743, in defineXML
    if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
libvirtError: XML error: requested USB port 3 not present on USB bus 0
2019-05-17 14:48:01,709+0200 INFO  (vm/f7da02e4) [virt.vm] (vmId='f7da02e4-725c-4c6c-bdd4-9f2cae8b10e4') Changed state to Down: XML error: requested USB port 3 not present on USB bus 0 (code=1) (vm:1675)

Strange thing is that this error was not present when I created the initial master VM.

I get similar errors when I select Q35 type VM's instead of the default.

Did your test pool have VM's with USB enabled?

Regards,

Rik

On 5/17/19 10:48 AM, Rik Theys wrote:

Hi Lucie,

On 5/16/19 6:27 PM, Lucie Leistnerova wrote:

Hi Rik,

On 5/14/19 2:21 PM, Rik Theys wrote:

Hi,

It seems VM pools are completely broken since our upgrade to 4.3. Is anybody else also experiencing this issue?

I've tried to reproduce this issue. And I can use pool VMs as expected, no problem. I've tested clean install and also upgrade from 4.2.8.7.
Version: ovirt-engine-4.3.3.7-0.1.el7.noarch with ovirt-web-ui-1.5.2-1.el7ev.noarch
That is strange. I will try to create a new pool to verify if I also have the problem with the new pool. Currently we are having this issue with two different pools. Both pools were created in August or September of last year. I believe it was on 4.2 but could still have been 4.1.

Only a single instance from a pool can be used. Afterwards the pool becomes unusable due to a lock not being released. Once ovirt-engine is restarted, another (single) VM from a pool can be used.

What users are running the VMs? What are the permissions?

The users are taking VM's from the pools using the user portal. They are all member of a group that has the UserRole permission on the pools.

Each VM is running by other user? Were already some VMs running before the upgrade?

A user can take at most 1 VM from each pool. So it's possible a user has two VM's running (but not from the same pool). It doesn't matter which user is taking a VM from the pool. Once a user has taken a VM from the pool, no other user can take a VM. If the user that was able to take a VM powers it down and tries to run a new VM, it will also fail.

During the upgrade of the host, no VM's were running.

Please provide exact steps.

1. ovirt-engine is restarted.

2. User A takes a VM from the pool and can work.

3. User B can not take a VM from that pool.

4. User A powers off the VM it was using. Once the VM is down, (s)he tries to take a new VM, which also fails now.

It seems the VM pool is locked when the first user takes a VM and the lock is never released.

In our case, there are no prestarted VM's. I can try to see if that makes a difference.


Are there any more steps I can take to debug this issue regarding the locks?

Regards,

Rik

I've added my findings to bug 1462236, but I'm no longer sure the issue is the same as the one initially reported.

When the first VM of a pool is started:

2019-05-14 13:26:46,058+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] START, IsVmDuringInitiatingVDSCommand( IsVmDuringInitiatingVDSCommandParameters:{vmId='d8a99676-d520-425e-9974-1b1efe6da8a5'}), log id: 2fb4f7f5
2019-05-14 13:26:46,058+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 2fb4f7f5
2019-05-14 13:26:46,208+02 INFO  [org.ovirt.engine.core.bll.VmPoolHandler] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Lock Acquired to object 'EngineLock:{exclusiveLocks='[d8a99676-d520-425e-9974-1b1efe6da8a5=VM]', sharedLocks=''}'

-> it has acquired a lock (lock1)

2019-05-14 13:26:46,247+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Lock Acquired to object 'EngineLock:{exclusiveLocks='[a5bed59c-d2fe-4fe4-bff7-52efe089ebd6=USER_VM_POOL]', sharedLocks=''}'

-> it has acquired another lock (lock2)

2019-05-14 13:26:46,352+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Running command: AttachUserToVmFromPoolAndRunCommand internal: false. Entities affected :  ID: 4c622213-e5f4-4032-8639-643174b698cc Type: VmPoolAction group VM_POOL_BASIC_OPERATIONS with role type USER
2019-05-14 13:26:46,393+02 INFO  [org.ovirt.engine.core.bll.AddPermissionCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Running command: AddPermissionCommand internal: true. Entities affected :  ID: d8a99676-d520-425e-9974-1b1efe6da8a5 Type: VMAction group MANIPULATE_PERMISSIONS with role type USER
2019-05-14 13:26:46,433+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Succeeded giving user 'a5bed59c-d2fe-4fe4-bff7-52efe089ebd6' permission to Vm 'd8a99676-d520-425e-9974-1b1efe6da8a5'
2019-05-14 13:26:46,608+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] START, IsVmDuringInitiatingVDSCommand( IsVmDuringInitiatingVDSCommandParameters:{vmId='d8a99676-d520-425e-9974-1b1efe6da8a5'}), log id: 67acc561
2019-05-14 13:26:46,608+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 67acc561
2019-05-14 13:26:46,719+02 INFO  [org.ovirt.engine.core.bll.RunVmCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Running command:RunVmCommand internal: true. Entities affected :  ID: d8a99676-d520-425e-9974-1b1efe6da8a5 Type: VMAction group RUN_VM with role type USER
2019-05-14 13:26:46,791+02 INFO  [org.ovirt.engine.core.vdsbroker.UpdateVmDynamicDataVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] START, UpdateVmDynamicDataVDSCommand( UpdateVmDynamicDataVDSCommandParameters:{hostId='null', vmId='d8a99676-d520-425e-9974-1b1efe6da8a5', vmDynamic='org.ovirt.engine.core.common.businessentities.VmDynamic@6db8c94d'}), log id: 2c110e4
2019-05-14 13:26:46,795+02 INFO  [org.ovirt.engine.core.vdsbroker.UpdateVmDynamicDataVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] FINISH, UpdateVmDynamicDataVDSCommand, return: , log id: 2c110e4
2019-05-14 13:26:46,804+02 INFO  [org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] START,CreateVDSCommand( CreateVDSCommandParameters:{hostId='eec7ec2b-cae1-4bb9-b933-4dff47a70bdb', vmId='d8a99676-d520-425e-9974-1b1efe6da8a5', vm='VM [stud-c7-1]'}), log id: 71d599f2
2019-05-14 13:26:46,809+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] START, CreateBrokerVDSCommand(HostName = studvirt1, CreateVDSCommandParameters:{hostId='eec7ec2b-cae1-4bb9-b933-4dff47a70bdb', vmId='d8a99676-d520-425e-9974-1b1efe6da8a5', vm='VM [stud-c7-1]'}), log id: 3aa6b5ff
2019-05-14 13:26:46,836+02 INFO  [org.ovirt.engine.core.vdsbroker.builder.vminfo.VmInfoBuildUtils] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Kernel FIPS - Guid: eec7ec2b-cae1-4bb9-b933-4dff47a70bdb fips: false
2019-05-14 13:26:46,903+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] VM <?xml version="1.0" encoding="UTF-8"?><domain type="kvm" xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
[domain xml stripped]
2019-05-14 13:26:46,928+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] FINISH, CreateBrokerVDSCommand, return: , log id: 3aa6b5ff
2019-05-14 13:26:46,932+02 INFO  [org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] FINISH, CreateVDSCommand, return: WaitForLaunch, log id: 71d599f2
2019-05-14 13:26:46,932+02 INFO  [org.ovirt.engine.core.bll.RunVmCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Lock freed to object 'EngineLock:{exclusiveLocks='[a5bed59c-d2fe-4fe4-bff7-52efe089ebd6=USER_VM_POOL]', sharedLocks=''}'

-> it has released lock2

2019-05-14 13:26:47,004+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] EVENT_ID: VDS_INITIATED_RUN_VM(506), Trying to restart VM stud-c7-1 on Host studvirt1
2019-05-14 13:26:47,094+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] EVENT_ID: USER_ATTACH_USER_TO_VM_FROM_POOL(316), Attaching User u0045469 to VM stud-c7-1 in VM Pool stud-c7-? was initiated by u0045469@esat.kuleuven.be-authz.
2019-05-14 13:26:47,098+02 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Trying to release exclusive lock which does not exist, lock key: 'a5bed59c-d2fe-4fe4-bff7-52efe089ebd6USER_VM_POOL'

-> it's trying to release the same lock2 as above and failing

2019-05-14 13:26:47,098+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-6) [e3c5745c-e593-4aed-ba67-b173808140e8] Lock freed to object 'EngineLock:{exclusiveLocks='[a5bed59c-d2fe-4fe4-bff7-52efe089ebd6=USER_VM_POOL]', sharedLocks=''}'

-> now it's indicating that it released/freed the lock (lock2)

2019-05-14 13:26:48,518+02 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-3) [e3c5745c-e593-4aed-ba67-b173808140e8] Command 'AttachUserToVmFromPoolAndRun' id: '0148c91d-b053-4dc9-960c-f10bf0d3f343' child commands '[0470802d-09fa-4579-b95b-3d9603feff7b, 47dbfc58-3bae-4229-96eb-d1fc08911237]' executions were completed, status 'SUCCEEDED'
2019-05-14 13:26:49,584+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-14) [e3c5745c-e593-4aed-ba67-b173808140e8] Ending command 'org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand' successfully.
2019-05-14 13:26:49,650+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-14) [e3c5745c-e593-4aed-ba67-b173808140e8] EVENT_ID: USER_ATTACH_USER_TO_VM_FROM_POOL_FINISHED_SUCCESS(318), User u0045469 successfully attached to VM stud-c7-1 in VM Pool stud-c7-?.
2019-05-14 13:26:50,584+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-2) [] EVENT_ID: VM_CONSOLE_DISCONNECTED(168), User <UNKNOWN> got disconnected from VM stud-c7-1.
2019-05-14 13:26:50,585+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-2) [] VM 'd8a99676-d520-425e-9974-1b1efe6da8a5'(stud-c7-1) moved from 'WaitForLaunch' --> 'PoweringUp'


The first lock (on the pool itself) is never released. It seems this lock should also be released? Maybe it's this lock that should be released the second time (instead of the second lock twice)?



When we start to launch another instance of the pool it fails:

2019-05-14 13:49:32,656+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-11) [55cc0796-4f53-49cd-8739-3b7e7dd2d95b] START, IsVmDuringInitiatingVDSCommand( IsVmDuringInitiatingVDSCommandParameters:{vmId='d8a99676-d520-425e-9974-1b1efe6da8a5'}), log id: 7db2f4fc
2019-05-14 13:49:32,656+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-11) [55cc0796-4f53-49cd-8739-3b7e7dd2d95b] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 7db2f4fc
2019-05-14 13:49:32,688+02 INFO  [org.ovirt.engine.core.bll.VmPoolHandler] (default task-11) [55cc0796-4f53-49cd-8739-3b7e7dd2d95b] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[d8a99676-d520-425e-9974-1b1efe6da8a5=VM]', sharedLocks=''}'
2019-05-14 13:49:32,700+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-11) [55cc0796-4f53-49cd-8739-3b7e7dd2d95b] Lock Acquired to object 'EngineLock:{exclusiveLocks='[a5bed59c-d2fe-4fe4-bff7-52efe089ebd6=USER_VM_POOL]', sharedLocks=''}'
2019-05-14 13:49:32,700+02 WARN  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-11) [55cc0796-4f53-49cd-8739-3b7e7dd2d95b] Validation of action 'AttachUserToVmFromPoolAndRun' failed for user u0045469@esat.kuleuven.be-authz. Reasons: VAR__ACTION__ALLOCATE_AND_RUN,VAR__TYPE__VM_FROM_VM_POOL,ACTION_TYPE_FAILED_NO_AVAILABLE_POOL_VMS
2019-05-14 13:49:32,700+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-11) [55cc0796-4f53-49cd-8739-3b7e7dd2d95b] Lock freed to object 'EngineLock:{exclusiveLocks='[a5bed59c-d2fe-4fe4-bff7-52efe089ebd6=USER_VM_POOL]', sharedLocks=''}'
2019-05-14 13:49:32,706+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot allocate and run VM from VM-Pool. There are no available VMs in the VM-Pool.]


Regards,
Rik


-- 
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>>

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3IASEV7U7DIDVHAGAR2E2WQVFTCFH7QU/
Thank you
Lucie
-- 
Lucie Leistnerova
Quality Engineer, QE Cloud, RHVM
Red Hat EMEA

IRC: lleistne @ #rhev-qe
-- 
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>>
-- 
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>>