oVirt with Gluster upgraded to 4.2: unable to boot vm with libgfapi

2 Jan 2018

      Hello,
a system upgraded from 4.1.7 (with libgfapi not enabled) to 4.2.
3 hosts in a HC configuration

Now I try to enable libgfapi:

Before a CentOS 6 VM booted with a qemu-kvm line of type:

 -drive
file=/rhev/data-center/mnt/glusterSD/ovirt01.localdomain.local:data/190f4096-003e-4908-825a-6c231e60276d/images/02731d5e-c222-4697-8f1f-d26a6a23ec79/1836df76-835b-4625-9ce8-0856176dc30c,format=raw,if=none,id=drive-virtio-disk0,serial=02731d5e-c222-4697-8f1f-d26a6a23ec79,cache=none,werror=stop,rerror=stop,aio=thread

Shutdown VM named centos6

Setup engine
root@ovengine log]# engine-config -s LibgfApiSupported=true
Please select a version:
1. 3.6
2. 4.0
3. 4.1
4. 4.2
4
[root@ovengine log]# engine-config -g LibgfApiSupported
LibgfApiSupported: false version: 3.6
LibgfApiSupported: false version: 4.0
LibgfApiSupported: false version: 4.1
LibgfApiSupported: true version: 4.2

Restart engine

[root@ovengine log]# systemctl restart ovirt-engine
[root@ovengine log]#

reconnect to web admin portal

Power on the centos6 VM
I get "Failed to run VM" on all the 3 configured hosts

Jan 1, 2018, 11:53:35 PM Failed to run VM centos6 (User:
admin@internal-authz).
Jan 1, 2018, 11:53:35 PM Failed to run VM centos6 on Host
ovirt02.localdomain.local.
Jan 1, 2018, 11:53:35 PM Failed to run VM centos6 on Host
ovirt03.localdomain.local.
Jan 1, 2018, 11:53:35 PM Failed to run VM centos6 on Host
ovirt01.localdomain.local.

In engine.log
2018-01-01 23:53:34,996+01 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] Failed in 'CreateBrokerVDS' method,
for vds: 'ovirt01.localdomain.local'; host: 'ovirt01.localdomain.local': 1
2018-01-01 23:53:34,996+01 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] Command
'CreateBrokerVDSCommand(HostName = ovirt01.localdomain.local,
CreateVDSCommandParameters:{hostId='e5079118-1147-469e-876f-e20013276ece',
vmId='64da5593-1022-4f66-ae3f-b273deda4c22', vm='VM [centos6]'})' execution
failed: 1
2018-01-01 23:53:34,996+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] FINISH, CreateBrokerVDSCommand, log
id: e3bbe56
2018-01-01 23:53:34,996+01 ERROR
[org.ovirt.engine.core.vdsbroker.CreateVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] Failed to create VM: 1
2018-01-01 23:53:34,997+01 ERROR
[org.ovirt.engine.core.vdsbroker.CreateVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] Command 'CreateVDSCommand(
CreateVDSCommandParameters:{hostId='e5079118-1147-469e-876f-e20013276ece',
vmId='64da5593-1022-4f66-ae3f-b273deda4c22', vm='VM [centos6]'})' execution
failed: java.lang.ArrayIndexOutOfBoundsException: 1
2018-01-01 23:53:34,997+01 INFO
[org.ovirt.engine.core.vdsbroker.CreateVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] FINISH, CreateVDSCommand, return:
Down, log id: ab299ce
2018-01-01 23:53:34,997+01 WARN  [org.ovirt.engine.core.bll.RunVmCommand]
(EE-ManagedThreadFactory-engine-Thread-2885)
[8d7c68c6-b236-4e76-b7b2-f000e2b07425] Failed to run VM 'centos6':
EngineException: java.lang.RuntimeException:
java.lang.ArrayIndexOutOfBoundsException: 1 (Failed with error ENGINE and
code 5001)

All engine.log file here:
https://drive.google.com/file/d/1UZ9dWnGrBaFVnfx1E_Ch52CtYDDtzT3p/view?usp=s...

The VM fails to start on all 3 hosts, but I don't see particular error on
them; eg on ovirt01 vdsm.log.1.xz here:
https://drive.google.com/file/d/1yIlKtRtvftJVzWNlzV3WhJ3DaP4ksQvw/view?usp=s...

The domain where the VM disk is on seems ok;
[root@ovirt01 vdsm]# gluster volume info data

Volume Name: data
Type: Replicate
Volume ID: 2238c6db-48c5-4071-8929-879cedcf39bf
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ovirt01.localdomain.local:/gluster/brick2/data
Brick2: ovirt02.localdomain.local:/gluster/brick2/data
Brick3: ovirt03.localdomain.local:/gluster/brick2/data (arbiter)
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root@ovirt01 vdsm]#

[root@ovirt01 vdsm]# gluster volume heal data info
Brick ovirt01.localdomain.local:/gluster/brick2/data
Status: Connected
Number of entries: 0

Brick ovirt02.localdomain.local:/gluster/brick2/data
Status: Connected
Number of entries: 0

Brick ovirt03.localdomain.local:/gluster/brick2/data
Status: Connected
Number of entries: 0

[root@ovirt01 vdsm]#

I followed here:
https://www.ovirt.org/develop/release-management/features/storage/glusterfs-...

Any other action to do at host side?
The cluster and DC are already at 4.2 compatibility version.

Thanks
Gianluca

Gianluca Cecchi

Yaniv Kaul

Gianluca Cecchi

Fred Rolland

Gianluca Cecchi

tags

participants (3)