Hi,
I'm having trouble getting my Nvidia vGPU to work on a VM.
Setup: RHEV 4.2 on RHEL 7.5, Tesla M60 (switched to graphics mode).
I'm using the NVIDIA-GRID-RHEL-7.5-410.92-410.91-412.16.zip package
from Nvidia.
On the hypervisor, I've installed the
NVIDIA-vGPU-rhel-7.5-410.91.x86_64 rpm. vfio kernel modules are
loaded, nvidia-smi shows the card, and I can see all the vGPUs via
vdsm-client
I've created a CentOS 7.4 VM and added a 'B' type vGPU instance in
'custom properties'. I've configured gridd.conf to point to the
license server and it reports picking up a license in
/var/log/messages. I installed the driver via the .run file
(NVIDIA-Linux-x86_64-410.92-grid.run). The nvidia kernel module is
loaded, but so also is the 'qxl' paravirtual driver.
lspci reports:
00:02.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual
graphic card (rev 04)
00:07.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla
M60] (rev a1)
The Xorg.0.log reports:
[ 1622.212] (--) PCI:*(0:0:2:0) 1b36:0100:1af4:1100 rev 4, Mem @
0xf0000000/134217728, 0xfb000000/8388608, 0xfb870000/8192, I/O @
0x0000c100/32, BIOS @ 0x????????/65536
[ 1622.212] (--) PCI: (0:0:7:0) 10de:13f2:10de:1177 rev 161, Mem @
0xfa000000/16777216, 0xd0000000/268435456, 0xf8000000/33554432, I/O @
0x0000c000/128, BIOS @ 0x????????/131072
[ 1622.212] (II) LoadModule: "glx"
[ 1622.212] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
[ 1622.213] (II) Module glx:
vendor="X.Org Foundation"
[ 1622.213] compiled for 1.19.3, module version = 1.0.0
[ 1622.213] ABI class: X.Org Server Extension, version 10.0
[ 1622.213] (II) LoadModule: "nvidia"
[ 1622.214] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[ 1622.214] (II) Module nvidia: vendor="NVIDIA Corporation"
[ 1622.214] compiled for 4.0.2, module version = 1.0.0
[ 1622.214] Module class: X.Org Video Driver
[ 1622.214] (II) NVIDIA dlloader X Driver 410.92 Thu Dec 20 04:48:17 CST 2018
[ 1622.214] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[ 1622.214] (++) using VT number 1
[ 1622.214] (EE) No devices detected.
[ 1622.214] (EE)
Fatal server error:
[ 1622.214] (EE) no screens found(EE)
[ 1622.214] (EE)
Could the qxl module be somehow blocking the nvidia driver? I tried
blacklisting the driver in grub, though that didn't work anyway.
Thanks in advance for any help.
Cam