Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

I noticed this document https://docs.nvidia.com/vgpu/16.0/grid-vgpu-release-notes-generic-linux-kvm/... has this to say In pass through mode, all GPUs connected to each other through NVLink must be assigned to the same VM. If a subset of GPUs connected to each other through NVLink is passed through to a VM, unrecoverable error XID 74 occurs when the VM is booted. If a subset of GPUs connected to each other through NVLink is passed through to a VM, unrecoverable error XID 74 occurs when the VM is booted. This error corrupts the NVLink state on the physical GPUs and, as a result, the NVLink bridge between the NVLink and the physical GPUs is not recognized. result, the NVLink bridge between the GPUs is unusable. You may need to passthrough all GPUs in the nvlink to the VM

Hi, In pass-through mode it is essential to assign all GPUs connected through NVLink to the same VM. If only a subset of these GPUs is assigned to a VM it triggers the unrecoverable error XID 74 during boot corrupting the NVLink state and rendering the NVLink bridge unusable. Therefore to avoid this issue ensure that all GPUs in the NVLink are passed through to the VM. Thanks
participants (2)
-
Maria Jonas
-
Zhengyi Lai