Virtual GPU (vGPU) enables multiple virtual machines (VMs) to have simultaneous, direct access to a single physical GPU, using the same graphics drivers that are deployed on non-virtualized operating systems. By doing this, vGPU provides VMs with unparalleled graphics performance, compute performance, and application compatibility, together with the cost-effectiveness and scalability brought about by sharing a GPU among multiple workloads.
Nvidia VGPU
NVIDIA vGPU software supports GPU instances on GPUs that support the Multi-Instance GPU (MIG) feature in NVIDIA vGPU and GPU pass through deployments. MIG enables a physical GPU to be securely partitioned into multiple separate GPU instances, providing multiple users with separate GPU resources to accelerate their applications. With MIG, A GPU that can be split into several GPU instances of different sizes, with each instance mapped to one vGPU. MIG needs GPU with Ampere GPU architecture, but not all GPU card supports it.
VGPU type
The number of physical GPUs that a board has depends on the board. Each physical GPU can support several different types of virtual GPU (vGPU). vGPU types have a fixed amount of frame buffer, number of supported display heads, and maximum resolutions. They are grouped into different series according to the different classes of workload for which they are optimized. Each series is identified by the last letter of the vGPU type name.
Series
Optimal Workload
Q-series
Virtual workstations for creative and technical professionals who require the performance and features of Quadro technology, 3D rendering
C-series
Compute-intensive server workloads, such as artificial intelligence (AI), deep learning, or high-performance computing (HPC)
B-series
Virtual desktops for business professionals and knowledge workers
A-series
App streaming or session-based solutions for virtual applications users
vGPU types determines
frame buffer
display heads(virtual display outputs)
maximum resolution
number of VGPU
Example M60-2Q is allocated 2048 Mbytes of frame buffer on a Tesla M60 board.
NVIDIA vGPU is a licensed product on all supported GPU boards.
Q-series vGPU types require a vWS license.
C-series vGPU types require an NVIDIA Virtual Compute Server (vCS) license but can also be used with a vWS license.
B-series vGPU types require a vPC license but can also be used with a vWS license.
A-series vGPU types require a vApps license.
ARCH
High-level architecture of NVIDIA vGPU, Under the control of the NVIDIA Virtual GPU Manager running under the hypervisor, NVIDIA physical GPUs are capable of supporting multiple virtual GPU devices (vGPUs) that can be assigned directly to guest VMs.
Time-Sliced NVIDIA vGPU Internal Architecture
A time-sliced vGPU is a vGPU that resides on a physical GPU that is not partitioned into multiple GPU instances. All time-sliced vGPUs resident on a GPU share access to the GPU’s engines including the graphics (3D), video decode, and video encode engines
This is VGPU Arch for traditional GPU ARCH that most of GPU card support it.
In a time-sliced vGPU, processes that run on the vGPU are scheduled to run in series. Each vGPU waits while other processes run on other vGPUs. While processes are running on a vGPU, the vGPU has exclusive use of the GPU’s engine.
For time-Sliced VGPU, the vgpu type must be same for a single GPU.
MIG-Backed NVIDIA vGPU Internal Architecture
A MIG-backed vGPU is a vGPU that resides on a GPU instance in a MIG-capable physical GPU. Each MIG-backed vGPU resident on a GPU has exclusive access to the GPU instance’s engines, including the graphics (3D), and video decode engines.
In a MIG-backed vGPU, processes that run on the vGPU run in parallel with processes running on other vGPUs on the GPU. Process run on all vGPUs resident on a physical GPU simultaneously.
For MIG-Backed vgpu, the vgpu type can be different for a single GPU.
GPU Product
Two many GPU series
Quadro - This is the workstation version. Higher priced. This is meant for corporate customers, so it is better tested, more memory, etc. This is the highest quality chips. And since they are higher priced, NVIDIA offers better support, easier exchanges etc.
Tesla - This is the range that is focused on HPC. some may not have video output. This is intended for people using CUDA.
Note
Some products are only for graphics while others are only for compute
Tesla M60 and M6 GPUs support compute mode and graphics mode, can switch between them
# show GPU $ nvidia-smi Mon Aug 9 11:24:19 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.46 Driver Version: 430.46 CUDA Version: N/A | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla P40 On | 00000000:03:00.0 Off | 0 | | N/A 27C P8 19W / 250W | 41MiB / 23039MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla P40 On | 00000000:04:00.0 Off | 0 | | N/A 27C P8 19W / 250W | 41MiB / 23039MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla P40 On | 00000000:84:00.0 Off | 0 | | N/A 26C P8 19W / 250W | 50MiB / 23039MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 Tesla P40 On | 00000000:85:00.0 Off | 0 | | N/A 31C P8 18W / 250W | 41MiB / 23039MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
# Compute M. means Compute mode enabled # Disp.A: GPU uses for display # Processes: show processes running on each GPU # For GPU supported MIG, there is label MIG M. along with Compute M.
# check BDF(bus, domain, function) of each GPU $ lspci | grep NVIDIA 03:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1) 04:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1) 84:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1) 85:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
# when VGPU is enabled for a GPU, there is a link created at /sys/class/mdev_bus/ pointing to GPU device(PCI bus number) # ls /sys/class/mdev_bus/ 0000:03:00.0 0000:04:00.0 0000:84:00.0 0000:85:00.0
# checkout all supported vgpu type of the given GPU # nvidia-156 is mdev identifier, the vgpu type is at mdev_supported_types/nvidia-156/name $ cd /sys/class/mdev_bus/0000:03:00.0 $ ls mdev_supported_types/ nvidia-156 nvidia-241 nvidia-284 nvidia-286 nvidia-46 nvidia-48 nvidia-50 nvidia-52 nvidia-54 nvidia-56 nvidia-58 nvidia-60 nvidia-62 nvidia-215 nvidia-283 nvidia-285 nvidia-287 nvidia-47 nvidia-49 nvidia-51 nvidia-53 nvidia-55 nvidia-57 nvidia-59 nvidia-61
# P40-2B, P40 is GPU type, while 2B is vgpu-type $ cat mdev_supported_types/nvidia-156/name GRID P40-2B
# check how many VGPU(depends on type) can be created for a given GPU $ cat mdev_supported_types/nvidia-156/available_instances 12
# create a VGPU $ uuidgen 2794ee88-7932-4c37-9927-97ef3a5e76c4 $ echo"2794ee88-7932-4c37-9927-97ef3a5e76c4"> mdev_supported_types/nvidia-156/create
# after this a mdev device(VGPU device) is created $ ls /sys/bus/mdev/devices/2794ee88-7932-4c37-9927-97ef3a5e76c4 driver iommu_group mdev_type nvidia power remove subsystem uevent
# assign vgpu to VM # VM must # 1. The VM to which you want to add the vGPUs is shut down.
$ virsh edit $vm-name
# uuid is the vgpu uuid or use bdf is also ok <device> ... <hostdev mode='subsystem'type='mdev' model='vfio-pci'> <source> <address uuid='2794ee88-7932-4c37-9927-97ef3a5e76c4'/> </source> </hostdev> </device>
# check vm who is using this vgpu $ cat /sys/bus/mdev/devices/2794ee88-7932-4c37-9927-97ef3a5e76c4/nvidia/vm_name
# remove a vgpu must know the uuid of vgpu # VM must # 1. The VM to which the vGPU is assigned is shut down
# To monitor vGPU engine usage by applications across multiple vGPUs $ nvidia-smi vgpu -p # GPU vGPU process sm mem enc dec process # Idx Id Id % % % % name 0 - - - - - - - 1 - - - - - - - 2 - - - - - - - 3 - - - - - - -
# If MIG mode is not enabled for the GPU, or if the GPU does not support MIG, this property reflects the number and type of vGPUs that are already running on the GPU. # 1. If no vGPUs are running on the GPU, all vGPU types that the GPU supports are listed. # 2. If one or more vGPUs are running on the GPU, but the GPU is not fully loaded, only the type of the vGPUs that are already running is listed. # 3. If the GPU is fully loaded, no vGPU types are listed.
The mdev device file that you create to represent the vGPU does not persist when the host is rebooted and must be re-created after the host is rebooted