hardware-vgpu

Introduction

Virtual GPU (vGPU) enables multiple virtual machines (VMs) to have simultaneous, direct access to a single physical GPU, using the same graphics drivers that are deployed on non-virtualized operating systems. By doing this, vGPU provides VMs with unparalleled graphics performance, compute performance, and application compatibility, together with the cost-effectiveness and scalability brought about by sharing a GPU among multiple workloads.

Nvidia VGPU

NVIDIA vGPU software supports GPU instances on GPUs that support the Multi-Instance GPU (MIG) feature in NVIDIA vGPU and GPU pass through deployments. MIG enables a physical GPU to be securely partitioned into multiple separate GPU instances, providing multiple users with separate GPU resources to accelerate their applications. With MIG, A GPU that can be split into several GPU instances of different sizes, with each instance mapped to one vGPU. MIG needs GPU with Ampere GPU architecture, but not all GPU card supports it.

VGPU type

The number of physical GPUs that a board has depends on the board. Each physical GPU can support several different types of virtual GPU (vGPU). vGPU types have a fixed amount of frame buffer, number of supported display heads, and maximum resolutions. They are grouped into different series according to the different classes of workload for which they are optimized. Each series is identified by the last letter of the vGPU type name.

Series Optimal Workload
Q-series Virtual workstations for creative and technical professionals who require the performance and features of Quadro technology, 3D rendering
C-series Compute-intensive server workloads, such as artificial intelligence (AI), deep learning, or high-performance computing (HPC)
B-series Virtual desktops for business professionals and knowledge workers
A-series App streaming or session-based solutions for virtual applications users

vGPU types determines

  • frame buffer
  • display heads(virtual display outputs)
  • maximum resolution
  • number of VGPU

Example
M60-2Q is allocated 2048 Mbytes of frame buffer on a Tesla M60 board.

NVIDIA vGPU is a licensed product on all supported GPU boards.

  • Q-series vGPU types require a vWS license.
  • C-series vGPU types require an NVIDIA Virtual Compute Server (vCS) license but can also be used with a vWS license.
  • B-series vGPU types require a vPC license but can also be used with a vWS license.
  • A-series vGPU types require a vApps license.

ARCH

High-level architecture of NVIDIA vGPU, Under the control of the NVIDIA Virtual GPU Manager running under the hypervisor, NVIDIA physical GPUs are capable of supporting multiple virtual GPU devices (vGPUs) that can be assigned directly to guest VMs.

NVIDIA vGPU System Architecture

Time-Sliced NVIDIA vGPU Internal Architecture

A time-sliced vGPU is a vGPU that resides on a physical GPU that is not partitioned into multiple GPU instances. All time-sliced vGPUs resident on a GPU share access to the GPU’s engines including the graphics (3D), video decode, and video encode engines

This is VGPU Arch for traditional GPU ARCH that most of GPU card support it.

In a time-sliced vGPU, processes that run on the vGPU are scheduled to run in series. Each vGPU waits while other processes run on other vGPUs. While processes are running on a vGPU, the vGPU has exclusive use of the GPU’s engine.

For time-Sliced VGPU, the vgpu type must be same for a single GPU.

MIG-Backed NVIDIA vGPU Internal Architecture

A MIG-backed vGPU is a vGPU that resides on a GPU instance in a MIG-capable physical GPU. Each MIG-backed vGPU resident on a GPU has exclusive access to the GPU instance’s engines, including the graphics (3D), and video decode engines.

In a MIG-backed vGPU, processes that run on the vGPU run in parallel with processes running on other vGPUs on the GPU. Process run on all vGPUs resident on a physical GPU simultaneously.

For MIG-Backed vgpu, the vgpu type can be different for a single GPU.

GPU Product

Two many GPU series

  • Quadro - This is the workstation version. Higher priced. This is meant for corporate customers, so it is better tested, more memory, etc. This is the highest quality chips. And since they are higher priced, NVIDIA offers better support, easier exchanges etc.
  • Tesla - This is the range that is focused on HPC. some may not have video output. This is intended for people using CUDA.

Note

  • Some products are only for graphics while others are only for compute
  • Tesla M60 and M6 GPUs support compute mode and graphics mode, can switch between them

Debug

GPU

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# Centos
# install vgpu driver
$ rpm -iv NVIDIA-vGPU-rhel-7.5-460.73.02.x86_64.rpm
$ reboot

# verify vgpu driver is loaded correctly
$ lsmod | grep vfio
nvidia_vgpu_vfio 27099 0
nvidia 12316924 1 nvidia_vgpu_vfio
vfio_mdev 12841 0
mdev 20414 2 vfio_mdev,nvidia_vgpu_vfio
vfio_iommu_type1 22342 0
vfio 32331 3 vfio_mdev,nvidia_vgpu_vfio,vfio_iommu_type1

# show GPU
$ nvidia-smi
Mon Aug 9 11:24:19 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.46 Driver Version: 430.46 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 On | 00000000:03:00.0 Off | 0 |
| N/A 27C P8 19W / 250W | 41MiB / 23039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P40 On | 00000000:04:00.0 Off | 0 |
| N/A 27C P8 19W / 250W | 41MiB / 23039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P40 On | 00000000:84:00.0 Off | 0 |
| N/A 26C P8 19W / 250W | 50MiB / 23039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P40 On | 00000000:85:00.0 Off | 0 |
| N/A 31C P8 18W / 250W | 41MiB / 23039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

# Compute M. means Compute mode enabled
# Disp.A: GPU uses for display
# Processes: show processes running on each GPU
# For GPU supported MIG, there is label MIG M. along with Compute M.

# check BDF(bus, domain, function) of each GPU
$ lspci | grep NVIDIA
03:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
04:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
84:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
85:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)

Create VPUG

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# when VGPU is enabled for a GPU, there is a link created at /sys/class/mdev_bus/ pointing to GPU device(PCI bus number)
# ls /sys/class/mdev_bus/
0000:03:00.0 0000:04:00.0 0000:84:00.0 0000:85:00.0

# checkout all supported vgpu type of the given GPU
# nvidia-156 is mdev identifier, the vgpu type is at mdev_supported_types/nvidia-156/name
$ cd /sys/class/mdev_bus/0000:03:00.0
$ ls mdev_supported_types/
nvidia-156 nvidia-241 nvidia-284 nvidia-286 nvidia-46 nvidia-48 nvidia-50 nvidia-52 nvidia-54 nvidia-56 nvidia-58 nvidia-60 nvidia-62
nvidia-215 nvidia-283 nvidia-285 nvidia-287 nvidia-47 nvidia-49 nvidia-51 nvidia-53 nvidia-55 nvidia-57 nvidia-59 nvidia-61

# P40-2B, P40 is GPU type, while 2B is vgpu-type
$ cat mdev_supported_types/nvidia-156/name
GRID P40-2B

# check how many VGPU(depends on type) can be created for a given GPU
$ cat mdev_supported_types/nvidia-156/available_instances
12

# create a VGPU
$ uuidgen
2794ee88-7932-4c37-9927-97ef3a5e76c4
$ echo "2794ee88-7932-4c37-9927-97ef3a5e76c4"> mdev_supported_types/nvidia-156/create

# after this a mdev device(VGPU device) is created
$ ls /sys/bus/mdev/devices/2794ee88-7932-4c37-9927-97ef3a5e76c4
driver iommu_group mdev_type nvidia power remove subsystem uevent

# assign vgpu to VM
# VM must
# 1. The VM to which you want to add the vGPUs is shut down.

$ virsh edit $vm-name

# uuid is the vgpu uuid or use bdf is also ok
<device>
...
<hostdev mode='subsystem' type='mdev' model='vfio-pci'>
<source>
<address uuid='2794ee88-7932-4c37-9927-97ef3a5e76c4'/>
</source>
</hostdev>
</device>

# check vm who is using this vgpu
$ cat /sys/bus/mdev/devices/2794ee88-7932-4c37-9927-97ef3a5e76c4/nvidia/vm_name


# remove a vgpu must know the uuid of vgpu
# VM must
# 1. The VM to which the vGPU is assigned is shut down

$ echo "1"> /sys/bus/mdev/devices/2794ee88-7932-4c37-9927-97ef3a5e76c4/remove

Monitoring GPU performance

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
# get GPU INFO
$ nvidia-smi

# get VGPU INFO
$ nvidia-smi vgpu
Mon Aug 9 12:14:14 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.46 Driver Version: 430.46 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 Tesla P40 | 00000000:03:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 1 Tesla P40 | 00000000:04:00.0 | 0% |
| 3251639763 GRID P40-6Q | 9750... i-y7i5cxa4x4 | 0% |
+---------------------------------+------------------------------+------------+
| 2 Tesla P40 | 00000000:84:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 3 Tesla P40 | 00000000:85:00.0 | 0% |
+---------------------------------+------------------------------+------------+

# get VGPU Details
$ nvidia-smi vgpu -q
GPU 00000000:03:00.0
Active vGPUs : 0

GPU 00000000:04:00.0
Active vGPUs : 1
vGPU ID : 3251639763
VM UUID : 97502756-1798-4014-9979-8595f909eeb3
VM Name : i-y7i5cxa4x4
vGPU Name : GRID P40-6Q
vGPU Type : 50
vGPU UUID : 0c4b9906-7a2e-2e4e-133d-ca368087e5ab
Guest Driver Version : Not Available
License Status : Unlicensed
Accounting Mode : Disabled
ECC Mode : Disabled
Accounting Buffer Size : 4000
Frame Rate Limit : N/A
FB Memory Usage
Total : 6144 MiB
Used : 0 MiB
Free : 6144 MiB
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0

GPU 00000000:84:00.0
Active vGPUs : 0

GPU 00000000:85:00.0
Active vGPUs : 0

# To monitor vGPU engine usage across multiple vGPUs

$ nvidia-smi vgpu -u
# GPU vGPU sm mem enc dec
# Idx Id % % % %
0 - - - - -
1 3251639763 0 0 0 0
2 - - - - -
3 - - - - -
0 - - - - -

# To monitor vGPU engine usage by applications across multiple vGPUs
$ nvidia-smi vgpu -p
# GPU vGPU process sm mem enc dec process
# Idx Id Id % % % % name
0 - - - - - - -
1 - - - - - - -
2 - - - - - - -
3 - - - - - - -

# If MIG mode is not enabled for the GPU, or if the GPU does not support MIG, this property reflects the number and type of vGPUs that are already running on the GPU.
# 1. If no vGPUs are running on the GPU, all vGPU types that the GPU supports are listed.
# 2. If one or more vGPUs are running on the GPU, but the GPU is not fully loaded, only the type of the vGPUs that are already running is listed.
# 3. If the GPU is fully loaded, no vGPU types are listed.

$ nvidia-smi vgpu -c
GPU 00000000:03:00.0
GRID P40-6Q

GPU 00000000:04:00.0
GRID P40-6Q

GPU 00000000:84:00.0
GRID P40-6Q

GPU 00000000:85:00.0
GRID P40-6Q

sysfs for NVIDIA GPU
The sysfs directory for each physical GPU is at

  • /sys/bus/pci/devices/
  • /sys/class/mdev_bus/
1
2
3
4
5
6
7
8
9
10
/sys/class/mdev_bus/
|-parent-physical-device
|-mdev_supported_types
|-nvidia-vgputype-id
|-available_instances
|-create
|-description
|-device_api
|-devices
|-name

The mdev device file that you create to represent the vGPU does not persist when the host is rebooted and must be re-created after the host is rebooted

REF