Skip to content

Commit 2158d65

Browse files
Merge pull request #6 from NVIDIA/gh-needs-open
GH200 systems require open GPU kernel module driver
2 parents 0bc66c9 + e0139d4 commit 2158d65

File tree

4 files changed

+21
-6
lines changed

4 files changed

+21
-6
lines changed

gpu-operator/gpu-operator-rdma.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ new kernel module ``nvidia-peermem`` is included in the standard NVIDIA driver i
3535
kernel module provides Mellanox Infiniband-based HCAs direct peer-to-peer read and write access to the GPU's memory.
3636

3737
Starting with v23.9.1 of the Operator, the Operator uses GDS driver version 2.17.5 or newer.
38-
This version and higher is only supported with the NVIDIA open kernel driver.
38+
This version and higher is only supported with the NVIDIA Open GPU Kernel module driver.
3939
The sample commands for installing the Operator include the ``--set useOpenKernelModules=true``
4040
command-line argument for Helm.
4141

@@ -386,7 +386,7 @@ The following section is applicable to the following configurations and describe
386386
* Kubernetes on bare metal and on vSphere VMs with GPU passthrough and vGPU.
387387

388388
Starting with v22.9.1, the GPU Operator provides an option to load the ``nvidia-fs`` kernel module during the bootstrap of the NVIDIA driver daemonset.
389-
Starting with v23.9.1, the GPU Operator deploys a version of GDS that requires using the NVIDIA open kernel driver.
389+
Starting with v23.9.1, the GPU Operator deploys a version of GDS that requires using the NVIDIA Open GPU Kernel module driver.
390390

391391
The following sample command applies to clusters that use the Network Operator to install the MLNX_OFED drivers.
392392

gpu-operator/life-cycle-policy.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information.
159159
.. _gds-open-kernel:
160160

161161
:sup:`1`
162-
This release of the GDS driver requires that you use the NVIDIA open kernel driver for the GPUs.
162+
This release of the GDS driver requires that you use the NVIDIA Open GPU Kernel module driver for the GPUs.
163163
Refer to :doc:`gpu-operator-rdma` for more information.
164164

165165
.. note::

gpu-operator/platform-support.rst

+14-2
Original file line numberDiff line numberDiff line change
@@ -41,19 +41,31 @@ Supported NVIDIA Data Center GPUs and Systems
4141

4242
The following NVIDIA data center GPUs are supported on x86 based platforms:
4343

44+
.. _open-kern-module: #requires-open-kernel-module
45+
.. |open-kern-module| replace:: :sup:`1`
46+
4447
.. tab-set::
4548

4649
.. tab-item:: GH-series Products
4750

51+
4852
.. list-table::
4953
:header-rows: 1
5054

5155
* - Product
5256
- Architecture
5357

54-
* - NVIDIA GH200
58+
* - NVIDIA GH200 |open-kern-module|_
5559
- NVIDIA Grace Hopper
5660

61+
.. _requires-open-kernel-module:
62+
63+
:sup:`1`
64+
NVIDIA GH200 systems require the NVIDIA Open GPU Kernel module driver.
65+
You can install the open kernel modules by specifying the ``driver.useOpenKernelModules=true``
66+
argument to the ``helm`` command.
67+
Refer to :ref:`chart customization options` for more information.
68+
5769
.. tab-item:: A, H and L-series Products
5870
:selected:
5971

@@ -466,7 +478,7 @@ Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage.
466478
.. note::
467479

468480
Version v2.17.5 and higher of the NVIDIA GPUDirect Storage kernel driver, ``nvidia-fs``,
469-
requires the NVIDIA open kernel modules.
481+
requires the NVIDIA Open GPU Kernel module driver.
470482
You can install the open kernel modules by specifying the ``driver.useOpenKernelModules=true``
471483
argument to the ``helm`` command.
472484
Refer to :ref:`chart customization options` for more information.

gpu-operator/release-notes.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -50,15 +50,18 @@ New Features
5050

5151
- Run Ubuntu 22.04 and an NVIDIA Linux kernel, such as one provided with a ``linux-nvidia-<x.x>`` package.
5252
- Add ``init_on_alloc=0`` and ``memhp_default_state=online_movable`` as Linux kernel boot parameters.
53+
- Run the NVIDIA Open GPU Kernel module driver.
5354

54-
* Added support for configuring the driver container to use the NVIDIA open kernel modules.
55+
* Added support for configuring the driver container to use the NVIDIA Open GPU Kernel module driver.
5556
Support is limited to installation using the runfile installer.
5657
Support for precompiled driver containers with open kernel modules is not available.
5758

5859
For clusters that use GPUDirect Storage (GDS), beginning with CUDA toolkit 12.2.2 and
5960
the NVIDIA GPUDirect Storage kernel driver version v2.17.5, are only supported
6061
with the open kernel modules.
6162

63+
NVIDIA GH200 Grace Hopper Superchip systems are only supported with the open kernel modules.
64+
6265
- Refer to :ref:`gpu-operator-helm-chart-options` for information about setting
6366
``useOpenKernelModules`` if you manage the driver containers with the NVIDIA cluster policy custom resource definition.
6467
- Refer to :doc:`gpu-driver-configuration` for information about setting ``spec.useOpenKernelModules``

0 commit comments

Comments
 (0)