SDB:NVIDIA drivers

Jump to: navigation, search


By default, openSUSE provides the opensource nouveau driver for NVIDIA GPUs, however certain features may not be supported and require to use NVIDIA drivers instead.

NVIDIA provides an open-source module and proprietary module. The open-source module is recommended by NVIDIA (if supported by device). You can find more details about installing drivers in this openSUSE blog article.

Icon-warning.png
Warning: While it is possible to use the official installer from NVIDIA, it is recommended to use openSUSE's package management tools instead.

If you disabled multiversion kernel packages, make sure you re-enable them in /etc/zypp/zypp.conf. It should contain the line multiversion = provides:multiversion(kernel).

The drivers are provided via NVIDIA's openSUSE repository for licensing reasons. You'll be asked whether you trust NVIDIA's GPG key the first time you reload the software repositories. The fingerprint of this key should be 2FB0 3195 DECD 4949 2BD1 C17A B1D0 D788 DB27 FD5A.

If you are upgrading to a new GPU, we recommended that you uninstall any NVIDIA drivers that may be specific to your old hardware before replacing the card.

Adding the NVIDIA repository

Via terminal on Leap, Tumbleweed, Slowroll

As root, enter one of the following command (matching your openSUSE distribution flavour) in a terminal.

# zypper install openSUSE-repos-Leap-NVIDIA
# zypper install openSUSE-repos-Tumbleweed-NVIDIA
# zypper install openSUSE-repos-Slowroll-NVIDIA

or

# zypper addrepo --refresh 'https://download.nvidia.com/opensuse/leap/$releasever' NVIDIA
# zypper addrepo --refresh https://download.nvidia.com/opensuse/tumbleweed NVIDIA

(ATM the Tumbleweed repository is also used by Slowroll.)

Via terminal on Aeon, Kalpa, Leap Micro

For a successful driver installation there are more steps required. As root, enter the following in a terminal:

# cp /usr/etc/transactional-update.conf /etc/transactional-update.conf

Open the newly created file with super user privileges:

# vim /etc/transactional-update.conf

Adjust and uncomment ZYPPER_AUTO_IMPORT_KEYS:

ZYPPER_AUTO_IMPORT_KEYS=1

Save the file using ESC + :q. As root enter the following in a terminal:

# transactional-update -i pkg install openSUSE-repos-MicroOS-NVIDIA

Installation

Automated installation (tested on Tumbleweed)

To auto-detect and install the right driver for your hardware, run

# zypper install-new-recommends

Installation of Open driver on Leap 15.6 and Tumbleweed

Unfortunately in our Leap 15.6 and Tumbleweed repositories we still have driver packages for older Proprietary driver (version 550), which are still registered for Turing+ GPUs. The reason is that at that time the Open driver wasn't considered stable yet for the desktop. Therefore, if you own a Turing+ GPU (check with inxi -aG) and would like to use the Open driver (which is recommended!) on Leap 15.6 please use the following command instead of the one right above.

# zypper in nvidia-open-driver-G06-signed-kmp-meta

Otherwise you will end up with a Proprietary driver release 550 initially, which then will be updated later to the current version of the Proprietary driver, but not replaced by the open driver automatically.

Manual installation

Get the hardware information

In a terminal:

 # lspci | grep VGA
 # lscpu | grep Arch

Alternatively:

 # hwinfo --gfxcard | grep Model
 # hwinfo --arch

Using inxi utility:

# inxi -G
# inxi -Ga

Determination of driver version

To determine the appropriate driver version for your card, put your hardware information into NVIDIA's driver search engine.

Note: NVIDIA driver versions are mapped to the following naming convention listed below. You will need this information when you are ready to install via commandline/zypper.

  • G03 = driver v340 = legacy driver
  • G04 = driver v390 = legacy driver
  • G05 = driver v470 = legacy driver
  • G06 = driver v550/v580
Icon-warning.png
Warning: Legacy drivers G03 and G04 won't work with the actual kernel 6.6.x series. Use the opensource nouveau driver instead.


Manual installation via Myrlyn (for Leap and Tumbleweed)

  1. Open Myrlyn
  2. View > Repositories > NVIDIA
  3. Choose the appropriate driver, e.g. x11-video-nvidiaG04 or x11-video-nvidiaG05 or nvidia-video-G06
  4. Optionally choose the corresponding OpenGL acceleration libraries; nvidia-glG04 or nvidia-glG05 or nvidia-gl-G06.
  5. Press Accept.
  6. Restart your computer.

Manual installation via Command line

It's for the moment the only way to install the drivers in openSUSE Aeon and Kalpa.

Once you know which product is mapped to the appropriate driver command mapping (e.g. G04 or G05 or G06), you can use :

Leap and Tumbleweed
 # zypper in <x11-video-nvidiaG04 or x11-video-nvidiaG05 or nvidia-video-G06>
 # zypper in <nvidia-glG04 or nvidia-glG05 or nvidia-gl-G06>
 # zypper in <nvidia-computeG04 or nvidia-computeG05 or nvidia-compute-G06 and nvidia-compute-utils-G06>
Aeon, Kalpa and MicroOS
Whenever nonfree drivers get installed or updated the user by default has to manually accept the licenses in order to update. This is not an issue unless you want to automate updating your system on distributions such as MicroOS, Aeon or Kalpa. The auto updater will stop working whenever zypper needs you to manually agree to the license of the Nvidia driver updates. In these cases, you can get Zypper to autoaccept proprietary licenses by editing /etc/zypp/zypper.conf as described below:
# vim /etc/zypp/zypper.conf

uncomment the line:

#autoAgreeWithLicenses = no

and change it to:

autoAgreeWithLicenses = yes

Install the correct GPU driver:

  • For nvidia-open-driver-G06:
# transactional-update -i pkg in nvidia-open-driver-G06-signed-kmp-default
# version=$(rpm -qa --queryformat '%{VERSION}\n' nvidia-open-driver-G06-signed-kmp-default | cut -d "_" -f1 | sort -u | tail -n 1)
# transactional-update --continue -i pkg in nvidia-video-G06 == ${version} nvidia-compute-utils-G06 == ${version}
  • For nvidia-video-G06:
# transactional-update -i pkg in nvidia-driver-G06-kmp-default nvidia-video-G06 nvidia-gl-G06 nvidia-compute-G06
  • For x11-video-nvidiaG05:
# transactional-update -i pkg in nvidia-gfxG05-kmp-default x11-video-nvidiaG05 nvidia-glG05 nvidia-computeG05
  • For x11-video-nvidiaG04:
# transactional-update -i pkg in nvidia-gfxG04-kmp-default x11-video-nvidiaG04 nvidia-glG04 nvidia-computeG04

Append rd.driver.blacklist=nouveau to /etc/kernel/cmdline, and run:

# transactional-update --continue initrd

Restart your computer, and validate afterwards using dmesg and lsmod if the NVIDIA modules are being successfully loaded.

Secureboot

Kernels in Leap and Tumbleweed will, by default, refuse to load any unsigned kernel modules on machines with secure boot enabled.

During the NVIDIA driver installation on a secureboot system a MOK keypair is created and the kernel modules are signed with the created private key. The created certificate (public key) remains on the storage below /usr/share/nvidia-pubkeys, and the installer attempts to import it to the list of to-be-enrolled MOK pubkeys.

During the following reboot this certificate can easily be enrolled to the MOK database. The EFI tool for this (mokutil) is automatically started: inside the tool select "Enroll MOK", then "Continue", then "Yes". Use your root password (US keyboard layout!) when prompted for a password. The certificate is now added to the MOK database and is considered trusted, which will allow kernel modules with matching signatures to load. To finish, select "Reboot".

Nvidia-secureboot-enrollKey.jpg

In case the import fails (for example, when no root password is set), or you miss the timeout for certificate enrollment after first reboot, you can easily import the certificate again by running the following command:

  • For nvidia-driver-G0X (X >= 6):
# mokutil --import /usr/share/nvidia-pubkeys/MOK-nvidia-driver-G0<X>-<driver_version>-<kernel_flavor>.der --root-pw
  • For nvidia-gfxG0X (X < 6):
# mokutil --import /usr/share/nvidia-pubkeys/MOK-nvidia-gfxG0<X>-<driver_version>-<kernel_flavor>.der --root-pw

Then reboot the machine and enroll the certificate as described before.

As a last resort, in case you are having problems with secure boot, you can, at your own risk, disable validation for kernel modules:

# mokutil --disable-validation


Driver Update

During a driver update the old and no longer used public key is being registered to be deleted from the MOK data base. So in addition to the "Enroll MOK" menu entry, a "Delete MOK" entry will appear in the EFI tool once you reboot the machine. In order to finally remove it from the MOK data base, select "Delete MOK", then "Continue", then "Yes". Use your root password (US keyboard layout!) when prompted for a password. You can show the certificate/description of the public key when selecting "View Key X" in order not to delete the wrong key. Press any key to continue from there.

Nvidia-secureboot-deleteKey.jpg

Uninstalling the NVIDIA driver

Identify every package that has been installed from the NVIDIA repo. As the repository name may differ, dependend how it was added, you first need to identify the correct repository:

# zypper lr
# zypper search -ir <repo name or number>

For a proper cleanup, you may want to uninstall all listed packages.

When installing the NVIDIA driver, the nouveau driver has been blacklisted. To ensure the opensource driver is allowed to take over again, make sure that there are no files containing the words blacklist nouveau at /etc/modprobe.d/. Also check that /etc/default/grub no longer contains rd.driver.blacklist=nouveau and that grub2-mkconfig -o /boot/grub2/grub.cfg ran to update grub.cfg. Also remove or rename /lib/modprobe.d/09-nvidia-modprobe-bbswitch-G04.conf if that doesn't happened automatically. After uninstalling the packages, you might need to recreate the initrd by running:

# dracut -f --regenerate-all

Troubleshooting

  • https://en.opensuse.org/SDB:NVIDIA_troubleshooting
  • If your computer freezes before the login screen after installing the propietary drivers and you are using GDM (usually the case if you are using GNOME), try adding WaylandEnable=false in /etc/gdm/custom.conf. This will disable the GNOME Wayland session as an option during login. If you want Wayland to still be enabled, run the command below and reboot.
 # sudo ln -sfv /dev/null /etc/udev/rules.d/61-gdm.rules
  • You can verify the driver was actually loaded by running lsmod | grep nvidia in the terminal. The output should be like:
nvidia_drm             57344  2
nvidia_modeset       1187840  3 nvidia_drm
nvidia_uvm           1110016  0
nvidia              19771392  81 nvidia_uvm,nvidia_modeset
drm_kms_helper        229376  2 nvidia_drm,i915
drm                   544768  13 drm_kms_helper,nvidia_drm,i915

The numbers in the middle column do not need to be the same. If the driver is loaded the problem relies elsewhere, since that means it was installed successfully.

  • As stated before in this guide, if you are using secure boot make sure you accept the MOK, else the module won't load. One way to know if secure boot could be blocking the module is to look at the output of dmesg and search for warnings like the following:
Lockdown: modprobe: unsigned module loading is restricted; see man kernel_lockdown.7
modprobe: ERROR: could not insert 'nvidia': Required key not available

See also

Optimus and Switcheroo Control

Users on hardware configurations with NVIDIA Optimus (usually the case on laptops) are advised to read the Switcheroo Control instructions.

CUDA

Developers and users involved in High Performance Computing applications may want to install CUDA libraries. Additional informations are provided at the CUDA Toolkit Documentation and download links at NVIDIA Developer.
These sources do not include support for Aeon, Kalpa and MicroOS users and can only be used for informational background.

openSUSE now has a cuda-cloud-opengpu meta package which maintains a working combination of CUDA tools, driver versions, container tools and available kernel. Previously, different repositories with different update cycles would cause mismatches between these packages.

NOTE All prior separately installed packages for NVIDIA must be removed first! The meta package must be the only directly installed package, otherwise it won't be able to control when linked updates between the driver, libraries, and kernel occur.

Note for Aeon, Kalpa, MicroOS users: The following actions will need to be done within an interactive transactional-update shell, i.e.: sudo transactional-update -c shell

1. Find what NVIDIA tools and drivers are installed:

zypper search -si nvidia

2. Remove the installed NVIDIA packages to have a clean starting point:

# zypper rm --clean-deps *nvidia*

3. Verify that you've added the NVIDIA repository as described in the first chapter.

4. Add the CUDA repository:

# zypper addrepo -f https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64/ cuda

5. Install the cuda-cloud-opengpu meta package:

# zypper install cuda-cloud-opengpu

NVIDIA Container Toolkit

Developers and users involving containerized High Performance Computing applications may want to install the NVIDIA Container Tools to enable GPU usage within containers. Additional instructions are provided at NVIDIA Container Toolkit Documentation.

NVIDIA Container Toolkit is provided in the stand alone NVIDIA-maintained repo mentioned in the NVIDIA documentation. This was split in late 2024 from the CUDA drivers repo since it's only loosely coupled. Some CUDA driver repos may still include some versions of the container tools for backwards compatibility, but their usage from that repo is deprecated.

NVIDIA UVM Driver Not Loaded at Boot

The NVIDIA UVM driver is not loaded by the nvidia-persistenced.service, it is normally only loaded on-demand when a library makes use of it. This includes running nvidia-smi tool. If you're using the NVIDIA GPU within a container that starts up automatically with the system, this driver will not have been loaded and will not get loaded when trying to use it from within the container. To fix this, add a systemd service unit that will automatically run nvidia-smi -L to list your NVIDIA GPUs, but that has the side-effect of loading the NVIDIA UVM driver. Set it as a weak dependency of your systemd service(s) that will run containers using NVIDIA GPUs. Example:

# In file /etc/systemd/system/nvidia-uvm.service
[Unit]
Description=Create the NVIDIA UVM device node by calling nvidia-smi
After=nvidia-persistenced.service
Wants=nvidia-persistenced.service
ConditionPathExists=/usr/bin/nvidia-smi
# Don't run if there are no NVIDIA devices. The nvidia-persistenced.service will have created
# this device if any GPUs exist, as wel as /dev/nvidia# nodes.
ConditionPathExists=/dev/nvidia-modeset

[Service]
Type=oneshot
# Use the call that just enumerates, which will load the devices for all GPUs.
ExecStart=/usr/bin/nvidia-smi -L
# Verify the UVM driver is present now, triggering a failure exit code if not
ExecStartPost=/bin/bash -c 'test -c /dev/nvidia-uvm'
RemainAfterExit=yes
User=root
TimeoutStartSec=30
Restart=no

[Install]
WantedBy=multi-user.target
# Make sure we get run any time docker.service needs to start, but don't block docker if we fail.
WantedBy=docker.service
# Add any other services that start containers needing the NVIDIA GPU as additional WantedBy= here.

Reload your systemd daemon so it knows about your new service, then enable it.

sudo systemctl daemon-reload
sudo systemctl enable nvidia-uvm.service

CDI Updating

If you use the CDI method of supplying NVIDIA GPU support to containers (uses --device=nvidia.com/gpu=... instead of --gpus=...), you may run into breakage when the drivers do update. The CDI configuration files need to be regenerated on most driver updates or they will point to the wrong locations. For unattended servers which use auto-update and auto-reboot, this can be automated by simply re-running the generation step on each new boot.
Since a failure to generate the CDI description file properly will not cause the nvidia-ctk command to overwrite an existing file, this is generally safe to have as part of your startup.

Recommended: Make all your systemd services that start containers depending on the NVIDIA GPU have soft-dependencies on this service.

Example systemd service unit file:

# Within file /etc/systemd/system/nvidia-ctk-update.service
[Unit]
Description=Regenerate the NVIDIA CDI config
After=nvidia-persistenced.service
Wants=nvidia-persistenced.service
ConditionPathExists=/usr/bin/nvidia-ctk
# Don't run if there are no nvidia devices
ConditionPathExists=/dev/nvidia-modeset

[Service]
Type=oneshot
ExecStart=bash -c 'nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml'
RemainAfterExit=yes
User=root
TimeoutStartSec=30
Restart=no

[Install]
WantedBy=docker.service
# Add any other services that start containers and need NVIDIA GPU access via CDI as additional WantedBy= values

Then reload the systemd daemon so it finds your new file, and enable it to run at boot.

sudo systemctl daemon-reload
sudo systemctl enable nvidia-ctk-update.service

External links