[Solution] Modular Filtering Issue for NVIDIA Drivers/CUDA Sources Conflict

Very recently, there were issues reported about modular filtering which came into the picture due to the presence of multiple sources to fetch the drivers from, causing dnf to not be able to resolve dependencies properly and call out for broken dependencies even though I have investigated the sources to conclude that there are no issues present there.

To give a deeper context, whenever someone installs driver from RPM Fusion repositories by executing the following commands - they opt-in to install driver from there.

$ sudo dnf install -y fedora-workstation-repositories
$ sudo dnf config-manager --set-enable rpmfusion-nonfree-nvidia-driver

From what @boistordu was able to deduce, this enables a module which includes packages like akmod-nvidia and its dependencies. So at any point of time, if someone executes the following - it would look up those packages in the list of enabled modules, which now includes the one for rpmfusion-nonfree-nvidia-driver module as well.

$ sudo dnf install -y gcc kernel-headers kernel-devel akmod-nvidia xorg-x11-drv-nvidia xorg-x11-drv-nvidia-libs

The above command would run just fine on a fresh installation and upgraded installations, having only the rpmfusion-nonfree-nvidia-driver module enabled but not so on those having both the rpmfusion-nonfree-nvidia-driver and cuda-fedora33 modules enabled (observed on Fedora 34, so the behaviour still needs investigation for Fedora 33). Attempting to install the drivers or upgrading them would most likely result in the following output (Command-line excerpt provided by @boistordu).

[ # ] NVIDIA AUTOINSTALLER FOR FEDORA
[ β˜… ] CHECKING SUPERUSER PERMISSIONS...
[ βœ“ ] Superuser privilege acquired
[ β˜… ] CHECKING AVAILABILITY OF RPM FUSION NVIDIA REPOSITORY...
[ ! ] RPM Fusion repository for Proprietary NVIDIA Driver was detected
[ β˜… ] ATTEMPTING CONNECTION TO RPM FUSION SERVERS...
[ βœ“ ] Connection to RPM Fusion servers was established
[ β˜… ] LOOKING FOR EXISTING DRIVER PACKAGES...
[ ! ] No existing NVIDIA driver packages were detected
[ β˜… ] INSTALLING PROPRIETARY DRIVERS...
Last metadata expiration check: 0:05:15 ago on Fri 30 Apr 2021 12:50:01 AM CEST.
Package gcc-11.0.1-0.3.fc34.x86_64 is already installed.
Package kernel-headers-5.11.16-300.fc34.x86_64 is already installed.
Package kernel-devel-5.11.15-200.fc33.x86_64 is already installed.
Package kernel-devel-5.11.16-200.fc33.x86_64 is already installed.
Package kernel-devel-5.11.16-300.fc34.x86_64 is already installed.
All matches were filtered out by modular filtering for argument: xorg-x11-drv-nvidia
Error: Unable to find a match: xorg-x11-drv-nvidia
[ βœ— ] Proprietary drivers could not be installed
[ βœ— ] Leaving installer

So what exactly is happening here? The device on which the drivers are attempted to be installed has had rpmfusion-nonfree-nvidia-driver module enabled first, and then the cuda-fedora33 module was enabled due to which an attempt to fetch one of the akmod-nvidia dependencies (called xorg-x11-drv-nvidia) could not come to fruition. One might say that the inability to find xorg-x11-drv-nvidia makes no sense as it is present with the rpmfusion-nonfree-nvidia-driver module, which is in fact, enabled here.

But here’s the thing, the cuda-fedora33 module seems to have an overriding effect on the rpmfusion-nonfree-nvidia-driver module due to which dnf states to be unable to find xorg-x11-drv-nvidia even when it is present. This essentially means that both the repositories are incompatible with each other, cannot coexist simultaneously and any attempts made to install or upgrade your drivers would result in a failure. Let us take an example of my ageing laptop here, which already has the drivers installed.

I have an PRIME config enabled so that is why you get to see just one GPU (that too, the discrete card) in the output of screenfetch. Please note that this has no bearing with the problem that we are talking about currently but I have stated it anyway for the sake of clarity. I have appended the output of date as well, should you have trust issues, for I would be experimenting with adding repositories in the coming screenshots.

The current working directory for me is /etc/yum.repos.d. This article about adding and remove software repositories in Fedora (which you can find here) states that this is where the repositories are located. Also, please note that I do not have a cuda-fedora33.repo file here, which means that I do not have the cuda-fedora33 module enabled.

I attempt updating the applications and this happens just fine, without any errors. Now, I would try enable the cuda-fedora33 module by executing the following command, and then try updating the applications again.

$ sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora33/x86_64/cuda-fedora33.repo

This is where the problems start to appear and dnf states the availability of the newer versions of the installed dependencies, though attempting to install those would most likely fail due to the unavailability of their own set of dependencies. Having the cuda-fedora33 module enabled is mandatory for installing and upgrading cuda and its dependencies but is that really so?

While attempting to install cuda, once the cuda-fedora33 module was enabled results in the following output which is full of errors, stating the unavailability of the dependencies (for some) and availability in two different places (for some). The documentation for installing CUDA (which you can find here) states that the nvidia-driver module must be disabled before proceeding to install cuda. Let us try to do that by executing the following command.

$ sudo dnf module disable nvidia-driver

For diagnostic purposes, I have shown the count of repositories listed in the /etc/yum.repos.d directory, before as well as after disabling the said module, and there does not seem to be any difference between the two. This is important to note as we can potentially infer that performing this operation would not affect the repository where you original acquire the drivers from and the rpmfusion-nonfree-nvidia-driver stays put - which contradicts our earlier assumption of impossible coexistence of rpmfusion-nonfree-nvidia-driver and cuda-fedora33 modules. Let us try installing cuda again by executing the following command.

$ sudo dnf install cuda

It seems to run just fine now. The update also runs fine without any errors.

So it seems that both the modules can indeed coexist and are compatible with each other to a greater extent, except the parts for the nvidia-driver module which needs to be explicitly disabled by executing the following command, after the cuda-fedora33.repo has been installed.

$ sudo dnf module disable nvidia-driver

I would introduce this change with the coming version of NVIDIA Auto Installer for Fedora but I hope that this discussion can be of assistance to those facing this issue. As these are my deductions and inferences, they might have inaccuracies and mistakes so please feel free to let me know of them, if there are any.

References

  1. Modularity β€” dnf latest documentation
  2. Why does rpmfusion CUDA install HOWTO say "sudo dnf module disable nvidia-driver" - disable nvidia-driver?
  3. Howto/CUDA - RPM Fusion
  4. Rpm - Nvidia driver on Fedora 34, seems to have issues - #2 by ankursinha
  5. Fedora 33 support Β· Issue #55 Β· t0xic0der/nvidia-auto-installer-for-fedora Β· GitHub
5 Likes

v0.3.6 of NVIDIA Auto Installer for Fedora is now generally available in COPR, including the change mentioned in the above discussion. The discussions leading to this change and its consequences can be found in this pull request.

Upgrading is highly recommended.

Please execute the following commands in succession, if you do not have it installed already.

# dnf install dnf-plugins-core -y
# dnf copr enable t0xic0der/nvidia-auto-installer-for-fedora -y
# dnf install nvautoinstall -y

For those interested, builds for v0.3.6 fail for the OpenSUSE Leap 15.1 x86_64 chroot (build IDs Build 2174254 in t0xic0der/nvidia-auto-installer-for-fedora and Build 2174257 in t0xic0der/nvidia-auto-installer-for-fedora) - Not that it matters anyway here for Fedora, builds work just fine.

There goes my Sunday :frowning_face:

2 Likes