NVIDIA driver not working after Fedora 33 install

This is the case with kernel 5.9, not with 5.8

Also, there is no need to download any 5.8.x kernel from koji, the latest (8.5.16) is in the repos.

With 5.8.16-300, as far as I remember, with my Nvidia 1060, I had the same problem of the author of the post.

If we all have Fedora 33 installed and are uptodate, wouldn’t we all have the same kernel? Why would some Nvidia cards work and not others? Why does everything work fine on my other Fedora computer with a 1080 GPU?

This is the kernel version I am on:

$ uname -a
Linux main 5.8.16-300.fc33.x86_64 #1 SMP Mon Oct 19 13:18:33 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

I’m on 5.8, so the kernel should be supported then. What else could be the problem?

There were no errors during the installation.

One sad possibility is that the card is physically shot, but then you probably wouldn’t get a picture at all.

You are putting ideas in my head about getting one of Nvidia’s new graphics cards…

This computer happens to be double-boot so I switched over to Windows and confirmed that Windows can use the GPU just fine. So it can’t be the hardware.

Wanna give this a try?

by the way, AMD and Intel release open-source drivers for their GPUs, no need to hassle around with propr. drivers that stop working from time to time…

sudo dnf install gcc kernel-headers kernel-devel akmod-nvidia xorg-x11-drv-nvidia xorg-x11-drv-nvidia-libs xorg-x11-drv-nvidia-libs.i686

to get the driver and all necessary dependencies.

Then wait at least 5 minutes for the modules to load, then

sudo akmods --force
sudo dracut --force

This would force the configuration to be read from the updated kernel modules which now have the NVIDIA drivers in them.

Then wait again, for at least 3 minutes, then reboot.

1 Like

Thanks for listing it here, @florian. :heart:

Indeed. If you are getting yourself a new GPU, just follow the brilliant advice given by Linus Torvalds (find it here) and stay away from Nvidia.

2 Likes

The kernel module forcing part is now included in the post-installation hook already, and is no more required to be run explicitly and the wait is also optional as the modules would be load completely before poweroff and before boot.

(Find the reference here. Even the tool I wrote used to make users wait for this long until v0.2.5 and it was so inconvenient :laughing:)

1 Like

OK, good to know.

@hx2a, so no need to wait, as I suggested, you can run one command straight after another

Thanks, and I appreciate that AMD and Intel release open-source drivers. I would go with that but I need the Nvidia card and drivers so I can use CUDA for ML research. I wish Nvidia would release open-source drivers also.

I’m happy to be a team player and help with the testing. To be clear on the steps here, I need to install the Python package in this repo and run the NVAutoInstFedora32 tool, even though I have Fedora33 installed. I should try each of the different “installation modes” and see if any of them work. Right? If there are errors or something doesn’t work, should I report back here or open an issue in that repo?

2 Likes

I tried all these steps and it didn’t work. :cry: Thank you for the suggestion though. :grinning:

1 Like

I just took a moment to validate my assumption that pytorch can only run on Nvidia graphics cards and it seems I might not have been correct. I actually don’t care at all about Nvidia cards, I just want pytorch to work.

1 Like

Exactly :100:

That is what you would need to do.

No dice. I tried every installation mode, most of which said “nothing to do” because the packages were already installed. The -plcuda and --vidacc ones installed a bunch of stuff.

Are there other ideas I can try?

[ # ] NVIDIA AUTOINSTALLER FOR FEDORA 32 AND ABOVE
[ # ] CHECKING FOR GPU COMPATIBILITY...
[ ! ] Compatibility infomation was obtained
[ ✔ ] One or more active NVIDIA GPUs were detected
      01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 960] (rev a1)
[ ✔ ] An single dedicated GPU setup was detected
[ # ] GATHERING CURRENT HOST INFORMATION...
[ ! ] Host information was gathered
      System: Linux v5.8.16-300.fc33.x86_64
      Hostname: main
      Version: #1 SMP Mon Oct 19 13:18:33 UTC 2020
      Distribution: Fedora x86_64
[ # ] CHECKING FOR HOST COMPATIBILITY...
[ ✔ ] Supported OS detected
      This tool is expected to work correctly here
[ ✘ ] Leaving installer

That means the the binary is working. (Hooray!) @florian @alciregi

Now let’s help you with your problem.

1 Like

Yes? What? What happens?

It’s probably some configuration issue. Maybe the RPM Fusion documentation for nvdia drivers and CUDA can help. Or the installer tool from t0xic0der, though you seem to have run into some trouble.
https://rpmfusion.org/Howto/NVIDIA
https://rpmfusion.org/Howto/CUDA

If those don’t work, there’s also the negativo17 repo. https://negativo17.org/nvidia-driver/ The downside is that it’s a third party repo, and I prefer to keep those to a minimum. The plus side is that it’s focused on making it easy to install nvidia and CUDA drivers, at least as easy as that mess can be made. It’s been working fine for me through many Fedora releases and upgrades, especially with the akmod package (dkms gave me occasional trouble). I’m currently on kernel 5.8.16-200 (F32, but will be upgrading to F33 soon).

As a side note, gotta love it when someone asks for help with something known to work and the recommendations are to buy different hardware or make major system changes… Tis the internet.

Thank you, I appreciate that.

At the moment, when I run sudo dnf upgrade I get a list of skipped packages with conflicts:

Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
 nvidia-driver-libs                          x86_64                3:455.32.00-1.el8                 cuda-rhel8-x86_64                 78 M
 nvidia-persistenced                         x86_64                3:455.32.00-1.el8                 cuda-rhel8-x86_64                 98 k
Skipping packages with broken dependencies:
 cuda-11-1                                   x86_64                11.1.1-1                          cuda-rhel8-x86_64                2.8 k
 cuda                                        x86_64                11.1.1-1                          cuda-rhel8-x86_64                2.7 k
 cuda-drivers                                x86_64                455.32.00-1                       cuda-rhel8-x86_64                7.0 k
 cuda-runtime-11-1                           x86_64                11.1.1-1                          cuda-rhel8-x86_64                2.7 k
 dnf-plugin-nvidia                           noarch                2.0-1.el8                         cuda-rhel8-x86_64                 12 k
 nvidia-driver                               x86_64                3:455.32.00-1.el8                 cuda-rhel8-x86_64                2.4 M
 nvidia-modprobe                             x86_64                3:455.32.00-1.el8                 cuda-rhel8-x86_64                 74 k
 nvidia-settings                             x86_64                3:455.32.00-1.el8                 cuda-rhel8-x86_64                1.7 M
 nvidia-xconfig                              x86_64                3:455.32.00-1.el8                 cuda-rhel8-x86_64                262 k

How do I fix these? These appeared after using the installation tool.

I see your point but in this case I actually appreciated the suggestion. I didn’t realize that non-Nvidia GPUs could meet my needs. Still, I need my current setup to work, and I know that there are probably other people out there who run into the same problem, get frustrated, don’t know how to get help, and then switch to Windows.

I have a GEFORCE GTX 1050, and the drivers I needed were gotten from rpmfusion and installed with “dnf install kernel-devel kernel-headers akmod-nvidia nvidia* xorg-x11-drv-nvidia*”. Then when that was done I just rebooted after a short wait and it worked with no errors.

I looked for your card and found it is a legacy card and supported by the 340 drivers. Iinstructions for downloading and installing the appropriate drivers are here. I see that you have the 455 drivers installed and they are too new to support your card. You will need to remove the currently installed drivers and install the ones that are correct for your card by explicitly giving the version number in the dnf command. To follow those directions you will have to enable the rpmfusion repo and disable the cuda-rhel8-x86_64 repo. It should also fix the conflict issues seen when trying to do an upgrade. You do not need all those cuda packages in most cases so the packages skipped with dependency issues should be removed as well as any nvidia packages already installed.

It is quite possible the nvidia drivers that are the correct version (340) to support your card will not work with the latest kernels. You are free to try them but may need to get a newer card.

Just for your information though. Here you can see that support for your card was EOL in November 2019. It may or may not work for you.

2 Likes