How to silent nouveau driver?

asked 2016-08-10 09:59:11 -0500

techmago

updated 2016-08-10 18:01:25 -0500


I installed a fedora 24 in a old pc i gifted to a friend of mine how doesn't have a pc... Its an old pc, but should be enough to his needs... After i installed the desktop, i found out the nouveau driver is spamming everything with error messages, at a rate of 4-10+ messages/sec. A good part of the capacity of the machine is now busy writing error logs.

The video board is a really old Nvidia Riva TNT. I dont have right now the exact log, but from my research when i was with the pc, it seen because there is no fan connected to the board. I did adapt an external fan, so this shoulndt be an issue, but the nouveau driver wont shut up about it.

I tried mask sysconfig and rsyslog, tried all combinations of dmesg -n [1 to 6] but i cant silent the noveau driver. The board is running fine. There is anyway to completely silent the nouveau driver, or if that's not possible, disable the entire dmesg?

I think the error logs look like these: kernel: [ 1267.201053] [drm] nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 1/3 Class 0x0062 Mthd 0x0c0c Data 0x00000000:0x00000000 kernel: [ 1267.201061] [drm] nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: ILLEGAL_MTHD, nStatus: PROTECTION_FAULT

If you silent nouveau then what would you do without a graphics driver?

powergame ( 2016-08-10 12:36:24 -0500 )

The best way to get help for this is to provide us the exact error message being generated; the range of possible solutions here is large. We may be able to address this issue with a more precise modification than attempting to prevent the nouveau kernel module from logging anything (not a good idea).

bitwiseoperator ( 2016-08-10 16:51:51 -0500 )

i added the log

techmago ( 2016-08-11 15:31:17 -0500 )

answered 2016-08-12 11:22:07 -0500

updated 2016-08-12 11:23:18 -0500

As you know, the error you're seeing is resulting from nouveau's attempt to automatically control your graphics card's dysfunctional fan. Since you know the fan is bad and you have put in place a separate fan to cool the device (and you're satisfied with that solution), you can instruct the nouveau module to cease its attempts at automatically controlling the fan.

The kernel parameter used to control this behavior is going to be found in your /sys directory under the PCI device number of the card, e.g.: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable

You can find it by searching /sys with the find utility: $ sudo find /sys -name pwm1_enable

Yours is likely set to the default value of "2". You should set it to "0" to tell the nouveau driver not to control the fan automatically. To make sure this is persistent across reboots, I'd set a udev rule by creating a file in /etc/udev/rules.d and adding the below content: $ sudo vim /etc/udev/rules.d/50-nouveau-hwmon.rules ACTION=="add", SUBSYSTEM=="hwmon", DRIVERS=="nouveau", ATTR{pwm1_enable}="0" :wq

Once you've added that rule, reboot and check the kernel ring buffer (dmesg) to confirm that the error no longer appears.

You may want to attempt to monitor the temperature of the GPU by some other means (using, perhaps lm_sensors). Or, you may just try to find a low-cost replacement card.

Props to the ArchWiki for being a good resource.

