Where to report AMD GPU lockups crasher/hangs bugs related to seemingly triggered by Firefox / VA-API after resuming from suspend?

Since June 24th (or maybe a few days earlier) I’ve been experiencing debilitating AMD radeonsi (my card is a “Pitcairn” R9 270 model) GPU lockups on my main Fedora 35 workstation, running Xorg GNOME with the default open source AMD drivers and the default Firefox package provided by Fedora, fully up to date. Typically when opening a page in a new tab in Firefox, particularly (or always?) when the page contains a video (ex: if it’s a YouTube tab for example).

The system then locks up solidly, with this typical dreaded error in journalctl:

jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10082msec
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000001761 last fence id 0x0000000000001762 on ring 5)
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 10120msec
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000002855a last fence id 0x0000000000028565 on ring 0)
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 10080msec
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000007dc7 last fence id 0x0000000000007dcd on ring 3)
jun 30 13:22:46 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10585msec
jun 30 13:22:46 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000001761 last fence id 0x0000000000001762 on ring 5)

…etc.

Only a full reboot via SSH works (and even then, it takes forever to do so, because you have to wait for systemd to “give up” waiting for Firefox and the filesystems to unmount at the end).

Those ring stalled GPU lockup errors are immediately preceeded by these, so I’m not sure if it’s actually caused by the VA-API implementation in Firefox, or if it’s just triggered by it and the bug is in mesa/the kernel/etc.:

jun 30 13:22:32 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:32 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:32 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:35 workstation gnome-shell[5654]: [2022-06-30T17:22:35Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
jun 30 13:22:35 workstation rtkit-daemon[1456]: Recovering from system lockup, not allowing further RT threads.
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:35 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:36 workstation gnome-shell[5654]: [2022-06-30T17:22:36Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
jun 30 13:22:36 workstation gnome-shell[5654]: [2022-06-30T17:22:36Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
jun 30 13:22:36 workstation gnome-shell[5654]: [2022-06-30T17:22:36Z ERROR mp4parse] Found 2 nul bytes in "\0\0"

At first I thought, maybe Mesa 21.3.9 fixes this, since it reportedly fixes “a crash in radeonsi driver”, but nope, it still occurs with that version.

Now after days of headbanging, trying different kernels, trying with the amdgpu.dpm=0 kernel boot option, etc., I think I narrowed down the bug trigger to these conditions, which makes it nearly 100% reproducible for me:

  • The system must have been suspended (put to sleep) once, then resumed
  • The system must be running in the Xorg version of GNOME; much to my surprise, the hang doesn’t seem to occur when running under the Wayland version of GNOME // Update: it does happen with Wayland too.
  • The issue is then triggered by trying to load a YouTube video tab (or play a video in an existing tab)

My question to you now is: where do I file a bug about this?

As you can see, the main issue is that whenever I encounter GPU lockups, I’m never sure who is the culprit: upstream, downstream, Firefox, Mesa, Mutter/GNOME-Shell, Xorg vs Wayland, the Linux kernel, etc. so I’m at a loss as to where the bug report should effectively go. Fedora’s "How to file a bug guide (if that’s the right place to look in) doesn’t have a section explaining what part of this complex middleware+userland mix is to blame, and how to triage/troubleshoot those types of mandelbugs.

If I didn’t miss something obvious here, and unless the ask.fedora forums is the main place to do the initial troubleshooting, then maybe this is an opportunity for the Fedora community to improve its guidance on how to report those types of bugs? :thinking:

1 Like

Update: I spoke too fast / was too optimistic. Although I thought the issue had stopped occurring when running under Wayland, it turns out it still happens, maybe just less frequently? After letting the computer auto-suspend and waking it up, I tried resuming playback of a YouTube video and it immediately froze the system; the screen went black after a few seconds. Still accessible from SSH as usual, below is what I saw in the logs.

When the computer goes to sleep by itself:

Jun 30 18:14:30 workstation kernel: Freezing user space processes ... (elapsed 0.003 seconds) done.
Jun 30 18:14:30 workstation kernel: OOM killer disabled.
Jun 30 18:14:30 workstation kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Jun 30 18:14:30 workstation kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Jun 30 18:14:30 workstation kernel: serial 00:03: disabled
Jun 30 18:14:30 workstation kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
Jun 30 18:14:30 workstation kernel: parport_pc 00:02: disabled
Jun 30 18:14:30 workstation kernel: sd 0:0:0:0: [sda] Stopping disk
Jun 30 18:14:30 workstation kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
Jun 30 18:14:30 workstation kernel: sd 3:0:0:0: [sdc] Synchronizing SCSI cache
Jun 30 18:14:30 workstation kernel: sd 1:0:0:0: [sdb] Stopping disk
Jun 30 18:14:30 workstation kernel: sd 3:0:0:0: [sdc] Stopping disk
Jun 30 18:14:30 workstation kernel: PM: suspend devices took 7.421 seconds
Jun 30 18:14:30 workstation kernel: ACPI: PM: Preparing to enter system sleep state S3
Jun 30 18:14:30 workstation kernel: ACPI: PM: Saving platform NVS memory
Jun 30 18:14:30 workstation kernel: Disabling non-boot CPUs ...
Jun 30 18:14:30 workstation kernel: smpboot: CPU 1 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 2 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 3 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 4 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 5 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 6 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 7 is now offline

When waking the system from sleep suspend:

Jun 30 18:14:30 workstation kernel: ACPI: PM: Low-level resume complete
Jun 30 18:14:30 workstation kernel: ACPI: PM: Restoring platform NVS memory
Jun 30 18:14:30 workstation kernel: Enabling non-boot CPUs ...
Jun 30 18:14:30 workstation kernel: x86: Booting SMP configuration:
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 1 APIC 0x2
Jun 30 18:14:30 workstation kernel: CPU1 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 2 APIC 0x4
Jun 30 18:14:30 workstation kernel: CPU2 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 3 APIC 0x6
Jun 30 18:14:30 workstation kernel: CPU3 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 4 APIC 0x1
Jun 30 18:14:30 workstation kernel: CPU4 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 5 APIC 0x3
Jun 30 18:14:30 workstation kernel: CPU5 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 6 APIC 0x5
Jun 30 18:14:30 workstation kernel: CPU6 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 7 APIC 0x7
Jun 30 18:14:30 workstation kernel: CPU7 is up
Jun 30 18:14:30 workstation kernel: ACPI: PM: Waking up from system sleep state S3
Jun 30 18:14:30 workstation kernel: usb usb3: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb4: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb5: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb6: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb7: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb8: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: sd 0:0:0:0: [sda] Starting disk
Jun 30 18:14:30 workstation kernel: sd 1:0:0:0: [sdb] Starting disk
Jun 30 18:14:30 workstation kernel: sd 3:0:0:0: [sdc] Starting disk
Jun 30 18:14:30 workstation kernel: tg3 0000:05:00.0 enp5s0: Link is down
Jun 30 18:14:30 workstation kernel: parport_pc 00:02: activated
Jun 30 18:14:30 workstation kernel: [drm] PCIE gen 2 link speeds already enabled
Jun 30 18:14:30 workstation kernel: [drm] PCIE GART of 2048M enabled (table at 0x00000000001D6000).
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: WB enabled
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
Jun 30 18:14:30 workstation kernel: serial 00:03: activated
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 6 use gpu addr 0x0000000080000c18
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 7 use gpu addr 0x0000000080000c1c
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_cp1' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_cp2' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_dma2' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: [drm] ring test on 0 succeeded in 3 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 1 succeeded in 1 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 2 succeeded in 1 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 3 succeeded in 6 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 4 succeeded in 5 usecs
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_uvd' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: usb 8-1: reset full-speed USB device number 2 using uhci_hcd
Jun 30 18:14:30 workstation kernel: usb 3-2: reset full-speed USB device number 3 using uhci_hcd
Jun 30 18:14:30 workstation kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jun 30 18:14:30 workstation kernel: ata6: SATA link down (SStatus 0 SControl 300)
Jun 30 18:14:30 workstation kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 30 18:14:30 workstation kernel: ata5: SATA link down (SStatus 0 SControl 300)
Jun 30 18:14:30 workstation kernel: ata1.00: configured for UDMA/133
Jun 30 18:14:30 workstation kernel: [drm] ring test on 5 succeeded in 2 usecs
Jun 30 18:14:30 workstation kernel: [drm] UVD initialized successfully.
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_vce1' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_vce2' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: [drm] ring test on 6 succeeded in 14 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 7 succeeded in 4 usecs
Jun 30 18:14:30 workstation kernel: [drm] VCE initialized successfully.
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 1 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 2 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 4 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: usb 3-1: reset low-speed USB device number 2 using uhci_hcd
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 5 succeeded
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 6 succeeded
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 7 succeeded
Jun 30 18:14:30 workstation kernel: tg3 0000:05:00.0 enp5s0: Link is up at 1000 Mbps, full duplex
Jun 30 18:14:30 workstation kernel: tg3 0000:05:00.0 enp5s0: Flow control is on for TX and on for RX
Jun 30 18:14:30 workstation kernel: [drm:si_dpm_set_power_state [radeon]] *ERROR* si_set_sw_state failed
Jun 30 18:14:30 workstation kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 30 18:14:30 workstation kernel: ata2.00: configured for UDMA/133
Jun 30 18:14:30 workstation kernel: ata4: link is slow to respond, please be patient (ready=0)
Jun 30 18:14:30 workstation kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jun 30 18:14:30 workstation kernel: ata4.00: configured for UDMA/133
Jun 30 18:14:30 workstation kernel: PM: resume devices took 7.609 seconds
Jun 30 18:14:30 workstation kernel: OOM killer enabled.
Jun 30 18:14:30 workstation kernel: Restarting tasks ... done.
Jun 30 18:14:30 workstation kernel: PM: suspend exit
Jun 30 18:14:30 workstation kernel: rfkill: input handler enabled

Jun 30 18:14:30 workstation rtkit-daemon[1007]: The canary thread is apparently starving. Taking action.
Jun 30 18:14:30 workstation systemd-resolved[970]: Clock change detected. Flushing caches.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Demoting known real-time threads.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Successfully demoted thread 24438 of process 24267 (/usr/lib64/firefox/firefox).
Jun 30 18:14:30 workstation systemd-sleep[27081]: System returned from sleep state.

Jun 30 18:14:30 workstation rtkit-daemon[1007]: Successfully demoted thread 16380 of process 14136 (/usr/lib64/firefox/firefox).
Jun 30 18:14:30 workstation systemd[1]: systemd-suspend.service: Deactivated successfully.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Successfully demoted thread 13894 of process 13727 (/usr/lib64/firefox/firefox).
Jun 30 18:14:30 workstation systemd[1]: Finished System Suspend.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Demoted 3 threads.
Jun 30 18:14:30 workstation systemd[1]: Stopped target Sleep.
Jun 30 18:14:30 workstation gdm[1101]: GLib: Source ID 91 was not found when attempting to remove it
Jun 30 18:14:30 workstation systemd[1]: Reached target Suspend.
Jun 30 18:14:30 workstation systemd[1]: Stopped target Suspend.
Jun 30 18:14:30 workstation systemd-logind[1010]: Operation 'sleep' finished.
Jun 30 18:14:30 workstation kernel: rfkill: input handler disabled

Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-2
Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1
Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-2
Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1
Jun 30 18:14:30 workstation systemd-resolved[970]: enp5s0: Bus client set DNS server list to: fdd0:edd8:b735::1

Jun 30 18:14:30 workstation chronyd[1025]: Forward time jump detected!

Jun 30 18:14:31 workstation gnome-shell[1889]: Object .MetaInputDeviceNative (0x7f9e7c14e0f0), has been already disposed — impossible to get any property from it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
Jun 30 18:14:31 workstation gnome-shell[1889]: == Stack trace for context 0x559715bb31f0 ==
Jun 30 18:14:31 workstation gnome-shell[1889]: #0   559715e9ce18 i   resource:///org/gnome/shell/ui/keyboard.js:1175 (21877dcc9470 @ 3)
Jun 30 18:14:31 workstation gnome-shell[27301]: The XKEYBOARD keymap compiler (xkbcomp) reports:
Jun 30 18:14:31 workstation gnome-shell[27301]: > Warning:          Unsupported maximum keycode 708, clipping.
Jun 30 18:14:31 workstation gnome-shell[27301]: >                   X11 cannot support keycodes above 255.
Jun 30 18:14:31 workstation gnome-shell[27301]: Errors from xkbcomp are not fatal to the X server

Jun 30 18:14:33 workstation wireplumber[2058]: <WpSiAudioAdapter:0x565336853250> Object activation aborted: proxy destroyed
Jun 30 18:14:33 workstation wireplumber[2058]: <WpSiAudioAdapter:0x565336853250> failed to activate item: Object activation aborted: proxy destroyed

Jun 30 18:14:34 workstation audit: BPF prog-id=73 op=LOAD
Jun 30 18:14:34 workstation systemd[1]: Starting Fingerprint Authentication Daemon...

Jun 30 18:14:35 workstation gnome-shell[1889]: Timelines with detached actors are not supported

And when the bug happens, upon attempting to play a YouTube video:

Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: VA-API version 1.13.0
Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: Found init function __vaDriverInit_1_13
Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: va_openDriver() returns 0
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10080msec
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000f3cb last fence id 0x000000000000f3cd on ring 5)
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 10081msec
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000007120a last fence id 0x0000000000071238 on ring 3)
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 10424msec
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10584msec
[this blah blah gets repeated hundreds of time]

Jun 30 18:15:01 workstation systemd[1]: session-20.scope: Deactivated successfully.
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 18648msec
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000007120a last fence id 0x0000000000071238 on ring 3)
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 18992msec
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 19152msec
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000f3cb last fence id 0x000000000000f3cd on ring 5)
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 19153msec
[this blah blah gets repeated a dozen more times]

Jun 30 18:15:04 workstation systemd[1]: fprintd.service: Deactivated successfully.
Jun 30 18:15:04 workstation audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 30 18:15:04 workstation audit: BPF prog-id=0 op=UNLOAD

Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 20664msec
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000007120a last fence id 0x0000000000071238 on ring 3)
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 21009msec
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 21168msec
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000f3cb last fence id 0x000000000000f3cd on ring 5)
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 21168msec
[this blah blah gets repeated two dozen more times]
Jun 30 18:15:08 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0: Saved 1217 dwords of commands on ring 0.
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0: GPU softreset: 0x0000034C
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS               = 0xA0003028
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS               = 0x200A0FC0
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000802
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x800000E3
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44CFC046
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: Wait for MC idle timedout !
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: GRBM_SOFT_RESET=0x0000DDFF
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: SRBM_SOFT_RESET=0x00120500
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS               = 0x00003028
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS               = 0x200806C0
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: GPU reset succeeded, trying to resume

Jun 30 18:15:15 workstation kernel: [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
Jun 30 18:15:15 workstation kernel: [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BFBA (len 254, WS 0, PS 4) @ 0xBFE4
Jun 30 18:15:15 workstation kernel: [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing B68E (len 94, WS 12, PS 8) @ 0xB6D7
Jun 30 18:15:15 workstation kernel: [drm] PCIE gen 2 link speeds already enabled
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: Wait for MC idle timedout !
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: Wait for MC idle timedout !
Jun 30 18:15:15 workstation kernel: [drm] PCIE GART of 2048M enabled (table at 0x00000000001D6000).
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: WB enabled
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: failed VCE resume (-22).
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_cp1' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_cp2' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_dma2' in directory '0' already present!
Jun 30 18:15:16 workstation gsd-power[2215]: Error setting property 'PowerSaveMode' on interface org.gnome.Mutter.DisplayConfig: Timeout was reached (g-io-error-quark, 24)
Jun 30 18:15:16 workstation kernel: [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
Jun 30 18:15:16 workstation kernel: [drm:si_resume [radeon]] *ERROR* si startup failed on resume
1 Like

I’ve experienced some very similar issues, though my use case is slightly different. I have the proprietary AMD OpenCL running alongside mesa (for rendering in Blender using HIP), but I’ve seen terrible lockups that take about 120 seconds to clear up, then my screen (under X window system) is a terrible missmatch of graphic pain.

For me, it happens every time I am utilizing my 6700 XT for rendering in Blender.

I’m seeing something similar. I’m using Fedora 36 on a desktop with an AMD GPU (FirePro W2100) and two monitors. i’ve been using Gnome with Xorg. If I suspend the machine, when I wake it up, everything works fine for at least a few seconds. Then the monitors go blank, go into power save for a second, then wake back up. The display is all black and white noise with some kind of square sprite that moves along with the mouse. No response to keyboard. Haven’t tried SSH-ing in. I’m always using firefox, and it seemed that this morning when I saw this happen, the actual lock up occurred when I clicked on a firefox window.

I see these errors using journalctl

Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: VA-API version 1.14.0
Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: Found init function __vaDriverInit_1_14
Jul 01 10:57:13 atreyu gnome-shell[8661]: ATTENTION: default value of option mesa_glthread overridden by environment.
Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: va_openDriver() returns 0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: VA-API version 1.14.0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Found init function __vaDriverInit_1_14
Jul 01 10:57:14 atreyu gnome-shell[8661]: ATTENTION: default value of option mesa_glthread overridden by environment.
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: va_openDriver() returns 0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: VA-API version 1.14.0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Found init function __vaDriverInit_1_14
Jul 01 10:57:14 atreyu gnome-shell[8661]: ATTENTION: default value of option mesa_glthread overridden by environment.
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: va_openDriver() returns 0
Jul 01 10:57:15 atreyu gnome-shell[24971]: [25010:25010:0701/105715.165461:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 1 times!
Jul 01 10:57:15 atreyu gnome-shell[24971]: [25010:25010:0701/105715.171032:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 2 times!
Jul 01 10:57:15 atreyu gnome-shell[24971]: [25010:25010:0701/105715.187573:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 3 times!

followed by

Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: ring 5 stalled for more than 10079msec
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000003b3d5 last fence id 0x000000000003b3d7 on ring 5)
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: Saved 7617 dwords of commands on ring 0.
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: GPU softreset: 0x0000034D
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0xA7482028
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x69000004
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200A0FC0
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010800
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00008802
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x800302E3
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44CFC046
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00120500
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x20080EC0
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Jul 01 10:57:27 atreyu /usr/libexec/gdm-x-session[2592]: radeon: Failed to deallocate virtual address for buffer:
Jul 01 10:57:27 atreyu /usr/libexec/gdm-x-session[2592]: radeon:    size      : 4096 bytes
Jul 01 10:57:27 atreyu /usr/libexec/gdm-x-session[2592]: radeon:    va        : 0x11a6fd000
Jul 01 10:57:30 atreyu kernel: [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
Jul 01 10:57:30 atreyu kernel: [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BBC8 (len 237, WS 0, PS 4) @ 0xBBD6
Jul 01 10:57:30 atreyu kernel: [drm] PCIE gen 3 link speeds already enabled
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Jul 01 10:57:30 atreyu kernel: [drm] PCIE GART of 2048M enabled (table at 0x0000000000165000).
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: WB enabled
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_cp1' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_cp2' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_dma2' in directory '0' already present!
Jul 01 10:57:31 atreyu kernel: [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
Jul 01 10:57:31 atreyu kernel: [drm:si_resume [radeon]] *ERROR* si startup failed on resume

and a long list of other error messages involving gdm-x-session, leading up to a kernel Oops.

The abrt message mentions these errors:

reason: WARNING: CPU: 4 PID: 2592 at drivers/gpu/drm/radeon/radeon_object.c:62 radeon_ttm_bo_destroy+0xde/0xf0 [radeon] [radeon]

backtrace:
WARNING: CPU: 4 PID: 2592 at drivers/gpu/drm/radeon/radeon_object.c:62 radeon_ttm_bo_destroy+0xde/0xf0 [radeon]
Modules linked in: cdc_acm tls tun ntfs3 rfcomm snd_seq_dummy snd_hrtimer xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT bridge stp llc xt_comment nf_nat_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast nft_objref nf_conntrack_tftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_syslog nft_log nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security vboxnetadp(OE) vboxnetflt(OE) ip_set nfnetlink ebtable_filter ebtables vboxdrv(OE) ip6table_filter iptable_filter qrtr bnep sunrpc binfmt_misc vfat fat squashfs loop intel_rapl_msr mei_pxp mei_wdt mei_hdcp iTCO_wdt ee1004 intel_pmc_bxt iTCO_vendor_support btusb dell_smm_hwmon btrtl btbcm btintel uvcvideo btmtk videobuf2_vmalloc videobuf2_memops bluetooth videobuf2_v4l2 intel_rapl_common
 snd_usb_audio videobuf2_common snd_usbmidi_lib videodev snd_hda_codec_realtek snd_rawmidi ecdh_generic intel_tcc_cooling rfkill mc x86_pkg_temp_thermal snd_hda_codec_generic snd_hda_codec_hdmi intel_powerclamp coretemp kvm_intel snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi pktcdvd snd_hda_codec kvm snd_hda_core dell_wmi irqbypass snd_hwdep ledtrig_audio rapl snd_seq intel_cstate snd_seq_device dell_smbios intel_uncore snd_pcm dcdbas intel_wmi_thunderbolt sparse_keymap wmi_bmof snd_timer dell_wmi_descriptor pcspkr mei_me snd i2c_i801 mei intel_pch_thermal i2c_smbus ie31200_edac soundcore acpi_pad zram amdgpu hid_logitech_hidpp iommu_v2 gpu_sched hid_logitech_dj wacom hid_multitouch ums_realtek i915 crct10dif_pclmul crc32_pclmul crc32c_intel radeon firewire_ohci ghash_clmulni_intel e1000e serio_raw firewire_core crc_itu_t drm_ttm_helper drm_buddy uas ttm usb_storage drm_dp_helper wmi video ip6_tables ip_tables analog gameport joydev ipmi_devintf ipmi_msghandler fuse
CPU: 4 PID: 2592 Comm: Xorg Tainted: G           OE     5.18.7-200.fc36.x86_64 #1
Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 2.18.1 07/09/2021
RIP: 0010:radeon_ttm_bo_destroy+0xde/0xf0 [radeon]
Code: 00 00 00 74 0f 48 8b b3 b0 01 00 00 48 89 df e8 c8 ee 25 fc 48 89 df e8 70 f9 24 fc 4c 89 e7 5b 5d 41 5c 41 5d e9 32 da cf fb <0f> 0b eb cd 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00
RSP: 0018:ffffb27f83af7ce8 EFLAGS: 00010283
RAX: ffff94bccd196270 RBX: ffff94bccd196078 RCX: ffff94bbaa3a8800
RDX: ffff94bacecf8480 RSI: ffff94bccd196000 RDI: ffff94bdfc1c1cc8
RBP: ffffffffffffffff R08: 0000000000000000 R09: 000000008020001f
R10: ffff94bc9fe76ba8 R11: ffffb27f83af7d20 R12: ffff94bccd196000
R13: ffff94baea98c058 R14: ffff94baea98c040 R15: 000000000000008f
FS:  00007f06bb428fc0(0000) GS:ffff94bdea500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000011109f109fb8 CR3: 000000043c2f0005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 radeon_bo_unref+0x1a/0x30 [radeon]
 radeon_gem_object_free+0x20/0x30 [radeon]
 drm_gem_object_release_handle+0x69/0x80
 ? drm_gem_handle_create+0x40/0x40
 drm_gem_handle_delete+0x59/0xa0
 ? drm_gem_handle_create+0x40/0x40
 drm_ioctl_kernel+0x9b/0x140
 drm_ioctl+0x21c/0x410
 ? drm_gem_handle_create+0x40/0x40
 ? ioctl_has_perm.constprop.0.isra.0+0xaa/0xf0
 radeon_drm_ioctl+0x49/0x80 [radeon]
 __x64_sys_ioctl+0x8a/0xc0
 do_syscall_64+0x58/0x80
 ? do_syscall_64+0x67/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f06bab0776f
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
RSP: 002b:00007ffc5f62de60 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00005638ec7693b0 RCX: 00007f06bab0776f
RDX: 00007ffc5f62df08 RSI: 0000000040086409 RDI: 0000000000000011
RBP: 00007ffc5f62df08 R08: 00007f06babf8410 R09: 00005638eb3e0780
R10: 0000000000000011 R11: 0000000000000246 R12: 0000000040086409
R13: 0000000000000011 R14: 000000011a360000 R15: 00005638ed66afe0
 </TASK>

crash function: radeon_bo_unref

This problem is relatively new. I generally keep this machine up to date. I ran dnf update yesterday and the problem occurred again this morning. I don’t remember seeing it a month ago. I’ve been traveling so I can’t be much more precise about the timing.

Same general question, not sure where to report this. My best guess is the radeon driver, but does that come under kernel, xorg-x11-drv-ati (which is what abrt suggests), or what?

Possibly related:

https://bugzilla.redhat.com/show_bug.cgi?id=2022980

https://bugzilla.redhat.com/show_bug.cgi?id=2089380 (I’m adding a link to this forum entry there)

https://bugzilla.redhat.com/show_bug.cgi?id=2091306