Fedora 33 unusable due to freeze (gsd-housekeeping or kerneloops)

Hi!

I upgraded my desktop from F30 to F33. After getting everything settled, I’m finding the machine is virtually unusable due to periodic, random, freezing.

I don’t have an AMD card. No change in hardware. Workloads vary but consistently freeze. No change in workload from when I had F30.

Looking at journalctl output, it seems to be freezing after gsd-housekeeping core dumps. When that occurs, I can’t log into the box remotely, does not answer pings, mouse freezes, can’t switch to console, etc. I have to totally reset/reboot computer. Not sure if gsd-housekeeping is symptom or culprit. I also have a ton of kerneloops.

I’ve updated the system with latest patches yesterday and still freezes.

Any suggestions how to workaround this or how to submit a bug?

My Problem Reporting tool says lots of kerneloops system failures but not enough info. It says “tained” and thus kernel maintainers cannot diagnost tainted reports. Don’t know what this means.

A kernel problem occurred, but your kernel has been tainted (flags:GD). Explanation:
D - Kernel has oopsed before
Kernel maintainers are unable to diagnose tainted reports.

Suggestions?

Thanks,

Bobby

Hi Bobby, are you aware of what modules are tainting the kernel? It sounds like you may be using some additional kernel support from external repositories. Is that the case?

Hi!

Hey, wasnt aware of any tainted modules. I did enable rpmfusion repo but not sure that I installed any kernel modules from it. Is there any easy way to tell? Thank you for the reply.

Bobby

Hi!

dmesg | grep -i taint does not show any match. I’ll keep googling for how to find tainted module.

Bobby

Hi!

dnf list installed does not show any rpms coming from rpmfusion. Only other non-fedora repos are insync (for google syncing), google (for chrome). Everything else is being installed from fedora, anaconda, updates.

One thing I did do is disable wayland so that I could run cairo-dock. But my machine freezes with or without cairo dock running.

However, this is the same setup as when I ran F30 and it never froze.

Bobby

Why did you update from F30 to F33? You worked almost a year with a unsupported version. In your case i would have installed F32.

Do you have at least older Kernels in your boot options to use ? To see if you can work.

Note that this means that there was an earlier crash. The system doesn’t want to report this one because everything might be insane because of that earlier crash already. You should be able to find the first oops and get more information.

The command

sudo journalctl -tkernel -perr -b-1

should give you all kernel errors recorded in the previous boot (-b-1 does that; just -b for the current boot) and hopefully you can find something meaningful there.

That said: it’s my experience that the current kernel series is somewhat unstable on some people’s hardware. Can you try enabling the “Rawhide Nodebug” kernel (the latest testing kernel but built for daily use rather than with slow debugging turned on):

https://fedoraproject.org/wiki/RawhideKernelNodebug

… and see if this helps with stability?

Hi!

Hey, I did not “upgrade” from F30 to F33. I did a clean install to F33. The reason I waited so long is that every time I do a new install, something like this happens :frowning:

Bobby

Hi!

journalctl -tkernel -perr -b-1

I’ve had 3-4 freezes this morning before I took a break for lunch and hike. The output is below.

I’ll try the RawhikeKernelNodebug link and see.

Thank you!

Bobby

Feb 01 10:23:40 kernel: usb usb5-port1: Cannot enable. Maybe the USB cable is bad?
Feb 01 10:23:41 kernel: nouveau 0000:01:00.0: bus: MMIO write of 8000001e FAULT at 10eb14 [ IBUS ]
Feb 01 10:23:41 kernel: usb usb5-port1: Cannot enable. Maybe the USB cable is bad?
Feb 01 10:23:43 kernel: usb usb5-port1: Cannot enable. Maybe the USB cable is bad?
Feb 01 10:23:44 kernel: usb usb5-port1: Cannot enable. Maybe the USB cable is bad?
Feb 01 10:23:44 kernel: usb usb5-port1: unable to enumerate USB device
Feb 01 10:23:45 kernel: gspca_vc032x: reg_r err -32
Feb 01 10:23:46 kernel: Bluetooth: hci0: BCM: firmware Patch file not found, tried:
Feb 01 10:23:46 kernel: Bluetooth: hci0: BCM: ‘brcm/BCM20702A1-0a5c-21e8.hcd’
Feb 01 10:23:46 kernel: Bluetooth: hci0: BCM: ‘brcm/BCM-0a5c-21e8.hcd’
Feb 01 10:35:54 kernel: watchdog: watchdog0: watchdog did not stop!

Okay, hmmm. Another thing to try is the proprietary nvidia driver, since I see you’re getting an error with noveau. That will not be compatible with the rawhide kernel and has its own set of problems, but will definitely help narrow things down.

Hi!

Hey, what made you think the errors were in the nouveau driver? I hate going to the proprietary nvidia driver since every time a new kernel is installed, my graphical boot gets grok’d.

Thanks,

Bobby

It is mentioned in your journalctl -tkernel -perr -b-1

Hi!

Thanks. Just completely missed the nouveau entry. Not sure what the hell I was seeing.

Yes, looks like nouveau. Ugh. So far, the rawhide kernel I’m using has helped. If I can go another few days w/o freezing, I’ll switch back to the regular kernel.

Whats the easiest way to switch back to regular kernels? Delete the rawhide kernel repo and then update?

Thanks,

Bobby

1 Like

You can let rawhide repo active, so you can go back if needed. Just install whatever you like. Link shows you how to select.

Hi!

Well, rawhide kernel just froze on me this morning. Bummer. Same thing – can’t report on the kerneloops.

nouveau 0000:01:00.0: bus: MMIO write of 800000f0 FAULT at 10eb14 [ IBUS ]

Should I file a bug report on the nouveau driver? Or, is that beating a dead horse deader?

Thanks,

Bobby

I think filing a bug is helpful. It’s not a dead horse, but it is a few heroic and slightly crazy (in a good way) pushing a large rock up an ever-growing hill. So, filing the bug is useful as more data but may not result in a quick solution. Definitely make sure to provide as much information as you can about your system.

Will do, thanks! I’ll post bug number here once I file it.

2 Likes

Bug 1924235

:+1:

For further testing, I re-enabled wayland and turned off cairo-dock and switched to dash-to-dock. Cairo-dock doesnt work with wayland anyway (or, I could not get it to work properly). If the system remains stable, then xorg-x11 combined wiht nouveau is probably the culprit. I’ll update bug report as well.