Kernel Memory Problems in Fedora 30-31 Using Kernel 5.3

I’ve been having an issue on my main workstation (and only this machine) ever since the update to kernel 5.3.
No current patches to the kernel have fixed it, and nether did upgrading from 30 to 31. The issue did not occur on any kernel 5.2 releases, and doesn’t seem to be related to any userland applications.

I have tried everything to diagnose and resolve the issue but I cannot fine any useful information.

The issue is simple. After booting my system, all I have to do is wait, and my system will slowly consume all of my 16GB of RAM until the entire system starts to thrash from swapping and become unusable.

Application memory usage will be nominal, and no buffers will be allocated as all the memory is in use. The memory just disappears over the course of about 24 hours, with the system reporting it all being used with no indication of what is using it. ‘meminfo’ and ‘slabinfo’ both show little change over what is reported.

Rebooting is the only way to alleviate the issue, but is, in itself, an issue.

Anyone that can help, or that knows where to look, please give me some direction here.
I’m not even sure what info to give at this point. Everything seems normal, other than I have no free memory.
I can’t even go back to 5.2 now, as it’s no longer available in the repos for 31.

When I’ve had such problems in the past, I’ve used top to find the culprit.

I’ve already been using top. There’s a discrepancy that mysteriously grows between the total used memory, and the sum of all resident memory.

Normally if you sum the total resident memory of all running applications, it actually comes out more than the total used (due to shared memory and the like). However, on my system, the total resident memory remains nominal, while the memory usage continues to climb as it’s being consumed by some mysterious unseen something that must be living in kernel space, or at least where it seemingly can’t be seen by the likes of top (or anything else I’ve tried) beyond the total memory usage.

To sum up all the things I’ve looked for:

Excessive memory usage by userland applications:
Nothing obvious in ‘top’, ‘/proc/meminfo’, or system monitor.

Excessive buffer/cache allocations:
I wouldn’t expect this to cause the problem I’m seeing, but even so, ‘free’ reports less and less of such allocations as the free memory drops, and using ‘sysctl vm.drop_caches=3’ has little to no effect.

Excessive kernel slab usage:
There doesn’t seem to be anything serious in ‘/proc/slabinfo’, and ‘slabtop’ just mirrors that back.

Kernel modules:
Nothing reported by lsmod looks suspicious. I wish amdgpu didn’t take up so much memory, but it doesn’t look any worse than normal.

I really don’t know here.
If you can give me something new to try, please do. I’ve tried everything I can think of.

I do keep getting dmesg errors like this:

xdg-mime[265913]: segfault at 7fff3463bfd8 ip 00007f9d01f4e61b sp 00007fff3463bfb0 error 6 in libc-2.30.so[7f9d01f46000+14f000]

But I don’t know if it’s related. (Also my second screen just went black for a few seconds, but there’s no telling what that was. It’s probably either just old, or the cheap DP to VGA adapter is acting up.)

I am no expert, but this sounds like a hidden process. Could your machine be compromised? I know that malware is pretty good at preventing detection.

That would be extraordinarily bizarre, and doesn’t quite explain why it only occurs on kernel 5.3.

Not only is Linux malware rare, but it’s also tyically rather specific, and I’m not a very large target.
A very small, low-value target with enterprise grade network security to help keep things secure.
My network exists behind a dedicated firewall, with most of my machines running minimal installs with internal software firewalls and SELinux enabled, that can only be accessed externally via ssh tunnel through another dedicated server using key based authentication with dedicated ed25519 or rsa-4096bit keys.
That seems like a lot to go through for so little gain (No saying there aren’t people who would try).

The only packages I’ve installed are from know-good repos, and on occasion, installed from source directly from github. What’s more, I’ve done nothing to prompt any sort of attack, and this didn’t start under any mysterious circumstances, it started exactly after the update to 5.3.

If someone were to go after my machine, I’d expect them to be using it for something like cryptomining, but that likely would leave some sort of trace in either CPU or GPU usage, and there’s no sign of that.

While this machine is probably one of my least secure, I’d expect any attacked to go after me brother machine that’s still running Windows 7.
It might be possible, but I think, astronomically unlikely.

Seeing as this is userland stuff, shouldn’t that be visible in top, atop or htop? I personally like htop because it has a column MEM% that you can click on to sort it. It should show you which processes are the culprits. Also, userland means non-kernel stuff right? So it should be an application or a service. It has to be visible somewhere.

I hope this helps.

Whatever it is doesn’t seem to be living in userland (or ipc shm, or kernel slabs for that matter).

If it where I should have seen it by now.
The highest memory usage in userland is Firefox, which is normal, and killing it has minimal difference.

I recorded ‘/proc/meminfo’ over ~24 hours (twice).
This is the only consistent thing I can see.

DirectMap4k: 2056464 kB
DirectMap2M: 14667776 kB
DirectMap1G: 1048576 kB

That’s the only value that I can see changing dramatically over an extended period (other than me free memory disappearing).
This seems like it could be arbitrary though, as the difference doesn’t seem that big (relatively speaking).

I’ll try mapping things to a graph and see if that shows anything else obvious.

1 Like

Sorry m8, I am out of my depth. You might want to submit a bug to the Fedora specilists. The last time I did it was on a redhat account, but that may have changed. They may want to do a binary search on which patch is to blame.

If you find the cause, please let us know. Good luck, and I hope you find it.

Maybe some data will help. Try running vmstat -a 5 20 > mem.log shortly after rebooting and then post the result. If something is really eating memory this will give some indication as to how fast and we can then proceed from there.

@cptgraywolf did you find out what your problem is?

I am experiencing similar OS behavior. .

I have to reboot my F31 once a day, because it will suck up my 12GB RAM + 8GB Swap.

Closing all apps does not free the memory. But I noticed that exiting Gnome session frees most of it.

Like you, I suspected the Kernel because it all started around the 5.x upgrade or perpahs with F31 upgrade. Not sure.

I do not know what the problem was. It went away after upgrading to kernel 5.5, confirming my suspicion that it was a kernel issue.

I did try restarting my graphical server at one time, to no avail. So it’s hard to say if it’s the same issue.

I’m also reluctant to mark this solved, as I never really found the problem. It just got fixed blindly by a random update.

Not for me, I still have the problem… and I am also on 5.5

Did you went to any other change? are you on F31?

I’m on 31, and have made no major changes that might have caused the issue to stop other than normal updates.

Out of curiosity. What hardware are you using?

I use an Asus Zenbook. It is about 2 years old. 12GB Ram and Core i7.

Actually, it was because of this laptop that I moved to Fedora, because It would recognize all HW at the time, while Ubuntu was many versions before.

And… my Fedora experience has been good. I think I joined in F19 or something. Never had a problem before. As you I thought that my problem came around new kernel v5.