The cause of the instability may be random spontaneous system reboots due to a CPU MCE errata error, with a reboot frequency of approximately three times a day.
Auto reboot I:
mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1600268852 SOCKET 0 APIC c microcode 8001138
After checking the kernel.org and Gentoo/archlinux wiki, I added some kernel boot parameters and disabled ASLR, but the machine will always restart automatically no matter how I adjust it.
This computer also has windows 10 installed on it and it is very stable.
If you’re experienced with computer troubleshooting, some help or tips would be greatly appreciated!
As a normal user, this problem is a fking disaster.
Automatic reboots caused by MCE errors are very common in the Linux world, and a large number of hardware can’t run Linux properly, with almost the same error output:
These questions are from the Fedora community, and few people are participating.
There is too much of this kind of feedback on other forums, with almost no clear solution.
This problem has been reported on bugzilla.kernel.org for almost 5 years by numerous users with different hardware.
It’s impossible to tell if it’s a problem with the hardware manufacturer or a flaw in Linux itself.
I don’t know how much help my experience with MCE errors will be, but here was my problem and solution.
I have an Intel I9-9720X and an ASUS motherboard and Fedora linux 29.
My MCE errors were not random, but I could trigger them by running a certain piece of software which it turns out relied on intels AVX advanced instruction set.
Seems thees instructions require their own timings.
After some reading, I decided to upgrade my BIOS to the latest version, and that solved the issue.
Like I said, don’t know if that helps but I thought I would put in my two cents…
random-reboots-while-idle is a known issue with 1st gen Ryzen.
Look in your BIOS for an option called Power Supply Idle Control or something similar and set it to Typical current idle. My mobo is an ASRock as well, but a different model; this option for me is under Advanced->AMD CBS->Zen Common Options. Hope it helps.
A few days ago I updated the BIOS to the latest version and disabled the “global c-state options”. This means that the computer will lose the important “deep energy saving” feature and the CPU will always be in the C1/C0 state.
Normally, it is risky to upgrade the BIOS under stable conditions. (I’ve been using this computer for three years now, and it has been very stable under Windows 10 until then.)
This wiki page from Gentoo is very comprehensive: Ryzen - Gentoo Wiki
I’ve translated the wiki pages from several sites to understand what the parameters mean.
I’ll try different solutions and I think it will be solved eventually.
disable multithreading, then it is very stable. e.g - execute this on bootup. The lockup issue disappears totally for me. It appears to be a bug with AMD and the way Linux uses the CPU
for CPU in /sys/devices/system/cpu/cpu[0-9]*; do
CPUID=$(basename $CPU)
echo “CPU: $CPUID”;
if test -e $CPU/online; then
echo “1” > $CPU/online;
fi;
When experiencing any of these problems I would always suggest updating the bios. My mobo is new and in less than 8 months there have been 4 BIOS upgrades (2 were only a month apart). Some addressing CPU and some addressing memory issues.