Failed upgrade from F32 to F33 -- Kernel Panic

Hello,

Last night I started updating my system from F32 to F33 through the “software” program, under the upgrade tab. Unfortunately, there was a power outage during the night and my machine was caught between F32 and F33. Upon attempting to start up, I would be faced with the “Oh no! Something has gone wrong” screen. I could still Ctrl-Alt-F2 into a terminal and log in normally. From here, I attempted to roll back to F32 via

sudo dnf distro-sync --releasever=32 --best --allowerasing

where I required the “allowerasing” flag as many packages were obsolete and would not install otherwise.

This has left me in a worse position. Now I cannot get past the Fedora splash screen, and hitting Esc yields the following wall of text after “Starting Switch Root…” (everything before PASSED):

Wall of Text
Hardware name: LENOVO ***, BIOS J5ET41WW (1.12 ) 12/08/2014
RIP: 0010:native_smp_send_rescehdule+0x34/0x40
Code: 05 11 ba 31 01 73 15 48 8b 05 e8 1e 13 01 be fd 00 00 00 48 8b 40 30 e9 8a ad 00 00 00 00 00 0f 1f 44 00 00 53 48 83 ec 20
RSP: 0018:ffff990dbac43ec8 EFLAGS: 00010002
RAX: 0000000000000000 RBX: ffff990db9ed0000 RCX: ffffffff95258ba8
RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000046
RBP: 0000000000000000 R08: 00000000000002f9 R09: 0000000000aaaaaa
R10: 0000000000000000 R11: ffffb2edd040faa0 R12: ffffb2edc067bd68
R13: ffff990dbac5af80 R14: ffffffff9416f520 R15: ffff990dbac5b0b8
FS: 0000000000000000(0000) GS:ffff990dbac40000(0000) knlGS:00000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb4cd564127 CR3: 000000001220e004 CR4: 00000000003606e0
Call Trace:
  <IRQ>
  update_process_times+0x4f/0x60
  tick_sched_handle+0x22/0x60
  tick_sched_timer+0x37/0x70
  _hrttimer_run_queues+0x100/0x280
  hrttimer_interrupt+0x100/0x220
  ? ktime_get+0x36/0xa0
  smp_apic_timer_interrupt+0x6a/0x140
   apic_timer_interrupt+0xf/0x20
  </IRQ>
RIP: 0010:panic+0x261/0x2a7
then another "code" followed by the 6 lines of values for R* again
  do_exit.cold.22+0x1a/0x81
  do_group_exit+0x3a/0xa8
  _x64_sys_exit_group+0x14/0x20
  do_syscall_64+0x5b/0x168
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f8ad1fd9151
  Code: Bad RIP value.
Six more lines for the R* values
----[end trace 431b009f370414ac ]----

As I could do nothing in the state it’s in, I attempted to reinstall Fedora from a Live USB (thankfully, /home is on a separate partition). However, I am unable to do so as the boot loader complains of lack of room on the drive. This is likely due to the fact that the drive has two partitions, a large virtual one for /, /home, and swap, and a smaller ext4 one for /boot.

At this point, I am hoping someone can help explain some way of (ideally) cleaning up my install or otherwise accessing the /home partition in this state. That way, I can copy my /home to some external storage device so I can wipe my machine and start clean.

Thanks for the help!

1 Like

Wow, this sounds awful! A thought would be to boot a live USB version and then mount the /home partition (maybe even as read-only) and copy it to an external drive that way.

1 Like

In short, it should be like this:

  • Boot into Fedora 33 Live session.
  • Mount your root and boot volumes/partitions.
  • Use dnf --installroot and/or rpm --root to fix your installation.

It was certainly a bad way to start the day! But yes, if all else fails I can always fall back to this (thankfully).

@vgaetera Thank you for the links, they were certainly helpful.

I believe everything is correctly rolledback on the /root partition; I was able to successfully distro-sync back to F32 after deleting some F33 packages. However, the kernel panic issue persists. I found that an older kernel is being loaded (5.0.16-100.fc28), and so I tried to instead load a newer one (5.8.18-200.fc32) via grub2-mkconig. As I understand, to use grub2 utilities I need to first run

sudo chroot /mnt/root 

but this returns the following Traceback complaining about glibc v 2.32

Traceback
Traceback (most recent call last):
File "/usr/bin/register-python-argcomplete", line 64, in <module>
  sys.stdout.write(argcomplete.shellcode(
  TypeError: shellcode() takes from 1 to 4 positional arguments but 5 were given
flatpak: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by /lib64/liblzma.so.5)
/bin/ps: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by /lib64/liblzma.so.5)
/bin/basename: missing operand
Try '/bin/basename --help' for more information.

where the latest version of glibc for F32 (as far as I can tell) is 2.31

sudo dnf --installroot=/mnt/root install glibc
Package glibc-2.31-4.fc32.x86_64 is already installed.

I am now at the point where I have this newer kernel on my /root partition (/root/usr/src/kernels) but I cannot seem to find a way to boot with it without using these grub2 utilities, which I cannot use. If you could point me to some documentation explaining how to manually update the kernel used in /boot (all references I find use grub2-mkconfig), I could make some headway again.

You may need to verify package integrity:

sudo rpm --root /mnt/root -V -a

And reinstall corrupted packages if any.

Use RPM/DNF to list, remove, or install kernels:

sudo rpm --root /mnt/root -q -a kernel\*
sudo dnf --installroot=/mnt/root --releasever=32 install kernel

Then you can use chroot to restore GRUB.

@vgaetera there are no corrupted packages present. I have the appropriate kernel installed as well. Unfortunately, I am still unable to run the chroot command on the mounted root partition (see Traceback in previous post).

I did try simply reinstalling GRUB2 onto /dev/sda. While this did not resolve the kernel panic issue with F32, I can get some progress with my F30 rescue that now puts me into dracut emergency shell. It recommended attaching the lengthy report, which due to the character limit I simply included the tail end (removing the “timeout” messages as well)

rdsosreport.txt
[   28.490433] localhost systemd[1]: Received SIGRTMIN+20 from PID 473 (plymouthd).
[   63.745752] localhost systemd[1]: Received SIGRTMIN+20 from PID 473 (plymouthd).
[  120.152380] localhost systemd[1]: Received SIGRTMIN+20 from PID 473 (plymouthd).
[  121.245438] localhost systemd[1]: Received SIGRTMIN+20 from PID 473 (plymouthd).
[  133.472207] localhost dracut-initqueue[471]: Warning: dracut-initqueue timeout - starting timeout scripts
[  133.505382] localhost dracut-initqueue[471]: Scanning devices sda2  for LVM volume groups
[  133.513797] localhost dracut-initqueue[471]: Reading all physical volumes. This may take a while...
[  133.546965] localhost dracut-initqueue[471]: Found volume group "fedora" using metadata type lvm2
[  133.553622] localhost dracut-initqueue[471]: PARTIAL MODE. Incomplete logical volumes will be processed.
[  133.598924] localhost dracut-initqueue[471]: 3 logical volume(s) in volume group "fedora" now active
[  195.077822] localhost dracut-initqueue[471]: Warning: Could not boot.
[  195.127158] localhost systemd[1]: Received SIGRTMIN+20 from PID 473 (plymouthd).
[  195.127459] localhost dracut-initqueue[471]: Warning: /dev/disk/by-uuid/2b82edc2-4eb2-44a0-8b5b-c71da0de9b3a does not exist
[  195.135773] localhost systemd[1]: Starting Setup Virtual Console...
[  195.139275] localhost systemd[1]: Started Setup Virtual Console.
[  195.139641] localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-vconsole-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  195.140181] localhost kernel: kauditd_printk_skb: 2 callbacks suppressed
[  195.140618] localhost kernel: audit: type=1130 audit(1605382638.301:13): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-vconsole-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  195.140525] localhost systemd[1]: Starting Dracut Emergency Shell...
[  195.140999] localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-vconsole-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  195.141520] localhost kernel: audit: type=1131 audit(1605382638.301:14): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-vconsole-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  195.158527] localhost systemd[1]: Received SIGRTMIN+21 from PID 473 (plymouthd).
[  195.159225] localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=plymouth-start comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  195.159761] localhost kernel: audit: type=1131 audit(1605382638.321:15): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=plymouth-start comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

again I encounter a bit of a wall. I am not familiar with dracut (I did check that all my paritions are ACTIVE), and the chroot command still fails so I cannot use the grub2 utilities.

@vgaetera the issues have been resolved, and I am able to run F33.

Firstly, I had to update again to F33 on my mounted root partition. This allowed me to use the latest glibc, and so I was capable of using chroot. Once I chroot into my sda root directory, I was able to reinstall the latest kernel, which incidentally removed the active one. Just to verify everything is correct in boot, I ran grub2-mkconfig.

I additionally followed the Fedora Upgrade Documentation, just to ensure the installation was 100% correct.

Unfortunately, the /home filesystem had some corruption (turned out to be python/opencv), but running fsck in rescue mode solved the remaining issues.

Thanks for the help, this was certainly an exciting upgrade.

2 Likes