F32 Kernel 5.8.9 fails to boot on Macbook Pro

Hi guys, I got Fedora 32 Xfce worked very well on my Macbook Pro 13’ (Early 2015), until the kernel upgrade from 5.8.4 to 5.8.6 (and even to 5.8.9), which cause the LUKS encrypted LVM ext4 /home volume could no longer be mounted.

Already filed a bug on RHBZ, but no responds:

Any suggestions?

1 Like

Hi,
it looks like your problem is actually with the root FS, not /home.
Looking at those boot messages, it seems like there is filesystem corruption on an XFS filesystem (apparently root), so it fails to mount. The home, efi etc. errors are dependency failures, probably because it can’t mount /home if there’s no / .Have you tried to fsck/xfs_repair that filesystem? Can you still boot with kernel 5.8.4?

2 Likes

I could boot with kernel 5.8.4.

Since I didn’t know what’s wrong, I have tried some ways:

  1. Download a netinstall ISO, and reinstall the OS but keep LUKS LVM ext4 /home. The latest kernel was 5.8.7 then, failed.
  2. Download an XFCE Live ISO, and reinstall the OS but keep LUKS LVM ext4 /home. The kernel fallbacked to the version same with the ISO kernel 5.6.6-300, which is the one I’m using now and it works very well.
  3. Update kernel to the latest version (5.8.9-200), it fails to boot again.

So I have to boot with kernel 5.6.6 now, obviously I lost kernel 5.8.4 when I tried reinstalling the OS.

OK, that’s very strange, because the boot messages imply a filesystem error, and that should be there regardless of which kernel you’re using. Can you

  • fsck your filesystems (particularly the XFS one)
  • check if you can boot with a newer kernel when you only mount necessary partitons (/, /boot, possibly /boot/efi)? You can just comment everything else in /etc/fstab. You’ll have an empty home directory, so logging in won’t work, but it should not hinder the boot.

What’s your disk layout?

1 Like

My disk layout:

$ sudo lsblk
NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                             8:0    0   113G  0 disk  
├─sda1                                          8:1    0   200M  0 part  
└─sda2                                          8:2    0 112.2G  0 part  
sdb                                             8:16   1   116G  0 disk  
├─sdb1                                          8:17   1   600M  0 part  
├─sdb2                                          8:18   1   600M  0 part  /boot/efi
├─sdb3                                          8:19   1     1G  0 part  /boot
└─sdb4                                          8:20   1 113.8G  0 part  
  └─luks-65d9ed28-ea08-4ea5-a1dd-7b2b086f5e09 253:0    0 113.8G  0 crypt 
    ├─fedora_localhost--live-root             253:1    0    70G  0 lvm   /
    ├─fedora_localhost--live-swap             253:2    0   7.8G  0 lvm   [SWAP]
    └─fedora_localhost--live-home             253:3    0    36G  0 lvm   /home

sda for macOS and sdb for fedora.

And my fstab:

$ cat /etc/fstab

#
# /etc/fstab
# Created by anaconda on Fri Sep 11 20:08:01 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
/dev/mapper/fedora_localhost--live-root /                       xfs     defaults,x-systemd.device-timeout=0 0 0
UUID=1a700fc4-99a3-41e8-8d5d-9a3cde05f805 /boot                   ext4    defaults        1 2
UUID=eae8f09d-88a6-3797-896b-9e2c6b3a61bf /boot/efi               hfsplus defaults        0 2
/dev/mapper/fedora_localhost--live-home /home                   ext4    defaults,x-systemd.device-timeout=0 1 2
/dev/mapper/fedora_localhost--live-swap none                    swap    defaults,x-systemd.device-timeout=0 0 0

I tried commenting out the last 2 lines (/home and swap) then rebooted the OS with kernel 5.8.9, still no luck:

I’ll try to boot with a Live ISO and fsck /home as well as xfs_repair / later.

1 Like

Then /home is not the culprit.

That will be helpful. Probably best to also fsck /boot and /boot/efi.
But all this is very weird.

  • FS errors should not be kernel-dependent
  • the system is trying (and failing) to activate dm-raid sets when you have no RAID array
  • fsck is failing for /boot & /boot/efi, when the system obviously has no issue booting from them

Since you have no RAID, you can disable dmraid-activation.service. I doubt that it is the underlying cause, but best to cross if off the list.
You can also check the journal for the failed boot (you can also do that from the live system by mounting the root FS and using journalctl --directory=<mountpoint>/var/log/journal), some of these failing services hopefully dumped some more information there.

1 Like

I booted with a Live ISO:

[liveuser@localhost-live ~]$ sudo lsblk 
NAME                              MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0                               7:0    0   1.3G  1 loop  
loop1                               7:1    0     5G  1 loop  
├─live-rw                         253:0    0     5G  0 dm    /
└─live-base                       253:1    0     5G  1 dm    
loop2                               7:2    0    32G  0 loop  
└─live-rw                         253:0    0     5G  0 dm    /
sda                                 8:0    0   113G  0 disk  
├─sda1                              8:1    0   200M  0 part  
└─sda2                              8:2    0 112.2G  0 part  
sdb                                 8:16   1   116G  0 disk  
├─sdb1                              8:17   1   600M  0 part  
├─sdb2                              8:18   1   600M  0 part  
├─sdb3                              8:19   1     1G  0 part  
└─sdb4                              8:20   1 113.8G  0 part  
  └─sdb4_crypt                    253:2    0 113.8G  0 crypt 
    ├─fedora_localhost--live-swap 253:3    0   7.8G  0 lvm   
    ├─fedora_localhost--live-home 253:4    0    36G  0 lvm   
    └─fedora_localhost--live-root 253:5    0    70G  0 lvm   
sdc                                 8:32   1   3.8G  0 disk  
├─sdc1                              8:33   1   1.4G  0 part  /run/initramfs/live
├─sdc2                              8:34   1    11M  0 part  
└─sdc3                              8:35   1  22.9M  0 part  

fsck all partitions:

[liveuser@localhost-live ~]$ sudo xfs_repair /dev/mapper/fedora_localhost--live-root
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 3
        - agno = 0
        - agno = 2
clearing reflink flag on inodes when possible
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
[liveuser@localhost-live ~]$ sudo fsck /dev/mapper/fedora_localhost--live-home 
fsck from util-linux 2.35.1
e2fsck 1.45.5 (07-Jan-2020)
/dev/mapper/fedora_localhost--live-home: clean, 383326/2359296 files, 6465478/9425920 blocks
[liveuser@localhost-live ~]$ sudo fsck /dev/sdb1
fsck from util-linux 2.35.1
fsck.fat 4.1 (2017-01-24)
/dev/sdb1: 0 files, 1/153296 clusters
[liveuser@localhost-live ~]$ sudo fsck /dev/sdb2
fsck from util-linux 2.35.1
** /dev/sdb2
   Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
   The volume name is Linux_HFS+_ESP
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume Linux_HFS+_ESP appears to be OK.
[liveuser@localhost-live ~]$ sudo fsck /dev/sdb3
fsck from util-linux 2.35.1
e2fsck 1.45.5 (07-Jan-2020)
/dev/sdb3: clean, 99/65536 files, 64981/262144 blocks

Then rebooted with kernel 5.8.9, failed.
Rebooted with Live ISO, got the journal:

Hm, unless my kinda tired eyes deceive me there are no failures of any kind in that log? Yet it clearly boots from 5.8.9, this is extremely weird. No more FS errors and RAID weirdness is a good thing, I guess, but I have to say I’m at a bit of a loss here.

F33 isos should have kernel 5.8.6, you could see if those boot & if you can unlock & mount your drives from within that live system.

Yep, I did this:

sudo systemctl disable dmraid-activation.service

And I tried following steps:

  1. I created a F33 Xfce Live USB and booted with it successfully. As a external storage, LUKS can be decrypted. Meanwhile, LVM root(xfs) and LVM home(ext4) both can be mounted (rw). So far so good.
  2. Then I started the f33 (pre-release) installation procedure – reformatted the LVM root with xfs, kept the LVM home(ext4). It was a perfect installation, after which I tried booting with f33 (kernel 5.8.6). Unfortunately, the issue was still there.
    See the journal here:
    https://paste.centos.org/view/b88fed31
  3. I repeated the f33 installation, but reformatted the LVM root with ext4, kept the LVM home(ext4). The failure seemed to be same.
    See the journal here:
    https://paste.centos.org/view/f28de1e3
  4. At last, I reinstalled f32 again, and reformatted the LVM root with ext4, kept the LVM home(ext4). f32 (kernel 5.6.6) works very well.
  5. I made an update, sudo dnf up -y , Then, rebooted to kernel 5.8.10, it hang there again.
    See the journal here:
    https://paste.centos.org/view/00428ad5

@lcts Thank you so much for your help.

1 Like

It’s still very weird to me that those errors do not seem to be reflected in the log. Also interesting that the F33 live system boots - so 5.8.6 works when run from the live system, but not when run from an installed root FS. Maybe something to do with some kernel module that went missing?

Anyways, given that we’ve successfully excluded any filesystem problems, or anything related to the /home filesystem this really does look like some weird kernel bug.

I’d suggest you update your bug report with the info that

  • this error persists after a system reinstall with both F32 & F33
  • 5.8.6 works from the live but not from the installed system
  • is independent of the root FS (ext4/xfs)
  • independent of whether /home gets mounted or not
    and, hopefully the kernel devs will have some ideas.

Thank you for your suggestion, I’ve updated the bug report.

1 Like

I am assuming that you open LUKS after fully boot.

Can you list all packages to be upgraded?

My point is the problem may relate to dracut and initramfs. Live cd has a proper initramfs and after loading kernel, you use tools from / to decrypt LUKS. But the initramfs/dracut that is wrapped with installer cannot decrypt your / from LUKS.

1 Like

After booting failed with kernel 5.8.10, the journal could write to /var/log/journal, so AFAICS the / had been decrypted successfully.

1 Like

Looks similar to this one. Happens in Rawhide once. Your log shows lots of lsetfilecon errors and EFI as vfat should not be able to have SELinux labels. So I guess grub is not properly run for the new kernel.

1 Like

grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg and add enforcing=0 in grub menu, didn’t work.

Another post suggests to reinstall efi package.

$ rpm -qa | grep efi   
grub2-efi-ia32-cdboot-2.04-23.fc32.x86_64
efibootmgr-16-7.fc32.x86_64
efi-filesystem-4-4.fc32.noarch
grub2-efi-x64-cdboot-2.04-23.fc32.x86_64
python3-olefile-0.46-9.fc32.noarch
efi-srpm-macros-4-4.fc32.noarch
grub2-efi-ia32-2.04-23.fc32.x86_64
grub2-efi-x64-2.04-23.fc32.x86_64
efivar-libs-37-7.fc32.x86_64
grub2-tools-efi-2.04-23.fc32.x86_64
$ sudo dnf history info 16                  
Transaction ID : 16
Begin time     : Mon 28 Sep 2020 01:41:12 PM CST
Begin rpmdb    : 1828:ded5bd04afdf83b8889f65a7b139429a21edd83a
End time       : Mon 28 Sep 2020 01:41:19 PM CST (7 seconds)
End rpmdb      : 1828:ded5bd04afdf83b8889f65a7b139429a21edd83a
User           : Pany <pany>
Return-Code    : Success
Releasever     : 32
Command Line   : reinstall efi-filesystem.noarch efi-srpm-macros.noarch efibootmgr.x86_64 efivar-libs.x86_64 grub2-efi-ia32.x86_64 grub2-efi-ia32-cdboot.x86_64 grub2-efi-x64.x86_64 grub2-efi-x64-cdboot.x86_64 grub2-tools-efi.x86_64 python3-olefile.noarch
Comment        : 
Packages Altered:
    Reinstall   grub2-efi-ia32-1:2.04-23.fc32.x86_64        @updates
    Reinstalled grub2-efi-ia32-1:2.04-23.fc32.x86_64        @@System
    Reinstall   grub2-efi-ia32-cdboot-1:2.04-23.fc32.x86_64 @updates
    Reinstalled grub2-efi-ia32-cdboot-1:2.04-23.fc32.x86_64 @@System
    Reinstall   grub2-efi-x64-1:2.04-23.fc32.x86_64         @updates
    Reinstalled grub2-efi-x64-1:2.04-23.fc32.x86_64         @@System
    Reinstall   grub2-efi-x64-cdboot-1:2.04-23.fc32.x86_64  @updates
    Reinstalled grub2-efi-x64-cdboot-1:2.04-23.fc32.x86_64  @@System
    Reinstall   grub2-tools-efi-1:2.04-23.fc32.x86_64       @updates
    Reinstalled grub2-tools-efi-1:2.04-23.fc32.x86_64       @@System
    Reinstall   efi-filesystem-4-4.fc32.noarch              @fedora
    Reinstalled efi-filesystem-4-4.fc32.noarch              @@System
    Reinstall   efi-srpm-macros-4-4.fc32.noarch             @fedora
    Reinstalled efi-srpm-macros-4-4.fc32.noarch             @@System
    Reinstall   efibootmgr-16-7.fc32.x86_64                 @fedora
    Reinstalled efibootmgr-16-7.fc32.x86_64                 @@System
    Reinstall   efivar-libs-37-7.fc32.x86_64                @fedora
    Reinstalled efivar-libs-37-7.fc32.x86_64                @@System
    Reinstall   python3-olefile-0.46-9.fc32.noarch          @fedora
    Reinstalled python3-olefile-0.46-9.fc32.noarch          @@System

And the journal seems nothing different.

You are right. The log shows that LUKS is decrypted and swap is activated too. initrd has reached to its end of duty and / is loaded.

I just installed a fresh FC33 onto a kvm instance, and it seems your process stops at the last scan of Starting Create Static Device Nodes in /dev and thus coldplug devices and rebuild hardware database are not processed at all.

Then it is like a driver issue and I think it is safe to submit a bug report to Bugzilla.

Thank you for giving me a hand.

Hello all, I am having a similar issue on an early 2015 Macbook Air, and I’ve reported my bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1878596

I noticed that this was related to booting off of an SD card but I am having the same issue with the internal SSD. In the OP’s bug thread, it seems a kernel patch has been committed fixing this issue, I’m wondering if the OP still experiences this issue on kernel 5.8.14 or 5.8.15?