Moving GPU drivers out of the initramfs
The firmware which drm/kms drivers need is becoming bigger and bigger and there is a push to move to generating a generic initramfs on distro's builders and signing the initramfs with the distro's keys for security reasons. When targetting desktops/laptops (as opposed to VMs) this means including firmware for all possible GPUs which leads to a very big initramfs.
This has made me think about dropping the GPU drivers from the initramfs and instead make plymouth work well/better with simpledrm (on top of efifb). A while ago I discussed making this change for Fedora with the Red Hat graphics team spoiler: For now nothing is going to change.
Let me repeat that: For now there are no plans to implement this idea so if you believe you would be impacted by such a change: Nothing is going to change.
Still this is something worthwhile to explore further.
Advantages:
1. Smaller initramfs size:
* E.g. a host specific initramfs with amdgpu goes down from 40MB to 20MB
* No longer need to worry about Nvidia GSP firmware size in initrd
* This should also significantly shrink the initrd used in liveimages
2. Faster boot times:
* Loading + unpacking the initrd can take a surprising amount of time. E.g. on my old AMD64 embedded PC (with BobCat cores) the reduction of 40MB -> 20MB in initrd size shaves approx. 3 seconds of initrd load time + 0.6s seconds from the time it takes to unpack the initrd
* Probing drm connectors can be slow and plymouth blocks the initrd -> rootfs transition while it is busy probing
3. Earlier showing of splash. By using simpledrm for the splash the splash can be shown earlier, avoiding the impression the machine is hanging during boot. An extreme example of this is my old AMD64 embedded PC, where the time to show the first frame of the splash goes down from 47 to 9 seconds.
4. One less thing to worry about when trying to create a uniform desktop pre-generated and signed initramfs (these would still need support for nvme + ahci and commonly used rootfs + lvm + luks).
Disadvantages:
Doing this will lead to user visible changes in the boot process:
1. Secondary monitors not lit up by the efifb will stay black during full-disk encryption password entry, since the GPU drivers will now only load after switching to the encrypted root. This includes any monitors connected to the non boot GPU in dual GPU setups.
2. With simpledrm plymouth does not get the physical size of the monitor, so plymouth will need to switch to using heuristics on the resolution instead of DPI info to decide whether or not to use hidpi (e.g. 2x size) rendering and even when switching to the real GPU driver plymouth needs to stay with its initial heuristics based decision to avoid the scaling changing when switching to the real driver which would lead to a big visual glitch / change halfway through the boot.
This may result in a different scaling factor for some setups, but I do not expect this really to be an issue.
3. On some (older) systems the efifb will not come up in native mode, but rather in 800x600 or 1024x768.
This will lead to a pretty significant discontinuity in the boot experience when switching from say 800x600 to 1920x1080 while plymouth was already showing the spinner at 800x600.
One possible workaround here is to add: 'video=efifb:auto' to the kernel commandline which will make the efistub switch to the highest available resolution before starting the kernel. But it seems that the native modes are simply not there on systems which come up at 800x600 / 1024x768 so this does not really help.
This does not actually break anything but it does look a bit ugly. So we will just need to document this as an unfortunate side-effect of the change and then we (and our users) will have to live with this (on affected hardware).
4. On systems where a full modeset is done the monitor going briefly black from the modeset will move from being just before plymouth starts to the switch from simpledrm drm to the real driver. So that is slightly worse. IMHO the answer here is to try and get fast modesets working on more systems.
5. On systems where the efifb comes up in the panel's native mode and a fast modeset can be done, the spinner will freeze for a (noticeable) fraction of a second as the switch to the real driver happens.
Preview:
To get an impression what this will look / feel like on your own systems, you can implement this right now on Fedora 40 with some manual configuration changes:
1. Create /etc/dracut.conf.d/omit-gpu-drivers.conf with:
omit_drivers+=" amdgpu radeon nouveau i915 "
And then run "sudo dracut -f" to regenerate your current initrd.
2. Add to kernel commandline: "plymouth.use-simpledrm"
3. Edit /etc/selinux/config, set SELINUX=permissive this is necessary because ATM plymouth has issues with accessing drm devices after the chroot from the initrd to the rootfs.
Note this all assumes EFI booting with efifb used to show the plymouth boot splash. For classic BIOS booting it is probably best to stick with having the GPU drivers inside the initramfs.
This has made me think about dropping the GPU drivers from the initramfs and instead make plymouth work well/better with simpledrm (on top of efifb). A while ago I discussed making this change for Fedora with the Red Hat graphics team spoiler: For now nothing is going to change.
Let me repeat that: For now there are no plans to implement this idea so if you believe you would be impacted by such a change: Nothing is going to change.
Still this is something worthwhile to explore further.
Advantages:
1. Smaller initramfs size:
* E.g. a host specific initramfs with amdgpu goes down from 40MB to 20MB
* No longer need to worry about Nvidia GSP firmware size in initrd
* This should also significantly shrink the initrd used in liveimages
2. Faster boot times:
* Loading + unpacking the initrd can take a surprising amount of time. E.g. on my old AMD64 embedded PC (with BobCat cores) the reduction of 40MB -> 20MB in initrd size shaves approx. 3 seconds of initrd load time + 0.6s seconds from the time it takes to unpack the initrd
* Probing drm connectors can be slow and plymouth blocks the initrd -> rootfs transition while it is busy probing
3. Earlier showing of splash. By using simpledrm for the splash the splash can be shown earlier, avoiding the impression the machine is hanging during boot. An extreme example of this is my old AMD64 embedded PC, where the time to show the first frame of the splash goes down from 47 to 9 seconds.
4. One less thing to worry about when trying to create a uniform desktop pre-generated and signed initramfs (these would still need support for nvme + ahci and commonly used rootfs + lvm + luks).
Disadvantages:
Doing this will lead to user visible changes in the boot process:
1. Secondary monitors not lit up by the efifb will stay black during full-disk encryption password entry, since the GPU drivers will now only load after switching to the encrypted root. This includes any monitors connected to the non boot GPU in dual GPU setups.
Generally speaking this is not really an issue, the secondary monitors will light up pretty quickly after the switch to the real rootfs. However when booting a docked laptop, with the lid closed and the only visible monitor(s) are connected to the non boot GPU, then the full-disk encryption password dialog will simply not be visible at all.
This is the main deal-breaker for not implementing this change.
Note because of the strict version lock between kernel driver and userspace with nvidia binary drivers, the nvidia binary drivers are usually already not part of the initramfs, so this problem already exists and moving the GPU drivers out of the initramfs does not really make this worse.
This is the main deal-breaker for not implementing this change.
Note because of the strict version lock between kernel driver and userspace with nvidia binary drivers, the nvidia binary drivers are usually already not part of the initramfs, so this problem already exists and moving the GPU drivers out of the initramfs does not really make this worse.
2. With simpledrm plymouth does not get the physical size of the monitor, so plymouth will need to switch to using heuristics on the resolution instead of DPI info to decide whether or not to use hidpi (e.g. 2x size) rendering and even when switching to the real GPU driver plymouth needs to stay with its initial heuristics based decision to avoid the scaling changing when switching to the real driver which would lead to a big visual glitch / change halfway through the boot.
This may result in a different scaling factor for some setups, but I do not expect this really to be an issue.
3. On some (older) systems the efifb will not come up in native mode, but rather in 800x600 or 1024x768.
This will lead to a pretty significant discontinuity in the boot experience when switching from say 800x600 to 1920x1080 while plymouth was already showing the spinner at 800x600.
One possible workaround here is to add: 'video=efifb:auto' to the kernel commandline which will make the efistub switch to the highest available resolution before starting the kernel. But it seems that the native modes are simply not there on systems which come up at 800x600 / 1024x768 so this does not really help.
This does not actually break anything but it does look a bit ugly. So we will just need to document this as an unfortunate side-effect of the change and then we (and our users) will have to live with this (on affected hardware).
4. On systems where a full modeset is done the monitor going briefly black from the modeset will move from being just before plymouth starts to the switch from simpledrm drm to the real driver. So that is slightly worse. IMHO the answer here is to try and get fast modesets working on more systems.
5. On systems where the efifb comes up in the panel's native mode and a fast modeset can be done, the spinner will freeze for a (noticeable) fraction of a second as the switch to the real driver happens.
Preview:
To get an impression what this will look / feel like on your own systems, you can implement this right now on Fedora 40 with some manual configuration changes:
1. Create /etc/dracut.conf.d/omit-gpu-drivers.conf with:
omit_drivers+=" amdgpu radeon nouveau i915 "
And then run "sudo dracut -f" to regenerate your current initrd.
2. Add to kernel commandline: "plymouth.use-simpledrm"
3. Edit /etc/selinux/config, set SELINUX=permissive this is necessary because ATM plymouth has issues with accessing drm devices after the chroot from the initrd to the rootfs.
Note this all assumes EFI booting with efifb used to show the plymouth boot splash. For classic BIOS booting it is probably best to stick with having the GPU drivers inside the initramfs.