I previously saw a general description of the Linux boot process:
I wanted to know more specific details, so I searched for some information and summarized it.
1. CPU Reset#
When we press the power button, the power starts to supply the device. At the beginning, the power supply is unstable. After the motherboard detects it, it will continuously send a RESET signal to the CPU. At this time, the CPU will clear the residual data in the registers and set the registers:
The most important registers in this process are CS and EIP. The implicit Base in CS will be added to RIP, which is . The address stored in this address (also known as the reset vector) is a jump instruction, and the destination address of the jump is the entry address of the BIOS.
Why is the reset vector not directly set to the entry address of the BIOS, but a jump is needed?
X86 chips initially run in real mode, which has a 20-bit addressing space (1M). The reset vector is fff0H, and there is only a short distance to ffffH, so the BIOS program cannot fit in. Therefore, a jump is needed.
As for why the reset vector is placed at a high address in the address space, it is to free up more space for memory.
Memory layout in real mode
CPU Reset in Multi-Core Systems#
In a multi-core computer, a certain protocol needs to be executed at the beginning, and the system will select a CPU to execute it. The other CPUs are in a waiting state. The selected main CPU is called the Bootstrapping CPU (BSP), and only the BSP continues to execute, while the other CPUs wait for instructions from the BSP.
2. BIOS Execution#
BIOS is a program fixed in EPROM, generally written by hardware manufacturers, and it is responsible for booting the system.
Power-On Self Test (POST)#
The first program that BIOS executes is POST, which is used to detect the hardware components when the computer is powered on. If non-serious errors are detected during the self-test, the system will provide prompt messages or beep warnings based on the detection codes.
Why use beeps to sound the alarm?
Because there was no integrated graphics in the past, the graphics card was always an external device. When POST is executed, the graphics card has not completed initialization, so it cannot display error messages on the screen and can only sound the alarm.
The detection process is performed one by one, and the BIOS manufacturer provides a POST CODE
for each device. When a device is being tested, this POST CODE will be loaded into the diagnostic port. If the test fails, this CODE will be retained for alarm purposes.
Device Initialization#
After POST, BIOS will also call the BIOS of each peripheral device for self-test and initialization, such as the BIOS of the graphics card. All devices are initialized and started at this time.
In addition to initializing devices, BIOS also initializes the interrupt vector table at this time.
Bootloader Startup#
After detecting and initializing various devices, BIOS will search for the user-defined boot order, which is usually the disk by default, but can also be a CD-ROM or USB drive (which was used for reinstalling the operating system in the past):
According to the order, BIOS will read the Master Boot Record (MBR) of the device at the front, which is the first sector of the first track (512 bytes), and load it to the absolute address 0x7C00 in RAM, and then jump to this address.
MBR format
MBR does not belong to any operating system. Its 512-byte content is as follows:
- Boot code (446B): Checks the accuracy of the partition table and transfers control to the bootloader program on the hard disk (such as GRUB).
- Disk partition table (16 x 4B): DPT, composed of 4 partition tables, each 16B, describes the disk partition information.
- End flag (2B)
UEFI
When we talk about BIOS now, we mainly refer to UEFI, not the traditional BIOS. Whether it is the traditional BIOS or UEFI, they both go through the process of ROM→RAM→BOOT, and the main process is the same. So why do we need UEFI?
You can refer to this answer: What is the difference between UEFI and BIOS in terms of principles?
3. Bootloader#
The so-called Bootloader program is used to load the operating system kernel file into memory. The main functions of such programs are:
- Transition from real mode to protected mode, from 16-bit addressing space to 32-bit addressing space, and enable segment mechanism.
- Read the ELF format Kernel from the hard disk (which is the sector following the MBR) and place it in a fixed position in memory. This process is generally done in two steps, and the final step is to execute the boot instruction, which loads the system boot menu (
/boot/grub/menu.lst
or grub.lst), the kernel vmlinuz, and the RAM disk initrd.
Here, we take GNU's grub as an example. Grub can be used to select different kernels on the operating system partition and pass startup parameters to these kernels. The loading of grub generally includes two steps: - Load the basic boot program-stage1, whose main function is to load the second boot program: stage2.
- Load the second boot program-stage2, which is used to bring out more advanced functions to allow users to load a specific operating system. In grub, this step is to provide users with a display menu or let users enter commands. The final state of this stage is to execute the boot command, which loads the operating system kernel into memory and transfers control to it.
After the kernel is loaded, the memory is mapped as follows:
4. Linux Kernel Configuration and Startup#
After the Linux kernel starts, a series of checks are performed. After the checks are completed, it jumps to the start_kernel function, which will initialize various modules in sequence, such as page tables, interrupt vectors, etc., and then becomes process 0.
Process 0 will fork process 1, the kernel_init process. The kernel_init process will execute the init program, roughly as follows:
init program
- The init process reads the
/etc/inittab
file, which sets the run level of Linux and determines the process running mode.- Linux executes the first user-level file
/etc/rc.d/rc.sysinit
, which includes functions such as setting the PATH environment variable, configuring the network, starting the swap partition, setting up /proc, system functions, etc.- Read the files in
/etc/modules.conf
and the/etc/modules.d
directory to load system kernel modules.- Start some services, and then execute the
/etc/rc.d/rc.local
file.- Finally, execute the
/bin/login
program to start the login interface and prompt the user to enter a username and password.
After the user startup is completed, process 0 enters cpu_idle and becomes the idle process. At this point, Linux is almost ready to start.