Bootstrapping BSD

Chapter 2 FreeBSD System Programming
Prev	top	Next

Copyright(C) 2001,2002,2003,2004 Nathan Boeger and Mana Tominaga

2.1 Bootstrapping BSD
Bootstrap : to promote or develop by initiative and effort with little or no assistance <bootstrapped herself to the top> www.m-w.com

Bootstrapping a computer refers to the process of loading an operating system. The process is: initializing the hardware, reading a small amount of code into memory, and executing that code. This small bit of code then loads a larger operating system. Once the operating system is loaded, it then needs to create its entire environment. This process, called bootstrapping a computer, is a complicated, highly platform-specific process.

In this chapter we explore the i386 bootstrap process on FreeBSD in detail. The specific concepts and processes are similar to NetBSD and OpenBSD's bootstrap programs on i386 as well. Note that some Assembly code will be needed to actually accomplish the task of booting an i386 based system. However, we don't review the assembly code in detail and focus on the high level concepts mainly, so the discussion should make sense even if you aren't an expert.

Note: Although certain concepts discussed in this chapter, particularly the real and protected modes, do not exist in modern hardware architectures such as PPC and Alpha, the i386 BSD base is by far the largest and will continue to be (with the notable exception of Mac OS X), and should be relevant for many. If you're interested in the boot system details, you're likely to have custom kernel needs, custom filesystems, and device drivers. And, i386 architectures are widely used with embedded systems. Given the install base, the i386 platform and its issues will still continue to be relevant for the next few years. Even the newer 64 bit CPU's as far as we know will have the same boot process.

2.2 FreeBSD's Bootstrap Process

FreeBSD uses a three-phase bootstrap process. When you power on a computer or reboot, once the BIOS completes its system tests, it will load the first track from disk0 into memory. (Each process uses programs of 512 bytes, taking up exactly one block of a hard disk.) This first track is known as the master boot record (MBR) and this is the boot0 program, the first to be executed and loaded by the computer. The second program, boot1, is again fixed to 512 bytes and knows enough to read disk slice information and load boot2. Once boot2 is loaded it has the ability to boot the system directly or load the loader program, at a fixed size of 512 bytes, which is fairly sophisticated and designed to allow more control over how exactly the system boots.

boot0

The first program loaded from BIOS, boot0, a small program with a fixed size of 512 bytes, resides on the Master Boot Record (MBR). You can find the source for this program at /usr/src/sys/boot/i386/boot0. Of course the BIOS of most modern computers can be set to boot from different drives including CDROM, floppy, and IDE disks. For this chapter, we'll assume the computer is booting from the first hard disk, also known as disk drive 0, C: or, to the BIOS, as 0x80.

From this first disk's first sector, 512 bytes are read into the memory location of 0x7C00. After that, the BIOS will check for the number 0xAA55 at the memory location of 0x7DFE (the last two bytes of the boot block code). This location index number, 0xAA55 is so important in i386 that it's been given a suitable name - the magic number. That is, only if the magic number exists at the memory location of 0x7DFE will the BIOS transfer control to the memory location of 0x7C00 where the boot0 code lies.

This raises an important point when writing boot code on Intel i386 systems: Remember that the first memory location in your code (0x0) has to be an instruction. And, when the BIOS transfers control to the memory location of 0x7C00 that location must contain an instruction. This could be a simple jump to another location or the entry to the boot program's main routine. Otherwise you have no control over what the CPU is actually doing when the boot code is executed, and because the state of the registers is unknown as well, you can't rely on having proper segment or stack registers set. This small work must be done by the boot code because there is no operating system yet loaded; all I/O must be done using the BIOS routines. (The Intel CPU documentation includes a full list.)

After the boot0 program is loaded and control is transferred to it, it will set up its registers and stack information. Then, boot0 relocates itself into a lower memory location and jumps to the new address offset to its main routine. At this point the boot0 program must still be able to locate and boot other bootable disks and partitions. At its end, the boot0 code has a small listing of proper known bootable types which must contain the magic number in their last 2 bytes to be bootable.

Finally, when the boot0 has finished searching for the bootable disks and partitions, it will prompt the user with a choice. If no selection is made within a small time period or if a key is pressed, boot0 will load that next boot block into memory and transfer control to it. Again, this could be any operating system's boot code - you could set it up to load the bootstrap code for Linux or even DOS. For BSD, the next stage boot program is boot1.

boot1

Similar to boot0, boot1 is a very simple program and its total sizes is 512 bytes; it must also have the magic number located at its final 2 bytes. Its purpose is to locate and read the FreeBSD disk partition, then locate and load the boot2 program.

Although in most situations, boot1 is loaded by boot0, that order isn't necessarily the only available option. With flexible FreeBSD you have the option of using what's known as a dedicated disk. (What's telling is that it's more notoriously known as a dangerously dedicated disk.) A dedicated disk is a disk where the entire disk, or every sector of the BIOS, belongs to FreeBSD. Normally, you'd find a fdisk table, or a slice table, on a PC disk, and it's used to allow multiple operating systems to boot of a single PC disk. You can choose to use a dedicated disk and then boot straight off of boot1; the boot0 block does not need to be installed on the disk or used at all. Whatever method you choose to implement though, boot1 is a very important boot block and needs to be loaded.

Boot1 is loaded into the memory location of 0x7C00 and operates in real mode; the environment isn't set, and the registers are not in known states. The boot1 program has to set up the stack, define all segment registers, and use the BIOS interface for all I/O. Once boot1 is loaded into memory and control is transferred, it must contain an instruction as its first memory location (0x0). After all this succeeds, the boot1 program will read the system disks in search of boot2.

Once boot2 is located, the program must set up boot2's environment; boot2 is a BTX client (Boot Extender) and is a little more sophisticated than the previous boot0 and boot1. The boot1 program will need to load boot2 into the memory location of 0x9000 and its entry point is located at 0x9010. However, even after boot1 loads and transfers control to boot2, there's a routine that is used by the boot2 program contained in boot1. If you read the source for the boot1 program you will notice a function called xread. This function is used to read data from the disk using the BIOS. So, the location of boot1 in memory is very important and boot2 has to be aware of its location to function properly.

boot2

So far we've loaded two boot blocks and one large program into memory, transferred control twice both times resetting up a small environment (stack, segment registers etc..), and performed some limited I/O using the BIOS. We still haven't reached the point of loading an operating system yet. If you ever watch your computer screen during a FreeBSD boot, you'll maybe see F1 and that cute spinning line of ASCII so far. You might not think it's all that impressive, but it's the exact, precise nature of Assembler code that makes the boot process seem so elegant and effortless.

Now on to the final bootstrap process, boot2. This final stage is simple and one of two things can happen: the boot2 program loads the loader (we discuss this in the following section) or, the boot2 program loads a kernel directly and boot without using the loader program at all. If you've ever interrupted boot2 program while it's loading, you may have have seen this, which boot2 prints to the screen:

     >> FreeBSD/i386 BOOT 
     Default: 0:ad(0,a)/boot/loader
     boot:

If you press enter, boot2 will simply load the default loader, as listed. However, if you just type in "boot kernel" then it will load up the kernel (/kernel ) and boot. You can, of course, change these default values. If you want to find out more read the documentation for boot(8).

We mentioned earlier that boot2 is a BTX client (Boot Extender). What does that entail? The BTX provides is a basic virtual 86 addressing environment. A discussion on the history of memory addressing on Intel hardware is in order.

So far we've avoided mention of memory addressing schemes, which can be confusing because Intel CPUs suffer from legacy issues, and boot code design is usually left to those developers who absolutely need to write it. Unless you are porting a system to a new architecture so your code is completely platform dependent, usually, a programmer will never be tasked with writing a boot loader.. However the boot process is very important to developers who need to write device drivers or other kernel related programming. This is where some developers will encounter the BTX loader.

Starting from around the 8088 until the 80186, Intel processors had only one way to address memory, called real mode. These early CPUs had whopping 16 bit registers and 20 bit memory addresses. The question then arises, how do you make a 20 bit address in a 16 bit register? The answer was, to use two 16 bit registers, with one register serving as a base the other as an offset to this base. The base register is shifted left 4 bits and thus when the two are combined an astounding 20 bit address can be calculated. With all these nifty segment registers and bit shifting the early Intel processors could address a total of 1 megabyte. Today this would not even be large enough to hold a Word document, as a bloated example.

Once the 80386 rolled around, addressing 1 megabyte was not enough; users demanded more memory and programs started to use more memory. A new addressing mode called protected mode was devised. The new protected mode allowed for addressing of up to 4Gigs of memory.

Another advantage of this new scheme was that it was easier to implement for assembly programmers. The main difference is that your extended registers (these are the same 16 bit registers that since the 386 are now 32 bit) can contain a full 32 bit address, even while your previous segment registers are now protected. The program cannot write to them nor read them. These segment registers are now used to locate your real address in memory and this process includes checking bits for permission (read, write etc..) and involves the MMU (memory management unit).

Now back to the BTX client issues. What advantage does this give us to use this BTX program? Simple: flexibility. BTX provides enough services so that a small program with a nice interface could be written and allow for greater flexibility in loading the kernel. On FreeBSD systems that would be the loader. From the next section you will see just how nice and flexible the loader program really is. So for the rest of this section we'll cover basic BTX services.

The BTX services can be categorized into two basic groups. The first group is system services provided by direct function calling (similar to system calls). The other group is services, which are environment services and are not directly called by the client. These services are similar to an operating system; however the BTX program operates as a single task environment.

The BTX services provided by direct calls consist of two system calls, exit and exec. When exit is called the BTX loader terminates and the system is rebooted. The final system call exec will transfer control to the provided address. This transfer of control is done in Supervisor mode and the new program can leave the protected CPU mode.

The environment services the BTX loader provides are very basic. The BTX loader handles all hardware interrupts. These interrupts are then sent to the proper BIOS handlers. BTX also emulates the BIOS extended memory transfer call. And finally several Assembler instructions are emulated. These are pushf, popf, cli, sti, iret, and hlt.

A final note of caution: All programs written to run in the BTX environment will need to be compiled and then linked with the btxld. For more information please read man pages for the BTX Loader.

2.3 Loader

The final boot stage consists of the loader. The loader program is a combination of standard commands (referred to as "built in commands") and a Forth interpreter (which is based on ficl). The loader will allow the user to interact with how the system boots or allow for system recovery and maintenance. From the loader the end user can choose to load or unload kernel or kernel modules. The user can also set and un-set specific variables, such as rootdevice and module_path. These can also be changed in the /boot/loader.conf. The default file the loader reads is located in /boot/defaults/loader.conf. This default file also contains many of the available options. Both of these files are constructed similarly to the /etc/rc.conf file.

The loader program is very useful for kernel and device driver debugging. From the loader you can tell the kernel to boot with the debugger (ddb) enabled. Or, you can load a specific kernel module for device driver testing. If you're going to be writing any kernel modules or device drivers you should read all the documentation on the loader. First start with the man page and then review all the options contained in the /boot/loader.conf. The loader program could be very useful down the road when you need to extend your BSD system or diagnose a kernel crash.

2.4 Beginning Kernel Services

We're finally at the stage when the kernel is loaded into memory and control of the CPU is transferred to it. Once the kernel is loaded it needs to run through its initialization and prepare the system for multitasking. This initialization includes three main components. The first two are machine specific and written in a combination of assembly and C. These first two stages prepare the system and initialize the CPU Memory Management Unit (MMU, if it exists) as well as handle initialization of hardware. The final stage entails setting up the basic kernel environment and getting the system ready to run process 1 (/sbin/init). These first two stages are highly platform dependant. Because every architecture has specific needs, we'll provide a high level overview for these two first stages and later in the book when we cover device drivers we'll go into these concepts in detail.

Stage 1 & 2 kernel assembly and C start-up routines

Although once loaded the kernel will assume nothing about the system, the loader program does pass some information to the kernel, such as the boot device and the boot flags. In addition, the kernel must create its environment and prepare the system for process 0 (explained below). These tasks include CPU detection, creation of a run-time task, and detection of memory amount.

The CPU identification is an important step. Because each platform can have multiple different types of CPUs (i386 being one of them), not all CPUs will support the same features. For example, take the MMX instruction set. Although it's not that important of a feature for the kernel, the floating point unit is, so if this feature is not supported on that type of CPU then it will have to be emulated in software. This is true for all other unsupported features as well as know bugs or idiosyncracies with the CPU.

The second stage will initialize the system's hardware and memory. This includes probing for hardware, initializing I/O devices, allocating memory for kernel structures, and creating the system message buffer. This stage is what you see during the boot screen, with lists of hardware flashing by. Traditionally in BSD, this stage is initialized by calling the function cpu_startup().

Stage 3 and process 0

Once the function cpu_startup() returns, the kernel will need to create process 0. This process 0 is commonly known as the swapper. If you run the ps command you will see it in action. However it really does not exist in the sense that there's no such binary named swapper associated with this process. This is true for these four other important processes found on a modern FreeBSD system: pagedaemon, vmdaemon, bufdaemon and syncer. To avoid complications, we'll just say that these processes are part of the VM subsystem; we discuss them in the process chapter. What's important to understand is that they are created during boot by the kernel and that they are not binary programs in the filesystem, and are fairly platform independent. They are written in C and are started after the beginning platform environment is initialized.

init and system shell scripts

Finally after all of that assembly and platform-dependant code gets executed, the kernel finally executes the first real program /sbin/init. This program is quite simple and is fairly small (on FreeBSD, it totals about 1,700 lines). As we discuss in the chapter on BSD processes, this is the one process that all processes are descendant from. The strength of the design is, because the /sbin/init program is just a binary in the file system, you can write a custom version. The main goal on start-up for /sbin/init is to run the system start up scripts and prepare the system for multiuser mode. Be aware of signals: the /sbin/init process should be able to handle signals with some grace, or your system could end up in a strange state of the /sbin/init program and crash on boot. Also during runtime /sbin/init can be sent signals to perform certain tasks. For examplee, if you want to tell the system to not spawn a process for a specific terminal, as listed in /etc/ttys, you can mark the desired terminal off and run the following which will in effect have init read the /etc/ttys and only spawn processes for the terminals listed as on:

bash$ kill -HUP 1

Note that unless you're careful, you could end up with a system that you can't log into! (Look at the chapter on signals for more details.)

The init program will on boot set up the system for multiuser mode. It's quite a feat, involving tedious tasks such as starting every daemon and setting network information. On Unix systems, a few ways to perform these tasks are available, mainly involving shell scripts. On some versions of Linux and System V systems /etc/rc<n>.d/ directories that correspond to the certain runlevels that the scripts should be started on are available. However the BSDs have a much simpler method. These are the rc scripts found in /etc/.

These scripts normally should not be edited; instead, set the values from /etc/rc.conf. With custom installations such as PicoBSD you might have to create your own scripts; PicoBSD is highly diskspace conscious, and has specific filesystem needs. One important note, the /usr/local/etc/rc.d/ directory is special. This directory is special in the sense that every executable found in this directory with a .sh will be executed after the /etc/rc scripts. For good system admins this directory replaces the older /etc/rc.local file. (The /etc/rc.local was the older method of running custom scripts or programs at the very end of the system start up.

The BSD rc scripts include the following notable ones listed below:

/etc/rc - main script and first to be called. Mounts the file system and then runs all the needed rc scripts.
/etc/rc.atm - used to configure ATM networking
/etc/rc.devfs - set up /dev/ permissions and links
/etc/rc.diskless1 - first diskless client script
/etc/rc.diskless2 - second diskless client script
/etc/rc.firewall - used for firewall rules
/etc/rc.firewall6 - used for ipv6 firewall rules
/etc/rc.i386 - intel system specific needs
/etc/rc.isdn - isdn network settings
/etc/rc.network - ipv4 network settings
/etc/rc.network6 - ipv6 network settings
/etc/rc.pccard - for laptop pc card controlers
/etc/rc.serial - set up serial devices
/etc/rc.sysctl - used to set sysctl options at boot time
/usr/local/etc/rc.d/ - general directory with custom start up scripts

Here is an example:

If you want to start rsync as a daemon then this script (although very simple) will do it:

#!/bin/sh

if [ -x  /usr/local/bin/rsync ] ; then
         /usr/local/bin/rsync --daemon
fi

This will first check to see if the rsync program exists then run it in daemon mode.

Chapter 2 FreeBSD System Programming
Prev	top	Next