Prefix

This document reflects what I have learned through porting several MIPS machines and other related Linux work. Hopefully it will help beginners to get started and give the experienced a reference point.

This document goes through all the major steps to port Linux to a MIPS machine. The focus can perhaps be called "MIPS machine abstraction layer", i.e., the interface between machine-specific code and, mostly, MIPS common code. Another useful document focuses on "Linux hardware abstraction layer", i.e., the interface between Linux common code and architecture-specific code. The document is written by C Hanish Menon (www.hanishkvc.com).

There are some notations used in this document.

TODO
Reminder for incomplete part
HELP
Really need your help on this
DEBATE
Here is my opinion. What do you think?
THANKS
Thanks to the person who pointed this out
NOTE
Additional note added later, but the original comment may be still useful

Chapter 1: An overview

Prerequisites:

  • Know C programming.

  • Have some knowledge of OS concepts, such as interrupt handling, system calls, memory management.

  • Know how to configure and make Linux kernel. You can find many helps on this if you are not very comfortable.

  • Have some knowledge of MIPS CPU. More than likely you will need to deal with CP0 registers, enable or disable interrupts, etc..

  • You don't have to be an expert in MIPS assembly, but total ignorance of it might make you handicapped in some situations.

  • Obviously, you need a MIPS hardware to play with.

  • Finally but most importantly, you need a willing-to-learn heart and perhaps many restless debugging hours. :-)

It is also highly recommanded to read through the Linux MIPS HOWTO by Ralf B�chle, ralf@gnu.org. By the way, as part of the pre-requisite, you should also remember Ralf's name. :-)

Kernel source trees

The common MIPS tree is the CVS tree at linux-mips.org. See the instructions in "Anonymous CVS servers" section in Linux/MIPS HOWTO. The current kernel version as of 2004/01/26 is 2.6.1. You can always check out earlier stable revisions by using "linux_2_4" or even "linux_2_2" branch tag.

Kernel patches

For various reasons, a kernel tree may leave bugs there for a quite long time before a suitable fix is checked in. There are various places to get patches. Here are some of the more common ones:

Jun Sun patches
Linux/MIPS FTP archive
Maciej W. Rozyki patches
Brad LaRonde's patches

Cross-compilation and toolchains

More than likely your MIPS box does not run Linux yet (why would you bother otherwise?). Therefore you will need another machine to build the kernel image. Once the image is built, you need to download this image to your MIPS machine and let it run your MIPS kernel. This is called cross-development. Your MIPS box is often called the target machine and the machine used to build the kernel image is called the host machine.

Cross-development is common for developing on embedded targets, because usually embedded targets do not have enough power or the peripherals to develop natively.

The most common host machine is probably Linux on i386/PCs.

You need to have cross-development tools setup on your host before you can start. While you can find instructions, such as in Linux/MIPS HOWTO, to build cross-compilation tools, your best bet is probably to get some ready-made ones.

MontaVista used to offer for free Journeyman edition, which includes a full featured toolchain. Unfortunately, it does not offer that anymore. Instead you can download the preview kit, which includes a "slim" version of toolchain. You can get the kit from http://www.mvista.com/previewkit/index.html

Dan Kegel has a set of scripts that build cross-compiling tools. You can check it out here.

The following are links to pre-build toolchains, instructions to build your own toolchain and finally pre-compiled distributions for MIPS boards:

Brad LaRonde's cross toolchain for Linux
Steve Hill's toolchains for glibc and uClibc
MIPS free toolchains
Distributions for MIPS

Overall porting steps :

Depending on your specific cases, some of the following steps can be skipped.

  1. Hello World! - Get board setup, serial porting working, and print out "Hello, world!" through the serial port.

  2. Add your own code.

  3. Get early printk working - Make the first MIPS image and see the printk output from kernel.

  4. Serial driver and serial console - Get the real printk working with the serial console.

  5. KGDB - KGDB can be enormously helpful in your development. It is highly recommended and it is not that difficult to set up.

  6. CPU support - If your MIPS CPU is not currently supported, you need to add new code that supports it.

  7. Board specific support - Create your board-specific directory. Setup interrupt routing/handling and kernel timer services.

  8. PCI subsystems - If your machine has PCI, you need to get the PCI subsystem working before you can use PCI devices.

  9. Ethernet drivers - You should already have the serial port working before attempting this. Usually the next driver you want is the ethernet driver. With ethernet driver working, you can set up a NFS root file system which gives you a fully working Linux userland.

  10. ROMFS root file system - Alternatively you can create a userland file system as a ROMFS image stored in a ramdisk.

Chapter 2: "Hello, world!"

In cross development, the serial port is usually the most important interface: That is where you can see anything happening! It might be worthwhile to make sure you get serial porting work before you even start playing with Linux. You can find the sample code or gzipped tar ball of a stand-alone program that can do printf. Such a program can even be useful in later debugging staging, e.g., printing out hardware register values.

Before you rush to type 'make', check and modify the following configurations:

  1. The sample code assumes R4K style CP0 structure. It should apply to most CPUs named above number 4000 and the recent MIPS32/MIPS64.

  2. Check if you have 1MB RAM size. (You really should have at 1MB to run Linux at all.) It is recommanded you have 8MB RAM or more.

  3. Is your serial port a standard UART type? If yes, modify the serial code and parameters. If not, you will have to supply your own functions to utilize the UART.

  4. What is your cross-tool name and path? Modify the Makefile accordingly.

Now, fire your "make" command.

Depending on your downloader on your MIPS box, you may need to generate ELF image, binary image or a SREC image.

Download the barebone image to your target and give it a run! Connect the serial port to your host machine. Start minicom and hopefully you can see the "Hello, world!" message.

Trouble shooting:

  • Make sure your bootloader downloads the image to uncached KSEG1 segment. If your bootloader downloads to the cached KSEG0 area, you will want to run the image from the KSEG0 area too.

  • If your bootloader has already initialized the serial port, you may want to skip your own initialization.

  • Did you set up minicom correctly? Test it with other machines.

  • Hopefully it is not the toolchain problem...

Chapter 3 : Add your own code

Let us add some code to the tree and make a Linux image. For conveninence sake, let us say we are porting Linux to a MIPS board called Aloha.

Create the right directory for your board

Your code for a new board can be classified into board-support code (or board-specific code) and drivers. Driver code should be placed under the 'drivers' directory and board specific code should be placed under 'arch/mips' directory.

The easiest choice is to create a directory called 'arch/mips/aloha'.

However, a couple of other considerations might make it slightly complicated.

  • If Aloha uses a chipset or System on a Chip (SOC) that is already supported or belongs to a bigger family, such as NEC VR41xx and gt64120, it makes sense to put Aloha code under those sub-directories. You can re-use and share a lot of common code.

  • Similarly, if Aloha is the first board that uses a chipset or SOC which is expected to be used in many other boards, you may want to create similar directory structure. However, if you are not sure, just create your own board specific directory.

In the past people have created directories based on the board manufacturer's name, such as "mips-boards". This generally is not a good idea. It is almost certain that some of these boards do not share anything common at all.

To make things worse, sometimes boards made by different companies use the same chipset or SOC. Now what are you going to do? Are you going to duplicate the common code? Or are you going stick one company's board under another company's name?

For header files, you usually create similar directory or header files under include/asm-mips. [DEBATE] For board specific header files, I would encourage people to place them under the corresponding 'arch/mips' directory if possible.

In our exmaple, we will create 'arch/mips/aloha' directory.

Write the minimum Aloha code

Let us write some code for the Aloha board which can generate a complete Linux image without complaining about missing symbols.

Go to this directory to browse 'arch/mips/aloha' directory. Or download the gzipped file of the directory.

Obviously the code is not complete yet, but if you follow the following steps and everything is correct, you should be able to generate a Linux/MIPS kernel image of your very own!

Hook up your code with the Linux tree

Most of the steps are fairly straightforward:

  • include/asm-mip/bootinfo.h - Add your machine group ID, machine group name and Aloha machine ID.

  • arch/mips/kernel/setup.c - Add 'aloha_setup' function declaration and invocation code.

  • arch/mips/Makefile - Add a section that links your Aloha code in.

    #
    # Hawaii Aloha board
    #
    ifdef CONFIG_ALOHA
    SUBDIRS += arch/mips/aloha
    LIBS += arch/mips/aloha/aloha.o
    LOADADDR += 0x80002000
    endif

    LOADADDR is the starting address for your Linux image when it is loaded into RAM. Note that the first 0x200 bytes are used by the exception vectors on most CPUs. Some CPUs will requries a larger space, so modify the LOADADDR accordingly. Due to the linker's addressing limit, the start address is aligned on a 8KB boundary, so setting your LOADADDR to 0x80002000 should be reasonable.

  • arch/mips/config-shared.in - Add necessary config information for Aloha board.

    1. Add the following to 'Machine selection'.

      dep_bool 'Support for Hawaii Aloha board  (EXPERIMENTAL)' CONFIG_ALOHA $CONFIG_EXPERIMENTAL
    2. Add a set of default configs for the board, which depends on the features and drivers that Linux port will supports. Here is a very simple example for our minimum Aloha board configurations.

      if [ "$CONFIG_ALOHA" = "y" ]; then
      define_bool CONFIG_CPU_R4X00 y
      define_bool CONFIG_CPU_LITTLE_ENDIAN y
      define_bool CONFIG_SERIAL y
      define_bool CONFIG_SERIAL_MANY_PORTS y
      define_bool CONFIG_NEW_IRQ y
      define_bool CONFIG_NEW_TIME_C y
      define_bool CONFIG_SCSI n
      fi

    There are two kinds of configuration options here. The first kind are those you cannot select interactively during 'make config' or 'make menuconfig' or 'make xconfig'. Examples are CONFIG_NEW_IRQ and CONFIG_NEW_TIME_C. You must put them here, or else they will not get selected. The second kind of options are those you can select interactively, such as CONFIG_CPU_R4X00 and CONFIG_SERIAL. However you may also put them here if you know which selection is right for the board. This way people will make fewer mistakes when they configure for the board.

For instant gratification, you can find a complete patch for adding the Aloha board support to the Linux/MIPS CVS tree checked out on January 20, 2004.

Configure and build a kernel image

Now you are ready to run your favorite configuration tool. Since we do not have much code added yet, do not be too greedy with selecting options. Just pick a couple of simple options such as the serial and serial console.

  • If you denoted the Aloha board support to be EXPERIMENTAL, select 'Prompt for development and/or incomplete code/drivers' under 'Code maturity level options'.

  • Select 'Support for Hawaii Aloha board' and unselect all other machines under 'Machine selection'.

  • Select the right CPU. Under 'CPU selection' select your CPU. If there is no entry for the CPU on your board, you will need to add support for it. Most recent CPUs can generally run to some degree with CPU_R4X00.

  • Under 'Character devices', select 'Standard/generic (8250/16550 and compatible UARTs) serial support' and 'Support for console on serial port'. Unselect the 'Virtual terminal' option.

  • Under the 'Kernel hacking' option, select 'Are you using a crosscompiler'.

  • For other options either take the default or select 'no'.

Here is a sample minimum config for our Aloha board.

Before you type 'make', double-check the 'arch/mips/Makefile' and make sure the cross-toolchain program names are correct and in your execution path i.e. your PATH environment variable.

Now type 'make dep' and 'make'. Then wait for a miracle to happen!

Chapter 4 : Early printk

Assuming you are lucky and actually generate an image from the last chapter, don't bother running it because you won't see anything. This is not strange because all our board-specific code is empty and we have not told Linux kernel anything about our serial port or I/O devices yet.

The sign of a live Linux kernel comes from the output of printk, which is routed to the first console. Since we have configured a serial console, we should be able to see something on the serial wire if we have set it up correctly.

Unfortunately, setup of the serial console happens much later during the kernel startup process. (See Appendix A for a chart of the kernel start-up sequence). Chances are your new kernel probably dies even before that. That is where the early printk patch comes in handy. It allows you to see printk as early as the first line of C code.

By the way the first line of C code for Linux MIPS is the first line of code of 'init_arch()' function in the 'arch/mips/setup.c' file.

For kernel version earlier than 2.4.10, you can find the early printk patch here for boards with standard UART serial ports. Starting from 2.4.10 and beyond, a new printk patch is needed. If you have already got the stand-alone "Hello, world!" program running, the early printk should be easy to get going, and you should have printk output from the Linux kernel very soon.

Chapter 5 : Serial driver and console

While early printk is rather useful, you still need to get the real serial driver working.

Assuming you have a standard serial port, there are two ways to add serial support: static defines and run-time setup.

With static defines, you modify the 'include/asm-mips/serial.h' file. Looking through the code, it is not difficult to figure out how to add support for your board's serial port(s).

As more boards are supported by Linux/MIPS, the 'serial.h' file gets crowded. One potential solution is to do run-time serial setup. Sometimes run-time serial setup is necessary if any of the parameters can only be detected at the run-time where settings are read from a non-volatile memory device or an option is passed on the kernel command line.

There are two elements to consider for doing run-time serial setup:

  • Reserve the 'rs_table[]' size, see the 'drivers/char/serial.c' file. Unfortunately there is not a clean way to accomplish this yet. A temporary workaround is to define CONFIG_SERIAL_MANY_PORTS in 'arch/mips/config-shared.in' for your board. This configuration reserves up to 64 serial port entries for your board!

  • Call the 'early_serial_setup()' routine in your board setup routine. Here is a piece of sample run-time initialization code.

Serial parameters

Most of the parameter settings are rather obvious. Here is a list of some less obvious ones:

line
Only used for run-time serial configuration. It is the index into the 'rs_table[] array'.
io_type
io_type determines how your serial registers are accessed. Two common types are SERIAL_IO_PORT and SERIAL_IO_MEM. A SERIAL_IO_PORT type driver uses the inb/outb macros to access registers whose base address is at port. In other words, if you specify SERIAL_IO_PORT as the io_type, you should also specify the port parameter.
For SERIAL_IO_MEM, the driver uses readb/writeb macros to access regsiters whose address is at iomem_base plus a shifted offset. The number of digits shifted is specified by iomem_reg_shift. For example, all the serial registers are placed at 4-byte boundary, then you have an iomem_reg_shift of 2.
Generally SERIAL_IO_PORT is for serial ports on an ISA bus and SERIAL_IO_MEM is for memory-mapped serial ports.
There are also SERIAL_IO_HUB6 and SERIAL_IO_GSC. The HUB6 was a multi-port serial card produced by Bell Technologies. The GSC is a special bus found on PA-RISC systems. These options will not be used on any MIPS boards.

Non-standard serial ports

If you have a non-standard serial port, you will have to write your own serial driver.

Some people derive their code from the standard serial driver. Unfortunately this is a very daunting task. The 'drivers/char/serial.c' file has over 6000 lines of code!

Fortunately, there is an alternative [THANKS: Fillod Stephane]. There is a generic serial driver, 'drivers/char/generic_serial.c'. This file provides a set of serial routines that are hardware independent. Then your task is to provide only the routines that are hardware dependent such as the interrupt handler routine and low-level I/O functions. There are plenty of examples. Just look inside the 'drivers/char/Makefile' and look for drivers that link with 'generic_serial.o' file.

[HELP: Would appreciate if you can share your experience of writing proprietary serial driver code or using the 'generic_serial.c' file.]

Chapter 6: KGDB

For many Linux kernel developers, KGDB is a life-saving tool. With KGDB, you can debug the kernel while it runs! You can set breakpoints or do single stepping at the source code level.

To do this, you will need a dedicated serial port on your target, and use a crossover-cable (also known as a null-modem) to connect it to your development host. If you are also using a serial console, this implies you will need two serial ports on your target. It is possible to do both kernel debug and serial console through a single serial port. This will be mentioned later in this chapter.

When you configure the kernel, select the 'Remote GDB kernel debugging' which is listed under "Kernel hacking". Do a 'make clean' and recompile the kernel so that debugging symbols are compiled into your kernel image. Try to make a new image. You will soon discover two missing symbols in the final linking stage:

arch/mips/kernel/kernel.o: In function `getpacket':
arch/mips/kernel/kernel.o(.text+0x85ac): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x85cc): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x8634): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x864c): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x8670): undefined reference to `putDebugChar'
arch/mips/kernel/kernel.o(.text+0x8680): undefined reference to `putDebugChar'
arch/mips/kernel/kernel.o(.text+0x8698): undefined reference to `putDebugChar'
arch/mips/kernel/kernel.o(.text+0x86a0): undefined reference to `putDebugChar'

You need to supply these two functions for your own boards:

int putDebugChar(uint8 byte)
uint8 getDebugChar(void)

As an example, here is the dbg_io.c for DDB5476 board. DDB5476 uses a standard UART serial port to implement those two functions.

After supplying those two functions, you are ready to debug the kernel. Run the new kernel image. If you also use the early printk patch, you should be able to see something like this on your console:

Wait for gdb client connection ...

Assuming you have already connected a cross-over serial cable between the dedicated serial port on the target and a serial port on your host (say, COM0), you can then set the appropriate baud rate and start the cross gdb on your host:

stty -F /dev/ttyS0 38400
mipsel-linux-gdb vmlinux

At the gdb prompt, type

target remote /dev/ttyS0

And, voila! You should be talking to the kernel through KGDB - if you are lucky enough!

A couple of tips on using KGDB:

  • Any functions preceded with the '__init' label will not break very well with breakpoints. Sometimes it will screw up the line numbers of other functions in the same file. Try undefining __init to be an empty macro in your 'include/linux/init.h' file. Refer to the patch. [NOTE: This problem is fixed in the latest gdb version, at least in gdb 5.2]

  • Sometimes if you break on a function, you cannot see the correct value in variables and cannot do back-tracing. This is probably because certain registers are still not initialized [HELP: because kernel is compiled with -O2 flag?]. Step into the function a couple of lines, and you should see the variable and back-tracing fine.

What if the board only has one serial port?

Some boards only have one serial port. If you use it as serial console, you cannot really use it for KGDB - unless you do some tricks to it.

There are two solutions. One is GDB console, and the other is to use a KGDB demuxing script.

It is easy to use the GDB console. When you select 'Remote GDB kernel debugging' under the 'Kernel hacking' sub-menu, you are also prompted for 'Console output to GDB'. Simply selecting that choice will work! In fact, this option is so easy to use you might want to use it even if you have a second serial port.

However, this option has a limit. When the kernel goes to userland, the console stops working. This is because the KGDB stub in the kernel and GDB are not designed to provide interactive output. [HELP: any volunteers?]

The second option uses a script called 'kgdb_demux' written by Brian Moyle. It creates two virtual ports, typically ttya0 and ttya1. It then listens to the real serial port (such as ttyS0). It will forward console traffic to ttya0 and KGDB traffic to ttya1. All you have to do then is to start minicom on /dev/ttya0 (port setting does not matter) and KGDB on /dev/ttya1.

You can download the tarball here. A couple of usage tips.

  • Untar the file to some place.

  • Copy kgdb_demux script to your execution path, and modify it properly.

  • Set the port parameters properly before you start kgdb_demux.

Chapter 7 : CPU support

In the MIPS world, there are different families of MIPS cores. For example Toshiba TX49, SiByte SB1, NEC VR41XX, MIPS32, and MIPS64 to name a few. These different families may have different cache architectures, may be 32-bit or 64-bit, and may or may not have a FP unit. A complete list of families is found in the 'arch/mips/config-shared.in' and 'include/asm-mips/cpu.h' files. When you are configuring your kernel the complete list will appear under the 'CPU type' menu selection. If you are adding support for an entirely new family of MIPS cores, you will need to change the following files:

  • arch/mips/config-shared.in - You will need to add your processor family to the list under 'CPU type'.

  • include/asm-mip/cpu.h - You will need to add a PRID_COMP and PRID_IMP entries for your processor family if they do not already exist. You will also need to then add the CPU types that you will support in that family. Adding CPU types is covered below.

  • arch/mips/kernel/cpu-probe.c - For each processor family, there is typically a cpu_probe_XXX function that probes and fills in the struct cpuinfo_mips which contains information about the CPU type, whether a FP unit exists, the number of TLBs available, etc. Your function is responsible for properly probing and filling in this information for the entire family of processors. Finally, if your family does something special for CPU idle, then you will have to define a 'wait' function. Most likely, you will be able to use one that already exists. If not, add your 'wait' function and enable detection of it in check_wait.

  • arch/mips/mm - If your processor family has a unique cache architecture or anything out of the ordinary, you will need to either add or modify files in this directory.

Within each family of MIPS cores, there are multiple processors. For example, Alchemy processors and Broadcom processors are in the MIPS32 family, the Sibyte 1250 is in the SB1 family, and so on. Once you have decided what family your processor is in, you need to verify that a unique CPU identification exists for it. The 'include/asm-mips/cpu.h' file contains all of the currently supported CPU types for the Linux/MIPS kernel.

If your processor is already listed in the 'include/asm-mips/cpu.h' file, the existing Linux/MIPS code should auto-detect it. If your processor is not listed, you will need to change three files to properly detect your processor:

  • include/asm-mip/cpu.h - Add your processor at the end of the CPU_XXX definitions. Make sure you update the 'CPU_LAST' value to match that of the processor you added.

  • arch/mips/kernel/proc.c - Add an entry to the cpu_name array so that your CPU name can be properly displayed in /proc/cpuinfo.

  • arch/mips/kernel/cpu-probe.c - Add your processor into the case statement in the function check_wait if your CPU can do some sort of wait or idle operation. Add the necessary code in the cpu_probe_XX function to detect and set your CPU type.

Once you have support for your processor family and specific CPU type, you should be able to take full advantage of its capabilities.

Chapter 8 : Board support - interrupts

It you followed the previous steps, most likely you will see kernel hanging at the BogusMIPS calibration step. The reason is simple: The interrupt code is not there and jiffies are never updated. Therefore the calibration can never be done.

Before you start writing interrupt code, it really pays to study the hardware first. Pay particular attention to identify all interrupt sources, their corresponding controllers and how they are routed.

Then you need to come up with a strategy, which typically includes:

  • a static interrupt routing map

  • a list of interrupt sources

  • a list of their corresponding controllers

  • how interrupt controllers cascade from each other

Interrupt code overview

To completely service an interrupt, four different pieces of code work together:

IRQ detection/dispatching
This is typically assembly code in a file called 'int_handler.S'. Sometimes there is also secondary-level dispatching code written in C for complicated IRQ detection. The end result is that we identify and select a single IRQ source, represented by an integer, and then pass it to function 'do_IRQ()'.
do_IRQ()
do_IRQ() is provided in the 'arch/mips/kernel/irq.c' file. It provides a common framework for IRQ handling. It invokes the individual IRQ controller code to enable/disable a particular interrupt. It calls the driver supplied interrupt handling routine that does the real processing.
hw_irq_handler
It is a structure associated with each IRQ source. The structure is a collection of function pointers, which tells do_IRQ() how it should deal with this particular IRQ.
driver interrupt handling code
The code that does the real job.

Obviously for our porting purposes we need to write IRQ detection/disptaching code and the hw_irq_handler code for any new IRQ controller used in the system. In addition, there is also IRQ setup and initialization code.

CPU as an IRQ controller

Most R4K-compatible MIPS CPUs have the ability to support eight interrupt sources. The first two are software interrupts, and the last one is typically used for CPU counter interrupt. Therefore you can have up to five external interrupt sources.

Luckily if the CPU is the only IRQ controller in the system, you are done! The hw_irq_handler code is written in arch/mips/kernel/irq_cpu.c file. All you have to do is to define "CONFIG_IRQ_CPU" for your machine. And in your irq setup routine make sure you call mips_cpu_irq_init(0). This call will initialize the interrupt descriptor and set up the IRQ numbers to range from 0 to 7.

You can also find a matching interrupt dispatching code in this file.

Set up cascading interrupts

More than likely you will have more interrupt sources than those that can directly connect to the CPU interrupt pins. A second or even third-level interrupt controller may connect to one or more of those CPU pins. In that case, you have cascading interrupts.

There are plenty of examples of how cascading interrupt works, such as the DDB5477 and the Osprey. Here is a short summary:

  • Assign blocks of IRQ numbers to various interrupt controllers in the whole system. For example, in the case of the Osprey, CPU interrupts occupy IRQ 0 to 7. Vr4181 system interrupts occupy IRQ 8 to 39, and GPIO interrupts occupy 40 to 56. In most cases, the actual IRQ numbers do not matter, as long as the driver knows which IRQ number it should use. However, if you have an i8259 interrupt controller and an ISA bus, you should try to assign IRQ number 0 to 16 for the i8259 interrupts because it will make the legacy PC drivers happy. (Please note before ~12/08/2001 in version 2.4.16 of the Linux kernel, the 'i8259.c' file set the base vector to be 0x20. If you use the IRQ acknowledgement cycle to obtain the interrupt vector, you will get an IRQ number from 0x20 to 0x2f. You will then need to substract 0x20 from the return value to get the correct IRQ number.)

  • Write the 'hw_irq_controller' member functinos for your specific controllers. Note that CPU and i8259 already have their code written. You just need to define appropriate CONFIG options for your board. See the next sub-section for more details about writing 'hw_irq_controller' member functions.

  • In your IRQ setup routine, initialize all the controllers, usually by calling 'interrupt_controller_XXX_init()' functions.

  • In your IRQ setup routine, setup the cascading IRQs. This setup will enable interrupts for the upper interrupt controller so that the lower-level interrupts can cascade through once they are enabled. A typical way of doing this is to have a dummy 'irqaction struct' and setup as follows:

    static struct irqaction cascade =
    { no_action, SA_INTERRUPT, 0, "cascade", NULL, NULL };

    extern int setup_irq(unsigned int irq, struct irqaction *irqaction);

    void __init <my>_irq_init(void)
    {
    ....
    setup_irq(CPU_IP3, &cascade);
    }
  • You need to expand your interrupt dispatching code properly to identify the added interrupt sources. If the code is simple enough, you can do it in the same int_handler.S file. If it is more complicated, you may do it in a separate C function (such as in the DDB5476 board).

The hw_irq_controller struct

The 'hw_irq_controller structure' is a defined in the 'include/linux/irq.h' file as an alias for the 'hw_interrupt_type' structure.

struct hw_interrupt_type {
const char * typename;
unsigned int (*startup)(unsigned int irq);
void (*shutdown)(unsigned int irq);
void (*enable)(unsigned int irq);
void (*disable)(unsigned int irq);
void (*ack)(unsigned int irq);
void (*end)(unsigned int irq);
void (*set_affinity)(unsigned int irq, unsigned long mask);
};

The 'arch/mips/kernel/irq_cpu.c' is a good sample code to write 'hw_irq_controller' member functions. Here are some more programming notes for each of these functions:

const char * typename;
Controller name. Will be displayed under /proc/interrupts.
unsigned int (*startup)(unsigned int irq);
Invoked when request_irq() or setup_irq are called. You need to enable this interrupt here. Other than that you may also want to do some IRQ-specific initialization (such as turning on power for this interrupt, perhaps).
void (*shutdown)(unsigned int irq);
Invoked when free_irq() is called. You need to disable this interrupt and perhaps some other IRQ-specific cleanup.
void (*enable)(unsigned int irq) and void (*disable)(unsigned int irq)
They are used to implement enable_irq(), disable_irq() and disable_irq_nosync(), which in turn are used by driver code.
void (*ack)(unsigned int irq)
ack() is invoked at the beginning of do_IRQ() when we want to acknoledge an interrupt. I think you need also to disable this interrupt here so that you don't get recursive interrupts on the same interrupt source. [HELP: can someone confirm?]
void (*end)(unsigned int irq)
This is called by do_IRQ() after it has handled this interrupt. If you disabled interrupt in ack() function, you should enable it here. [HELP: generally what else we should do here?]
void (*set_affinity)(unsigned int irq, unsigned long mask)
This is used in SMP machines to set up interrupt handling affinity with certain CPUs. [TODO] [HELP]

The IRQ initialization code

The IRQ initialization is done in 'init_IRQ()'. Currently it is supplied by each individual board. In the future, it will probably be a MIPS common routine, which will further invoke a board-specific function, board_irq_init(). board_irq_init will be a function pointer that <my_board>_setup() function needs to assign propoer value.

In any case, the following is a skeleton code for a normal init_IRQ() routine.

extern asmlinkage void vr4181_handle_irq(void);
extern void breakpoint(void);
extern int setup_irq(unsigned int irq, struct irqaction *irqaction);
extern void mips_cpu_irq_init(u32 irq_base);
extern void init_generic_irq(void);

static struct irqaction cascade =
{ no_action, SA_INTERRUPT, 0, "cascade", NULL, NULL };
static struct irqaction reserved =
{ no_action, SA_INTERRUPT, 0, "reserved", NULL, NULL };
static struct irqaction error_irq =
{ error_action, SA_INTERRUPT, 0, "error", NULL, NULL };

void __init init_IRQ(void)
{
int i;
extern irq_desc_t irq_desc[];

/* hardware initialization code */
....

/* this is the routine defined in int_handler.S file */
set_except_vector(0, my_irq_dispatcher);

/* setup the default irq descriptor */
init_generic_irq();

/* init all interrupt controllers */
mips_cpu_irq_init(CPU_IRQ_BASE);
....

/* set up cascading IRQ */
setup_irq(CPU_IRQ_IP3, &cascade);

/* set up reserved IRQ so that others can not mistakingly request
* it later.
*/
setup_irq(CPU_IRQ_IP4, &reserved);

#ifdef CONFIG_DEBUG
/* setup debug IRQ so that if that interrupt happens, we can
* capture it.
*/
setup_irq(CPU_IRQ_IP4, &error_irq);
#endif

#ifdef CONFIG_REMOTE_DEBUG
printk("Setting debug traps - please connect the remote debugger.\n");
set_debug_traps();
breakpoint();
#endif
}

Final notes

What is described in this chapter is what is so-called new style interrupt handling. We used to have three different ways to handle interrupts: new style (CONFIG_NEW_IRQ), the old style (CONFIG_ROTTEN_IRQ) and board-private ad hoc routines. New style is now the only valid method since October 2002.

Chapter 9 : Board support - system time and timer

Linux relies on an RTC (real-time clock) device to obtain the real calendar data and time when it boots up. It relies on a system timer to advance the tick count, jiffies. If you don't provide proper time and timer code, Linux won't run. In fact it will stick in 'calibrate_delay()' during the startup process because jiffies is never incremented. (See Appendix B for more details about Linux/MIPS startup sequence).

There is an excellent document (I becomes a little shameless. :-0) under 'Documentations/mips/time.README' that should be read to further understand timekeeping for Linux/MIPS. It is a must read.

Here are some comments on implementing time and timer services:

  • If your system has a CPU counter and another hardware timer, use the hardware timer over the CPU counter, even though CPU counter might be easier to setup and use. This is because CPU counter relies on the CPU frequency which is more likely to change in the future. In addition, performance-critical code may need to access the CPU counter for its own measurements. Some CPUs may have a variable CPU frequency which makes CPU counter not usable as a timer source. [DEBATE: 03/12/04, I changed my preference on this issue. Linux in the future will have richer and higher resolution time support. If all MIPS boards use CPU counter as the system timer, we can maximize the code sharing.]

  • Unless you have to use interrupts to calibrate the CPU frequency, you can generally avoid implementing the 'board_time_init()' function. Most of its work can be done in the board setup routine.

  • When you implement 'rtc_set_time()', more than likely you need to call the 'to_tm()' function which converts a single jiffy value to a full 'struct rtc_time'. This function is provided in 'arch/mips/kernel/time.c' and declared in 'include/asm-mips/time.h'.

Chapter 10 : Board support - PCI subsystem

The PCI subsystem is perhaps the most complex code you have to deal with during the porting process. The key to making the PCI subsystem work properly is a good understanding of the PCI bus itself, the code layout, and the execution flow in Linux. Like many other parts of porting, you will find in the end, the actual code writen is minimumal.

References

Pete Popov wrote a fine document on PCI and Linux/MIPS. It is under 'Documentation/mips/pci/pci.README'. It is highly recommended reading.

For those who want to know more about the PCI bus itself, I recommend the book PCI System Architecture published by MindShare Inc.

Overview of the PCI bus

Here we summarize some facts of the PCI bus:

  • The PCI bus has three separate address spaces, config, I/O, and memory space.

  • Every PCI device responds to config commands, and it can respond to I/O accesses and/or memory accesses.

  • During the boot time, the BIOS or the OS sets the base address registers (BARs) through configuration space. BARs determine address ranges in I/O or memory space that a device should respond to. Obviously, those ranges should not be duplicated anywhere else in the I/O space or memory space on the same PCI bus.

  • Multiple PCI buses can be connected through PCI-PCI bridges.

BAR assignment in Linux

On all IBM PC-compatible machines, BARs are assigned by the BIOS. Linux simply scans through the buses and records the BAR values.

Some MIPS boards adopt similar approaches, where BARs are assigned by firmware. However, the quality of BAR assignment by firmware vary quite a bit. Some firmware simply assigns BARs to on-board PCI devices and ignore all add-on PCI cards. In that case, Linux cannot solely rely on the firmware's assignment.

There is another issue of depending on the firmware assignment. You need to stick with the address range setup by the firmware. In other words, if the firmware assigns PCI memory space from 0x10000000 to 0x14000000, you cannot easily move it to a different address space somewhere else in Linux.

There three ways to possibly fix this:

  • The first way is to fix the BAR assignment manually in your board setup routine. This only works if your board does not have a PCI slot to plug in an arbitrary PCI card. You need to carefully examine the existing PCI resource assignment done by firmware so that you do not assign overlapping address ranges.

  • The second way is to do a complete PCI resource assignment *before* Linux starts PCI bus scanning. In other words, we discard any PCI resource assignment done in firmware, if there is any, and do a new assignment by ourselves. This approach gives us complete control over the address range and resource allocation. With the CONFIG_PCI_AUTO option used in 'arch/mips/config-shared.in' and 'arch/mips/kernel/pci_auto.c' file, it turns out to be quite easy to do. This approach is the focus of this chapter.

  • Another approach is to call the 'pci_assign_unassigned_resources()' function, which is defined in the 'drivers/pci/setup-bus.c' file in recent 2.4.x kernels *after* Linux completes the PCI bus scan. With earlier versions of this function, Linux will assign resources to PCI devices whose BAR have *not* been properly assigned. With the recent versions (that have "optimal" resource assignment based on sizes), this PCI routine apparently does a complete resource re-assignment. In other words, it does almost exactly the same as what the 'pci_auto.c' file does.

[DEBATE] The 'pci_auto' and 'assign_unassigned_resource' approaches have their own advantages and disadvantages. Ideally, the whole PCI subsystem should be completely re-written so that several things can be taken into consideration which are not currently addressed:

  • A notion of a host-PCI controller, and supporting multiple host-PCI controllers.

  • A kernel-independent abstraction layer to access configuration space, distinguishing Type 0 and Type 1 configuration accesses. This removes the need for 'pci_dev' in the lowest-level PCI routines.

  • Pass 1 scan to do bus number assignment and record bus topology through the top-level host-PCI controller structure.

  • Pass 2 scan to discover all other PCI devices and assign the resources along the way.

  • If we want to do optimal resource assignment, we need to do the resource assignment in Pass 3 instead of Pass 2.

  • A complete PCI device list is built during the above passes.

  • The address range for PCI memory space and I/O space are set when host-PCI controller structures are initially created.

PCI startup sequence

  1. do_basic_setup() calls pci_init(), which is defined in 'drivers/pci/pci.c'.

  2. pci_init() first calls pcibios_init(). If you enable CONFIG_NEW_PCI, pcibios_init() is implemented in the 'arch/mips/kernel/pci.c' file. Otherwise you need to provide it in your own board-dependent code.

  3. Optionally, pcibios_init() may call pciauto_assign_resources() to do a complete PCI resource assignment.

  4. Somewhere inside pcibios_init(), pci_scan_bus() is called. If a machine has multiple host-PCI controllers, pci_scan_bus() should be called for each of the top-level PCI buses. Apparently, bus numbers should have already been setup before pci_scan_bus() can properly run.

  5. Optionally, after pci_scan_bus() is called, pcibios_init() may choose to call pci_assign_unassigned_resources() to do a complete PCI resource assignment.

  6. pcibios_init() will do some more fixups (resources, IRQs, etc.).

  7. Returning from pcibios_init(), pci_init() will do a final round of device-based fixups.

In this chapter, we focus on the approach where both CONFIG_NEW_PCI and CONFIG_PCI_AUTO are enabled.

PCI driver interface

All of this work for PCI is to eventually setup a structure where all PCI device drivers can run happily. Knowing how PCI device drivers access PCI resources can greatly help you understand how you should do PCI initialization and setup.

  • inb()/outb()/inw()/outw()/inl()/outl()
    Defined in 'include/asm-mips/io.h'. Drivers use these macros to read/write into PCI IO space. The address arguments are addresses in PCI IO space, and should correspond to one of the BAR values of the device. On MIPS machines, we assume PCI IO space is mapped into a contiguous physical address block. The base address of the block is mips_io_port_base, which you need to set up at the beginning of board setup time. The proper way to set it up is call set_io_port_base().

  • readb()/writeb()/readw()/writew()/readl()/writel()
    Defined in the 'include/asm-mips/io.h' file. Drivers use them to access PCI memory space. On MIPS machines, we assume PCI memory space is 1:1 mapped into a block of physical address. Therefore those macros are equivalent to direct physical memory access.

  • pci_read_config_word() and friends
    Defined in 'include/linux.pci.h' (through PCI_OP()). Drivers use them to read or write the configuration registers of devices. Low-level routines are abstracted as struct pci_ops, where each board must supply one.

  • pci_map_single() and friends
    Defined in 'include/asm-mips/pci.h'. Drivers use these macros to map a virtual address to a bus address (so you can tell the device to do DMA). They don't usually affect PCI porting.

Setting up the host-PCI controller

As we can see from the above discussion, we need to set up the host-PCI controller such that:

  1. It has a 1:1 mapping between PCI memory space and CPU physical address space

  2. It maps the beginning part of PCI IO space into a address block in physical address space.

The Host-PCI controller usually allows you to map PCI memory space into a window in physical address space. Let us say the base address is ba_mem and size is sz_mem. It usually allows you to translate that address into another one with a fixed offset, say off. (Note off can be both positive or negative). So if a driver accesses the address ba_mem+x (0 <= x < sz_mem), the host-PCI controller will intercept the command and translate it into a PCI memory access at address ba_mem+off+x.

To maintain 1:1 mapping, it implies we must set up the PCI addressing such that off is 0. Also note that with this setup, we cannot access the PCI memory range [0x0, ba_mem] and [ba_mem + sz_mem, 0xffffffff].

Additionaly, we must also make system RAM visible on the PCI memory bus at address 0x0 (assuming that is the address in physical address space) in order for PCI devices to do DMA transfers.

The beginning part of PCI IO space is usually mapped into another window in physical address space, say [ba_io, ba_io + sz_io]. In other words, a range [0, sz_io] in PCI IO space corresponds to the range [ba_io, ba_io + sz_io]. Obviously, mips_io_port_base should be set to ba_io.

The above setup is typically done in the board-specific setup routine (i.e., <board>_setup()). You typically also setup ioport_resource and iomem_resource as well:

        ioport_resource.start = 0x0;
ioport_resource.end = sz_io;
iomem_resource.start = 0x0;
iomem_resource.end = sz_mem;

These variables are the roots of all IO and memory resources (roughly corresponding to the ancient ISA IO space and ISA memory space). For simplicity you can also set the end to be 0xffffffff.

Board-specific functions and variables

Here is a list of board-specific functions you must implement. Again, I assume this board has CONFIG_NEW_PCI and CONFIG_PCI_AUTO options enabled.

  • struct pci_ops my_pci_ops
    You implement six functions to fill into this structure, which is needed by pci_scan_bus() and pciauto_assign_resources(). Note that you need to dintinguish type 0 or type 1 configuration in those functions. You can typically check that by checking whether the bus's parent (dev->bus->parent) is NULL.

  • mips_pci_channels[]
    You need to define this array in order to use pciauto resource assignment. For each top-level PCI bus, you need to supply an element data to this array. The array ends with an all-NULL element. Each element is a structure that consists a pci_ops, pci_io_resource, and pci_mem_resource, which usually represents a top-level PCI bus connected to CPU. pci_ops defines the functions access the PCI bus's config space. pci_io_resource and pci_mem_resource specifies the address range that pciauto will use to assign to the BARs of PCI devices. For pci_io_resource, it starts with 0x0 (or 0x1000 to leave some room for legacy ISA devices), and it ends at sz_io. For pci_mem_resource, it starts at ba_mem and ends at ba_mem + sz_mem. Note these addresses are in PCI IO and PCI memory space respectively. However, since we maintain 1:1 mapping between PCI memory space and CPU physical address space, pci_mem_resource also represents the PCI memory space window in CPU physical address space.

  • pcibios_fixup_resources()
    This routine is passed to pci_scan_bus() and invoked right after when the PCI device is discovered. This is place where you can do some device-specific fixup (BARs, pci_dev structure, etc). Note you can do the same fixup in pcibios_fixup(). I recommand leave this function empty unless you have some specific need that requires immediate fixup.

  • pcibios_fixup()
    This function is invoked after pci_scan_bus() is done, i.e., all PCI bridges and devices are discovered by Linux. Here you can enumerate through PCI devices and do device based fixup. Or you can do bus or contoller related fixups.

  • pcibios_fixup_irqs()
    This is the place to fix up PCI related IRQs. It is invoked after pci_scan_bus() is done and next to pcibios_fixup(). A typical strategy is to assign irq based on the slot number and possibly bus number if there are more than top-level buses in the system. Note that you can do the fixup in pcibios_fixup() as well and leave this function empty.

  • pcibios_assign_all_busses()
    Return 1 to indicate that all bus numbers have been assigned by pciauto. [TODO:Should this function be included in pciauto by default?][TODO: the driver/pci/pci.c file seems to have a typo when it calls this function. The logic is inversed.]

Tips for writting PCI code

  • Some ill-behaviored PCI device may spoil the party. The best way to deal with them is to signle them out in the PCI configuration routines (in pci_ops). Inside those routines you check for the slot number and function number corresponding to the bad devices and return some NULL numbers.

  • If you have mulitple top-level PCI buses, it is tricky to do PCI IO assignment. Assume you have two PCI buses and their IO spaces are mapped into [ba_io1, ba_io1 + sz_io1] and [ba_io2, ba_io2+sz_io2] (ba_io1 + sz_io1 <= ba_io2). You need to

    1. Set mips_io_port_base to be ba_io1

    2. Set pci_io_resource to be [0x0, sz_io1] for the first PCI bus

    3. Set pci_io_resource to be [ba_io2 - ba_io1, ba_io2 - ba_io1 + sz_io2] for the 2nd PCI bus. In addition, the legacy ISA devices on the second PCI bus cannot be used without modifying their drivers.

Chapter 11 : Ether drivers and networking

[TODO]

Chapter 12 : Romfs root filesystem

[TODO]

Chapter 13 : Debugging

Although it is the last chapter, it is probably the most necessary one. In this chapter, I will list some commonly used debugging tips and tricks.

Kernel Oops

ksymoops is a program that deciphers all the secret numbers in a kernel core dump. Old versions of ksymoops may not work well for MIPS.

I sometimes use a script, call2sym, written by Phil Hollenback. You just cut and past the call trace part to feed the script and it will display a possible function call stack at the time crash. Note it is likely some symbols are bogus, but the real ones should be displayed. You will have to use your own judgement.

In 2.6, the brack trace is automatically printed for you when a kernel core dump happens.

Appendix A: Linux startup sequence in 2.4.x

kernel_entry() - arch/mips/kernel/head.S
set stack;
prepare argc/argp/envp;
jal init_arch - arch/mips/kernel/setup.c
cpu_probe() -
prom_init(...) - arch/mips/ddb5074/prom.c
loadmmu()
start_kernel() - init/main.c
setup_arch(&commaind_line); - arch/mips/kernel/setup.c
ddb_setup() - arch/mips/ddb5074/setup.c
parse_options(command_line);
trap_init();
init_IRQ(); - arch/mips/kernel/irq.c
sched_init();
softirq_init();
time_init();
if (board_time_init) board_time_init();
set xtime by calling rtc_get_time();
pick appropriate do_gettimeoffset()
board_timer_setup(&timer_irqaction);
console_init();
init_modules();
kmem_cache_init();
sti(); /* interrupt first open */
calibrate_delay();
mem_init();
kmem_cache_sizes_init();
pgtable_cache_init();
fork_init();
proc_caches_init();
vfs_caches_init();
buffer_init();
page_cache_init();
signals_init();
smp_init();
kernel_thread(init, ...)
cpu_idle();


init() - init/main.c
- lock_kernel();
do_basic_setup();
- [MTRR] mtrr_init();
[SYSCTL] sysctl_init();
[S390] s390_init_machine_check();
[ACPI] acpi_init();
[PCI] pci_init();
[SBUS] sbus_init();
[PPC] ppc_init();
[MCA] mca_init();
[ARCH_ACORN] ecard_init();
[ZORRO] zorro_init();
[DIO] dio_init();
[MAC] nubus_init();
[ISAPNP] isapnp_init();
[TC] tc_init();
sock_init();
start_context_thread();
do_initcalls();
[IRDA] irda_proto_init();
[IRDA] irda_device_init();
[PCMCIA] init_pcmcia_ds();

prepare_namespace();
free_initmem();
unlock_kernel();

files = current->files;
if(unshare_files())
panic("unshare");
put_files_struct(files);

if (open("/dev/console", O_RDWR, 0) < 0)
printk("Warning: unable to open an initial console.\n");
(void) dup(0);
(void) dup(0);

if (execute_command)
run_init_process(execute_command);
run_init_process("/sbin/init");
run_init_process("/etc/init");
run_init_process("/bin/init");
run_init_process("/bin/sh");

panic("No init found. Try passing init= option to kernel.");

Appendix B: Credits and thanks

People in the following list have generously given their feedback to me. In spite of my effort to keep the list as complete as possible, I am afraid many people are still missing here.

        Dirk Behme <dirk.behme@de.bosch.com>
Fillod Stephane <FillodS@thmulti.com>
Geoffrey Espin <espin@idiom.com>
Gerald Champagne <gerald.champagne@esstech.com>
Henri Girard <khgirard@broadbandnetdevices.com>
Neal Crook <ncrook@micron.com>
Steven J. Hill <James.Hill@timesys.com>
TAKANO Ryousei <takano@axe-inc.co.jp>
Motoya Kurotsu <kurotsu@allied-telesis.co.jp>

Appendix C: Change logs

2004/03/29
Include the second batch of changes and updates from Steven Hill (Humongous thanks!). The content is more less current with respect to 2.4 tree. Notes for porting to 2.6 will hopefully follow soon.
2004/01/26
Included initial batch of changes and updates from Steven Hill (Big thanks!). External links are now shown in a separate window.
2002/08/23
Add change log. Make external links the _top frame. Update tools (thanks to Henri Girard)
2002/12/11
Incorporate comments from Neal Crook.