Prefix
This
document reflects what I have learned through porting several MIPS
machines and other related Linux work. Hopefully it will help
beginners to get started and give the experienced a reference point.
This
document goes through all the major steps to port Linux to a MIPS
machine. The focus can perhaps be called "MIPS machine
abstraction layer", i.e., the interface between machine-specific
code and, mostly, MIPS common code. Another useful document focuses
on "Linux hardware abstraction layer", i.e., the interface
between Linux common code and architecture-specific code. The
document is written by C Hanish Menon (www.hanishkvc.com).
There are some notations used
in this document.
- TODO
-
Reminder for incomplete part
-
HELP
-
Really need your help on this
-
DEBATE
-
Here is my opinion. What do you
think?
-
THANKS
-
Thanks to the person who pointed
this out
-
NOTE
-
Additional note added later, but
the original comment may be still useful
Chapter
1: An overview
Prerequisites:
-
Know
C programming.
-
Have
some knowledge of OS concepts, such as interrupt handling, system
calls, memory management.
-
Know
how to configure and make Linux kernel. You can find many helps on
this if you are not very comfortable.
-
Have
some knowledge of MIPS CPU. More than likely you will need to deal
with CP0 registers, enable or disable interrupts, etc..
-
You
don't have to be an expert in MIPS assembly, but total ignorance of
it might make you handicapped in some situations.
-
Obviously,
you need a MIPS hardware to play with.
-
Finally
but most importantly, you need a willing-to-learn heart and perhaps
many restless debugging hours. :-)
It
is also highly recommanded to read through the Linux
MIPS HOWTO by Ralf B�chle, ralf@gnu.org. By the way, as part of
the pre-requisite, you should also remember Ralf's name. :-)
Kernel source
trees
The common MIPS tree is the
CVS tree at linux-mips.org. See the instructions in "Anonymous
CVS servers" section in Linux/MIPS HOWTO. The current kernel
version as of 2004/01/26 is 2.6.1. You can always check out earlier
stable revisions by using "linux_2_4" or even "linux_2_2"
branch tag.
Kernel patches
For various reasons, a kernel
tree may leave bugs there for a quite long time before a suitable fix
is checked in. There are various places to get patches. Here are some
of the more common ones:
Jun
Sun patches
Linux/MIPS
FTP archive
Maciej
W. Rozyki patches
Brad
LaRonde's patches
Cross-compilation
and toolchains
More than likely your MIPS box
does not run Linux yet (why would you bother otherwise?). Therefore
you will need another machine to build the kernel image. Once the
image is built, you need to download this image to your MIPS machine
and let it run your MIPS kernel. This is called cross-development.
Your MIPS box is often called the target machine and the machine used
to build the kernel image is called the host machine.
Cross-development is common
for developing on embedded targets, because usually embedded targets
do not have enough power or the peripherals to develop natively.
The most common host machine
is probably Linux on i386/PCs.
You need to have
cross-development tools setup on your host before you can start.
While you can find instructions, such as in Linux/MIPS HOWTO, to
build cross-compilation tools, your best bet is probably to get some
ready-made ones.
MontaVista used to offer for
free Journeyman edition, which includes a full featured toolchain.
Unfortunately, it does not offer that anymore. Instead you can
download the preview kit, which includes a "slim" version
of toolchain. You can get the kit from
http://www.mvista.com/previewkit/index.html
Dan Kegel has a set of scripts
that build cross-compiling tools. You can check it out here.
The following are links to
pre-build toolchains, instructions to build your own toolchain and
finally pre-compiled distributions for MIPS boards:
Brad
LaRonde's cross toolchain for Linux
Steve
Hill's toolchains for glibc and uClibc
MIPS
free toolchains
Distributions
for MIPS
Overall porting
steps :
Depending on your specific
cases, some of the following steps can be skipped.
-
Hello
World! - Get board setup, serial porting working, and print out
"Hello, world!" through the serial port.
-
Add
your own code.
-
Get
early printk working - Make the first MIPS image and see the printk
output from kernel.
-
Serial
driver and serial console - Get the real printk working with the
serial console.
-
KGDB
- KGDB can be enormously helpful in your development. It is highly
recommended and it is not that difficult to set up.
-
CPU
support - If your MIPS CPU is not currently supported, you need to
add new code that supports it.
-
Board
specific support - Create your board-specific directory. Setup
interrupt routing/handling and kernel timer services.
-
PCI
subsystems - If your machine has PCI, you need to get the PCI
subsystem working before you can use PCI devices.
-
Ethernet
drivers - You should already have the serial port working before
attempting this. Usually the next driver you want is the ethernet
driver. With ethernet driver working, you can set up a NFS root file
system which gives you a fully working Linux userland.
-
ROMFS root file system -
Alternatively you can create a userland file system as a ROMFS image
stored in a ramdisk.
Chapter
2: "Hello, world!"
In
cross development, the serial port is usually the most important
interface: That is where you can see anything happening! It might be
worthwhile to make sure you get serial porting work before you even
start playing with Linux. You can find the sample
code or gzipped
tar ball of a stand-alone program that can do printf. Such a
program can even be useful in later debugging staging, e.g., printing
out hardware register values.
Before you rush to type
'make', check and modify the following configurations:
-
The
sample code assumes R4K style CP0 structure. It should apply to most
CPUs named above number 4000 and the recent MIPS32/MIPS64.
-
Check
if you have 1MB RAM size. (You really should have at 1MB to run
Linux at all.) It is recommanded you have 8MB RAM or more.
-
Is
your serial port a standard UART type? If yes, modify the serial
code and parameters. If not, you will have to supply your own
functions to utilize the UART.
-
What is your cross-tool
name and path? Modify the Makefile accordingly.
Now, fire your "make"
command.
Depending on your downloader
on your MIPS box, you may need to generate ELF image, binary image or
a SREC image.
Download the barebone image to
your target and give it a run! Connect the serial port to your host
machine. Start minicom and hopefully you can see the "Hello,
world!" message.
Trouble shooting:
-
Make
sure your bootloader downloads the image to uncached KSEG1 segment.
If your bootloader downloads to the cached KSEG0 area, you will want
to run the image from the KSEG0 area too.
-
If
your bootloader has already initialized the serial port, you may
want to skip your own initialization.
-
Did
you set up minicom correctly? Test it with other machines.
-
Hopefully it is not the
toolchain problem...
Chapter
3 : Add your own code
Let
us add some code to the tree and make a Linux image. For conveninence
sake, let us say we are porting Linux to a MIPS board called Aloha.
Create
the right directory for your board
Your
code for a new board can be classified into board-support code (or
board-specific code) and drivers. Driver code should be placed under
the 'drivers' directory and board specific code should be placed
under 'arch/mips' directory.
The
easiest choice is to create a directory called 'arch/mips/aloha'.
However,
a couple of other considerations might make it slightly complicated.
-
If
Aloha uses a chipset or System on a Chip (SOC) that is already
supported or belongs to a bigger family, such as NEC VR41xx and
gt64120, it makes sense to put Aloha code under those
sub-directories. You can re-use and share a lot of common code.
-
Similarly,
if Aloha is the first board that uses a chipset or SOC which is
expected to be used in many other boards, you may want to create
similar directory structure. However, if you are not sure, just
create your own board specific directory.
In
the past people have created directories based on the board
manufacturer's name, such as "mips-boards". This generally
is not a good idea. It is almost certain that some of these boards do
not share anything common at all.
To
make things worse, sometimes boards made by different companies use
the same chipset or SOC. Now what are you going to do? Are you going
to duplicate the common code? Or are you going stick one company's
board under another company's name?
For
header files, you usually create similar directory or header files
under include/asm-mips. [DEBATE] For board specific header files, I
would encourage people to place them under the corresponding
'arch/mips' directory if possible.
In
our exmaple, we will create 'arch/mips/aloha' directory.
Write
the minimum Aloha code
Let
us write some code for the Aloha board which can generate a complete
Linux image without complaining about missing symbols.
Go
to this
directory to browse 'arch/mips/aloha' directory. Or download the
gzipped
file of the directory.
Obviously the code is not
complete yet, but if you follow the following steps and everything is
correct, you should be able to generate a Linux/MIPS kernel image of
your very own!
Hook up your code
with the Linux tree
Most of the steps are fairly
straightforward:
-
include/asm-mip/bootinfo.h
- Add your machine group ID, machine group name and Aloha machine
ID.
-
arch/mips/kernel/setup.c
- Add 'aloha_setup' function declaration and invocation code.
-
arch/mips/Makefile
- Add a section that links your Aloha code in.
#
# Hawaii Aloha board
#
ifdef CONFIG_ALOHA
SUBDIRS += arch/mips/aloha
LIBS += arch/mips/aloha/aloha.o
LOADADDR += 0x80002000
endif
LOADADDR is the starting address
for your Linux image when it is loaded into RAM. Note that the first
0x200 bytes are used by the exception vectors on most CPUs. Some
CPUs will requries a larger space, so modify the LOADADDR
accordingly. Due to the linker's addressing limit, the start address
is aligned on a 8KB boundary, so setting your LOADADDR to 0x80002000
should be reasonable.
-
arch/mips/config-shared.in
- Add necessary config information for Aloha board.
-
Add the following to
'Machine selection'.
dep_bool 'Support for Hawaii Aloha board (EXPERIMENTAL)' CONFIG_ALOHA $CONFIG_EXPERIMENTAL
-
Add a set of default
configs for the board, which depends on the features and drivers
that Linux port will supports. Here is a very simple example for
our minimum Aloha board configurations.
if [ "$CONFIG_ALOHA" = "y" ]; then
define_bool CONFIG_CPU_R4X00 y
define_bool CONFIG_CPU_LITTLE_ENDIAN y
define_bool CONFIG_SERIAL y
define_bool CONFIG_SERIAL_MANY_PORTS y
define_bool CONFIG_NEW_IRQ y
define_bool CONFIG_NEW_TIME_C y
define_bool CONFIG_SCSI n
fi
There are two kinds of
configuration options here. The first kind are those you cannot
select interactively during 'make config' or 'make menuconfig' or
'make xconfig'. Examples are CONFIG_NEW_IRQ and CONFIG_NEW_TIME_C.
You must put them here, or else they will not get selected. The
second kind of options are those you can select interactively, such
as CONFIG_CPU_R4X00 and CONFIG_SERIAL. However you may also put them
here if you know which selection is right for the board. This way
people will make fewer mistakes when they configure for the board.
For instant gratification, you
can find a
complete patch for adding the Aloha board support to the
Linux/MIPS CVS tree checked out on January 20, 2004.
Configure and
build a kernel image
Now you are ready to run your
favorite configuration tool. Since we do not have much code added
yet, do not be too greedy with selecting options. Just pick a couple
of simple options such as the serial and serial console.
-
If
you denoted the Aloha board support to be EXPERIMENTAL, select
'Prompt for development and/or incomplete code/drivers' under 'Code
maturity level options'.
-
Select
'Support for Hawaii Aloha board' and unselect all other machines
under 'Machine selection'.
-
Select
the right CPU. Under 'CPU selection' select your CPU. If there is no
entry for the CPU on your board, you will need to add support for
it. Most recent CPUs can generally run to some degree with
CPU_R4X00.
-
Under
'Character devices', select 'Standard/generic (8250/16550 and
compatible UARTs) serial support' and 'Support for console on serial
port'. Unselect the 'Virtual terminal' option.
-
Under
the 'Kernel hacking' option, select 'Are you using a crosscompiler'.
-
For other options either
take the default or select 'no'.
Here is a sample
minimum config for our Aloha board.
Before you type 'make',
double-check the 'arch/mips/Makefile' and make sure the
cross-toolchain program names are correct and in your execution path
i.e. your PATH environment variable.
Now type 'make dep' and
'make'. Then wait for a miracle to happen!
Chapter
4 : Early printk
Assuming
you are lucky and actually generate an image from the last chapter,
don't bother running it because you won't see anything. This is not
strange because all our board-specific code is empty and we have not
told Linux kernel anything about our serial port or I/O devices yet.
The
sign of a live Linux kernel comes from the output of printk, which is
routed to the first console. Since we have configured a serial
console, we should be able to see something on the serial wire if we
have set it up correctly.
Unfortunately,
setup of the serial console happens much later during the kernel
startup process. (See Appendix A for a chart of the kernel start-up
sequence). Chances are your new kernel probably dies even before
that. That is where the early printk patch comes in handy. It allows
you to see printk as early as the first line of C code.
By
the way the first line of C code for Linux MIPS is the first line of
code of 'init_arch()' function in the 'arch/mips/setup.c' file.
For
kernel version earlier than 2.4.10, you can find the early
printk patch here for boards with standard UART serial ports.
Starting from 2.4.10 and beyond, a new
printk patch is needed. If you have already got the stand-alone
"Hello, world!" program running, the early printk should be
easy to get going, and you should have printk output from the Linux
kernel very soon.
Chapter
5 : Serial driver and console
While
early printk is rather useful, you still need to get the real serial
driver working.
Assuming
you have a standard serial port, there are two ways to add serial
support: static defines and run-time setup.
With
static defines, you modify the 'include/asm-mips/serial.h' file.
Looking through the code, it is not difficult to figure out how to
add support for your board's serial port(s).
As
more boards are supported by Linux/MIPS, the 'serial.h' file gets
crowded. One potential solution is to do run-time serial setup.
Sometimes run-time serial setup is necessary if any of the parameters
can only be detected at the run-time where settings are read from a
non-volatile memory device or an option is passed on the kernel
command line.
There
are two elements to consider for doing run-time serial setup:
-
Reserve
the 'rs_table[]' size, see the 'drivers/char/serial.c' file.
Unfortunately there is not a clean way to accomplish this yet. A
temporary workaround is to define CONFIG_SERIAL_MANY_PORTS in
'arch/mips/config-shared.in' for your board. This configuration
reserves up to 64 serial port entries for your board!
-
Call
the 'early_serial_setup()' routine in your board setup routine. Here
is a piece of sample
run-time initialization code.
Serial parameters
Most of the parameter settings
are rather obvious. Here is a list of some less obvious ones:
- line
-
Only used for run-time serial
configuration. It is the index into the 'rs_table[] array'.
-
io_type
-
io_type determines how your
serial registers are accessed. Two common types are SERIAL_IO_PORT
and SERIAL_IO_MEM. A SERIAL_IO_PORT
type driver uses the inb/outb macros to access registers whose base
address is at port. In other words, if you
specify SERIAL_IO_PORT as the io_type,
you should also specify the port parameter.
For SERIAL_IO_MEM, the driver uses
readb/writeb macros to access regsiters whose address is at
iomem_base plus a shifted offset. The
number of digits shifted is specified by iomem_reg_shift. For
example, all the serial registers are placed at 4-byte boundary,
then you have an iomem_reg_shift of 2.
Generally SERIAL_IO_PORT is for serial
ports on an ISA bus and SERIAL_IO_MEM is
for memory-mapped serial ports.
There are also SERIAL_IO_HUB6
and SERIAL_IO_GSC. The HUB6 was a
multi-port serial card produced by Bell Technologies. The GSC is a
special bus found on PA-RISC systems. These options will not be used
on any MIPS boards.
Non-standard serial ports
If you have a non-standard
serial port, you will have to write your own serial driver.
Some people derive their code
from the standard serial driver. Unfortunately this is a very
daunting task. The 'drivers/char/serial.c' file has over 6000 lines
of code!
Fortunately, there is an
alternative [THANKS: Fillod Stephane]. There is a generic serial
driver, 'drivers/char/generic_serial.c'. This file provides a set of
serial routines that are hardware independent. Then your task is to
provide only the routines that are hardware dependent such as the
interrupt handler routine and low-level I/O functions. There are
plenty of examples. Just look inside the 'drivers/char/Makefile' and
look for drivers that link with 'generic_serial.o' file.
[HELP:
Would appreciate if you can share your experience of writing
proprietary serial driver code or using the 'generic_serial.c' file.]
Chapter
6: KGDB
For
many Linux kernel developers, KGDB is a life-saving tool. With KGDB,
you can debug the kernel while it runs! You can set breakpoints or do
single stepping at the source code level.
To
do this, you will need a dedicated serial port on your target, and
use a crossover-cable (also known as a null-modem) to connect it to
your development host. If you are also using a serial console, this
implies you will need two serial ports on your target. It is possible
to do both kernel debug and serial console through a single serial
port. This will be mentioned later in this chapter.
When
you configure the kernel, select the 'Remote GDB kernel debugging'
which is listed under "Kernel hacking". Do a 'make clean'
and recompile the kernel so that debugging symbols are compiled into
your kernel image. Try to make a new image. You will soon discover
two missing symbols in the final linking stage:
arch/mips/kernel/kernel.o: In function `getpacket':
arch/mips/kernel/kernel.o(.text+0x85ac): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x85cc): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x8634): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x864c): undefined reference to `getDebugChar'
arch/mips/kernel/kernel.o(.text+0x8670): undefined reference to `putDebugChar'
arch/mips/kernel/kernel.o(.text+0x8680): undefined reference to `putDebugChar'
arch/mips/kernel/kernel.o(.text+0x8698): undefined reference to `putDebugChar'
arch/mips/kernel/kernel.o(.text+0x86a0): undefined reference to `putDebugChar'
You
need to supply these two functions for your own boards:
int putDebugChar(uint8 byte)
uint8 getDebugChar(void)
As
an example, here is the
dbg_io.c for DDB5476 board. DDB5476 uses a standard UART serial
port to implement those two functions.
After supplying those two
functions, you are ready to debug the kernel. Run the new kernel
image. If you also use the early printk patch, you should be able to
see something like this on your console:
Wait for gdb client connection ...
Assuming you have already
connected a cross-over serial cable between the dedicated serial port
on the target and a serial port on your host (say, COM0), you can
then set the appropriate baud rate and start the cross gdb on your
host:
stty -F /dev/ttyS0 38400
mipsel-linux-gdb vmlinux
At the gdb prompt, type
target remote /dev/ttyS0
And, voila! You should be talking
to the kernel through KGDB - if you are lucky enough!
A couple of tips on using
KGDB:
-
Any
functions preceded with the '__init' label will not break very well
with breakpoints. Sometimes it will screw up the line numbers of
other functions in the same file. Try undefining __init to be an
empty macro in your 'include/linux/init.h' file. Refer to the patch.
[NOTE: This problem is fixed in the latest gdb version, at least in
gdb 5.2]
-
Sometimes if you break on
a function, you cannot see the correct value in variables and cannot
do back-tracing. This is probably because certain registers are
still not initialized [HELP: because kernel is compiled with -O2
flag?]. Step into the function a couple of lines, and you should see
the variable and back-tracing fine.
What if the board
only has one serial port?
Some boards only have one
serial port. If you use it as serial console, you cannot really use
it for KGDB - unless you do some tricks to it.
There are two solutions. One
is GDB console, and the other is to use a KGDB demuxing script.
It is easy to use the GDB
console. When you select 'Remote GDB kernel debugging' under the
'Kernel hacking' sub-menu, you are also prompted for 'Console output
to GDB'. Simply selecting that choice will work! In fact, this option
is so easy to use you might want to use it even if you have a second
serial port.
However, this option has a
limit. When the kernel goes to userland, the console stops working.
This is because the KGDB stub in the kernel and GDB are not designed
to provide interactive output. [HELP: any volunteers?]
The second option uses a
script called 'kgdb_demux' written by Brian Moyle. It creates two
virtual ports, typically ttya0 and ttya1. It then listens to the real
serial port (such as ttyS0). It will forward console traffic to ttya0
and KGDB traffic to ttya1. All you have to do then is to start
minicom on /dev/ttya0 (port setting does not matter) and KGDB on
/dev/ttya1.
You can download the
tarball here. A couple of usage tips.
-
Untar
the file to some place.
-
Copy
kgdb_demux script to your execution path, and modify it properly.
-
Set the port parameters
properly before you start kgdb_demux.
Chapter
7 : CPU support
In
the MIPS world, there are different families of MIPS cores. For
example Toshiba TX49, SiByte SB1, NEC VR41XX, MIPS32, and MIPS64 to
name a few. These different families may have different cache
architectures, may be 32-bit or 64-bit, and may or may not have a FP
unit. A complete list of families is found in the
'arch/mips/config-shared.in' and 'include/asm-mips/cpu.h' files. When
you are configuring your kernel the complete list will appear under
the 'CPU type' menu selection. If you are adding support for an
entirely new family of MIPS cores, you will need to change the
following files:
-
arch/mips/config-shared.in
- You will need to add your processor family to the list under 'CPU
type'.
-
include/asm-mip/cpu.h
- You will need to add a PRID_COMP and PRID_IMP entries for your
processor family if they do not already exist. You will also need to
then add the CPU types that you will support in that family. Adding
CPU types is covered below.
-
arch/mips/kernel/cpu-probe.c
- For each processor family, there is typically a cpu_probe_XXX
function that probes and fills in the struct
cpuinfo_mips which contains information about the CPU type,
whether a FP unit exists, the number of TLBs available, etc. Your
function is responsible for properly probing and filling in this
information for the entire family of processors. Finally, if your
family does something special for CPU idle, then you will have to
define a 'wait' function. Most likely, you will be able to use one
that already exists. If not, add your 'wait' function and enable
detection of it in check_wait.
-
arch/mips/mm
- If your processor family has a unique cache architecture or
anything out of the ordinary, you will need to either add or modify
files in this directory.
Within
each family of MIPS cores, there are multiple processors. For
example, Alchemy processors and Broadcom processors are in the MIPS32
family, the Sibyte 1250 is in the SB1 family, and so on. Once you
have decided what family your processor is in, you need to verify
that a unique CPU identification exists for it. The
'include/asm-mips/cpu.h' file contains all of the currently supported
CPU types for the Linux/MIPS kernel.
If
your processor is already listed in the 'include/asm-mips/cpu.h'
file, the existing Linux/MIPS code should auto-detect it. If your
processor is not listed, you will need to change three files to
properly detect your processor:
-
include/asm-mip/cpu.h
- Add your processor at the end of the CPU_XXX definitions. Make
sure you update the 'CPU_LAST' value to match that of the processor
you added.
-
arch/mips/kernel/proc.c
- Add an entry to the cpu_name array so
that your CPU name can be properly displayed in /proc/cpuinfo.
-
arch/mips/kernel/cpu-probe.c
- Add your processor into the case
statement in the function check_wait if
your CPU can do some sort of wait or idle operation. Add the
necessary code in the cpu_probe_XX function
to detect and set your CPU type.
Once
you have support for your processor family and specific CPU type, you
should be able to take full advantage of its capabilities.
Chapter
8 : Board support - interrupts
It
you followed the previous steps, most likely you will see kernel
hanging at the BogusMIPS calibration step. The reason is simple: The
interrupt code is not there and jiffies are never updated. Therefore
the calibration can never be done.
Before
you start writing interrupt code, it really pays to study the
hardware first. Pay particular attention to identify all interrupt
sources, their corresponding controllers and how they are routed.
Then
you need to come up with a strategy, which typically includes:
-
a
static interrupt routing map
-
a
list of interrupt sources
-
a
list of their corresponding controllers
-
how
interrupt controllers cascade from each other
Interrupt
code overview
To
completely service an interrupt, four different pieces of code work
together:
- IRQ
detection/dispatching
-
This
is typically assembly code in a file called 'int_handler.S'.
Sometimes there is also secondary-level dispatching code written in
C for complicated IRQ detection. The end result is that we identify
and select a single IRQ source, represented by an integer, and then
pass it to function 'do_IRQ()'.
-
do_IRQ()
-
do_IRQ()
is provided in the 'arch/mips/kernel/irq.c' file. It provides a
common framework for IRQ handling. It invokes the individual IRQ
controller code to enable/disable a particular interrupt. It calls
the driver supplied interrupt handling routine that does the real
processing.
-
hw_irq_handler
-
It
is a structure associated with each IRQ source. The structure is a
collection of function pointers, which tells do_IRQ() how it should
deal with this particular IRQ.
-
driver
interrupt handling code
-
The
code that does the real job.
Obviously
for our porting purposes we need to write IRQ detection/disptaching
code and the hw_irq_handler code for any new IRQ controller used in
the system. In addition, there is also IRQ setup and initialization
code.
CPU
as an IRQ controller
Most
R4K-compatible MIPS CPUs have the ability to support eight interrupt
sources. The first two are software interrupts, and the last one is
typically used for CPU counter interrupt. Therefore you can have up
to five external interrupt sources.
Luckily
if the CPU is the only IRQ controller in the system, you are done!
The hw_irq_handler code is written in arch/mips/kernel/irq_cpu.c
file. All you have to do is to define "CONFIG_IRQ_CPU" for
your machine. And in your irq setup routine make sure you call
mips_cpu_irq_init(0). This call will
initialize the interrupt descriptor and set up the IRQ numbers to
range from 0 to 7.
You
can also find a matching interrupt dispatching code in this
file.
Set up cascading
interrupts
More than likely you will have
more interrupt sources than those that can directly connect to the
CPU interrupt pins. A second or even third-level interrupt controller
may connect to one or more of those CPU pins. In that case, you have
cascading interrupts.
There are plenty of examples
of how cascading interrupt works, such as the DDB5477 and the Osprey.
Here is a short summary:
-
Assign
blocks of IRQ numbers to various interrupt controllers in the whole
system. For example, in the case of the Osprey, CPU interrupts
occupy IRQ 0 to 7. Vr4181 system interrupts occupy IRQ 8 to 39, and
GPIO interrupts occupy 40 to 56. In most cases, the actual IRQ
numbers do not matter, as long as the driver knows which IRQ number
it should use. However, if you have an i8259 interrupt controller
and an ISA bus, you should try to assign IRQ number 0 to 16 for the
i8259 interrupts because it will make the legacy PC drivers happy.
(Please note before ~12/08/2001 in version 2.4.16 of the Linux
kernel, the 'i8259.c' file set the base vector to be 0x20. If you
use the IRQ acknowledgement cycle to obtain the interrupt vector,
you will get an IRQ number from 0x20 to 0x2f. You will then need to
substract 0x20 from the return value to get the correct IRQ number.)
-
Write
the 'hw_irq_controller' member functinos for your specific
controllers. Note that CPU and i8259 already have their code
written. You just need to define appropriate CONFIG options for your
board. See the next sub-section for more details about writing
'hw_irq_controller' member functions.
-
In
your IRQ setup routine, initialize all the controllers, usually by
calling 'interrupt_controller_XXX_init()' functions.
-
In your IRQ setup
routine, setup the cascading IRQs. This setup will enable interrupts
for the upper interrupt controller so that the lower-level
interrupts can cascade through once they are enabled. A typical way
of doing this is to have a dummy 'irqaction struct' and setup as
follows:
static struct irqaction cascade =
{ no_action, SA_INTERRUPT, 0, "cascade", NULL, NULL };
extern int setup_irq(unsigned int irq, struct irqaction *irqaction);
void __init <my>_irq_init(void)
{
....
setup_irq(CPU_IP3, &cascade);
}
-
You need to expand your
interrupt dispatching code properly to identify the added interrupt
sources. If the code is simple enough, you can do it in the same
int_handler.S file. If it is more complicated, you may do it in a
separate C function (such as in the DDB5476 board).
The
hw_irq_controller struct
The 'hw_irq_controller
structure' is a defined in the 'include/linux/irq.h' file as an alias
for the 'hw_interrupt_type' structure.
struct hw_interrupt_type {
const char * typename;
unsigned int (*startup)(unsigned int irq);
void (*shutdown)(unsigned int irq);
void (*enable)(unsigned int irq);
void (*disable)(unsigned int irq);
void (*ack)(unsigned int irq);
void (*end)(unsigned int irq);
void (*set_affinity)(unsigned int irq, unsigned long mask);
};
The 'arch/mips/kernel/irq_cpu.c'
is a good sample code to write 'hw_irq_controller' member functions.
Here are some more programming notes for each of these functions:
- const
char * typename;
-
Controller name. Will be
displayed under /proc/interrupts.
-
unsigned
int (*startup)(unsigned int irq);
-
Invoked when request_irq()
or setup_irq are called. You need to
enable this interrupt here. Other than that you may also want to do
some IRQ-specific initialization (such as turning on power for this
interrupt, perhaps).
-
void
(*shutdown)(unsigned int irq);
-
Invoked when free_irq()
is called. You need to disable this interrupt and perhaps some
other IRQ-specific cleanup.
-
void
(*enable)(unsigned int irq) and void
(*disable)(unsigned int irq)
-
They are used to implement
enable_irq(), disable_irq()
and disable_irq_nosync(), which in turn
are used by driver code.
-
void
(*ack)(unsigned int irq)
-
ack() is invoked at the
beginning of do_IRQ() when we want to acknoledge an interrupt. I
think you need also to disable this interrupt here so that you
don't get recursive interrupts on the same interrupt source. [HELP:
can someone confirm?]
-
void
(*end)(unsigned int irq)
-
This is called by do_IRQ()
after it has handled this interrupt. If you disabled interrupt in
ack() function, you should enable it here. [HELP: generally what
else we should do here?]
-
void
(*set_affinity)(unsigned int irq, unsigned long mask)
-
This is used in SMP machines to
set up interrupt handling affinity with certain CPUs. [TODO] [HELP]
The IRQ initialization code
The IRQ initialization is done
in 'init_IRQ()'. Currently it is supplied by each individual board.
In the future, it will probably be a MIPS common routine, which will
further invoke a board-specific function, board_irq_init().
board_irq_init will be a function pointer that <my_board>_setup()
function needs to assign propoer value.
In any case, the following is
a skeleton code for a normal init_IRQ() routine.
extern asmlinkage void vr4181_handle_irq(void);
extern void breakpoint(void);
extern int setup_irq(unsigned int irq, struct irqaction *irqaction);
extern void mips_cpu_irq_init(u32 irq_base);
extern void init_generic_irq(void);
static struct irqaction cascade =
{ no_action, SA_INTERRUPT, 0, "cascade", NULL, NULL };
static struct irqaction reserved =
{ no_action, SA_INTERRUPT, 0, "reserved", NULL, NULL };
static struct irqaction error_irq =
{ error_action, SA_INTERRUPT, 0, "error", NULL, NULL };
void __init init_IRQ(void)
{
int i;
extern irq_desc_t irq_desc[];
/* hardware initialization code */
....
/* this is the routine defined in int_handler.S file */
set_except_vector(0, my_irq_dispatcher);
/* setup the default irq descriptor */
init_generic_irq();
/* init all interrupt controllers */
mips_cpu_irq_init(CPU_IRQ_BASE);
....
/* set up cascading IRQ */
setup_irq(CPU_IRQ_IP3, &cascade);
/* set up reserved IRQ so that others can not mistakingly request
* it later.
*/
setup_irq(CPU_IRQ_IP4, &reserved);
#ifdef CONFIG_DEBUG
/* setup debug IRQ so that if that interrupt happens, we can
* capture it.
*/
setup_irq(CPU_IRQ_IP4, &error_irq);
#endif
#ifdef CONFIG_REMOTE_DEBUG
printk("Setting debug traps - please connect the remote debugger.\n");
set_debug_traps();
breakpoint();
#endif
}
Final notes
What is described in this
chapter is what is so-called new style interrupt handling. We used to
have three different ways to handle interrupts: new style
(CONFIG_NEW_IRQ), the old style (CONFIG_ROTTEN_IRQ) and board-private
ad hoc routines. New style is now the only valid method since October
2002.
Chapter
9 : Board support - system time and timer
Linux
relies on an RTC (real-time clock) device to obtain the real calendar
data and time when it boots up. It relies on a system timer to
advance the tick count, jiffies. If you don't provide proper
time and timer code, Linux won't run. In fact it will stick in
'calibrate_delay()' during the startup process because jiffies
is never incremented. (See Appendix B for more details about
Linux/MIPS startup sequence).
There
is an excellent document (I becomes a little shameless. :-0) under
'Documentations/mips/time.README' that should be read to further
understand timekeeping for Linux/MIPS. It is a must read.
Here
are some comments on implementing time and timer services:
-
If
your system has a CPU counter and another hardware timer, use the
hardware timer over the CPU counter, even though CPU counter might
be easier to setup and use. This is because CPU counter relies on
the CPU frequency which is more likely to change in the future. In
addition, performance-critical code may need to access the CPU
counter for its own measurements. Some CPUs may have a variable CPU
frequency which makes CPU counter not usable as a timer source.
[DEBATE: 03/12/04, I changed my preference on this issue. Linux in
the future will have richer and higher resolution time support. If
all MIPS boards use CPU counter as the system timer, we can maximize
the code sharing.]
-
Unless
you have to use interrupts to calibrate the CPU frequency, you can
generally avoid implementing the 'board_time_init()' function. Most
of its work can be done in the board setup routine.
-
When
you implement 'rtc_set_time()', more than likely you need to call
the 'to_tm()' function which converts a single jiffy value
to a full 'struct rtc_time'. This function is provided in
'arch/mips/kernel/time.c' and declared in 'include/asm-mips/time.h'.
Chapter
10 : Board support - PCI subsystem
The
PCI subsystem is perhaps the most complex code you have to deal with
during the porting process. The key to making the PCI subsystem work
properly is a good understanding of the PCI bus itself, the code
layout, and the execution flow in Linux. Like many other parts of
porting, you will find in the end, the actual code writen is
minimumal.
References
Pete
Popov wrote a fine document on PCI and Linux/MIPS. It is under
'Documentation/mips/pci/pci.README'. It is highly recommended
reading.
For
those who want to know more about the PCI bus itself, I recommend the
book PCI System Architecture published by MindShare
Inc.
Overview of the
PCI bus
Here we summarize some facts
of the PCI bus:
-
The
PCI bus has three separate address spaces, config, I/O, and memory
space.
-
Every
PCI device responds to config commands, and it can respond to I/O
accesses and/or memory accesses.
-
During
the boot time, the BIOS or the OS sets the base address registers
(BARs) through configuration space. BARs determine address ranges in
I/O or memory space that a device should respond to. Obviously,
those ranges should not be duplicated anywhere else in the I/O space
or memory space on the same PCI bus.
-
Multiple PCI buses can be
connected through PCI-PCI bridges.
BAR assignment in
Linux
On all IBM PC-compatible
machines, BARs are assigned by the BIOS. Linux simply scans through
the buses and records the BAR values.
Some MIPS boards adopt similar
approaches, where BARs are assigned by firmware. However, the quality
of BAR assignment by firmware vary quite a bit. Some firmware simply
assigns BARs to on-board PCI devices and ignore all add-on PCI cards.
In that case, Linux cannot solely rely on the firmware's assignment.
There is another issue of
depending on the firmware assignment. You need to stick with the
address range setup by the firmware. In other words, if the firmware
assigns PCI memory space from 0x10000000 to 0x14000000, you cannot
easily move it to a different address space somewhere else in Linux.
There three ways to possibly
fix this:
-
The
first way is to fix the BAR assignment manually in your board setup
routine. This only works if your board does not have a PCI slot to
plug in an arbitrary PCI card. You need to carefully examine the
existing PCI resource assignment done by firmware so that you do not
assign overlapping address ranges.
-
The
second way is to do a complete PCI resource assignment *before*
Linux starts PCI bus scanning. In other words, we discard any PCI
resource assignment done in firmware, if there is any, and do a new
assignment by ourselves. This approach gives us complete control
over the address range and resource allocation. With the
CONFIG_PCI_AUTO option used in 'arch/mips/config-shared.in' and
'arch/mips/kernel/pci_auto.c' file, it turns out to be quite easy to
do. This approach is the focus of this chapter.
-
Another approach is to
call the 'pci_assign_unassigned_resources()' function, which is
defined in the 'drivers/pci/setup-bus.c' file in recent 2.4.x
kernels *after* Linux completes the PCI bus scan. With earlier
versions of this function, Linux will assign resources to PCI
devices whose BAR have *not* been properly assigned. With the recent
versions (that have "optimal" resource assignment based on
sizes), this PCI routine apparently does a complete resource
re-assignment. In other words, it does almost exactly the same as
what the 'pci_auto.c' file does.
[DEBATE] The 'pci_auto' and
'assign_unassigned_resource' approaches have their own advantages and
disadvantages. Ideally, the whole PCI subsystem should be completely
re-written so that several things can be taken into consideration
which are not currently addressed:
-
A
notion of a host-PCI controller, and supporting multiple host-PCI
controllers.
-
A
kernel-independent abstraction layer to access configuration space,
distinguishing Type 0 and Type 1 configuration accesses. This
removes the need for 'pci_dev' in the lowest-level PCI routines.
-
Pass
1 scan to do bus number assignment and record bus topology through
the top-level host-PCI controller structure.
-
Pass
2 scan to discover all other PCI devices and assign the resources
along the way.
-
If
we want to do optimal resource assignment, we need to do the
resource assignment in Pass 3 instead of Pass 2.
-
A
complete PCI device list is built during the above passes.
-
The address range for PCI
memory space and I/O space are set when host-PCI controller
structures are initially created.
PCI startup
sequence
-
do_basic_setup()
calls pci_init(), which is defined in
'drivers/pci/pci.c'.
-
pci_init()
first calls pcibios_init(). If you enable
CONFIG_NEW_PCI, pcibios_init() is
implemented in the 'arch/mips/kernel/pci.c' file. Otherwise you need
to provide it in your own board-dependent code.
-
Optionally,
pcibios_init() may call
pciauto_assign_resources() to do a complete
PCI resource assignment.
-
Somewhere
inside pcibios_init(), pci_scan_bus()
is called. If a machine has multiple host-PCI controllers,
pci_scan_bus() should be called for each of
the top-level PCI buses. Apparently, bus numbers should have already
been setup before pci_scan_bus() can
properly run.
-
Optionally,
after pci_scan_bus() is called,
pcibios_init() may choose to call
pci_assign_unassigned_resources() to do a
complete PCI resource assignment.
-
pcibios_init()
will do some more fixups (resources, IRQs, etc.).
-
Returning from
pcibios_init(), pci_init()
will do a final round of device-based fixups.
In this chapter, we focus on
the approach where both CONFIG_NEW_PCI and CONFIG_PCI_AUTO are
enabled.
PCI driver
interface
All of this work for PCI is to
eventually setup a structure where all PCI device drivers can run
happily. Knowing how PCI device drivers access PCI resources can
greatly help you understand how you should do PCI initialization and
setup.
-
inb()/outb()/inw()/outw()/inl()/outl()
Defined in 'include/asm-mips/io.h'. Drivers use these macros to
read/write into PCI IO space. The address arguments are addresses in
PCI IO space, and should correspond to one of the BAR values of the
device. On MIPS machines, we assume PCI IO space is mapped into a
contiguous physical address block. The base address of the block is
mips_io_port_base, which you need to set up
at the beginning of board setup time. The proper way to set it up is
call set_io_port_base().
-
readb()/writeb()/readw()/writew()/readl()/writel()
Defined in the 'include/asm-mips/io.h' file. Drivers use them to
access PCI memory space. On MIPS machines, we assume PCI memory
space is 1:1 mapped into a block of physical address. Therefore
those macros are equivalent to direct physical memory access.
-
pci_read_config_word()
and friends
Defined in 'include/linux.pci.h' (through PCI_OP()).
Drivers use them to read or write the configuration registers of
devices. Low-level routines are abstracted as struct
pci_ops, where each board must supply one.
-
pci_map_single()
and friends
Defined in 'include/asm-mips/pci.h'. Drivers
use these macros to map a virtual address to a bus address (so you
can tell the device to do DMA). They don't usually affect PCI
porting.
Setting up the
host-PCI controller
As we can see from the above
discussion, we need to set up the host-PCI controller such that:
-
It
has a 1:1 mapping between PCI memory space and CPU physical address
space
-
It maps the beginning
part of PCI IO space into a address block in physical address space.
The Host-PCI controller
usually allows you to map PCI memory space into a window in physical
address space. Let us say the base address is ba_mem
and size is sz_mem. It usually allows you to
translate that address into another one with a fixed offset, say off.
(Note off can be both positive or negative).
So if a driver accesses the address ba_mem+x (0
<= x < sz_mem), the host-PCI controller will intercept the
command and translate it into a PCI memory access at address
ba_mem+off+x.
To maintain 1:1 mapping, it
implies we must set up the PCI addressing such that off
is 0. Also note that with this setup, we cannot access the PCI memory
range [0x0, ba_mem] and [ba_mem
+ sz_mem, 0xffffffff].
Additionaly, we must also make
system RAM visible on the PCI memory bus at address 0x0 (assuming
that is the address in physical address space) in order for PCI
devices to do DMA transfers.
The beginning part of PCI IO
space is usually mapped into another window in physical address
space, say [ba_io, ba_io + sz_io]. In other
words, a range [0, sz_io] in PCI IO space
corresponds to the range [ba_io, ba_io + sz_io].
Obviously, mips_io_port_base should be set
to ba_io.
The above setup is typically
done in the board-specific setup routine (i.e., <board>_setup()).
You typically also setup ioport_resource and
iomem_resource as well:
ioport_resource.start = 0x0;
ioport_resource.end = sz_io;
iomem_resource.start = 0x0;
iomem_resource.end = sz_mem;
These variables are the roots of
all IO and memory resources (roughly corresponding to the ancient ISA
IO space and ISA memory space). For simplicity you can also set the
end to be 0xffffffff.
Board-specific
functions and variables
Here is a list of
board-specific functions you must implement. Again, I assume this
board has CONFIG_NEW_PCI and CONFIG_PCI_AUTO options enabled.
-
struct
pci_ops my_pci_ops
You implement six functions to fill into
this structure, which is needed by pci_scan_bus() and
pciauto_assign_resources(). Note that you need to dintinguish type 0
or type 1 configuration in those functions. You can typically check
that by checking whether the bus's parent (dev->bus->parent)
is NULL.
-
mips_pci_channels[]
You need to define this array in order to use pciauto resource
assignment. For each top-level PCI bus, you need to supply an
element data to this array. The array ends with an all-NULL element.
Each element is a structure that consists a pci_ops,
pci_io_resource, and pci_mem_resource, which usually represents a
top-level PCI bus connected to CPU. pci_ops defines the functions
access the PCI bus's config space. pci_io_resource and
pci_mem_resource specifies the address range that pciauto will use
to assign to the BARs of PCI devices. For pci_io_resource, it starts
with 0x0 (or 0x1000 to leave some room for legacy ISA devices), and
it ends at sz_io. For pci_mem_resource, it
starts at ba_mem and ends at ba_mem
+ sz_mem. Note these addresses are in PCI IO and PCI memory
space respectively. However, since we maintain 1:1 mapping between
PCI memory space and CPU physical address space, pci_mem_resource
also represents the PCI memory space window in CPU physical address
space.
-
pcibios_fixup_resources()
This routine is passed to pci_scan_bus() and invoked right after
when the PCI device is discovered. This is place where you can do
some device-specific fixup (BARs, pci_dev structure, etc). Note you
can do the same fixup in pcibios_fixup(). I recommand leave this
function empty unless you have some specific need that requires
immediate fixup.
-
pcibios_fixup()
This function is invoked after pci_scan_bus() is done, i.e., all
PCI bridges and devices are discovered by Linux. Here you can
enumerate through PCI devices and do device based fixup. Or you can
do bus or contoller related fixups.
-
pcibios_fixup_irqs()
This is the place to fix up PCI related IRQs. It is invoked
after pci_scan_bus() is done and next to pcibios_fixup(). A typical
strategy is to assign irq based on the slot number and possibly bus
number if there are more than top-level buses in the system. Note
that you can do the fixup in pcibios_fixup() as well and leave this
function empty.
-
pcibios_assign_all_busses()
Return 1 to indicate that all bus numbers have been assigned by
pciauto. [TODO:Should this function be included in pciauto by
default?][TODO: the driver/pci/pci.c file seems to have a typo when
it calls this function. The logic is inversed.]
Tips for writting
PCI code
-
Some
ill-behaviored PCI device may spoil the party. The best way to deal
with them is to signle them out in the PCI configuration routines
(in pci_ops). Inside those routines you check for the slot number
and function number corresponding to the bad devices and return some
NULL numbers.
-
If
you have mulitple top-level PCI buses, it is tricky to do PCI IO
assignment. Assume you have two PCI buses and their IO spaces are
mapped into [ba_io1, ba_io1 + sz_io1] and
[ba_io2, ba_io2+sz_io2] (ba_io1
+ sz_io1 <= ba_io2). You need to
-
Set
mips_io_port_base to be ba_io1
-
Set
pci_io_resource to be [0x0,
sz_io1] for the first PCI bus
-
Set pci_io_resource
to be [ba_io2 - ba_io1, ba_io2 - ba_io1 +
sz_io2] for the 2nd PCI bus. In addition, the legacy ISA
devices on the second PCI bus cannot be used without modifying
their drivers.
Chapter
11 : Ether drivers and networking
[TODO]
Chapter
12 : Romfs root filesystem
[TODO]
Chapter
13 : Debugging
Although
it is the last chapter, it is probably the most necessary one. In
this chapter, I will list some commonly used debugging tips and
tricks.
Kernel
Oops
ksymoops
is a program that deciphers all the secret numbers in a kernel core
dump. Old versions of ksymoops may not work well for MIPS.
I
sometimes use a script, call2sym,
written by Phil Hollenback. You just cut and past the call trace part
to feed the script and it will display a possible function
call stack at the time crash. Note it is likely some symbols are
bogus, but the real ones should be displayed. You will have to use
your own judgement.
In
2.6, the brack trace is automatically printed for you when a kernel
core dump happens.
Appendix
A: Linux startup sequence in 2.4.x
kernel_entry() - arch/mips/kernel/head.S
set stack;
prepare argc/argp/envp;
jal init_arch - arch/mips/kernel/setup.c
cpu_probe() -
prom_init(...) - arch/mips/ddb5074/prom.c
loadmmu()
start_kernel() - init/main.c
setup_arch(&commaind_line); - arch/mips/kernel/setup.c
ddb_setup() - arch/mips/ddb5074/setup.c
parse_options(command_line);
trap_init();
init_IRQ(); - arch/mips/kernel/irq.c
sched_init();
softirq_init();
time_init();
if (board_time_init) board_time_init();
set xtime by calling rtc_get_time();
pick appropriate do_gettimeoffset()
board_timer_setup(&timer_irqaction);
console_init();
init_modules();
kmem_cache_init();
sti(); /* interrupt first open */
calibrate_delay();
mem_init();
kmem_cache_sizes_init();
pgtable_cache_init();
fork_init();
proc_caches_init();
vfs_caches_init();
buffer_init();
page_cache_init();
signals_init();
smp_init();
kernel_thread(init, ...)
cpu_idle();
init() - init/main.c
- lock_kernel();
do_basic_setup();
- [MTRR] mtrr_init();
[SYSCTL] sysctl_init();
[S390] s390_init_machine_check();
[ACPI] acpi_init();
[PCI] pci_init();
[SBUS] sbus_init();
[PPC] ppc_init();
[MCA] mca_init();
[ARCH_ACORN] ecard_init();
[ZORRO] zorro_init();
[DIO] dio_init();
[MAC] nubus_init();
[ISAPNP] isapnp_init();
[TC] tc_init();
sock_init();
start_context_thread();
do_initcalls();
[IRDA] irda_proto_init();
[IRDA] irda_device_init();
[PCMCIA] init_pcmcia_ds();
prepare_namespace();
free_initmem();
unlock_kernel();
files = current->files;
if(unshare_files())
panic("unshare");
put_files_struct(files);
if (open("/dev/console", O_RDWR, 0) < 0)
printk("Warning: unable to open an initial console.\n");
(void) dup(0);
(void) dup(0);
if (execute_command)
run_init_process(execute_command);
run_init_process("/sbin/init");
run_init_process("/etc/init");
run_init_process("/bin/init");
run_init_process("/bin/sh");
panic("No init found. Try passing init= option to kernel.");
Appendix
B: Credits and thanks
People
in the following list have generously given their feedback to me. In
spite of my effort to keep the list as complete as possible, I am
afraid many people are still missing here.
Dirk Behme <dirk.behme@de.bosch.com>
Fillod Stephane <FillodS@thmulti.com>
Geoffrey Espin <espin@idiom.com>
Gerald Champagne <gerald.champagne@esstech.com>
Henri Girard <khgirard@broadbandnetdevices.com>
Neal Crook <ncrook@micron.com>
Steven J. Hill <James.Hill@timesys.com>
TAKANO Ryousei <takano@axe-inc.co.jp>
Motoya Kurotsu <kurotsu@allied-telesis.co.jp>
Appendix
C: Change logs
- 2004/03/29
-
Include
the second batch of changes and updates from Steven Hill (Humongous
thanks!). The content is more less current with respect to 2.4 tree.
Notes for porting to 2.6 will hopefully follow soon.
-
2004/01/26
-
Included
initial batch of changes and updates from Steven Hill (Big thanks!).
External links are now shown in a separate window.
-
2002/08/23
-
Add
change log. Make external links the _top frame. Update tools (thanks
to Henri Girard)
-
2002/12/11
-
Incorporate comments from Neal
Crook.