Now that the process of building a cross-toolchain for the ARM family of microprocessors has been described, we turn our attention to the issues involved in porting the Minix kernel to the Ipaq.
Minix kernel (as any other operating system) has its own set of libraries. Usually all of those are called kernel libraries and are only used in the kernel. The user-land libraries are libc. Since the user-land libraries use the kernel heavily (syscalls), naturally the kernel must implement the syscall facility. That is what the diffrent parts of the kernel do. When a syscall gets invoked a function inside the kernel gets called and handles the dispatch. Simple, eh?
Unfortuntly, some operating systems don't keep all the libaries functions completly seperated. Thus you end up using some of the functions from the standard libc libraries such as: printf, strcpy, etc Granted if these were implemented in the kernel libraries a certain unneccessary duplication would arise.
And this is where your dilemma is. You can port the kernel libraries and the standard C libraries (libc). Or you can just port the kernel libraries and only port the neccesary functions from the standard C libraries. That latter part is the easiest to do (but later on you have to clean up the code -> its a mess).
We choose to port the kernel libraries and only the neccesary user-land functions from the standard C libraries (libc).
The whole point of going through the complex process of building a cross-toolchain for ARM processors under Linux was the fact that there is no version of Minix available for the ARM processor family that includes the Amsterdam C Compiler (ACK) that Minix uses. Therefore the following steps had to be undertaken to compile the Minix kernel under the cross-toolchain environment that we set up under Linux i386:
Its always easier to write code when you have a Makefile. It will save you hours of retyping/code/etc. For more information regarding how to use/write stuff for Makefile, check out Makefile or any website that can find out on the web that teaches about Makefiles.
For testing programs, this following Makefile is pretty good:
############################################################################### # ############################################################################### ROOT = .. ARCH = /ipaq/local/bin/arm-coff AR = ${ARCH}-ar CC = ${ARCH}-gcc LD = ${ARCH}-ld -N OC = ${ARCH}-objcopy OD = ${ARCH}-objdump CFLAGS = -g -Wall -D_MINIX -D_WORD_SIZE=4 -D_EM_WSIZE=4 BASIC = Makefile all: uart.o hello.o $(LD) hello.o uart.o -o A.o $(OD) -DSs A.o > debug $(OC) --output-format=binary A.o A ############################################################################### uart.o: $(CC) $(CFLAGS) -c uart.S -o uart.o hello.o: $(CC) $(CFLAGS) -c hello.c -o hello.o
The most important flag in this case is the -c which instructs gcc compiler to not run the linker. The -o is the output flag.
The arm-coff-ld is run with the input object files (hello.o uart.o) and with the output A.o. Only linking off these two files is done. So if you are using printf or some methods from libc, then the linker won't be able to find the the libc library and complain about it. You will have to add the -L flag to provide the libgcc functionality, or implement (port) those functions by yourself.
This little handy utility disassembles all the binary opcodes from the file along with combining with source code. This program requires the input file to have a header (which we get after linking the files). Look below for the example:
void gccmain() { someweirdfunction("boo!"); }
.text .global _someweirdfunction .text _someweirdfunction: mov pc,lr
A.o: file format coff-arm-little Contents of section .text: 8000 0dc0a0e1 00d82de9 04b04ce2 04009fe5 ......-...L..... ... bla bla .. ioDisassembly of section .text: 00008000 <_gccmain>: void gccmain() { 8000: e1a0c00d mov r12, sp 8004: e92dd800 stmdb sp!, {r11, r12, lr, pc} 8008: e24cb004 sub r11, r12, #4 ; 0x4 0000800c <LBB2>: someweirdfunction("boo!"); 800c: e59f0004 ldr r0, [pc, #4] ; 8018 <L4> 8010: eb000019 bl 807c <_puts> 8014: ea000000 b 801c <L2> 00008018 <L4>: 8018: 0000809c muleq r0, r12, r0 0000801c <L2>: } 801c: e91ba800 ldmdb r11, {r11, sp, pc} 00008020 <_someweirdfunction>: 8020: e1a0f00e mov pc, lr .. bla bla ..
This program enables one to strip the header, debug information out the executable file and make it completly binary - something equivalant to MS-DOS COM files. This means that the file starts executing at the first byte off the code. Has no stack reserved, and assumes nothing about the operating system. All is left to you, the reader :p
How does the compiler link your assembler code against the C libraries (so that you can use the functions from your C code)? Very easy, it just links your C code against the assembler code. It basicly inserts branching conditions or inline assembler statments. Look above in the debug file.
This is problem you are going to run into sonner or later. The problem is that when you compile programs using GCC, the files have a header. This header specifies at what is the starting address, how stack to allocate, etc. For a example off a header, check out Startup state of Linux/i386 ELF binary. It gives good overview of what the headers does.
But in our case, after we link the program, we strip it off the header so that only binary opcode is left. That is of course what we want, but then our address in the code are screwed up. You see, when the linker links the files it assumes a relocatable address, and the operating system subtracts/adds the program's starting memory location to the memory address in the program. This of course works only when you have somebody taking care of relocation. When the header is striped, the program has the wrong address.
Its quite easy to fix this. You specify manually at what address the program should run from (in memory). Thus, if your bootloader allows you to load the program code at location 0x0010000 then that is what address you should tell the linker to link the programs with. The flag you add is -Ttext=0x0010000. Your new entry in the Makefile would look like this:
LD = ${ARCH}-ld -Ttext=0x100000
Good question. The GCC cross-compiler (and only the crosscompiler!, not the user-land compiler) assumes that all invocations of functions (regardless its defined in assembler or C code), have to be preceded with the _ character. This problem (or feature) is to allow to diffrentiate between user-land functions and kernel functions. Both of them might have the same name, and this appendix of _ would eliviate a lot of headaches that could have been stumbled upon. When the operating system is ported, the new compiler (which would be nativly compiled) would not assume of such thing.
Actually, the answers lies in the configuration files for the target platform. If you compile the cross-compiler using arm-elf, this _ issue will not be present, b/c the ELF files can only be run in a user-land process. While the arm-coff, arm-aout don't have this restriction and can be run from memory without headers.
Since headers are not being used, a logical questions to ask is: how do I know where the program will start from? In most C books that you may have read, it is assumed that every meaningful program starts in a function called main. This is not necessarily true. In reality, it can start anywhere - its the linker which decides who gets called when the program is invoked. Look at the parameter -e in your ld --help. If you have a program with the function wazzup(), just link the program with ld -e wazzup and the program will start executing at the function wazzup(). This behavior only works if the program has a header.
We won't be using a header. So what we have to do is to stack the programs together.
arm-coff-ld entry.o klib.o kernel.o end.o -o kernel.o
arm-coff-objcopy --output-format=binary kernel.o kernel
This will produce a kernel binary file which will have the instructions starting in entry.o. So do make sure you don't have some data variables in the beginning of the entry.o file, otherwise you will be pulling your hair.