Porting Minix C/assembler source code to iPAQ (StrongARM CPU) Konrad Rzeszutek, Paul Gonin, Venkata Mahadevan, Suchada Phalachaipiromsil A HOWTO-guide explaining the neccesary steps in porting an Operating System (in this case Minix) to another platform, how to fix/setup a cross-compiler, how to deal with compiling assembler code along with C code, and how to deal with weird errors, problems. ______________________________________________________________________ Table of Contents 1. Introduction 2. How this guide is organized. 3. Explanation of terms. 4. Building a cross-toolchain 4.1 Getting the right sources 4.2 Installing binutils 4.3 Setting up headers for the cross-compiler 4.4 Building GCC. 4.5 Thoughts 5. Porting Minix 5.1 The problems with Cross-compiling Minix 5.2 Obtaining a Bootloader for the Ipaq In order to boot the Minix kernel (or any other kernel for that matter), a bootloader for the target platform must be procured. The bootloader initializes the "raw" hardware of the computer and allows an operating system (or stand-alone) application to be started by jumping to a particular memory address that contains the binary code of the program. A bootloader for the Ipaq developed by Compaq's Cambridge Research Laboratory is available for download at 5.3 Makefile 5.3.1 arm-coff-gcc 5.3.2 arm-coff-ld 5.3.3 arm-coff-objdump 5.3.3.1 hello.c 5.3.3.2 uart.s 5.3.3.3 debug 5.3.4 arm-coff-copy 5.4 C/Assembler linkage 5.5 Text address 5.6 What's the deal with the "_function_name"? 5.7 Since I'm not using headers, how do I know where the program will start from? 6. The porting process. 6.1 Get to know thyself 6.2 Some thoughts 6.3 Providing Input to the Ipaq 6.4 Bootloader installation 6.5 Printk and Standard Output 6.5.1 boo.s 6.6 Assembler Libraries (klib and mpx) 6.7 Setting up a stack for the C Environment 6.7.1 head.s 7. Conclusions and Future work 8. Links 8.1 Websites 8.2 Newsgroups 8.3 IRC channels ______________________________________________________________________ 11.. IInnttrroodduuccttiioonn This document entails the steps required to build a cross-platform compiler, the usage of it, info about iPAQ StrongARM assembler, how to make a libc library and kernel libraries, how to compile assembler code against C code, and how to deal with weird errors. The authors of this guide assume prior experience with C and assembler language. The C language knowledge is a mmuusstt. We also assume you know UNIX and has basic knowledge of GNU suite programs. 22.. HHooww tthhiiss gguuiiddee iiss oorrggaanniizzeedd.. Steps: +o Getting/installing a cross-compiler suite. +o Getting/using a bootloader on the target platform. +o Compile assembler code. +o Compile C code along with assembler code and link 'em together. +o Drinking lots of coffee. 33.. EExxppllaannaattiioonn ooff tteerrmmss.. This guide uses a lot of weird names, phrases and such. To faciliate a quicker assimilation of information, this little section is a must read. BBiigg--eennddiiaann vvss lliittttllee--eennddiiaann Imagine that you are trying to assemble a simple command: mov r1,r0 When the compiler assembles it, it comes to something like this: e1a01000 But if you are using a little-endian, this will be written into the binary file as: 0010a0e1 Side note:, endianess can be defined as the way to see a byte in which case we would be talking about bit ordering endianess but the StrongArm use big endian bit ordering endianess, in our case we are concerned with byte-ordering endianess which deals with the ordering of a sequence of bytes of 4 bytes). The StrongArm is a 32bits processor which uses Little Endian words (4 bytes). MMiinniixx A simple microkernel operating system. Check out the source code at Minix repository. tteexxtt llooccaattiioonn Location (in absolute 32bits) for the linker to assume that the kernel will be running from. You see, when the linker links the code, it uses absolute address to reference to string variables and such. So if you don't specify what location in memory the linker should assume the program(kernel) will be loaded, your program(kernel) will display garbage. Here is an example, lets assume we want to load into a register the pointer to a string variable: puts("boo!"); 800c: e59f0004 ldr r0, [pc, #4] ; 8018 8010: eb000019 bl 807c <_puts> 8014: ea000000 b 801c 00008018 : 8018: 0000809c muleq r0, r12, r0 0000809c : 809c: 216f6f62 cmncs pc, r2, ror #30 ; "boo!" The _l_d_r is a load word into a register, _b_l is a branching with return (something equivalant to i386 _c_a_l_l). As you can see, the register r0 contains now the contents of location 8018, which is 0000809c. 0000809c is the string "boo!" in hexidecimal. The problem is that when you deploy this program (kernel) on to a system, you might not put the code at 00008000, but somewhere else. So when your program (kernel) runs, it will fetch the data from the wrong location! That is why you need to change the text location to the address where you will put the program (kernel). The argument for the linker is --TTtteexxtt==00xx[[ssoommee--vvaalluuee--hheerree]] CCrroossss--ttoooollcchhaaiinn A toolchain actually consists of a number of components. The main one is the compiler itself gcc, which can be native to the host or a cross-compiler. This is supported by binutils, a set of tools for manipulating binaries. These components are all you need for compiling the kernel, but almost anything else you compile also needs the C-library glibc. As you will realize if you think about it for a moment, compiling the compiler poses a bootstrapping problem, which is the main reason why generating a toolset is not a simple exercise. 44.. BBuuiillddiinngg aa ccrroossss--ttoooollcchhaaiinn If you have access to a build cross-toolchain, it might be easier to use/install that one. However, you might run into problems with it - the headers might have wrong information, you can't change the linkers text location , its built for big-endian instead of little-endian. But on the other hand, most (if not all) work without problems. Here is a list of websites with cross-toolchains: +o Arm Linux v4l Cross Tool Chain +o ARM Software Development Toolkit release 2.02u (non-commercial license for education instututions) But if on the other hand you want to be a geek and have a desire to build a cross-toolchain from scratch, go ahead and read this section. This is naturally not the only cross-tool chain build guide in the world. If you have trouble comprehending this section, you might consider visiting these great sites: +o Building the Toolchain +o HOWTO Build a Cross Toolchain in Brief Keep in mind that most of the documentation regarding building the cross-compiler came from the _H_O_W_T_O _B_u_i_l_d _a _C_r_o_s_s _T_o_o_l_c_h_a_i_n _i_n _B_r_i_e_f. 44..11.. GGeettttiinngg tthhee rriigghhtt ssoouurrcceess We are going to build an ARM cross tool chain (cross-platform compiler, cross-platform binutils) for a arm-coff file format (you could pick arm-elf, or arm-aout format as well). This COFF format produces _f_l_a_t, or standalone binaries, not tied in to any operating system. The sources will reside in _/_i_p_a_q_/_s_r_c, the _/_i_p_a_q_/_b_u_i_l_d as the build directory, and _/_i_p_a_q_/_l_o_c_a_l as the installation prefix. Neccessary steps: +o Get the source code for the bin utils - GNU bin utils or Chain Tools sources +o Get the source code for the GCC compiler - GNU GCC or Chain Tools sources +o Patches for the gcc 2.95.2 compiler - gcc-2.95.2-diff.991022 and gcc-fold-const.patch. Both are available in Toolchain sources or Chain Tools sources +o Get the linux kernel compiled for StrongARM. Get the sources from: or Chain Tools sources Uncompress all the files in the _/_i_p_a_q_/_s_r_c source directory. You should have three directories _b_i_n_u_t_i_l_s_-_2_._9_._5_._0_._2_2,_g_c_c_-_2_._9_5_._2, and _l_i_n_u_x. Apply the two patches: ___________________________________________________________________ cd /ipaq/src/gcc-2.95.2 patch -p0 < ../gcc-2.95.2-diff-991022 cd gcc patch -p0 < ../../gcc-fold-const.patch ___________________________________________________________________ Create the build directories. These are the directories where the programs will be build. In this document, _/_i_p_a_q_/_b_u_i_l_d is the build directory. ______________________________________________________________________ mkdir /ipaq/build mkdir /ipaq/build/binutils-2.9.5.0.22 mkdir /ipaq/build/gcc-2.95.2 ______________________________________________________________________ 44..22.. IInnssttaalllliinngg bbiinnuuttiillss BinUtils are your friends. They are essentialy the basic tools needed by the cross-compiler to function. They include utilties such as _o_b_j_c_o_p_y_, _o_b_j_d_u_m_p_, _a_s_, _a_r_, _s_t_r_i_p_, _l_d and bunch other. +o Choose the pprreeffiixx for the new tool chain. This is a fixed directory where the tool chain will forever reside (unless you re-build the tool chain). As I mentioned before, we will use _/_i_p_a_q_/_l_o_c_a_l in this document. +o Choose the ttaarrggeett for the new tool chain. In our case we are using _a_r_m_-_c_o_f_f since its one of the formats that provides flat-format (standalone, not tied in with any operating system). +o In the _/_i_p_a_q_/_b_u_i_l_d_/_b_i_n_u_t_i_l_s_-_2_._9_._5_._0_._2_2 build directory (_/_i_p_a_q_/_b_u_i_l_d_/_b_i_n_u_t_i_l_s_-_2_._9_._5_._0_._2_2) directory: ___________________________________________________________________ /ipaq/src/binutils-2.9.5.0.22/configure --target=arm-coff --prefix=/ipaq/local --host=i386 make make install ___________________________________________________________________ And now you will find a bunch of new applications in _/_i_p_a_q_/_l_o_c_a_l_/_b_i_n. MMaakkee ssuurree yyoouu aadddd tthhiiss ddiirreeccttoorryy ttoo yyoouurr PPAATTHH bbeeffoorree pprroocceeddddiinngg wwiitthh ccoommppiilliinngg tthhee ccrroossss--ccoommppiilleerr!!. 44..33.. SSeettttiinngg uupp hheeaaddeerrss ffoorr tthhee ccrroossss--ccoommppiilleerr For the cross-compiler to compile correctly, its neccesary to have the include files from the Linux. Even though you are going to compile programs using your own include headers, this step is still neccesary. If you don't follow this step, you wont be able to compile the GNU gcc cross-compiler. There only reason why you want to this is that GCC will compile. You can remove the include files later on. Dirty secret: It uses the include files to make its lliibbggcccc..aa file. If you are going to link source files using gcc, then you need this file. If you aren't and you are going to use your own libraries, you won't use this file. +o Get the ARM Linux kernel. Get it from Handhelds.org website or Chain Tools sources . +o Uncompress and its source directory (_k_e_r_n_e_l_/_l_i_n_u_x), type ___________________________________________________________________ make dep ___________________________________________________________________ +o Copy all the _i_n_c_l_u_d_e_/_a_s_m_-_a_r_m and _i_n_c_l_u_d_e_/_l_i_n_u_x directories. cd /ipaq/local/arm-linux mkdir include cd include cp -dR /ipaq/src/kernel/linux/include/asm-arm ./ cp -dR /ipaq/src/ker- nel/linux/include/linux ./ 44..44.. BBuuiillddiinngg GGCCCC.. +o Get into the gcc ssoouurrccee directory. ___________________________________________________________________ cd /ipaq/src/gcc-2.95.2/gcc/config/arm ___________________________________________________________________ and with your favorite editor (hint: _v_i) edit the file _t_-_l_i_n_u_x. Append _-_D_i_n_h_i_b_i_t___l_i_b_c _-_D_____g_t_h_r___p_o_s_i_x___h to the line that says: TARGET_LIBGCC2_CFLAGS = -fomit-frame-pointer -fPIC which should result in: TARGET_LIBGCC2_CFLAGS = -fomit-frame-pointer -fPIC -Dinhibit_libc -D__gthr_posix_h +o Get into your build directory and run the _c_o_n_f_i_g_u_r_e program: cd /ipaq/build/gcc-2.95.2 /ipaq/src/gcc-2.95.2/configure --target=arm- coff --host=i386-pc-linux-gnu --prefix=/ipaq/local --disable-threads --with-cpu=strongarm110 -enable-languages=c +o Compile the program. make make install If there are problems, such as: +o _C_o_u_l_d_n_'_t _f_i_n_d _s_t_d_l_i_b_._h _a_n_d _u_n_i_s_t_d_._h. That means your _-_D_i_n_h_i_b_i_t___l_i_b_c flag wasn't passed to the Makefiles during auto-configuration. Edit the Makefile in the _/_i_p_a_q_/_b_u_i_l_d_/_g_c_c_-_2_._9_5_._2_/ggcccc ddiirreeccttoorryy.. FFiinndd wwhheerree iitt ssaayyss:: GCC_CFLAGS=$(INTERNAL_CFLAGS) $(X_CFLAGS) $(T_CFLAGS) $(CFLAGS) -I./include $(TCFLAGS) Add to it: _-_D_i_n_h_i_b_i_t___l_i_b_c, so that it will look like: GCC_CFLAGS=$(INTERNAL_CFLAGS) $(X_CFLAGS) $(T_CFLAGS) $(CFLAGS) -I./include $(TCFLAGS) -Dinhibit_libc +o _E_r_r_o_r_: _n_o _s_u_c_h _3_8_6 _i_n_s_t_r_u_c_t_i_o_n_:. You are using the wrong linker. Make sure you have in your PATH variable the directory _/_i_p_a_q_/_l_o_c_a_l_/_b_i_n where the arm-coff-ld resides. +o That's it. Just do make install 44..55.. TThhoouugghhttss Instead of _a_r_m_-_c_o_f_f you can use _a_r_m_-_a_o_u_t if you want too. 55.. PPoorrttiinngg MMiinniixx Now that the process of building a cross-toolchain for the ARM family of microprocessors has been described, we turn our attention to the issues involved in porting the Minix kernel to the Ipaq. Minix kernel (as any other operating system) has its own set of libraries. Usually all of those are called _k_e_r_n_e_l _l_i_b_r_a_r_i_e_s and are only used in the kernel. The user-land libraries are _l_i_b_c. Since the user-land libraries use the kernel heavily (_s_y_s_c_a_l_l_s), naturally the kernel must implement the syscall facility. That is what the diffrent parts of the kernel do. When a _s_y_s_c_a_l_l gets invoked a function inside the kernel gets called and handles the dispatch. Simple, eh? Unfortuntly, some operating systems don't keep all the libaries functions completly seperated. Thus you end up using some of the functions from the standard libc libraries such as: _p_r_i_n_t_f_, _s_t_r_c_p_y_, _e_t_c Granted if these were implemented in the kernel libraries a certain unneccessary duplication would arise. And this is where your dilemma is. You can port the kernel libraries and the standard C libraries (libc). Or you can just port the kernel libraries and only port the neccesary functions from the standard C libraries. That latter part is the easiest to do (but later on you have to clean up the code -> its a mess). We choose to port the kernel libraries and only the neccesary user- land functions from the standard C libraries (libc). 55..11.. TThhee pprroobblleemmss wwiitthh CCrroossss--ccoommppiilliinngg MMiinniixx The whole point of going through the complex process of building a cross-toolchain for ARM processors under Linux was the fact that there is no version of Minix available for the ARM processor family that includes the Amsterdam C Compiler (ACK) that Minix uses. Therefore the following steps had to be undertaken to compile the Minix kernel under the cross-toolchain environment that we set up under Linux i386: +o Makefiles: must be modified to work under Linux Make, including setting explicit paths to find the include files and libraries. +o Compiler: GCC 2.95.2 built as a cross-compiler for ARM targets (arm-coff-gcc) +o Assembler: GNU assembler +o Linker: GNU ld (arm-coff-ld) +o Assembler code: must be translated from Amsterdam assembler format to GNU assembler format. A time-consuming process because the respective syntax of these assemblers is quite different. Minix i386 assembler code must also be converted to ARM assembler code in order to compile and execute on the Ipaq. +o Build tools: binutils with tools such as objcopy, objdump, as, ar, strip, and ld 55..22.. IInn oorrddeerr ttoo bboooott tthhee MMiinniixx kkeerrnneell ((oorr aannyy ootthheerr kkeerrnneell ffoorr tthhaatt mmaatttteerr)),, aa bboooottllooaaddeerr ffoorr tthhee ttaarrggeett ppllaattffoorrmm mmuusstt bbee pprrooccuurreedd.. TThhee bboooottllooaaddeerr iinniittiiaalliizzeess tthhee ""rraaww"" hhaarrddwwaarree ooff tthhee ccoommppuutteerr aanndd aalllloowwss aann ooppeerraattiinngg ssyysstteemm ((oorr ssttaanndd--aalloonnee)) aapppplliiccaattiioonn ttoo bbee ssttaarrtteedd bbyy jjuummppiinngg ttoo aa ppaarrttiiccuullaarr mmeemmoorryy aaddddrreessss tthhaatt ccoonnttaaiinnss tthhee bbiinnaarryy ccooddee ooff tthhee pprrooggrraamm.. AA bboooottllooaaddeerr ffoorr tthhee IIppaaqq ddeevveellooppeedd bbyy CCoommppaaqq''ss CCaamm-- bbrriiddggee RReesseeaarrcchh LLaabboorraattoorryy iiss aavvaaiillaabbllee ffoorr ddoowwnnllooaadd aattHHaannddhheellddss..oorrgg < Obtaining a Bootloader for the Ipaq 55..33.. MMaakkeeffiillee Its always easier to write code when you have a _M_a_k_e_f_i_l_e. It will save you hours of retyping/code/etc. For more information regarding how to use/write stuff for Makefile, check out Makefile or any website that can find out on the web that teaches about _M_a_k_e_f_i_l_es. For testing programs, this following Makefile is pretty good: ______________________________________________________________________ ############################################################################### # ############################################################################### ROOT = .. ARCH = /ipaq/local/bin/arm-coff AR = ${ARCH}-ar CC = ${ARCH}-gcc LD = ${ARCH}-ld -N OC = ${ARCH}-objcopy OD = ${ARCH}-objdump CFLAGS = -g -Wall -D_MINIX -D_WORD_SIZE=4 -D_EM_WSIZE=4 BASIC = Makefile all: uart.o hello.o $(LD) hello.o uart.o -o A.o $(OD) -DSs A.o > debug $(OC) --output-format=binary A.o A ############################################################################### uart.o: $(CC) $(CFLAGS) -c uart.S -o uart.o hello.o: $(CC) $(CFLAGS) -c hello.c -o hello.o ______________________________________________________________________ 55..33..11.. aarrmm--ccooffff--ggcccc The most important flag in this case is the _-_c which instructs gcc compiler to not run the linker. The _-_o is the output flag. 55..33..22.. aarrmm--ccooffff--lldd The arm-coff-ld is run with the input object files (_h_e_l_l_o_._o _u_a_r_t_._o) and with the output _A_._o. Only linking off these two files is done. So if you are using printf or some methods from libc, then the linker won't be able to find the the libc library and complain about it. You will have to add the _-_L flag to provide the libgcc functionality, or implement (port) those functions by yourself. 55..33..33.. aarrmm--ccooffff--oobbjjdduummpp This little handy utility disassembles all the binary opcodes from the file along with combining with source code. This program requires the input file to have a header (which we get after linking the files). Look below for the example: 55..33..33..11.. hheelllloo..cc ______________________________________________________________________ void gccmain() { someweirdfunction("boo!"); } ______________________________________________________________________ 55..33..33..22.. uuaarrtt..ss ______________________________________________________________________ .text .global _someweirdfunction .text _someweirdfunction: mov pc,lr ______________________________________________________________________ 55..33..33..33.. ddeebbuugg ______________________________________________________________________ A.o: file format coff-arm-little Contents of section .text: 8000 0dc0a0e1 00d82de9 04b04ce2 04009fe5 ......-...L..... ... bla bla .. ioDisassembly of section .text: 00008000 <_gccmain>: void gccmain() { 8000: e1a0c00d mov r12, sp 8004: e92dd800 stmdb sp!, {r11, r12, lr, pc} 8008: e24cb004 sub r11, r12, #4 ; 0x4 0000800c : someweirdfunction("boo!"); 800c: e59f0004 ldr r0, [pc, #4] ; 8018 8010: eb000019 bl 807c <_puts> 8014: ea000000 b 801c 00008018 : 8018: 0000809c muleq r0, r12, r0 0000801c : } 801c: e91ba800 ldmdb r11, {r11, sp, pc} 00008020 <_someweirdfunction>: 8020: e1a0f00e mov pc, lr .. bla bla .. ______________________________________________________________________ 55..33..44.. aarrmm--ccooffff--ccooppyy This program enables one to strip the header, debug information out the executable file and make it completly binary - something equivalant to MS-DOS COM files. This means that the file starts executing at the first byte off the code. Has no stack reserved, and assumes nothing about the operating system. All is left to you, the reader :p 55..44.. CC//AAsssseemmbblleerr lliinnkkaaggee How does the compiler link your assembler code against the C libraries (so that you can use the functions from your C code)? Very easy, it just _l_i_n_k_s your C code against the assembler code. It basicly inserts branching conditions or inline assembler statments. Look above in the _d_e_b_u_g file. 55..55.. TTeexxtt aaddddrreessss This is problem you are going to run into sonner or later. The problem is that when you compile programs using GCC, the files have a header. This header specifies at what is the starting address, how stack to allocate, etc. For a example off a header, check out Startup state of Linux/i386 ELF binary . It gives good overview of what the headers does. But in our case, after we link the program, we strip it off the header so that only binary opcode is left. That is of course what we want, but then our address in the code are screwed up. You see, when the linker links the files it assumes a relocatable address, and the operating system subtracts/adds the program's starting memory location to the memory address in the program. This of course works only when you have somebody taking care of relocation. When the header is striped, the program has the wrong address. Its quite easy to fix this. You specify manually at what address the program should run from (in memory). Thus, if your bootloader allows you to load the program code at location _0_x_0_0_1_0_0_0_0 then that is what address you should tell the linker to link the programs with. The flag you add is _-_T_t_e_x_t_=_0_x_0_0_1_0_0_0_0. Your new entry in the Makefile would look like this: LD = ${ARCH}-ld -Ttext=0x100000 55..66.. WWhhaatt''ss tthhee ddeeaall wwiitthh tthhee ""__ffuunnccttiioonn__nnaammee""?? Good question. The GCC cross-compiler (and only the crosscompiler!, not the user-land compiler) assumes that aallll invocations of functions (regardless its defined in assembler or C code), have to be preceded with the __ character. This problem (or feature) is to allow to diffrentiate between user-land functions and kernel functions. Both of them might have the same name, and this appendix of __ would eliviate a lot of headaches that could have been stumbled upon. When the operating system is ported, the new compiler (which would be nativly compiled) would not assume of such thing. Actually, the answers lies in the configuration files for the target platform. If you compile the cross-compiler using _a_r_m_-_e_l_f, this __ issue will not be present, b/c the ELF files can only be run in a user-land process. While the _a_r_m_-_c_o_f_f_, _a_r_m_-_a_o_u_t don't have this restriction and can be run from memory without headers. 55..77.. SSiinnccee II''mm nnoott uussiinngg hheeaaddeerrss,, hhooww ddoo II kknnooww wwhheerree tthhee pprrooggrraamm wwiillll ssttaarrtt ffrroomm?? Since headers are not being used, a logical questions to ask is: how do I know where the program will start from? In most C books that you may have read, it is assumed that _e_v_e_r_y meaningful program starts in a function called main. This is not necessarily true. In reality, it can start anywhere - its the linker which decides who gets called when the program is invoked. Look at the parameter _-_e in your _l_d _-_-_h_e_l_p. If you have a program with the function _w_a_z_z_u_p_(_), just link the program with _l_d _-_e _w_a_z_z_u_p and the program will start executing at the function _w_a_z_z_u_p_(_). This behavior only works if the program has a header. We won't be using a header. So what we have to do is to stack the programs together. arm-coff-ld entry.o klib.o kernel.o end.o -o kernel.o arm-coff-objcopy --output-format=binary kernel.o kernel This will produce a kernel binary file which will have the instruc- tions starting in _e_n_t_r_y_._o. So do make sure you don't have some data variables in the beginning of the _e_n_t_r_y_._o file, otherwise you will be pulling your hair. 66.. TThhee ppoorrttiinngg pprroocceessss.. This section will explain how we attacked the problem of porting Minix v2.0.2 (i386) to an iPAQ (StrongARM CPU). Its quite technical and many hours of sweat have been poured over our stupid mistakes, so don't dare to e-mail us any corrections/ideas. We will silence you :p 66..11.. GGeett ttoo kknnooww tthhyysseellff Actually, its should be _G_e_t _t_o _k_n_o_w _t_h_e _c_o_m_p_i_l_e_r. Understand from the previos section how the compiler works, how to cross-link assembler and C source code. How to deal with text-location and working around no support from libc. If you think you got a grasp on that, then you are good to go. 66..22.. SSoommee tthhoouugghhttss Modularize modularize and modularize everything you can. Do little test/suites thing to make sure that your code works. Do daily backups of your data. Don't drink too much coffee. You will end up spilling it on the keyboard. 66..33.. PPrroovviiddiinngg IInnppuutt ttoo tthhee IIppaaqq As mentioned in an earlier section, input to the Ipaq can be achieved in 2 ways: via its touch screen or serial port. It is plainly obvious that the latter is the only feasible method of providing input to the Ipaq in this case because the touch screen requires some sort of windowing system and touch screen driver to be useable. Therefore, the Ipaq had to be connected to the host PC (running Linux) via a serial cable. The terminal program, Minicom, was used to transfer data (such as files and keystrokes from the host PC) to the Ipaq. Output from the Ipaq was returned to Minicom via the same serial cable. Any terminal program can be used, provided the following settings are maintained: 115,200bps, 8 data bits, no parity bits, and 1 stop bit. Flow control must also be disabled in order for this to work. For screenshots of our accomplishment visit this iPAQ Linux success! url. 66..44.. BBoooottllooaaddeerr iinnssttaallllaattiioonn Instructions for installing the bootloader can be found at the Handhelds.org website . The bootloader was exactly what we wanted. After we looked at the source code, it became evident that we have multiple ways of loading our kernel. We could either load in the memory (_l_o_a_d _r_a_m), which would mean that our starting position would become _0_x_c_0_0_0_0_0_0_0. That is of course the flag that must be submitted to the linker. Otherwise you are screwed. 66..55.. PPrriinnttkk aanndd SSttaannddaarrdd OOuuttppuutt It is obviously essential to have the ability to print to standard output in order to be able to print messages and other information when testing and debugging the kernel. On the PC, output to the screen is achieved by simply writing data to the frame buffer of the video card. Due to the Ipaq's radically different architecture, it is not possible to do this in a manner similar to the PC. Therefore, the simplest means of outputting information from the Ipaq was to write data to its serial port. The information will then show up in the window of the terminal program (Minicom) running on the host PC. The following snippet of ARM assembler code does this: 66..55..11.. bboooo..ss ______________________________________________________________________ .text .global putc @ void putc (char x) .global uart_init @ void uart_init (void) #define UTCR0 0x00 #define UTCR1 0x04 #define UTCR2 0x08 #define UTCR3 0x0c #define UTDR 0x14 #define UTSR0 0x1c #define UTSR1 0x20 #define BAUDRATE 115200 #define BAUD_DIV ((230400/BAUDRATE)-1) .align test_putc: mov r0,#'M' bl putc mov pc,lr putc: ldr r3, UART_BASE mov r1, r0 mov r0, #0 b 1f mov pc,lr uart_init: ldr r3, UART_BASE mov r1, #0 str r1, [r3, #UTCR3] mov r1, #0x08 @ 8N1 str r1, [r3, #UTCR0] mov r1, #BAUD_DIV str r1, [r3, #UTCR2] mov r1, r1, lsr #8 str r1, [r3, #UTCR1] mov r1, #0x03 @ RXE + TXE str r1, [r3, #UTCR3] mov r1, #0xff @ flush status reg str r1, [r3, #UTSR0] mov pc, lr UART_BASE: .long 0x80050000 @ UART3 ______________________________________________________________________ This code outputs the character 'M' to standard output, which in this case, is the serial port of the Ipaq. At the host PC, the character 'M' will appear in the terminal program's window. The function uart_init is responsible for initializing the serial port on the Ipaq and putc writes a single character to the serial port. Modifying this code to print a sequence of characters i.e. a string is relatively simple. Since the putc function has been declared global, it can be accessed from a C program. Then, within the C program, a loop that repeatedly calls the putc function by passing it pointers to characters can execute in order to print out a character sequence. Alternately, a puts function that does the same thing can be coded in assembler. The latter method is probably more efficient. However, this code is not entirely sufficient for our purposes. For example, suppose we want to print formatted output such as decimals and hexadecimals. Since we are not using any libc, we cannot use functions such as printf. However, the Minix kernel has a function called printk, which is short for kernel print. Kernel print uses a put character function (which we already have -- see above) to print formatted output. This function is written in C, so it can be linked with the assembler code above. Please refer to the Minix v2.0.2 source for a full listing of this function. 66..66.. AAsssseemmbblleerr LLiibbrraarriieess ((kklliibb aanndd mmppxx)) here are two main libraries used in the Minix i386 kernel. These are klib386.s and mpx.s. Hence, the first task in porting the Minix kernel to the Ipaq is to convert these libraries to run on the StrongARM CPU. Due to the vast differences in hardware between an Ipaq and an IBM PC, some functions required on the PC side may not be required on the Ipaq and vice-versa. The klib contains a number of assembly code utility routines required by the kernel. In the klib386.s for the IBM PC, there are many routines that are PC-specific such as functions to copy and move data to the frame buffer of the video card. To get a minimalist Minix kernel running on the Ipaq, the following functions from klib386.s were ported to the StrongARM CPU to form a new library called klibsa1110.s: ______________________________________________________________________ void phys_copy(phys_bytes source,phys_bytes destination, phys_byes bytecount); void lock(); void unlock(); void exit(); ______________________________________________________________________ The _p_h_y_s___c_o_p_y function simply copies a block of physical memory. The source, destination, and bytecount are represented as unsigned longs. The implementation for the StrongARM CPU loads 4 words at a time and stores them on a memory stack. The _l_o_c_k_(_) and _u_n_l_o_c_k_(_) functions disable and enable CPU interrupts respectively. The _e_x_i_t_(_) function is provided merely as a convenience for library routines that use exit. Since no calls to exit can actually occur within the kernel, a dummy version that simply returns to the caller is implemented. The mpx library handles process switching and message handling. All transitions to the kernel go through this library. Interrupts (software and hardware) and sending/receiving message calls can cause transitions to the kernel. The most important function in this library is the system call (_s___c_a_l_l) function. The system call function handles interrupts caused by system calls (software interrupts / SWI's). This was the only function ported to the StrongARM from the mpx.s library. 66..77.. SSeettttiinngg uupp aa ssttaacckk ffoorr tthhee CC EEnnvviirroonnmmeenntt The kernel must be allocated a stack in main memory for it to operate in. The following snippet of assembler code does this: 66..77..11.. hheeaadd..ss ______________________________________________________________________ .text .align 2 .global boo .text boo: ldr sp,STACK bl _start STACK: .long 0xc0080000 ______________________________________________________________________ 77.. CCoonncclluussiioonnss aanndd FFuuttuurree wwoorrkk The following goals were achieved in our attempt to port the Minix 2.0.2 OS kernel to the Ipaq: +o A usable cross-compiler suite that generates code executable on the Ipaq was built +o Several assembler language functions from the kernel libraries were ported to run on the StrongARM. Testing confirmed their validity. +o Standard output, including formatted standard output, functions were successfully demonstrated on the Ipaq. +o C and ARM assembly language functions were successfully linked. +o A good deal was learned about The Minix Operating System and operating systems in general. This project could be continued and completed if given additional time. A lot of the groundwork has already been done. What remains to be done is finishing the remaining assembler functions in the kernel, modifying some of the include files, and attempting a full kernel compile. Another possibility for the continuation of this project would be switching from Minix to ucLinux . Although uCLinux is still a monolithic operating system like Linux, the small size of its kernel and embedded capabilities make it more akin to Minix. The main advantage of using uCLinux is that GCC is used as its C compiler, so there is no need to spend time modifying C code and include files when porting. The uCLinux developer community is also much more active than the Minix one, so obtaining assistance when undertaking a project of this magnitude is easier. A Corel Netwinder would also be useful in continuing this project. A Netwinder is a StrongARM based computer system intended to serve as a web server. NetBSD and Linux have already been ported to this platform, so using this system as a development environment for the Ipaq is feasible. 88.. LLiinnkkss 88..11.. WWeebbssiitteess +o GNU Toolchain for ARM targets info +o Arm Linux v4l Cross Tool Chain +o ARM Software Development Toolkit release 2.02u +o Building the Toolchain +o HOWTO Build a Cross Toolchain in Brief +o GNU binutils +o Linux Kernel for StrongARM +o startup state of Linux/i386 ELF binary 88..22.. NNeewwssggrroouuppss +o comp.sys.arm +o comp.os.minix 88..33.. IIRRCC cchhaannnneellss +o #ipaq on irc.openprojects.net