The instructions that a CPU fetches from memory and executes are not at all understandable
to human beings.
They are machine codes which tell the computer precisely what to do.
The hexadecimal number 0x89E5 is an Intel 80486 instruction which copies the
contents of the ESP register to the EBP register.
One of the first software tools invented for the earliest computers was an assembler,
a program which takes a human readable source file and assembles it into
Assembly languages explicitly handle registers and operations on
data and they are specific to a particular microprocessor.
The assembly language for an Intel X86 microprocessor is very different
from the assembly language for an Alpha AXP microprocessor.
The following Alpha AXP assembly code shows the sort of operations that a
program can perform:
ldr r16, (r15) ; Line 1
ldr r17, 4(r15) ; Line 2
beq r16,r17,100 ; Line 3
str r17, (r15) ; Line 4
100: ; Line 5
The first statement (on line 1) loads register 16 from the
address held in register 15.
The next instruction loads register 17 from the next location in
Line 3 compares the contents of register 16 with that of register 17
and, if they are equal, branches to label 100.
If the registers do not contain the same value then the program
continues to line 4 where the contents of r17 are saved into memory.
If the registers do contain the same value then no data needs to
Assembly level programs are tedious and tricky to write and prone to
Very little of the Linux kernel is written in assembly language and
those parts that are are written only for efficiency and they are
specific to particular microprocessors.
The C Programming Language and Compiler
Writing large programs in assembly language is a difficult and time
It is prone to error and the resulting program is not portable, being
tied to one particular processor family.
It is far better to use a machine independent language
C allows you to describe programs in terms of their logical algorithms and the data
that they operate on.
Special programs called compilers read the C program and translate
it into assembly language, generating machine specific code from it.
A good compiler can generate assembly instructions that are very
nearly as efficient as those written by a good assembly programmer.
Most of the Linux kernel is written in the C language.
The following C fragment:
if (x != y)
x = y ;
performs exactly the same operations as the previous example assembly code.
If the contents of the variable x are not the same as the contents
of variable y then the contents of y will be copied to x.
C code is organized into routines, each of which perform a task.
Routines may return any value or data type supported by C.
Large programs like the Linux kernel comprise many separate C source
modules each with its own routines and data structures.
These C source code modules group logical functions such as
filesystem handling code.
C supports many types of variables, or locations in memory which
can be referenced by a symbolic name.
In the above C fragment x and y refer to locations in memory.
The programmer does not care where in memory the variables are put, it
is the linker (see below) that has to worry about that.
Some variables contain different sorts of data,
integer and floating point and others are pointers.
Pointers are variables that contain the address, the location in memory
of other data.
Consider a variable called x, it might live in memory at address
You could have a pointer, called px, which points at x.
px might live at address 0x80010030.
The value of px would be 0x80010000: the address of the variable x.
C allows you to bundle together related variables into data structures.
int i ;
char b ;
} my_struct ;
is a data structure called my_struct which contains two elements,
an integer (32 bits of data storage) called i and a character (8 bits
of data) called b.
Linkers are programs that link together several object modules and libraries
to form a single, coherent, program.
Object modules are the machine code output from an assembler or compiler and contain
executable machine code and data together with information that allows the linker
to combine the modules together to form a program.
For example one module might contain all of a program's database functions and
another module its command line argument handling functions.
Linkers fix up references between these object modules, where a routine or data
structure referenced in one module actually exists in another module.
The Linux kernel is a single, large program linked together from its
many constituent object modules.