The diagram in Figure 1 shows how a process in Unix is typically layed out. The absolute addresses are for Sparcs running SunOS; where the user text (code) begins, where the separation between kernel address space and user address space lays, where the user stack starts and whether it grows up or down are all operating system / processor architecture dependent. (For example, HPPAs running HPUX have a stack which grows up rather than down; alphas running OSF/1 have gaps between user stack and kernel space, and between user data and user text.)
C programs uses the (downward growing) stack for two major tasks:
(1) store the return address to the caller, and (2) allocate space for
automatic variables (variables local in scope to a C function which is
not declared static.) Suppose we had the following C
fragment:
void procA(int argA1,
int argA2)
{
char buffA1[BUFSZ];
int autoA1;
...
}
void procB(int argB1,
int argB2,
int argB3)
{
int autoB1, autoB2;
...
procA(argB1 + argB2, argB3);
;
...
}
and that procB is called, which in turn calls
procA. When procB calls procB,
the following sequence of events happen:
procA are evaluated. (The
order of evaluation is implementation dependent.)
stdarg mechanism.) Most
RISC architectures pass the first N arguments in registers rather than
on the stack; the C compiler will allocate space on the stack for
these arguments, but will not save them into the stack unless the C code
takes the address of an argument.
procA) is code automatically generated by the compiler
to be run on entry into the function. It saves the
procB's frame pointer on the stack and allocates space on
the stack (by directly setting the stack pointer (%sp) and the frame
pointer (%fp)) for procA's local variables; the frame
pointer is a register which holds the base address of this area
of memory holding the function's local variables. Initially the frame
pointer and the stack pointer contain the same value, but the stack
pointer changes as procA calls other functions. The
frame pointer is used to access the current function's local variables
using constant indexed addressing. (Some architectures/compilers do
not use an explicit frame pointer, esp with leaf functions. Some
instruction set's call instruction will save the current frame pointer
and other registers as well as the return address on the stack
automatically.)
procA returns, the frame pointer and other
saved registers are popped off of the stack. This is done by the
function epilog code automatically generated by the compiler.
Finally the instruction pointer is loaded with the saved return
address.
procA is run, the stack looks like this:
procA allows its buffer to overflow, the
process may be vulnerable to attackers. We assume for the moment that
the attacker can influence the amount and value of data to be written
into the buffer -- this is true when it is an I/O buffer for data
supplied by the attacker.
Suppose the attacker knows the approximate stack pointer value
when procA is called. S/he does the following: write a
string of NOPs into buffA1 followed by the actual foreign
(malicious) code, and then overflow the buffer so that the return
address slot will contain the approximate address of the middle of the
string of NOPs. When procA returns, control is
transferred somewhere into the string of NOPs and eventually will
arrive at the foreign (malicious) code.
Compilers tend to word align objects allocated on the stack -- for performance (and sometimes correctness) reasons -- so that's not a problem. The only slightly tricky part is to write the machine language code so that it is position independent, i.e., will execute correctly regardless of where the code block is placed. (Hint: use the instruction pointer as a source of correct addresses.)

bsy@cse.ucsd.edu, last updated

