Play and learn 300 000+ tabs online

Thursday, June 3, 2010

How Function Calls Works - A Deep Analysis

This is the first time am peeking into low-level of programming. This topic is not part of my curriculum but had to explore this in order to understand few advanced features of functional programming language. I’ll detail about those features in my future blog.

Function, is just a piece of code which can be referred using a name. The computer stores the data of the function (such as arguments, local variables, return address etc.,) when a function is invoked and destroys them when the function returns. Similarly when a function calls other functions, the caller function returns only after all the callee functions are returned. We can infer a LIFO (last-in-first-out) scenario in this, ie., the function that is invoked last completes first and the one invoked first completes last. So the stack becomes an obvious choice for function calls implementation.

In the memory, the stack grows from the higher memory address to the lower memory addresses. So we have to visualize an inverted stack. Have look at the below diagram.

Stack frame
Stack frame

Let me brief through all the terminologies before we get into the working.

Stack frame – The section of memory where the local variables, arguments, return address and other information of a function are stored, is called stack frame or activation record.

Stack pointer – For function calls, a bulk of data is pushed and popped from the memory, whenever a function is invoked and returned respectively. But we may need to access the data which is deep into the stack and it will be inefficient to pop off all the data to do that. Hence, unlike conventional stack implementation, we have a register called stack pointer that points to top of the stack such that all the addresses smaller than the address to which the stack pointer points are considered as garbage and all the address larger are considered as valid.

Stack bottom – It is the highest valid address of the stack. When the stack is initialized the stack pointer points to stack bottom.

Stack limit – It is the smallest valid address of the stack. When the stack pointer goes below this address, then there’s stack overflow.

Frame pointer – When a new function is invoked the frame pointer points to where the stack pointer was and when the function returns the stack pointer points back to where the frame pointer is.

Return address – It points to address (within the caller function) to which the control should pass when the callee (current) function returns.

Static link – This comes into picture for the programming languages that support nested functions. When a function (nested function) is defined within another function (enclosing function), the nested function may need to access the variables of the enclosing function. Therefore, the nested function’s stack frame should be able to access the enclosing function’s frame. This made possible by static link which points to the address of the enclosing function’s frame.

How it works?

The working of stack differs with different programming languages. Programming languages like C, doesn’t support nested functions and they dont need static link. Programming lanuages like Pascal support nested functions and they use static link. There are programming languages like ML and Scheme which support both nested functions and function-valued variables (ie., a function returned as a result of another function). I am not aware of how these function calls work. I have discussed the working only for programming languages that support nested functions.

Consider a pseudo-C code: (This doesn’t work in C. Just using a comfortable syntax.)

int foo (int arg1)
{
int f (int arg2)
{
int i=10;
return arg2 * i;
}
int g(int arg3)
{
return f(arg3);
}
return g(arg1);
}

f and g are nested functions of foo. foo invokes g which inturn invokes f.

Stack frame working
Stack frame – working
When the function foo invokes function g, the argument arg3 is pushed into the stack. A new section memory is allocated for the function g and the stack pointer points to the highest unused address. The frame pointer points to where stack pointer was. The old frame pointer is pushed ie., 1. The static link which points to the address of the enclosing function foo is pushed, ie., 1. The return address is pushed ie., to foo.

Then when the function f is invoked the argument arg2 is pushed. Now the frame pointer points to where the stack pointer was. The old FP is pushed ie., 3. The static link pointing to foo is pushed ie., 1. The return address is pushed ie., to g. Then the local variable i is pushed.

When the function f returns. The stack pointer points to where stack pointer was and the frame pointer points to the address stored as old FP in f’s stack frame.

This simply doesn’t work with languages like ML and scheme because in those, the local variables should not be destroyed while the function returns! as they may be required for the nested callee function.

How recursive functions work?

The recursive functions work in same way as the other functions. For every recursive call a new stack frame of the function is created. But this avoided in tail recursive calls in functional languages. In tail recursive call, each recursive call is made to loop to the same stack frame.

Stack overflow

When the stack pointer goes below the stack limit, then there’s stack overflow. The most common cause for stack overflow is the infinite loop in the recursive function calls. Imagine if a function is invoked recursively for infinite times, a stack frame gets appended for every recursive call and at one point the stack pointer goes below the stack limit and thus a stack overflow is caused.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.