x86 Disassembly/Calling Convention Examples


Microsoft C Compiler edit

Here is a simple function in C:

 int MyFunction(int x, int y)
 {
 	return (x * 2) + (y * 3);
 }

Using cl.exe, we are going to generate 3 separate listings for MyFunction, one with CDECL, one with FASTCALL, and one with STDCALL calling conventions. On the commandline, there are several switches that you can use to force the compiler to change the default:

  • /Gd : The default calling convention is CDECL
  • /Gr : The default calling convention is FASTCALL
  • /Gz : The default calling convention is STDCALL

Using these commandline options, here are the listings:

CDECL edit

 int MyFunction(int x, int y)
 {
 	return (x * 2) + (y * 3);
 }

becomes:

 PUBLIC	_MyFunction
 _TEXT	SEGMENT
 _x$ = 8						; size = 4
 _y$ = 12						; size = 4
 _MyFunction	PROC NEAR
 ; Line 4
 	push  ebp
 	mov   ebp, esp
 ; Line 5
 	mov   eax, _y$[ebp]
 	imul  eax, 3
 	mov   ecx, _x$[ebp]
 	lea	  eax, [eax+ecx*2]
 ; Line 6 
 	pop	  ebp
 	ret	  0
 _MyFunction	ENDP
 _TEXT	ENDS
 END

On entry of a function, ESP points to the return address pushed on the stack by the call instruction (that is, previous contents of EIP). Any argument in stack of higher address than entry ESP is pushed by caller before the call is made; in this example, the first argument is at offset +4 from ESP (EIP is 4 bytes wide), plus 4 more bytes once the EBP is pushed on the stack. Thus, at line 5, ESP points to the saved frame pointer EBP, and arguments are located at addresses ESP+8 (x) and ESP+12 (y).

For CDECL, caller pushes arguments into stack in a right to left order. Because ret 0 is used, it must be the caller who cleans up the stack.

As a point of interest, notice how lea is used in this function to simultaneously perform the multiplication (ecx * 2), and the addition of that quantity to eax. Unintuitive instructions like this will be explored further in the chapter on unintuitive instructions.

FASTCALL edit

 int MyFunction(int x, int y)
 {
 	return (x * 2) + (y * 3);
 }

becomes:

 PUBLIC	@MyFunction@8
 _TEXT	SEGMENT
 _y$ = -8						; size = 4
 _x$ = -4						; size = 4
 @MyFunction@8 PROC NEAR
 ; _x$ = ecx
 ; _y$ = edx
 ; Line 4
 	push   ebp
 	mov	   ebp, esp
 	sub	   esp, 8
 	mov	   _y$[ebp], edx
 	mov	   _x$[ebp], ecx
 ; Line 5
 	mov	   eax, _y$[ebp]
 	imul   eax, 3
 	mov	   ecx, _x$[ebp]
 	lea	   eax, [eax+ecx*2]
 ; Line 6
 	mov	   esp, ebp
 	pop	   ebp
 	ret	   0
 @MyFunction@8 ENDP
 _TEXT	ENDS
 END

This function was compiled with optimizations turned off. Here we see arguments are first saved in stack then fetched from stack, rather than be used directly. This is because the compiler wants a consistent way to use all arguments via stack access, not only one compiler does like that.

There is no argument is accessed with positive offset to entry SP, it seems caller doesn’t pushed in them, thus it can use ret 0. Let’s do further investigation:

 int FastTest(int x, int y, int z, int a, int b, int c)
 {
     return x * y * z * a * b * c;
 }

and the corresponding listing:

 PUBLIC	@FastTest@24
 _TEXT	SEGMENT
 _y$ = -8						; size = 4
 _x$ = -4						; size = 4
 _z$ = 8						; size = 4
 _a$ = 12						; size = 4
 _b$ = 16						; size = 4
 _c$ = 20						; size = 4
 @FastTest@24 PROC NEAR
 ; _x$ = ecx
 ; _y$ = edx
 ; Line 2
 	push    ebp
 	mov	    ebp, esp
 	sub	    esp, 8
 	mov	    _y$[ebp], edx
 	mov	    _x$[ebp], ecx
 ; Line 3
 	mov	    eax, _x$[ebp]
 	imul	eax, DWORD PTR _y$[ebp]
 	imul	eax, DWORD PTR _z$[ebp]
 	imul	eax, DWORD PTR _a$[ebp]
 	imul	eax, DWORD PTR _b$[ebp]
 	imul	eax, DWORD PTR _c$[ebp]
 ; Line 4
 	mov	    esp, ebp
 	pop	    ebp
 	ret	    16					; 00000010H

Now we have 6 arguments, four are pushed in by caller from right to left, and last two are passed again in cx/dx, and processed the same way as previous example. Stack cleanup is done by ret 16, which corresponding to 4 arguments pushed before call executed.

For FASTCALL, compiler will try to pass arguments in registers, if not enough caller will pushed them into stack still in an order from right to left. Stack cleanup is done by callee. It is called FASTCALL because if arguments can be passed in registers (for 64bit CPU the maximum number is 6), no stack push/clean up is needed.

The name-decoration scheme of the function: @MyFunction@n, here n is stack size needed for all arguments.

STDCALL edit

 int MyFunction(int x, int y)
 {
 	return (x * 2) + (y * 3);
 }

becomes:

 PUBLIC	_MyFunction@8
 _TEXT	SEGMENT
 _x$ = 8						; size = 4
 _y$ = 12						; size = 4
 _MyFunction@8 PROC NEAR
 ; Line 4
 	push	ebp
 	mov	    ebp, esp
 ; Line 5
 	mov	    eax, _y$[ebp]
 	imul	eax, 3
 	mov	    ecx, _x$[ebp]
 	lea	    eax, [eax+ecx*2]
 ; Line 6
 	pop	    ebp
 	ret	    8
 _MyFunction@8 ENDP
 _TEXT	ENDS
 END

The STDCALL listing has only one difference than the CDECL listing that it uses "ret 8" for self clean up of stack. Lets do an example with more parameters:

 int STDCALLTest(int x, int y, int z, int a, int b, int c)
 {
 	return x * y * z * a * b * c;
 }

Let's take a look at how this function gets translated into assembly by cl.exe:

 PUBLIC	_STDCALLTest@24
 _TEXT	SEGMENT
 _x$ = 8						; size = 4
 _y$ = 12						; size = 4
 _z$ = 16						; size = 4
 _a$ = 20						; size = 4
 _b$ = 24						; size = 4
 _c$ = 28						; size = 4
 _STDCALLTest@24 PROC NEAR
 ; Line 2
 	push	ebp
 	mov	    ebp, esp
 ; Line 3
 	mov	    eax, _x$[ebp]
 	imul	eax, DWORD PTR _y$[ebp]
 	imul	eax, DWORD PTR _z$[ebp]
 	imul	eax, DWORD PTR _a$[ebp]
 	imul	eax, DWORD PTR _b$[ebp]
 	imul	eax, DWORD PTR _c$[ebp]
 ; Line 4
 	pop	    ebp
 	ret	    24					; 00000018H
 _STDCALLTest@24 ENDP
 _TEXT	ENDS
 END

Yes the only difference between STDCALL and CDECL is that the former does stack clean up in callee, the later in caller. This saves a little bit in X86 due to its "ret n".

GNU C Compiler edit

We will be using 2 example C functions to demonstrate how GCC implements calling conventions:

 int MyFunction1(int x, int y)
 {
 	return (x * 2) + (y * 3);
 }

and

 int MyFunction2(int x, int y, int z, int a, int b, int c)
 {
 	return x * y * (z + 1) * (a + 2) * (b + 3) * (c + 4);
 }

GCC does not have commandline arguments to force the default calling convention to change from CDECL (for C), so they will be manually defined in the text with the directives: __cdecl, __fastcall, and __stdcall.

CDECL edit

The first function (MyFunction1) provides the following assembly listing:

 _MyFunction1:
 	pushl	%ebp
 	movl	%esp, %ebp
 	movl	8(%ebp), %eax
 	leal	(%eax,%eax), %ecx
 	movl	12(%ebp), %edx
 	movl	%edx, %eax
 	addl	%eax, %eax
 	addl	%edx, %eax
 	leal	(%eax,%ecx), %eax
 	popl	%ebp
 	ret

First of all, we can see the name-decoration is the same as in cl.exe. We can also see that the ret instruction doesn't have an argument, so the calling function is cleaning the stack. However, since GCC doesn't provide us with the variable names in the listing, we have to deduce which parameters are which. After the stack frame is set up, the first instruction of the function is "movl 8(%ebp), %eax". One we remember (or learn for the first time) that GAS instructions have the general form:

instruction src, dest

We realize that the value at offset +8 from ebp (the last parameter pushed on the stack) is moved into eax. The leal instruction is a little more difficult to decipher, especially if we don't have any experience with GAS instructions. The form "leal(reg1,reg2), dest" adds the values in the parenthesis together, and stores the value in dest. Translated into Intel syntax, we get the instruction:

 lea ecx, [eax + eax]

Which is clearly the same as a multiplication by 2. The first value accessed must then have been the last value passed, which would seem to indicate that values are passed right-to-left here. To prove this, we will look at the next section of the listing:

 movl	12(%ebp), %edx
 movl	%edx, %eax
 addl	%eax, %eax
 addl	%edx, %eax
 leal	(%eax,%ecx), %eax

the value at offset +12 from ebp is moved into edx. edx is then moved into eax. eax is then added to itselt (eax * 2), and then is added back to edx (edx + eax). remember though that eax = 2 * edx, so the result is edx * 3. This then is clearly the y parameter, which is furthest on the stack, and was therefore the first pushed. CDECL then on GCC is implemented by passing arguments on the stack in right-to-left order, same as cl.exe.

FASTCALL edit

 .globl @MyFunction1@8
 	.def	@MyFunction1@8;	.scl	2;	.type	32;	.endef
 @MyFunction1@8:
 	pushl	%ebp
 	movl	%esp, %ebp
 	subl	$8, %esp
 	movl	%ecx, -4(%ebp)
 	movl	%edx, -8(%ebp)
 	movl	-4(%ebp), %eax
 	leal	(%eax,%eax), %ecx
 	movl	-8(%ebp), %edx
 	movl	%edx, %eax
 	addl	%eax, %eax
 	addl	%edx, %eax
 	leal	(%eax,%ecx), %eax
 	leave
 	ret

Notice first that the same name decoration is used as in cl.exe. The astute observer will already have realized that GCC uses the same trick as cl.exe, of moving the fastcall arguments from their registers (ecx and edx again) onto a negative offset on the stack. Again, optimizations are turned off. ecx is moved into the first position (-4) and edx is moved into the second position (-8). Like the CDECL example above, the value at -4 is doubled, and the value at -8 is tripled. Therefore, -4 (ecx) is x, and -8 (edx) is y. It would seem from this listing then that values are passed left-to-right, although we will need to take a look at the larger, MyFunction2 example:

 .globl @MyFunction2@24
 	.def	@MyFunction2@24;	.scl	2;	.type	32;	.endef
 @MyFunction2@24:
 	pushl	%ebp
 	movl	%esp, %ebp
 	subl	$8, %esp
 	movl	%ecx, -4(%ebp)
 	movl	%edx, -8(%ebp)
 	movl	-4(%ebp), %eax
 	imull	-8(%ebp), %eax
 	movl	8(%ebp), %edx
 	incl	%edx
 	imull	%edx, %eax
 	movl	12(%ebp), %edx
 	addl	$2, %edx
 	imull	%edx, %eax
 	movl	16(%ebp), %edx
 	addl	$3, %edx
 	imull	%edx, %eax
 	movl	20(%ebp), %edx
 	addl	$4, %edx
 	imull	%edx, %eax
 	leave
 	ret	    $16

By following the fact that in MyFunction2, successive parameters are added to increasing constants, we can deduce the positions of each parameter. -4 is still x, and -8 is still y. +8 gets incremented by 1 (z), +12 gets increased by 2 (a). +16 gets increased by 3 (b), and +20 gets increased by 4 (c). Let's list these values then:

z = [ebp + 8]
a = [ebp + 12]
b = [ebp + 16]
c = [ebp + 20]

c is the furthest down, and therefore was the first pushed. z is the highest to the top, and was therefore the last pushed. Arguments are therefore pushed in right-to-left order, just like cl.exe.

STDCALL edit

Let's compare then the implementation of MyFunction1 in GCC:

 .globl _MyFunction1@8
 	.def	_MyFunction1@8;	.scl	2;	.type	32;	.endef
 _MyFunction1@8:
 	pushl	%ebp
 	movl	%esp, %ebp
 	movl	8(%ebp), %eax
 	leal	(%eax,%eax), %ecx
 	movl	12(%ebp), %edx
 	movl	%edx, %eax
 	addl	%eax, %eax
 	addl	%edx, %eax
 	leal	(%eax,%ecx), %eax
 	popl	%ebp
 	ret	    $8

The name decoration is the same as in cl.exe, so STDCALL functions (and CDECL and FASTCALL for that matter) can be assembled with either compiler, and linked with either linker, it seems. The stack frame is set up, then the value at [ebp + 8] is doubled. After that, the value at [ebp + 12] is tripled. Therefore, +8 is x, and +12 is y. Again, these values are pushed in right-to-left order. This function also cleans its own stack with the "ret 8" instruction.

Looking at a bigger example:

 .globl _MyFunction2@24
 	.def	_MyFunction2@24;	.scl	2;	.type	32;	.endef
 _MyFunction2@24:
 	pushl	%ebp
 	movl	%esp, %ebp
 	movl	8(%ebp), %eax
 	imull	12(%ebp), %eax
 	movl	16(%ebp), %edx
 	incl	%edx
 	imull	%edx, %eax
 	movl	20(%ebp), %edx
 	addl	$2, %edx
 	imull	%edx, %eax
 	movl	24(%ebp), %edx
 	addl	$3, %edx
 	imull	%edx, %eax
 	movl	28(%ebp), %edx
 	addl	$4, %edx
 	imull	%edx, %eax
 	popl	%ebp
 	ret	    $24

We can see here that values at +8 and +12 from ebp are still x and y, respectively. The value at +16 is incremented by 1, the value at +20 is incremented by 2, etc all the way to the value at +28. We can therefore create the following table:

x = [ebp + 8]
y = [ebp + 12]
z = [ebp + 16]
a = [ebp + 20]
b = [ebp + 24]
c = [ebp + 28]

With c being pushed first, and x being pushed last. Therefore, these parameters are also pushed in right-to-left order. This function then also cleans 24 bytes off the stack with the "ret 24" instruction.

Example: C Calling Conventions edit

Identify the calling convention of the following C function:

 int MyFunction(int a, int b)
 {
    return a + b;
 }

The function is written in C, and has no other specifiers, so it is CDECL by default.

Example: Named Assembly Function edit

Identify the calling convention of the function MyFunction:

 :_MyFunction@12
 push ebp
 mov  ebp, esp
 ...
 pop  ebp
 ret  12

The function includes the decorated name of an STDCALL function, and cleans up its own stack. It is therefore an STDCALL function.

Example: Unnamed Assembly Function edit

This code snippet is the entire body of an unnamed assembly function. Identify the calling convention of this function.

 push ebp
 mov  ebp, esp
 add  eax, edx
 pop  ebp
 ret

The function sets up a stack frame, so we know the compiler hasnt done anything "funny" to it. It accesses registers which arent initialized yet, in the edx and eax registers. It is therefore a FASTCALL function.

Example: Another Unnamed Assembly Function edit

 push ebp 
 mov  ebp, esp
 mov  eax, [ebp + 8]
 pop  ebp
 ret  16

The function has a standard stack frame, and the ret instruction has a parameter to clean its own stack. Also, it accesses a parameter from the stack. It is therefore an STDCALL function.

Example: Name Mangling edit

What can we tell about the following function call?

 mov    ecx, x
 push   eax
 mov    eax, ss:[ebp - 4]
 push   eax
 mov    al, ss:[ebp - 3]
 call   @__Load?$Container__XXXY_?Fcii

Two things should get our attention immediately. The first is that before the function call, a value is stored into ecx. Also, the function name itself is heavily mangled. This example must use the C++ THISCALL convention. Inside the mangled name of the function, we can pick out two english words, "Load" and "Container". Without knowing the specifics of this name mangling scheme, it is not possible to determine which word is the function name, and which word is the class name.

We can pick out two 32-bit variables being passed to the function, and a single 8-bit variable. The first is located in eax, the second is originally located on the stack from offset -4 from ebp, and the third is located at ebp offset -3. In C++, these would likely correspond to two int variables, and a single char variable. Notice at the end of the mangled function name are three lower-case characters "cii". We can't know for certain, but it appears these three letters correspond to the three parameters (char, int, int). We do not know from this whether the function returns a value or not, so we will assume the function returns void.

Assuming that "Load" is the function name and "Container" is the class name (it could just as easily be the other way around), here is our function definition:

class Container
{
  void Load(char, int, int);
}