x86 Assembly/High-Level Languages

x86 Assembly
quick links: registers • move • jump • calculate • logic • rearrange • misc. • FPU

Very few projects are written entirely in assembly. It's often used for accessing processor-specific features, optimizing critical sections of code, and very low-level work, but for many applications, it can be simpler and easier to implement the basic control flow and data manipulation routines in a higher level language, like C. For this reason, it is often necessary to interface between assembly language and other languages.

Compilers

The first compilers were simply text translators that converted a high-level language into assembly language. The assembly language code was then fed into an assembler, to create the final machine code output. The GCC compiler still performs this sequence (code is compiled into assembly, and fed to the AS assembler). However, many modern compilers will skip the assembly language and create the machine code directly.

Assembly language code has the benefit that it is in one-to-one correspondence with the underlying machine code. Each machine instruction is mapped directly to a single Assembly instruction. Because of this, even when a compiler directly creates the machine code, it is still possible to interface that code with an assembly language program. The important part is knowing exactly how the language implements its data structures, control structures, and functions. The method in which function calls are implemented by a high-level language compiler is called a calling convention.

The calling convention is a contract between the function and caller of the function and specifies several parameters:

How the arguments are passed to the function, and in what order? Are they pushed onto the stack, or are they passed in via the registers?
How are return values passed back to the caller? This is usually via registers or on the stack.
What processor states are volatile (available for modification)? Volatile registers are available for modification by the function. The caller is responsible for saving the state of those registers if needed. Non-volatile registers are guaranteed to be preserved by the function. The called function is responsible for saving the state of those registers and restoring those registers on exit.
The function prologue and epilogue, which sets up the registers and stack for use within the function and then restores the stack and registers before exiting.

C Calling Conventions

Wikipedia has related information at x86 calling conventions

CDECL

For C compilers, the CDECL calling convention is the de facto standard. It varies by compiler, but the programmer can specify that a function be implemented using CDECL usually by pre-appending the function declaration with a keyword, for example __cdecl in Visual studio:

int __cdecl func()

in gcc it would be __attribute__( (__cdecl__ )):

int __attribute__((__cdecl__ )) func()

CDECL calling convention specifies a number of different requirements:

Function arguments are passed on the stack, in right-to-left order.
Function result is stored in EAX/AX/AL
Floating point return values will be returned in ST0
The function name is pre-appended with an underscore.
The arguments are popped from the stack by the caller itself.
8-bit and 16-bit integer arguments are promoted to 32-bit arguments.
The volatile registers are: EAX, ECX, EDX, ST0 - ST7, ES and GS
The non-volatile registers are: EBX, EBP, ESP, EDI, ESI, CS and DS
The function will exit with a RET instruction.
The function is supposed to return values types of class or structure via a reference in EAX/AX. The space is supposed to be allocated by the function, which unable to use the stack or heap is left with fixed address in static non-constant storage. This is inherently not thread safe. Many compilers will break the calling convention:
1. GCC has the calling code allocate space and passes a pointer to this space via a hidden parameter on the stack. The called function writes the return value to this address.
2. Visual C++ will:
  1. Pass POD return values 32 bits or smaller in the EAX register.
  2. Pass POD return values 33-64 bits in size via the EAX:EDX registers
  3. For non-POD return values or values larger than 64-bits, the calling code will allocate space and passes a pointer to this space via a hidden parameter on the stack. The called function writes the return value to this address.

CDECL functions are capable of accepting variable argument lists. Below is example using cdecl calling convention:

global main

extern printf

section .data
	align 4
	a:	dd 1
	b:	dd 2
	c:	dd 3
	fmtStr:	db "Result: %d", 0x0A, 0

section .bss
	align 4

section .text
				
;
; int func( int a, int b, int c )
; {
;	return a + b + c ;
; }
;
func:
	push	ebp		; Save ebp on the stack
	mov	ebp, esp	; Replace ebp with esp since we will be using
				; ebp as the base pointer for the functions
				; stack.
				;
				; The arguments start at ebp+8 since calling the
				; the function places eip on the stack and the
				; function places ebp on the stack as part of
				; the preamble.
				;
	mov	eax, [ebp+8]	; mov a int eax
	mov	edx, [ebp+12]	; add b to eax
	lea	eax, [eax+edx]	; Using lea for arithmetic adding a + b into eax
	add	eax, [ebp+16]	; add c to eax
	pop	ebp		; restore ebp
	ret			; Returning, eax contains result

	;
	; Using main since we are using gcc to link
	;
	main:

	;
	; Set up for call to func(int a, int b, int c)
	;
	; Push variables in right to left order
	;
	push	dword [c]
	push	dword [b]
	push	dword [a]
	call	func
	add	esp, 12		; Pop stack 3 times 4 bytes
	push	eax
	push	dword fmtStr
	call	printf
	add	esp, 8		; Pop stack 2 times 4 bytes

	;
	; Alternative to using push for function call setup, this is the method
	; used by gcc
	;
	sub	esp, 12		; Create space on stack for three 4 byte variables
	mov	ecx, [b]
	mov	eax, [a]
	mov	[esp+8], dword 4
	mov	[esp+4], ecx
	mov	[esp],	 eax
	call	func
	;push	eax
	;push	dword fmtStr
	mov	[esp+4], eax
	lea	eax, [fmtStr]
	mov	[esp], eax
	call	printf

				;
				; Call exit(3) syscall
				;	void exit(int status)
				;
	mov	ebx, 0		; Arg one: the status
	mov	eax, 1		; Syscall number:
	int 	0x80

In order to assemble, link and run the program we need to do the following:

nasm -felf32 -g cdecl.asm
gcc -o cdecl cdecl.o
./cdecl

STDCALL

STDCALL is the calling convention that is used when interfacing with the Win32 API on Microsoft Windows systems. STDCALL was created by Microsoft, and therefore isn't always supported by non-Microsoft compilers. It varies by compiler but, the programmer can specify that a function be implemented using STDCALL usually by pre-appending the function declaration with a keyword, for example __stdcall in Visual studio:

int __stdcall func()

in gcc it would be __attribute__( (__stdcall__ )):

int __attribute__((__stdcall__ )) func()

STDCALL has the following requirements:

Function arguments are passed on the stack in right-to-left order.
Function result is stored in EAX/AX/AL
Floating point return values will be returned in ST0
64-bits integers and 32/16 bit pointers will be returned via the EAX:EDX registers.
8-bit and 16-bit integer arguments are promoted to 32-bit arguments.
Function name is prefixed with an underscore
Function name is suffixed with an "@" sign, followed by the number of bytes of arguments being passed to it.
The arguments are popped from the stack by the callee (the called function).
The volatile registers are: EAX, ECX, EDX, and ST0 - ST7
The non-volatile registers are: EBX, EBP, ESP, EDI, ESI, CS, DS, ES, FS and GS
The function will exit with a RET n instruction, the called function will pop n additional bytes off the stack when it returns.
POD return values 32 bits or smaller will be returned in the EAX register.
POD return values 33-64 bits in size will be returned via the EAX:EDX registers.
Non-POD return values or values larger than 64-bits, the calling code will allocate space and passes a pointer to this space via a hidden parameter on the stack. The called function writes the return value to this address.

STDCALL functions are not capable of accepting variable argument lists.

For example, the following function declaration in C:

_stdcall void MyFunction(int, int, short);

would be accessed in assembly using the following function label:

_MyFunction@12

Remember, on a 32 bit machine, passing a 16 bit argument on the stack (C "short") takes up a full 32 bits of space.

FASTCALL

FASTCALL functions can frequently be specified with the __fastcall keyword in many compilers. FASTCALL functions pass the first two arguments to the function in registers, so that the time-consuming stack operations can be avoided. FASTCALL has the following requirements:

The first 32-bit (or smaller) argument is passed in ECX/CX/CL (see [1])
The second 32-bit (or smaller) argument is passed in EDX/DX/DL
The remaining function arguments (if any) are passed on the stack in right-to-left order
The function result is returned in EAX/AX/AL
The function name is prefixed with an "@" symbol
The function name is suffixed with an "@" symbol, followed by the size of passed arguments, in bytes.

C++ Calling Conventions (THISCALL)

The C++ THISCALL calling convention is the standard calling convention for C++. In THISCALL, the function is called almost identically to the CDECL convention, but the this pointer (the pointer to the current class) must be passed.

The way that the this pointer is passed is compiler-dependent. Microsoft Visual C++ passes it in ECX. GCC passes it as if it were the first parameter of the function. (i.e. between the return address and the first formal parameter.)

Ada Calling Conventions

Pascal Calling Conventions

The Pascal convention is essentially identical to cdecl, differing only in that:

The parameters are pushed left to right (logical western-world reading order)
The routine being called must clean the stack before returning

Additionally, each parameter on the 32-bit stack must use all four bytes of the DWORD, regardless of the actual size of the datum.

This is the main calling method used by Windows API routines, as it is slightly more efficient with regard to memory usage, stack access and calling speed.

Note: the Pascal convention is NOT the same as the Borland Pascal convention, which is a form of fastcall, using registers (eax, edx, ecx) to pass the first three parameters, and also known as Register Convention.

Fortran Calling Conventions

Inline Assembly

C/C++

This Borland C++ example splits byte_data into two bytes in buf, the first containing high 4 bits and low 4 bits in the second.

void ByteToHalfByte(BYTE *buf, int pos, BYTE byte_data)
{
  asm
  {
    mov al, byte_data
    mov ah, al
    shr al, 04h
    and ah, 0Fh
    mov ecx, buf
    mov edx, pos
    mov [ecx+edx], al
    mov [ecx+edx+1], ah
  }
}

Pascal

The FreePascal Compiler (FPC) and GNU Pascal Compiler (GPC) allow asm-blocks. While GPC only accepts AT&T-syntax, FPC can work with both, and allows a direct pass-through to the assembler. The following two examples are written to work with FPC (regarding compiler directives).

program asmDemo(input, output, stderr);

// The $asmMode directive informs the compiler
// which syntax is used in asm-blocks.
// Alternatives are 'att' (AT&T syntax) and 'direct'.
{$asmMode intel}

var
	n, m: longint;
begin
	n := 42;
	m := -7;
	writeLn('n = ', n, '; m = ', m);
	
	// instead of declaring another temporary variable
	// and writing "tmp := n; n := m; m := tmp;":
	asm
		mov rax, n // rax := n
		// xchg can only operate at most on one memory address
		xchg rax, m // swaps values in rax and at m
		mov n, rax // n := rax (holding the former m value)
	// an array of strings after the asm-block closing 'end'
	// tells the compiler which registers have changed
	// (you don't wanna mess with the compiler's notion
	// which registers mean what)
	end ['rax'];
	
	writeLn('n = ', n, '; m = ', m);
end.

In FreePascal you can also write whole functions in assembly language. Also note, that if you use labels, you have to declare them beforehand (FPC requirement):

// the 'assembler' modifier allows us
// to implement the whole function in assembly language
function iterativeSquare(const n: longint): qword; assembler;
// you have to familiarize the compiler with symbols
// which are meant to be jump targets
{$goto on}
label
	iterativeSquare_iterate, iterativeSquare_done;
// note, the 'asm'-keyword instead of 'begin'
{$asmMode intel}
asm
	// ecx is used as counter by loop instruction
	mov ecx, n // ecx := n
	mov rax, 0 // rax := 0
	mov r8, 1 // r8 := 1
	
	cmp ecx, rax // ecx = rax [n = 0]
	je iterativeSquare_done // n = 0
	
	// ensure ecx is positive
	// so we'll run against zero while decrementing
	jg iterativeSquare_iterate // if n > 0 then goto iterate
	neg ecx // ecx := ecx * -1
	
	// n^2 = sum over first abs(n) odd integers
iterativeSquare_iterate:
	add rax, r8 // rax := rax + r8
	inc r8 // inc(r8) twice
	inc r8 // to get next odd integer
	loop iterativeSquare_iterate // dec(ecx)
	// if ecx <> 0 then goto iterate
	
iterativeSquare_done:
	// the @result macro represents the functions return value
	mov @result, rax // result := rax
// note, a list of modified registers (here ['rax', 'ecx', 'r8'])
//    is ignored for pure assembler routines
end;