X86 Assembly/Floating Point

While integers are sufficient for some applications, it is often necessary to use the floating point coprocessor to manipulate numbers with fractional parts.

x87 CoprocessorEdit

The original x86 family members had a separate math coprocessor that handled floating point arithmetic. The original coprocessor was the 8087, and all FPUs since have been dubbed "x87" chips. Later variants integrated the floating point unit (FPU) into the microprocessor itself. Having the capability to manage floating point numbers means a few things:

  1. The microprocessor must have space to store floating point numbers
  2. The microprocessor must have instructions to manipulate floating point numbers

The FPU, even when it is integrated into an x86 chip, is still called the "x87" section. For instance, literature on the subject will frequently call the FPU Register Stack the "x87 Stack", and the FPU operations will frequently be called the "x87 instruction set".

FPU Register StackEdit

The FPU has 8 registers, st0 to st7, formed into a stack. Numbers are pushed onto the stack from memory, and are popped off the stack back to memory. FPU instructions generally will pop the first two items off the stack, act on them, and push the answer back on to the top of the stack.

Floating point numbers may generally be either 32 bits long (C "float" type), or 64 bits long (C "double" type). However, in order to reduce round-off errors, the FPU stack registers are all 80 bits wide.

Most calling conventions return floating point values in the st0 register.

ExamplesEdit

The following program (using NASM syntax) calculates the square root of 123.45.

global _start
 
section .data
    val: dq 123.45  ;declare quad word (double precision)
 
section .bss
    res: resq 1     ;reserve 1 quad word for result
 
section .text
    _start:
 
    fld qword [val] ;load value into st0
    fsqrt           ;compute square root of st0 and store in st0
    fst qword [res] ;store st0 in result
 
    ;end program

Essentially, programs that use the FPU load values onto the stack with FLD and its variants, perform operations on these values, then store them into memory with one of the forms of FST. Because the x87 stack can only be accessed by FPU instructions ‒ you cannot write mov eax, st0 ‒ it is necessary to store values to memory if you want to print them, for example.

A more complex example that evaluates the Law of Cosines:

;; c^2 = a^2 + b^2 - cos(C)*2*a*b
;; C is stored in ang
 
global _start
 
section .data
    a: dq 4.56   ;length of side a
    b: dq 7.89   ;length of side b
    ang: dq 1.5  ;opposite angle to side c (around 85.94 degrees)
 
section .bss
    c: resq 1    ;the result ‒ length of side c
 
section .text
    _start:
 
    fld qword [a]   ;load a into st0
    fmul st0, st0   ;st0 = a * a = a^2
 
    fld qword [b]   ;load b into st1
    fmul st1, st1   ;st1 = b * b = b^2
 
    fadd st0, st1   ;st0 = a^2 + b^2
 
    fld qword [ang] ;load angle into st0
    fcos            ;st0 = cos(ang)
 
    fmul qword [a]  ;st0 = cos(ang) * a
    fmul qword [b]  ;st0 = cos(ang) * a * b
    fadd st0, st0   ;st0 = cos(ang) * a * b * 2
 
    fsubp st1, st0  ;st1 = st1 - st0 = (a^2 + b^2) - (2 * a * b * cos(ang))
                    ;and pop st0
 
    fsqrt           ;take square root of st0 = c
 
    fst qword [c]   ;store st0 in c ‒ and we're done!
 
    ;end program

Floating-Point Instruction SetEdit

You may notice that some of the instructions below differ from another in name by just one letter: a P appended to the end. This suffix signifies that in addition to performing the normal operation, they also Pop the x87 stack after execution is complete.

Original 8087 instructionsEdit

F2XM1, FABS, FADD, FADDP, FBLD, FBSTP, FCHS, FCLEX, FCOM, FCOMP, FCOMPP, FDECSTP, FDISI, FDIV, FDIVP, FDIVR, FDIVRP, FENI, FFREE, FIADD, FICOM, FICOMP, FIDIV, FIDIVR, FILD, FIMUL, FINCSTP, FINIT, FIST, FISTP, FISUB, FISUBR, FLD, FLD1, FLDCW, FLDENV, FLDENVW, FLDL2E, FLDL2T, FLDLG2, FLDLN2, FLDPI, FLDZ, FMUL, FMULP, FNCLEX, FNDISI, FNENI, FNINIT, FNOP, FNSAVE, FNSAVEW, FNSTCW, FNSTENV, FNSTENVW, FNSTSW, FPATAN, FPREM, FPTAN, FRNDINT, FRSTOR, FRSTORW, FSAVE, FSAVEW, FSCALE, FSQRT, FST, FSTCW, FSTENV, FSTENVW, FSTP, FSTSW, FSUB, FSUBP, FSUBR, FSUBRP, FTST, FWAIT, FXAM, FXCH, FXTRACT, FYL2X, FYL2XP1

Added in specific processorsEdit

Added with 80287Edit

FSETPM

Added with 80387Edit

FCOS, FLDENVD, FNSAVED, FNSTENVD, FPREM1, FRSTORD, FSAVED, FSIN, FSINCOS, FSTENVD, FUCOM, FUCOMP, FUCOMPP

Added with Pentium ProEdit

FCMOVB, FCMOVBE, FCMOVE, FCMOVNB, FCMOVNBE, FCMOVNE, FCMOVNU, FCMOVU, FCOMI, FCOMIP, FUCOMI, FUCOMIP, FXRSTOR, FXSAVE

Added with SSEEdit

FXRSTOR, FXSAVE

These are also supported on later Pentium IIs which do not contain SSE support

Added with SSE3Edit

FISTTP (x87 to integer conversion with truncation regardless of status word)

Undocumented instructionsEdit

FFREEP performs FFREE ST(i) and pop stack

Further ReadingEdit

Last modified on 30 January 2014, at 20:41