Java Programming/Byte Code

Java Byte Code is the language to which Java source is compiled and the Java Virtual Machine understands. Unlike compiled languages that have to be specifically compiled for each different type of computers, a Java program only needs to be converted to byte code once, after which it can run on any platform for which a Java Virtual Machine exists.

Bytecode is the compiled format for Java programs. Once a Java program has been converted to bytecode, it can be transferred across a network and executed by Java Virtual Machine (JVM). Bytecode files generally have a .class extension. It is not normally necessary for a Java programmer to know byte code, but it can be useful.



Other Languages

edit

There are a number of exciting new languages being created that also compile to Java byte code, such as Groovy.

GNAT
The GNU Ada-Compiler, is capable of compiling Ada into Java-style bytecode.
ftp://cs.nyu.edu/pub/gnat
JPython
Compiles Python to Java-style bytecode.
http://www.jpython.org/
Kawa
Compiles Scheme to Java-style bytecode.
http://www.gnu.org/software/kawa/


Example

edit

Consider the following Java code.

 outer:
 for (int i = 2; i < 1000; i++) {
  for (int j = 2; j < i; j++) {
    if (i % j == 0)
      continue outer;
  }
  System.out.println (i);
 }

A Java compiler might translate the Java code above into byte code as follows, assuming the above was put in a method:

  Code:
   0:   iconst_2
   1:   istore_1
   2:   iload_1
   3:   sipush  1000
   6:   if_icmpge       44
   9:   iconst_2
   10:  istore_2
   11:  iload_2
   12:  iload_1
   13:  if_icmpge       31
   16:  iload_1
   17:  iload_2
   18:  irem             # remainder
   19:  ifne    25
   22:  goto    38
   25:  iinc    2, 1
   28:  goto    11
   31:  getstatic       #84; //Field java/lang/System.out:Ljava/io/PrintStream;
   34:  iload_1
   35:  invokevirtual   #85; //Method java/io/PrintStream.println:(I)V
   38:  iinc    1, 1
   41:  goto    2
   44:  return

Example 2

edit

As an example we can write a simple Foo.java source:

public class Foo {
  public static void main(final String[] args) {
    System.out.println("This is a simple example of decompilation using javap");
    a();
    b();
  }
	
  public static void a() {
    System.out.println("Now we are calling a function...");
  }

  public static void b() {
    System.out.println("...and now we are calling b");
  }
}

Compile it and then move Foo.java to another directory or delete it if you wish. What can we do with javap and Foo.class ?

$javap Foo

produces this result:

Compiled from "Foo.java"
public class Foo extends java.lang.Object {
    public Foo();
    public static void main(java.lang.String[]);
    public static void a();
    public static void b();
}

As you can see the javac compiler doesn't strip any (public) variable name from the .class file. As a result the names of the functions, their parameters and types of return are exposed. (This is necessary in order for other classes to access them.)

Let's do a bit more, try:

$javap -c Foo
Compiled from "Foo.java"
public class Foo extends java.lang.Object{
public Foo();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   ldc             #3; //String This is a simple example of decompilation using javap
   5:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   8:   invokestatic    #5; //Method a:()V
   11:  invokestatic    #6; //Method b:()V
   14:  return

public static void a();
  Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   ldc             #7; //String Now we are calling a function...
   5:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   8:   return

public static void b();
  Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   ldc             #8; //String ...and now we are calling b
   5:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   8:   return

}

The Java bytecodes

edit

See Oracle's Java Virtual Machine Specification[1] for more detailed descriptions

The manipulation of the operand stack is notated as [before]→[after], where [before] is the stack before the instruction is executed and [after] is the stack after the instruction is executed. A stack with the element 'b' on the top and element 'a' just after the top element is denoted 'a,b'.

Mnemonic Opcode
(in hex)
Other bytes Stack
[before]→[after]
Description
A
aaload 32 arrayref, index → value loads onto the stack a reference from an array
aastore 53 arrayref, index, value → stores a reference into an array
aconst_null 01 → null pushes a null reference onto the stack
aload 19 index → objectref loads a reference onto the stack from a local variable #index
aload_0 2a → objectref loads a reference onto the stack from local variable 0
aload_1 2b → objectref loads a reference onto the stack from local variable 1
aload_2 2c → objectref loads a reference onto the stack from local variable 2
aload_3 2d → objectref loads a reference onto the stack from local variable 3
anewarray bd indexbyte1, indexbyte2 count → arrayref creates a new array of references of length count and component type identified by the class reference index (indexbyte1 << 8 + indexbyte2) in the constant pool
areturn b0 objectref → [empty] returns a reference from a method
arraylength be arrayref → length gets the length of an array
astore 3a index objectref → stores a reference into a local variable #index
astore_0 4b objectref → stores a reference into local variable 0
astore_1 4c objectref → stores a reference into local variable 1
astore_2 4d objectref → stores a reference into local variable 2
astore_3 4e objectref → stores a reference into local variable 3
athrow bf objectref → [empty], objectref throws an error or exception (notice that the rest of the stack is cleared, leaving only a reference to the Throwable)
B
baload 33 arrayref, index → value loads a byte or Boolean value from an array
bastore 54 arrayref, index, value → stores a byte or Boolean value into an array
bipush 10 byte → value pushes a byte onto the stack as an integer value
C
caload 34 arrayref, index → value loads a char from an array
castore 55 arrayref, index, value → stores a char into an array
checkcast c0 indexbyte1, indexbyte2 objectref → objectref checks whether an objectref is of a certain type, the class reference of which is in the constant pool at index (indexbyte1 << 8 + indexbyte2)
D
d2f 90 value → result converts a double to a float
d2i 8e value → result converts a double to an int
d2l 8f value → result converts a double to a long
dadd 63 value1, value2 → result adds two doubles
daload 31 arrayref, index → value loads a double from an array
dastore 52 arrayref, index, value → stores a double into an array
dcmpg 98 value1, value2 → result compares two doubles
dcmpl 97 value1, value2 → result compares two doubles
dconst_0 0e → 0.0 pushes the constant 0.0 onto the stack
dconst_1 0f → 1.0 pushes the constant 1.0 onto the stack
ddiv 6f value1, value2 → result divides two doubles
dload 18 index → value loads a double value from a local variable #index
dload_0 26 → value loads a double from local variable 0
dload_1 27 → value loads a double from local variable 1
dload_2 28 → value loads a double from local variable 2
dload_3 29 → value loads a double from local variable 3
dmul 6b value1, value2 → result multiplies two doubles
dneg 77 value → result negates a double
drem 73 value1, value2 → result gets the remainder from a division between two doubles
dreturn af value → [empty] returns a double from a method
dstore 39 index value → stores a double value into a local variable #index
dstore_0 47 value → stores a double into local variable 0
dstore_1 48 value → stores a double into local variable 1
dstore_2 49 value → stores a double into local variable 2
dstore_3 4a value → stores a double into local variable 3
dsub 67 value1, value2 → result subtracts a double from another
dup 59 value → value, value duplicates the value on top of the stack
dup_x1 5a value2, value1 → value1, value2, value1 inserts a copy of the top value into the stack two values from the top
dup_x2 5b value3, value2, value1 → value1, value3, value2, value1 inserts a copy of the top value into the stack two (if value2 is double or long it takes up the entry of value3, too) or three values (if value2 is neither double nor long) from the top
dup2 5c {value2, value1} → {value2, value1}, {value2, value1} duplicate top two stack words (two values, if value1 is not double nor long; a single value, if value1 is double or long)
dup2_x1 5d value3, {value2, value1} → {value2, value1}, value3, {value2, value1} duplicate two words and insert beneath third word (see explanation above)
dup2_x2 5e {value4, value3}, {value2, value1} → {value2, value1}, {value4, value3}, {value2, value1} duplicate two words and insert beneath fourth word
F
f2d 8d value → result converts a float to a double
f2i 8b value → result converts a float to an int
f2l 8c value → result converts a float to a long
fadd 62 value1, value2 → result adds two floats
faload 30 arrayref, index → value loads a float from an array
fastore 51 arreyref, index, value → stores a float in an array
fcmpg 96 value1, value2 → result compares two floats
fcmpl 95 value1, value2 → result compares two floats
fconst_0 0b → 0.0f pushes 0.0f on the stack
fconst_1 0c → 1.0f pushes 1.0f on the stack
fconst_2 0d → 2.0f pushes 2.0f on the stack
fdiv 6e value1, value2 → result divides two floats
fload 17 index → value loads a float value from a local variable #index
fload_0 22 → value loads a float value from local variable 0
fload_1 23 → value loads a float value from local variable 1
fload_2 24 → value loads a float value from local variable 2
fload_3 25 → value loads a float value from local variable 3
fmul 6a value1, value2 → result multiplies two floats
fneg 76 value → result negates a float
frem 72 value1, value2 → result gets the remainder from a division between two floats
freturn ae value → [empty] returns a float from method
fstore 38 index value → stores a float value into a local variable #index
fstore_0 43 value → stores a float value into local variable 0
fstore_1 44 value → stores a float value into local variable 1
fstore_2 45 value → stores a float value into local variable 2
fstore_3 46 value → stores a float value into local variable 3
fsub 66 value1, value2 → result subtracts two floats
G
getfield b4 index1, index2 objectref → value gets a field value of an object objectref, where the field is identified by field reference in the constant pool index (index1 << 8 + index2)
getstatic b2 index1, index2 → value gets a static field value of a class, where the field is identified by field reference in the constant pool index (index1 << 8 + index2)
goto a7 branchbyte1, branchbyte2 [no change] goes to another instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
goto_w c8 branchbyte1, branchbyte2, branchbyte3, branchbyte4 [no change] goes to another instruction at branchoffset (signed int constructed from unsigned bytes branchbyte1 << 24 + branchbyte2 << 16 + branchbyte3 << 8 + branchbyte4)
I
i2b 91 value → result converts an int into a byte
i2c 92 value → result converts an int into a character
i2d 87 value → result converts an int into a double
i2f 86 value → result converts an int into a float
i2l 85 value → result converts an int into a long
i2s 93 value → result converts an int into a short
iadd 60 value1, value2 → result adds two ints together
iaload 2e arrayref, index → value loads an int from an array
iand 7e value1, value2 → result performs a logical and on two integers
iastore 4f arrayref, index, value → stores an int into an array
iconst_m1 02 → -1 loads the int value -1 onto the stack
iconst_0 03 → 0 loads the int value 0 onto the stack
iconst_1 04 → 1 loads the int value 1 onto the stack
iconst_2 05 → 2 loads the int value 2 onto the stack
iconst_3 06 → 3 loads the int value 3 onto the stack
iconst_4 07 → 4 loads the int value 4 onto the stack
iconst_5 08 → 5 loads the int value 5 onto the stack
idiv 6c value1, value2 → result divides two integers
if_acmpeq a5 branchbyte1, branchbyte2 value1, value2 → if references are equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_acmpne a6 branchbyte1, branchbyte2 value1, value2 → if references are not equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpeq 9f branchbyte1, branchbyte2 value1, value2 → if ints are equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpne a0 branchbyte1, branchbyte2 value1, value2 → if ints are not equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmplt a1 branchbyte1, branchbyte2 value1, value2 → if value1 is less than value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpge a2 branchbyte1, branchbyte2 value1, value2 → if value1 is greater than or equal to value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpgt a3 branchbyte1, branchbyte2 value1, value2 → if value1 is greater than value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmple a4 branchbyte1, branchbyte2 value1, value2 → if value1 is less than or equal to value2, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifeq 99 branchbyte1, branchbyte2 value → if value is 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifne 9a branchbyte1, branchbyte2 value → if value is not 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
iflt 9b branchbyte1, branchbyte2 value → if value is less than 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifge 9c branchbyte1, branchbyte2 value → if value is greater than or equal to 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifgt 9d branchbyte1, branchbyte2 value → if value is greater than 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifle 9e branchbyte1, branchbyte2 value → if value is less than or equal to 0, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifnonnull c7 branchbyte1, branchbyte2 value → if value is not null, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
ifnull c6 branchbyte1, branchbyte2 value → if value is null, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
iinc 84 index, const [No change] increment local variable #index by signed byte const
iload 15 index → value loads an int value from a variable #index
iload_0 1a → value loads an int value from variable 0
iload_1 1b → value loads an int value from variable 1
iload_2 1c → value loads an int value from variable 2
iload_3 1d → value loads an int value from variable 3
imul 68 value1, value2 → result multiply two integers
ineg 74 value → result negate int
instanceof c1 indexbyte1, indexbyte2 objectref → result determines if an object objectref is of a given type, identified by class reference index in constant pool (indexbyte1 << 8 + indexbyte2)
invokedynamic ba indexbyte1, indexbyte2 [arg1, arg2, ...] → result invokes a dynamic method and puts the result on the stack (might be void); the method is identified by method reference index in constant pool (indexbyte1 << 8 | indexbyte2)
invokeinterface b9 indexbyte1, indexbyte2, count, 0 objectref, [arg1, arg2, ...] → invokes an interface method on object objectref, where the interface method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2) and count is the number of arguments to pop from the stack frame including the object on which the method is being called and must always be greater than or equal to 1
invokespecial b7 indexbyte1, indexbyte2 objectref, [arg1, arg2, ...] → invoke instance method on object objectref requiring special handling (instance initialization method, a private method, or a superclass method), where the method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2)
invokestatic b8 indexbyte1, indexbyte2 [arg1, arg2, ...] → invoke a static method, where the method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2)
invokevirtual b6 indexbyte1, indexbyte2 objectref, [arg1, arg2, ...] → invoke virtual method on object objectref, where the method is identified by method reference index in constant pool (indexbyte1 << 8 + indexbyte2)
ior 80 value1, value2 → result logical int or
irem 70 value1, value2 → result logical int remainder
ireturn ac value → [empty] returns an integer from a method
ishl 78 value1, value2 → result int shift left
ishr 7a value1, value2 → result int shift right
istore 36 index value → store int value into variable #index
istore_0 3b value → store int value into variable 0
istore_1 3c value → store int value into variable 1
istore_2 3d value → store int value into variable 2
istore_3 3e value → store int value into variable 3
isub 64 value1, value2 → result int subtract
iushr 7c value1, value2 → result int shift right
ixor 82 value1, value2 → result int xor
J
jsr a8 branchbyte1, branchbyte2 → address jump to subroutine at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2) and place the return address on the stack
jsr_w c9 branchbyte1, branchbyte2, branchbyte3, branchbyte4 → address jump to subroutine at branchoffset (signed int constructed from unsigned bytes branchbyte1 << 24 + branchbyte2 << 16 + branchbyte3 << 8 + branchbyte4) and place the return address on the stack
L
l2d 8a value → result converts a long to a double
l2f 89 value → result converts a long to a float
l2i 88 value → result converts a long to an int
ladd 61 value1, value2 → result add two longs
laload 2f arrayref, index → value load a long from an array
land 7f value1, value2 → result bitwise and of two longs
lastore 50 arrayref, index, value → store a long to an array
lcmp 94 value1, value2 → result compares two longs values
lconst_0 09 → 0L pushes the long 0 onto the stack
lconst_1 0a → 1L pushes the long 1 onto the stack
ldc 12 index → value pushes a constant #index from a constant pool (String, int, float or class type) onto the stack
ldc_w 13 indexbyte1, indexbyte2 → value pushes a constant #index from a constant pool (String, int, float or class type) onto the stack (wide index is constructed as indexbyte1 << 8 + indexbyte2)
ldc2_w 14 indexbyte1, indexbyte2 → value pushes a constant #index from a constant pool (double or long) onto the stack (wide index is constructed as indexbyte1 << 8 + indexbyte2)
ldiv 6d value1, value2 → result divide two longs
lload 16 index → value load a long value from a local variable #index
lload_0 1e → value load a long value from a local variable 0
lload_1 1f → value load a long value from a local variable 1
lload_2 20 → value load a long value from a local variable 2
lload_3 21 → value load a long value from a local variable 3
lmul 69 value1, value2 → result multiplies two longs
lneg 75 value → result negates a long
lookupswitch ab <0-3 bytes padding>, defaultbyte1, defaultbyte2, defaultbyte3, defaultbyte4, npairs1, npairs2, npairs3, npairs4, match-offset pairs... key → a target address is looked up from a table using a key and execution continues from the instruction at that address
lor 81 value1, value2 → result bitwise or of two longs
lrem 71 value1, value2 → result remainder of division of two longs
lreturn ad value → [empty] returns a long value
lshl 79 value1, value2 → result bitwise shift left of a long value1 by value2 positions
lshr 7b value1, value2 → result bitwise shift right of a long value1 by value2 positions
lstore 37 index value → store a long value in a local variable #index
lstore_0 3f value → store a long value in a local variable 0
lstore_1 40 value → store a long value in a local variable 1
lstore_2 41 value → store a long value in a local variable 2
lstore_3 42 value → store a long value in a local variable 3
lsub 65 value1, value2 → result subtract two longs
lushr 7d value1, value2 → result bitwise shift right of a long value1 by value2 positions, unsigned
lxor 83 value1, value2 → result bitwise exclusive or of two longs
M
monitorenter c2 objectref → enter monitor for object ("grab the lock" - start of synchronized() section)
monitorexit c3 objectref → exit monitor for object ("release the lock" - end of synchronized() section)
multianewarray c5 indexbyte1, indexbyte2, dimensions count1, [count2,...] → arrayref create a new array of dimensions dimensions with elements of type identified by class reference in constant pool index (indexbyte1 << 8 + indexbyte2); the sizes of each dimension is identified by count1, [count2, etc]
N
new bb indexbyte1, indexbyte2 → objectref creates new object of type identified by class reference in constant pool index (indexbyte1 << 8 + indexbyte2)
newarray bc atype count → arrayref creates new array with count elements of primitive type identified by atype
nop 00 [No change] performs no operation
P
pop 57 value → discards the top value on the stack
pop2 58 {value2, value1} → discards the top two values on the stack (or one value, if it is a double or long)
putfield b5 indexbyte1, indexbyte2 objectref, value → set field to value in an object objectref, where the field is identified by a field reference index in constant pool (indexbyte1 << 8 + indexbyte2)
putstatic b3 indexbyte1, indexbyte2 value → set static field to value in a class, where the field is identified by a field reference index in constant pool (indexbyte1 << 8 + indexbyte2)
R
ret a9 index [No change] continue execution from address taken from a local variable #index (the asymmetry with jsr is intentional)
return b1 → [empty] return void from method
S
saload 35 arrayref, index → value load short from array
sastore 56 arrayref, index, value → store short to array
sipush 11 byte1, byte2 → value pushes a signed integer (byte1 << 8 + byte2) onto the stack
swap 5f value2, value1 → value1, value2 swaps two top words on the stack (note that value1 and value2 must not be double or long)
T
W
Unused
impdep1 fe reserved for implementation-dependent operations within debuggers; should not appear in any class file
impdep2 ff reserved for implementation-dependent operations within debuggers; should not appear in any class file
(no name) cb-fd these values are currently unassigned for opcodes and are reserved for future use
xxxunusedxxx ba this opcode is reserved "for historical reasons"

References

edit
  1. Oracle's Java Virtual Machine Specification
edit