Compiler Construction/Known Languages Conceptual Implementations

Known Languages Conceptual Implementations

Since this book is about the compiler, the following studies do not detail much syntax or language designs but implementation techniques. If you are familiar with languages below, it may help understand compilers by understanding how those familiar concepts (like dynamic binding) are implemented.

Furthermore, we don't consider practical implication of decisions on the language design. Some tricky situations, like problems of reference dependency or the precedence of operators, may not occur most of the times, but those are often the interests of language implementors.

Java

Invocation

In Java, there are four kinds of method invocation, namely invokestatic, invokespecial, invokevirtual, invokeinterface. As the names suggest, the first is used to invoke static method and the rest instance methods. Since a static method cannot be overridden, invokestatic is very simple; it is essentially the same as calling a function in C.

We now see the mechanism of invocation of instance methods. Consider the following piece of code.

class A {
  public static int f () { return 1; }
  private int g_private () { return 2; }
  public final int g_final () { return 3; }
  public int g_non_final () { return 4; }

  public void test (A a) {
    a.f ();  // static; this is always 1.
    a.g_private ();  // special; this is always 2.
    a.g_final (); // special; this is always 3.
    a.g_non_final (); // virtual; this may be 4 or something else.
  }
}

class B extends A {
  public int g_non_final () { return 6; }
}

class C extends B {
  public int g_non_final () { return 7; }
  public int foo () { return A.this.g_non_final (); }
}

invokestatic is invoked with the references to the class name and the method name and pops arguments from the stack. An expression A.f (2) is complied to:

iconst_2           // push a constant 2 onto the stack 
invokestatic A.f   // invoke a static method
// the return value is at the top of the stack.

In Java, a private method cannot be overridden. Thus, a method has to be called based on a class regardless of how an object is created. invokespecial allows this; the instruction is the same as invokestatic except that it also pops the object reference besides supplied arguments. Thus far, dynamic binding is not in use, and it is not necessary to have information about binding at runtime about private methods.

Specifically, invokespecial can be used either (1) calling a private method or (2) a invoking a method of the super class (including the constructor for the super class, namely <init>). To call a super method other than <init>, one has to write like super.f () where f is the name of the super method.

In semantics invokeinterface doesn't differ from invokevirtual, but it can give the compiler a hit about the invocation.

Class methods

Class methods can be defined with a static qualifier. Private class methods may be in the same object, if they belong to the different classes. No two public class methods may be in the same object; in other words, class methods cannot be overridden. This also means final qualifier is semantically meaningless for class methods.

Fields

Each field is accessed based on a class. Consider the following.

class A {
  public int i = 2;
}

class B extends A {
  public int i = 3;
}

B b = new B ();
A a = b;
b.i++;  // this would be 3 + 1 = 4
a.i++;  // this would be 2 + 1 = 3

In other words, an access control modifier (none, public, private and protected) only affects if clients of the class can access a given field. This means that Java virtual machine may ignore the access flag, handling each field in the same manner.

References

How the Java virtual machine handles method invocation and return

Objective-C

Objects and fields

In Objective-C, each class is a struct in C. That is,

@interface A
{
  int a, b, c
}

@end

would be implemented as like:

struct A {
  int a, b, c;
  .... // some runtime information
};

Thus, since each object in Objective-C, a pointer to a memory block in the heap. And so the way to access fields is the same as the way for members of struct. That is,

id obj = [A alloc];

The implication of this scheme is that while an object naturally fits to non-OOP C program, one disadvantage is that fields cannot be "shadowed." That is,

@interface A
{
  @private
  int a;
}
@end

@interface B : A
{
  @private
  int a;
}
@end

This would result in duplicate members error. This contrasts with the situation in Java.

Finally, since the selection of methods occurs at runtime (in contrast to the cases in Java or C++), methods are handled differently than fields.

Methods

In Objective-C, the selection of methods occurs at runtime. Compilers may issue warnings about likely mistyped names because the compiler can know a set of selector names that are defined in the program. This, however, is not semantically necessary; any message can be sent to any object.

Semantically, the sender of a message checks if a given object responds to the message, and if not, try its super class, and if not again, its super and so on.

A complication may arise, for example, when there are two selectors with the differing return type. Consider the following case.

@interface A { }
- (float) func: (int) i
@end

@interface B { }
- (int) func: (int) i
@end

In this case, because the compiler cannot know to which method--(float) func or (int) func—an object would respond, it cannot generate code that sends a message, as returning a float value usually differs from doing an int value.