Programming Languages/Semantics Specification

The Semantics of Programming Languages

Semantics, roughly, are meanings given for groups of symbols: ab+c, "ab"+"c", mult(5,4).

For example, to express the syntax of adding 5 with 4, we can say: Put a "+" sign in between the 5 and 4, yielding "5 + 4". However, we must also define the semantics of 5+4. If we use arithmetic semantics, 5+4 would be a representation/equivalent(=) of 9.

In computer science, the author of a programming language 'could' either use or create a new definition for '+'. For example, '+' can defined as arithmetic subtraction(-). More commonly however, '+' is defined as a function(takes in a value to output another) for complex numbers, and even strings(an ordered set of symbols) like "123a","aFsd", and "./etc".

Semantics are far from being mere theory or philosophy. Many properties of programming languages can only be determined through rigorous analysis. For example, we'd like to make statements such as "this programming language is secure." But to demonstrate that a programming language is secure requires more than just a discussion of the language's syntax. Without a formal proof of security, systems can be vulnerable to the unanticipated consequences of many separate interacting concerns. One way to make such proofs is by using mathematical models.

Given the semantics of one programming language we have the option of describing other programming languages in terms of that base language. So, one way to specify the semantics of a programming language is to relate it to another language. However, this creates a bootstrapping problem: what to use for the original base language. Programming language researchers have solved the problem by using a mathematical model that was originally invented for logic, but has extensive computational applications: Lambda Calculus.

Lambda Calculus, invented in the 1920s and 30s by Church, Kleene, and others, is one of the simplest to describe Turing Complete languages. It's a language that only has a single value: the function; and only two operations: defining a function and calling a function. To make things even simpler, functions are only allowed to have one parameter.

In the following discussion, gear your mind to thinking about functions that take functions as arguments and return other functions as values. It is a very abstract concept, but, as we'll find, soon we can build up the language through clever definitions into something that looks more like the programming languages we are used to. For example, one of the first tricks we'll explore is how to achieve the same result of allowing multiple parameters even though each function can only take a single parameter.

Building Up Operations

Before we start working with lambda calculus, let's begin with a puzzle in a more standard programming language. Suppose that Java only had the binary "<" operator, and we needed to define functions to give us ">", ">=", and "<=" operators. We also don't have unary operators either, but we still have if-statements.

First, let's look at what we're given:

 // < - Returns true if a is less than b
 public static boolean lessThan(int a, int b) { ... }

Our task is to define greaterThan, greaterThanOrEqualTo, and lessThanOrEqualTo. Remember, for this puzzle, we can only make calls to lessThan, use if-statements, and return boolean values. Think of how you might do this before reading on.

The fist function is not too complicated:

 // > - Returns true if a is greater than b
 public static boolean greaterThan(int a, int b) {
   return lessThan(b, a);
 }

This works because whenever we have $b<a$ we already know that $a>b$ . We can use similar reasoning to write implementations for greaterThanOrEqualTo and lessThanOrEqualTo:

 // >= - Returns true if a is greater than or equal to b
 public static boolean greaterThanOrEqualTo(int a, int b) {
   return not(lessThan(a, b));
 }
 
 // <= - Returns true if a is less than or equal to b
 public static boolean lessThanOrEqualTo(int a, int b) {
   return greaterThanOrEqualTo(b, a);
 }
 
 // ! - Returns the negation of b
 public static boolean not(boolean b) {
   if (b)
     return false;
   else
     return true;
 }

In the definition of greaterThanOrEqualTo we are in effect negating the result of lessThan. Thus, when $a\not <b$ we return true, and when $a<b$ we return false. But when $a\not <b$ we know that $a\geq b$ ; this is a case where we must return true, and we do. Similarly, when $a<b$ we know is cannot be the case that $a\geq b$ , so we correctly return false.

Given these definitions, we can define equalTo and notEqualTo:

 // == - Returns true if a equals b
 public static boolean equalTo(int a, int b) {
   if (greaterThanOrEqualTo(a, b))
     return greaterThanOrEqualTo(b, a);
   else
     return false;
 }
 
 // != - Returns true if a does not equal b
 public static boolean notEqualTo(int a, int b) {
   return not(equalTo(a, b));
 }

We are now starting to think in ways that will let us build up operations in lambda calculus. Not only will we have to invent < and ==, we'll also need to invent numbers, booleans, all operations on numbers, all relational operations, linked-lists, if expressions, looping constructs and even a model for output and other side effects.

The Built-in Operations of Lambda Calculus

Here is the grammar of the complete lambda calculus language:

expr ::= "λ" id "." expr         abstraction, anonymous function definition
expr ::= expr expr               application, calling a function
expr ::= id                      variable use
expr ::= "(" expr ")"            grouping

We'll also introduce another notation to allow us to create short-hands:

definition ::= "let" id "=" expr

After such a "let" is made, the identifier should be expanded to be the expression on the right-hand side. We'll also include another form of parenthesis, which are no different than the curved parenthesis already defined:

expr ::= "[" expr "]"

Because the language doesn't have much punctuation lots of parenthesis are going to be used, so by allowing two different types of parenthesis readability is improved (because you can better visually match where a grouping starts and ends).

Finally, identifiers can be any sequence of characters provided they aren't already tokens that mean something else:

id ::=  $\langle$ any printable, non-whitespace characters except ()[]=.;, "let", and "λ" $\rangle$

So, unlike most programming languages, in lambda calculus "0", "+", "%" and "15" are all valid identifiers, just as "i", "x", "y1", "remainder" and "theReturnValue" are all identifiers in more common programming languages.

We will use ";" for line comments.

Abstraction: Defining a Function

We'll begin specifying lambda calculus rules by comparison to a hypothetical Java-like language. The abstraction rule,

expr ::= "λ" id "." expr

allows us to create new functions. The lambda "λ" symbol marks a new function is being defined, and the identifier that follows it is the name of its parameter. The expression after the dot "." is an expression that may refer to the parameter and any other variables in scope. Note that the functions do not have names in lambda calculus. If we want to name them we'll need to use the "let" form to just create a short-hand.

It seems that we can't define much of anything at all so far. The first function that might occur to you is the identity function:

λx. x

That is, for any x you supply this function, you'll get x back. Instead of constantly writing λx. x each time we want to use the identity function, we can create a short-hand

let identity = λx. x

and then refer to identity instead, which would be the same as referring to the syntactically heavier λx. x.

In Java, the same definition would look more like this:

public static Function identity(Function x) {
  return x;
}

Because the only values or types in lambda calculus are functions, both the argument and the return value are of a hypothetical "Function" type.

Application: Calling a Function

The application rule

expr ::= expr expr

allows us to apply (or to "call") functions. Suppose that "f" were a function and "a" was its argument. To apply f to a, we simply juxtapose them together:

f a

Because parenthesis just provide groupings, and because f and a are already terminal elements, all of the following are equivalent to the above:

f(a)
(f a)
((f) a)
((f) (a))
(f)a

Supposing that f was an instance of Function in our hypothetical Java-like language. We might write function application like this:

f.apply(a)

which would mean "apply f to a."

The β-reduction

The actual semantics of application is the substitution rule (also called the β-reduction rule). When f is a lambda function and a is some value, and they are juxtaposed, an application occurs. Suppose that fs parameter were named x and some expression expr used x. Then

f a

would be the same as

(λx. expr) a

An application states to replace all occurences of x with a. We would write this substitution as:

expr [a/x]

Given our identity function from before, we can apply it to elements. So,

identity a

is the same as

(λx. x) a

which is the same as

x [a/x]

which, after the substitution becomes

Also,

identity identity

this is the same as saying:

(λx. x) (λx. x)

which is the same as

x [(λx. x)/x]

which, after the substitution becomes

(λx. x)

or, that is to say, the identity function itself.

Precedence order

Left to right

Multiple Arguments and Lexical Scope

Currently it doesn't seem like lambda calculus can do much more than create the identity function and apply it to itself. In order get further in the language, we need to start thinking of the abstraction operation as a true operation in its own right. For example, instead of defining the identity function, let's define a function that creates the identity function:

λy. λx. x

Which should be interpreted as "given any y, return the identity function".

Previously in our lambda function bodies, we've only returned the parameter. The above example shows that we also have another choice: return the parameter of a function we're defined in. For example, what would the following mean?

λy. λx. y

This function should be interpreted as "given any y, returns a function that, given any x, always returns y." In other words, it creates a constant function, which we represent in mathematics as a horizontal line on a graph. (The identity function on a graph would instead be a line at a 45 degree angle, running through the origin.) This section shows how this new option can allow you to express rich concepts. In particular, this section shows how you can simulate functions that take multiple arguments.

One way you might think of simulating functions that take multiple arguments by starting from functions that only take a single argument is to combine all of the arguments into a unit (just as a complex number contains two floating-point numbers) and passing that unit. Such a technique can be described as a tuple technique: as long as you can pass an argument that is actually a collection of multiple values it's really no different than passing in those multiple values in the first place.

You could use the tuple technique, but lambda calculus doesn't have any tuple types in it: The only type is a function that takes in a single argument. But recall the two functions above, that use x and y. Because the function λy function creates the λx function the λx function can use either of x and y.

One way to think of the problem is to focus on a particular example. So, suppose you want to define a new function that takes in arguments a, b, and c and does some computation involving those values. You can think of this function as a machine. The limitation is that in lambda calculus these machines can only take in a single value. But there aren't any strict limitations on what can be returned. So, think first about a machine that can only take in a: because it doesn't know b or c it can't do the desired computation. But because it knows a, it can return a new machine that also knows a. Because that machine also takes in a parameter, we can let that be b. Thus, you now have a machine that knows a and b, but not c. So, this machine should in turn create a machine that know a and b and takes in a parameter c. Once that machine is given c it will be able to compute with all three values.

To make this concept clearer, suppose you want to create a machine that can do addition. That is, some kind of function that might read as, "given any n and m, returns the sum of n + m." We can achieve such an effect if we create a more primitive function first: create-adder. We won't yet implement create-adder, but we'll assume it exists with a specification like this:

; create-adder: given n, returns a function that adds n to its argument
let create-adder = λn. (details left out for the moment)

Assuming create-adder exists, it's trivial to define addition:

let + = λn.λm. ((create-adder n) m)

First, don't let the "+" trip you up. That's just a symbol like any other identifier. We could have just as well made the short-hand for this function be "plus" or "add" instead of "+". So, what is the "+" function itself?

The way to read "+" is: "given any n, return a function that, given any m, returns n + m." Assuming that 5 and 4 are defined to be values (we'll show how we can do this soon), "+" can be used to add them:

((+ 5) 4)

Here, first "+" is applied to 5, and that result is applied to 4. Because at a high-level "+" is thought of as taking two arguments, it can be easier to understand if we remove some of the parenthesis:

(+ 5 4)

which still means the same thing: apply "+" to 5, and then apply that result to 4 (and would still mean that in this case if we remove all of the parenthesis). This way of calling the function is also the same as a prefix notation.

But how exactly does the "+" function return "n + m"? The body is this:

((create-adder n) m)

Here, create-adder is passed n. By definition, it will return a function that, given any number, will return the sum of n plus that number. We then immediately apply that function to m, resulting in the sum of n + m.

This trick is known as Currying and is how we'll accomplish multiple arguments, even though the language itself only technically allows single argument functions.

Creating Richer Functions

Now that we can see some way to add multiple arguments to a language where functions only take a single argument we can explore other extensions. In order to create more programming constructs we'll not only need to make control-flow statements, we'll also need to make new values.

The only values we've fully constructed have been: (1) the identity function; (2) a function that returns the identity function; and (3) a function that returns a constant function.

Boolean Values and Conditionals

The first values we'll create are the booleans true and false. We could start by just defining true to be the identity function and then letting false be the function that returns the identity function, but then we might find it hard to make these definitions be useful.

Instead, we should ask first, before defining true and false, What do we want to do with true and false supposing that we had them?

One desirable property is that we could perform an if somehow. We could think of if as a three-argument function, that takes a boolean value, a "then" value, and then an "else" value:

if cond A B

Note that if is not a keyword, it is just symbols we're using. We want to define it such that when "cond" is true

if cond A B

reduces to

and when "cond" is false then

if cond A B

reduces to

To get to this point, we can just think of true as a function that has two parameters, a t parameter and an f parameter, and returns t:

let true = λt. λf. t

And we can think of false as a function that has two parameters, a t parameter and an f parameter, and returns f:

let false = λt. λf. f

Given that, it's simple to define if:

let if = λcond. λA. λB. cond A B

Here, the boolean itself (named cond) is doing all of the heavy lifting: The boolean is passed both A and B. If the boolean is true, it should return A, and if the boolean is false, it should return B. Given our careful definitions of true and false this is what will happen.

[TODO: Revise this: For the moment, we should assume some form of lazy evaluation so that the then and else values don't both get evaluated (which could cause an infinite loop)]

Logical Operations

Now that we have suitable definitions for true and false, and even an if construct, we can define other boolean operations quite simply. This recalls the previous exercise we did in Java with lessThan and the other relational functions:

not: logical not operation of a boolean value

let not = λb.
  if b
    false
    true

You may have noticed by now that the "if" is not actually needed, because we could have cut it out and passed the values on directly to the boolean:

let not = λb. b false true

It just a matter of style that we use if, as a way to improve readability.

and: logical and of two booleans if b is true, it returns c, otherwise it returns false, thus, it only returns true when b and c are both true

let and = λb. λc.
  if b
    c
    false

or: logical or operation: returns false only when both b and c are false

let or = λb. λc.
  if b
    true
    c

xor: logical exclusive-or operation: returns true when b and c are different, false otherwise

let xor = λb. λc.
  if b
    (not c)
    c

bool-eq: boolean equality: returns true if b has the same value as c

let bool-eq = λb. λc.
  if b
    c
    (not c)

note that beq could also be written as not (xor b c)

Integers

Church numbers -- For all n>=0: The n'th Church number takes in a "zero" value z and applies a given "successor" function s to that zero value n times. In algebraic terms, 0 = z (i.e., no applications of s to z are made) and 1 = s(z), 2 = s(s(z)), 3 = s(s(s(z))), ... and so on

let 0 = λs. λz. z
let 1 = λs. λz. s z

successor function: apply the successor to the given numeral one more time

let succ = λn.
  λs. λz.
    s (n s z)

is-zero?: return true if the number passed in equals zero. This is accomplished by having the zero function be "true" and the successor function be a function that always returns "false". Thus, true will only be the result when the successor is applied zero times.

let is-zero? = λn. n (λx. false) true

a cute version that uses Currying is: { n (and false) true } which might be easier to understand

addition: returns the sum of n and m

let + = λn. λm. n succ m

[TODO: I've also seen this longer form and I'm not sure why it's used:]

let + = λs. λz. m s (n s z)

multiplication: returns the product of n and m, done by applying the "add n to yourself" operation m times to 0.

let * = λn. λm. m (+ n) 0

natural number exponentiation: raise n to the m power

let pow = λn. λm. m (* n) 1

Lists

subtraction is trickier, so we need to define pairs first

let pair = λfirst. λsecond. λbool. bool first second

extract first element of a pair

let first = λpair. pair true

extract second element of a pair

let second = λpair. pair false

cons, car, cdr: to think of pairs as lists instead

let cons = pair
let car = first
let cdr = second

Subtraction

now, back to subtraction: the key idea is to start with an operation that takes in a pair of numbers (a, b) and returns (a+1, a)

let next-pair = λp. pair (succ (first p)) (first p)

now notice that if we start with (0, 0) as a zero, and next-pair as the successor function, applying the successor function n-times yields the following:

times applied   value of pair   value of pair in terms of n
-------------   -------------   ---------------------------
n =         0   (0, 0)          (n, n)
n =         1   (1, 0)          (n, n - 1)
n =         2   (2, 1)          (n, n - 1)
n =         3   (3, 2)          (n, n - 1)
n =         4   (4, 3)          (n, n - 1)

Note, then, that (n next-pair (pair 0 0)) yields the pair (n, n - 1) for n >= 1, and (0, 0) for n = 0. If we want the predecessor of n to be n - 1 for n >= 1, and 0 for n = 0, we just need to take the second element of this pair! That's exactly how we implement predecessor:

let pred = λn. second (n next-pair (pair 0 0))

subtraction: return $n-m$ ; but if that value would be negative, 0 is returned instead

let - = λn. λm. m pred n

Relational Operations and Division

now that we have subtraction, we can compare numbers:

is n greater than or equal to m?

let >= = λn. λm. is-zero? (- m n)

is n less than or equal to m?

let <= = λn. λm. >= m n

is n less than m?

let < = λn. λm. not (>= n m)

is n greater than m?

let > = λn. λm. not (>= m n)

equal: are n and m equal numbers?

let equal = λn. λm. and (>= n m) (>= m n)

not-equal: are n and m different numbers?

let not-equal = λn. λm. not (equal n m)

div: given a numerator n and a denominator d, returns the quotient

let / = λn. λd.
  (if (< n d)
     0          ; -- (n < d)
     (if (equal n d)
        1          ; -- (n == d)
        ; subtract d from n until the value is < d -- (n > d)
        (n
          [λn'.
            (if (< n' d)
               n'
               (- n' d))] n)))

mod: given a numerator n and a denominator d, returns the remainder

let mod = λn. λd. (- n (* (/ n d) d))

The Y-combinator

the fixed-point Y-combinator (call-by-value), for implementing recursion

let fix = λf.
  [λx. f (λy. x x y)] [λx. f (λy. x x y)]

Y: alias for fix

let Y = fix

Monads: Simulating Storage Locations and Output

Lambda Calculus Extended with Primitives

[TODO: Discuss this extension. Typically by extending lambda calculus we make models of programming language. An example is Feather-weight Java, which was used to prove that Java was a secure language (in theory; the model did not take into account implementations, which may have bugs that result in security holes).]