Standard ML Programming/Expressions

Tokens

A Standard ML program consists of a sequence of "tokens"; these may be thought of as the "words" of the language. Some of the most common token types are:

token type	examples
special constants	`2` , `~5.6` , `"string"` , `#"c"`
alphanumeric identifiers	`x`, `mod`, `'a`
symbolic identifiers	`+` , `-` , `*` , `/`
keywords	`val` , `=` , `(` , `)`

Arithmetic expressions

Arithmetic expressions are similar to those in many other languages:

3 + 4
3.0 / 4.0
(2 - 3) * 6

However, a few points bear note:

Unary negation is expressed using the tilde ~ rather than the hyphen -; the latter is used only for binary subtraction. For example, three minus negative-two is written 3 - ~2 or 3 - ~ 2.
Though the operators are "overloaded" to support multiple numeric types — for example, both 3 + 4 and 3.0 + 4.0 are valid (the former having type int, the latter type real) — there is no implicit promotion between types. Therefore, an expression such as 3 + 4.0 is not valid; one must write either 3.0 + 4.0, or real 3 + 4.0 (using the basis function real to convert 3 to 3.0).
Integer division is expressed using the special operator div rather than the solidus /; the latter is used only for real-number division. (And since there is no implicit promotion between types, an expression such as 3 / 4 is not valid; one must write either 3.0 / 4.0 or real 3 / real 4.) There is also a modulus operator mod; for example, seventeen divided by five is three-remainder-two, so 17 div 5 is 3 and 17 mod 5 is 2. (More generally, if q is a positive integer, then d = p div q and m = p mod q are integers such that p = d * q + m, m >= 0, and m < q. If q is a negative integer, then p = d * q + m, m <= 0, and m > q.)

Function calls

Once a function has been declared:

fun triple n = 3 * n

it is called simply by following the function-name with an argument:

val twelve = triple 4

In the general case, parentheses are not necessary, but they are frequently necessary for grouping. Also, as we saw in the chapter on types, tuples are constructed using parentheses, and it is not uncommon to construct a tuple as a function argument:

fun diff (a, b) = a - b
val six = diff (9, 3)

Function calls have very high precedence, higher than any infix operator; so, triple 4 div 5 means (3 * 4) div 5, which is 2, rather than 3 * (4 div 5), which is 0. Also, they are left-associative; f x y means (f x) y (where f takes x as its argument and returns a function that accepts y as its argument), not f (x y).

Infix function calls

A binary function — that is, a function whose parameter type is a 2-tuple type — can be turned into an infix operator:

fun minus (a, b) = a - b
val three = minus (5, 2)
infix minus
val seven = 4 minus ~3
val two = op minus (20, 18)

An infix operator can have any precedence level from 0 to 9, 0 (the default) being the lowest precedence, 9 being the highest. The Standard Basis provides these built-in infix specifications:

infix  7  * / div mod
infix  6  + - ^
infixr 5  :: @
infix  4  = <> > >= < <=
infix  3  := o
infix  0  before

Notice that in the third line, :: and @ are made infix using infixr rather than infix. This makes them right-associative rather than left-associative; whereas 3 - 4 - 5 means (3 - 4) - 5, 3 :: 4 :: nil means 3 :: (4 :: nil).

An identifier can actually be declared infix even before it refers to a specific function:

infix pow
fun x pow y = Math.pow (x, y)
val eight = 2.0 pow 3.0

Note that in this case, the infix notation is already used in declaring the function. Another way of doing this is by using op in the function declaration, which works regardless if the function has already been declared infix or not:

fun op pow (x, y) = Math.pow (x, y)

The preferred style is usually to not do this, but it can be useful if the declaration happens in a setting where it is not certain whether the function has been declared infix or not, or when use might be called on a file containing such a function declaration more than once, where the function is declared as infix after its definition. It can also be a way of signalling that the function might be declared as infix later on, possibly in another file, in which case it can be useful to ensure that the order of parsing the files does not matter.

Boolean and conditional expressions

Comparisons

As we saw in the chapter on types, the bool (boolean) type has two values, true and false. We also saw the built-in polymorphic equality operator =, of type ''a * ''a -> bool. Closely related is the inequality operator <>, also of type ''a * ''a -> bool, which returns true when = would return false, and vice versa.

The < (less than), > (greater than), <= (less than or equal to), and >= (greater than or equal to) operators are overloaded to be usable with a variety of numeric, character, and string types (but as with the arithmetic operators, both operands must have the same type).

Operations on booleans

Three main functions operate on boolean values:

The function not, of type bool -> bool, maps true to false and vice versa.
The infix operator andalso, of type bool * bool -> bool, maps (true, true) to true, and all other possibilities to false. This is know as a short circuit operator; if its first operand is false, then it will return false without evaluating its second operand.
The infix operator orelse, of type bool * bool -> bool, maps (false, false) to false, and all other possibilities to true. Like andalso it is a shortcutting operator, but in the opposite direction: if its first operand is true, then it will return true without even evaluating its second operand.

Conditional expressions

One major use of boolean values is in conditional expressions. An expression of this form:

if boolean_expression then expression_if_true else expression_if_false

evaluates to the result of expression_if_true if boolean_expression evaluates to true, and to the result of expression_if_false if boolean_expression evaluates to false. As with the shortcutting operators, the unneeded expression is not evaluated. This allows conditional expressions to be used in creating recursive functions:

fun factorial x = if x = 0 then 1 else x * (factorial (x - 1))

It also allows for conditional side effects:

if x = 0 then print "x = 0" else print "x <> 0"

Note that, since conditional expressions return a value, the "then" and "else" branches are both required to be present, and to have the same type, though they do not have to be particularly meaningful:

if x = 0 then print "x = 0" else ()

Case expressions and pattern-matching

Functions may be composed of one or more rules. A rule consists of its function-name, an argument pattern and its expression. When the function is called the argument values will be matched to the patterns in top down order. Functions using pattern-matching are very similar to case expressions. In fact you can transform any case construct into a function. For example this snippet

case compare(a,b) of
 GREATER => 1
 LESS => 2
 EQUAL => 3

is semantically equal to

fun case_example_function GREATER = 1
  | case_example_function LESS = 2
  | case_example_function EQUAL = 3;
case_example_function compare(a,b);

The first matching rule will get its expression evaluated and returned. That means if the patterns are overlapping (multiple patterns match a given argument value) one must keep in mind that only the first matching rule gets evaluated.

fun list_size (nil) = 0
 |  list_size (_::xs) = 1 + str_len xs;

The functions pattern is called exhaustive if there is a matching pattern for all legal argument values. The following example is non-exhaustive.

fun list_size ([a]) = 1
 |  list_size ([a,b]) = 2;

Any empty list or lists of size>2 will cause a Match-exception. To make it exhaustive one might add a few patterns.

fun list_size ([a]) = 1
 |  list_size ([a,b]) = 2
 |  list_size (nil) = 0
 |  list_size (_) = 0;

Lists of size>2 will return 0 which does not make a lot of sense but the functions pattern is now exhaustive.

Exceptions

Exceptions are used to abort evaluation. There are several built in exceptions that can be thrown using the raise keyword, and you can define your own using the exception keyword. Exceptions can have messages attached to them by writing of in the declaration and they are used much like datatype constructors. An exception can be caught using the handle keyword. Example:

exception Exc of string;
fun divide (_, 0) = raise Exc("Cannot divide by zero")
  | divide (num, den) = num div den
val zero_or_infinity = divide (0, 0)
  handle Exc msg => (print (msg ^ "\n"); 0)

This will print "Cannot divide by zero" and evaluate zero_or_infinity to 0 thanks to the handle clause.

Built in exceptions include Empty, Domain and Fail(msg).

Lambda expressions

A lambda expression is an expression that can be evaluated into a function without the function being bound to an identifier, a.k.a. an anonymous function, function constant or a function literal. Such a function can be defined in Standard ML using the fn keyword. This is particularly useful for higher order functions, i.e. functions that take other functions as arguments, such as the built in map and foldl functions. Instead of writing:

fun add_one x = x + 1
val incremented = map add_one [1, 2, 3]

You could simply write

val incremented = map (fn x => x + 1) [1, 2, 3]

Note that the => operator is used instead of =. The fn keyword can be used in place of fun, including pattern matching and even recursion (if the rec keyword is used):

val rec fact = fn 1 => 1 | n => if n > 1 then n * fact(n - 1) else raise Domain

Be aware that it is often considered better style to use the fun keyword.