Clojure Programming/Concepts

Concepts edit

Basics edit

Numbers edit

Types of Numbers edit

Clojure supports the following numeric types:

Integer
Floating Point
Ratio
Decimal

Numbers in Clojure are based on java.lang.Number. BigInteger and BigDecimal are supported and hence we have arbitrary precision numbers in Clojure.

The Ratio type is described on Clojure page as

Ratio

Represents a ratio between integers. Division of integers that can't be reduced to an integer yields a ratio, i.e. 22/7 = 22/7, rather than a floating point or truncated value.

Ratios allows a computation to be maintained in numeric form. This can help avoid inaccuracies in long computations.

Here is a little experiment. Lets first try a computation of (1/3 * 3/1) as floating point. Later we try the same with Ratio.

 (def a (/ 1.0 3.0))
 (def b (/ 3.0 1.0))

 (* a b)
 ;; ⇒ 1.0

 (def c (* a a a a a a a a a a)) ; ⇒ #'user/c
 (def d (* b b b b b b b b b b)) ; ⇒ #'user/d

 (* c d)
 ;; ⇒ 0.9999999999999996

The result we want is 1, but the value of (* c d) above is 0.9999999999999996. This is due to the inaccuracies of a and b multiplying as we create c and d. You really don't want such calculations happening in your pay cheque :)

The same done with ratios below:

 (def a1 (/ 1 3))
 (def b1 (/ 3 1))

 (def c (* a1 a1 a1 a1 a1 a1 a1 a1 a1 a1))
 (def d (* b1 b1 b1 b1 b1 b1 b1 b1 b1 b1))

 (* c d)
 ;; ⇒ 1

The result is 1 as we hoped for.

Number Entry Formats edit

Clojure supports the usual formats for entry as shown below

 user=> 10       ; decimal
 10
 user=> 010      ; octal
 8
 user=> 0xff     ; hex
 255
 user=> 1.0e-2   ; double
 0.01
 user=> 1.0e2    ; double
 100.0

To make things easier, a radix based entry format is also supported in the form <radix>r<number>. Where radix can be any natural number between 2 and 36.

 2r1111
 ;; ⇒ 15

These formats can be mixed and used.

 (+ 0x1 2r1 01)
 ;; ⇒ 3

Many bitwise operations are also supported by Clojure API:

 (bit-and 2r1100 2r0100)
 ;; ⇒ 4

Some of the others are:

(bit-and x y)
(bit-and-not x y)
(bit-clear x n)
(bit-flip x n)
(bit-not x)
(bit-or x y)
(bit-set x n)
(bit-shift-left x n)
(bit-shift-right x n)
(bit-test x n)
(bit-xor x y)

Check Clojure API for the complete documentation.

Converting Integers to Strings edit

One general purpose way to format any data for printing is to use a java.util.Formatter.

The predefined convenience function format makes using a Formatter easy (the hash in %#x displays the number as a hexadecimal number prefixed with 0x):

 
 (format "%#x" (bit-and 2r1100 2r0100))
 ;; ⇒ "0x4"

Converting integers to strings is even easier with java.lang.Integer. Note that since the methods are static, we must use the "/" syntax instead of ".method":

 
 (Integer/toBinaryString 10)
 ;; ⇒ "1010"
 (Integer/toHexString 10)
 ;; ⇒ "a"
 (Integer/toOctalString 10)
 ;;⇒ "12"

Here is another way to specify the base of the string representation:

 
 (Integer/toString 10 2)
 ;; ⇒"1010"

Where 10 is the number to be converted and 2 is the radix.

Note: In addition to the above syntax, which is used for accessing static fields or methods, . (dot) can be used. It is a special form used for accessing arbitrary (non-private) fields or methods in Java as explained in the Clojure Reference (Java Interop). For example:

 
 (. Integer toBinaryString 10)
 ;; ⇒ "1010"

For static accesses, the / syntax is preferred.

Converting Strings to Integers edit

For converting strings to integers, we can again use java.lang.Integer. This is shown below.

   user=> (Integer/parseInt "A" 16)      ; hex
   10
   user=> (Integer/parseInt "1010" 2)    ; bin
   10
   user=> (Integer/parseInt "10" 8)      ; oct
   8
   user=> (Integer/parseInt "8")         ; dec
   8

The above sections give an overview of the integer-to-string and string-to-integer formatting. There is a very rich set of well documented functions available in the Java libraries (too rich to document here). These functions can easily be used to meet varied needs.

Structures edit

Structures in Clojure are a little different from those in languages like Java or C++. They are also different from structures in Common Lisp (even though we have a defstruct in Clojure).

In Clojure, structures are a special case of maps and are explained in the data structures section in the reference.

The idea is that multiple instance of the structures will need to access their field values using the field names which are basically the map keys. This is fast and convenient, especially because Clojure automatically defines the keys as accessors for the structure instances.

Following are the important functions dealing with structures:

defstruct
create-struct
struct
struct-map

For the full API refer to data structures section in Clojure reference.

Structures are created using defstruct which is a macro wrapping the function create-struct which actually creates the struct. defstruct creates the structure using create-struct and binds it to the structure name supplied to defstruct.

The object returned by create-struct is what is called the structure basis. This is not a structure instance but contains information of what the structure instances should look like. New instances are created using struct or struct-map.

The structure field names of type keyword or symbols are automatically usable as functions to access fields of the structure. This is possible as structures are maps and this feature is supported by maps. This is not possible for other types of field names such as strings or numbers. It is quite common to use keywords for field names for structures due to the above reason. Also, Clojure optimises structures to share base key information. The following shows sample usage:

 (defstruct employee :name :id)
 (struct employee "Mr. X" 10)                           ; ⇒ {:name "Mr. X", :id 10}
 (struct-map employee :id 20 :name "Mr. Y")             ; ⇒ {:name "Mr. Y", :id 20}

 (def a (struct-map employee :id 20 :name "Mr. Y"))
 (def b (struct employee "Mr. X" 10))'

 ;; :name and :id are accessors
 (:name a)                                              ; ⇒ "Mr. Y"
 (:id b)                                                ; ⇒ 10
 (b :id)                                                ; ⇒ 10
 (b :name)                                              ; ⇒ "Mr. X"

Clojure also supports the accessor function that can be used to get accessor functions for fields to allow easy access. This is important when field names are of types other than keyword or symbols. This is seen in the interaction below.

 (def e-str (struct employee "John" 123))
 e-str
 ;; ⇒ {:name "John", :id 123}

 ("name" e-str) ; ERROR: string not an accessor
 ;; ERROR ⇒
 ;; java.lang.ClassCastException: java.lang.String cannot be cast to clojure.lang.IFn
 ;; java.lang.ClassCastException: java.lang.String cannot be cast to clojure.lang.IFn
 ;;         at user.eval__2537.invoke(Unknown Source)
 ;;         at clojure.lang.Compiler.eval(Compiler.java:3847)
 ;;         at clojure.lang.Repl.main(Repl.java:75)

 (def e-name (accessor employee :name))  ; bind accessor to e-name
 (e-name e-str) ; use accessor
 ;; ⇒ "John"

As structures are maps, new fields can be added to structure instances using assoc. dissoc can be used to remove these instance specific keys. Note however that struct base keys cannot be removed.

 b
 ;; ⇒ {:name "Mr. X", :id 10}

 (def b1 (assoc b :function "engineer"))
 b1
 ;; ⇒ {:name "Mr. X", :id 10, :function "engineer"}

 (def b2 (dissoc b1 :function)) ; this works as :function is instance
 b2
 ;; ⇒ {:name "Mr. X", :id 10}

 (dissoc b2 :name)  ; this fails. base keys cannot be dissociated
 ;; ERROR ⇒ java.lang.Exception: Can't remove struct key

assoc can also be used to "update" a structure.

 a
 ;; ⇒ {:name "Mr. Y", :id 20}

 (assoc a :name "New Name")
 ;; ⇒ {:name "New Name", :id 20}

 a                   ; note that 'a' is immutable and did not change
 ;; ⇒ {:name "Mr. Y", :id 20}

 (def a1 (assoc a :name "Another New Name")) ; bind to a1
 a1
 ;; ⇒ {:name "Another New Name", :id 20}

Observe that like other sequences in Clojure, structures are also immutable, hence, simply doing assoc above does not change a. Hence we rebind it to a1. While it is possible to rebind the new value back to a, this is not considered good style.

Exception Handling edit

Clojure supports Java based Exceptions. This may need some getting used to for Common Lisp users who are used to the Common Lisp Condition System.

Clojure does not support a condition system and is not expected to be supported anytime soon as per this message. That said, the more common exception system which is adopted by Clojure is well suited for most programming needs.

If you are new to exception handling, the Java Tutorial on Exceptions is a good place to learn about them.

In Clojure, exceptions can be handled using the following functions:

(try expr* catch-clause* finally-clause?)
- catch-clause -> (catch classname name expr*)
- finally-clause -> (finally expr*)
(throw expr)

Two types of exceptions you may want to handle in Clojure are:

Clojure Exception: These are exception generated by Clojure or the underlying Java engine
User Defined Exception: These are exceptions which you might create for your applications

Clojure Exceptions edit

Below is a simple interaction at the REPL that throws an exception:

user=> (/ 1 0)
java.lang.ArithmeticException: Divide by zero
java.lang.ArithmeticException: Divide by zero
        at clojure.lang.Numbers.divide(Numbers.java:142)
        at user.eval__2127.invoke(Unknown Source)
        at clojure.lang.Compiler.eval(Compiler.java:3847)
        at clojure.lang.Repl.main(Repl.java:75)

In the above case we see a java.lang.ArithmeticException being thrown. This is a runtime exception which is thrown by the underlying JVM. The long message can sometimes be intimidating for new users but the trick is to simply look at the exception (java.lang.ArithmeticException: Divide by zero) and not bother with the rest of the trace.

Similar exceptions may be thrown by the compiler at the REPL.

user=> (def xx yy)
java.lang.Exception: Unable to resolve symbol: yy in this context
clojure.lang.Compiler$CompilerException: NO_SOURCE_FILE:4: Unable to resolve symbol: yy in this context
        at clojure.lang.Compiler.analyze(Compiler.java:3669)
        at clojure.lang.Compiler.access$200(Compiler.java:37)
        at clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:335)
        at clojure.lang.Compiler.analyzeSeq(Compiler.java:3814)
        at clojure.lang.Compiler.analyze(Compiler.java:3654)
        at clojure.lang.Compiler.analyze(Compiler.java:3627)
        at clojure.lang.Compiler.eval(Compiler.java:3851)
        at clojure.lang.Repl.main(Repl.java:75)

In the above case, the compiler does not find the binding for yy and hence it throws the exception. If your program is correct (i.e. in this case yy is defined (def yy 10)) , you won't see any compile time exceptions.

The following interaction shows how runtime exceptions like ArithmeticException can be handled.

user=> (try (/ 1 0)
            (catch Exception e (prn "in catch"))
            (finally (prn "in finally")))
"in catch"
"in finally"
nil

The syntax for the try block is (try expr* catch-clause* finally-clause?).

As can be seen, it's quite easy to handle exceptions in Clojure. One thing to note is that (catch Exception e ...) is a catch all for exceptions as Exception is a superclass of all exceptions. It is also possible to catch specific exceptions which is generally a good idea.

In the example below, we specifically catch ArithmeticException.

user=> (try (/ 1 0) (catch ArithmeticException e (prn "in catch")) (finally (prn "in finally")))
"in catch"
"in finally"
nil

When we use some other exception type in the catch block, we find that the ArithmeticException is not caught and is seen by the REPL.

user=> (try (/ 1 0) (catch IllegalArgumentException e (prn "in catch")) (finally (prn "in finally")))
"in finally"
java.lang.ArithmeticException: Divide by zero
java.lang.ArithmeticException: Divide by zero
        at clojure.lang.Numbers.divide(Numbers.java:142)
        at user.eval__2138.invoke(Unknown Source)
        at clojure.lang.Compiler.eval(Compiler.java:3847)
        at clojure.lang.Repl.main(Repl.java:75)

User-Defined Exceptions edit

As mentioned previously, all exceptions in Clojure need to be a subclass of java.lang.Exception (or generally speaking - java.lang.Throwable which is the superclass for Exception). This means that even when you want to define your own exceptions in Clojure, you need to derive it from Exception.

Don't worry, that's easier than it sounds :)

Clojure API provides a function gen-and-load-class which can be used to extend java.lang.Exception for user-defined exceptions. gen-and-load-class generates and immediately loads the bytecode for the specified class.

Now, rather than talking too much, let's quickly look at code.

(gen-and-load-class 'user.UserException :extends Exception)

(defn user-exception-test []
  (try
    (throw (new user.UserException "msg: user exception was here!!"))
    (catch user.UserException e
      (prn "caught exception" e))
    (finally (prn "finally clause invoked!!!"))))

Here we are creating a new class 'user.UserException that extends java.lang.Exception. We create an instance of user.UserException using the special form (new Classname-symbol args*). This is then thrown.

Sometimes you may come across code like (user.UserException. "msg: user exception was here!!"). This is just another way to say new. Note the . (dot) after the user.UserException. This does exactly the same thing.

Here is the interaction:

user=> (load-file "except.clj")
#'user/user-exception-test

user=> (user-exception-test)
"caught exception" user.UserException: msg: user exception was here!!
"finally clause invoked!!!"
nil
user=>

So here we have both the catch and the finally clauses being invoked. That's all there is to it.

With Clojure's support for Java Interop, it is also possible for the user to create exceptions in Java and catch them in Clojure, but creating the exception in Clojure is typically more convenient.

Mutation Facilities edit

Employee Record Manipulation edit

Data structures and sequences in Clojure are immutable as seen in the examples presented in Clojure_Programming/Concepts#Structures (it is suggested that the reader go through that section first).

While immutable data has its advantages, any project of reasonable size will require the programmer to maintain some sort of state. Managing state in a language with immutable sequences and data structures is a frequent source of confusion for people used to programming languages that allow mutation of data.

A good essay on the Clojure approach is [http://clojure.org/state Values and Change - Clojure's approach to Identity and State], written by Rich Hickey.

It may be useful to watch Clojure Concurrency screen cast as some of those concepts are used in this section. Specifically refs and transactions.

In this section we create a simple employee record set and provide functions to:

Add an employee
Delete employee by name
Change employee role by name

The example is purposely kept simple as the intent is to show the state and mutation facilities rather than provide full functionality.

Lets dive into the code.

(alias 'set 'clojure.set)   ; use set/fn-name rather than clojure.set/fn-name

(defstruct employee
           :name :id :role) ; == (def employee (create-struct :name :id ..))

(def employee-records (ref #{}))

;;;===================================
;;; Private Functions: No Side-effects
;;;===================================

(defn- update-role [n r recs]
  (let [rec    (set/select #(= (:name %) n) recs)
        others (set/select #(not (= (:name %) n)) recs)]
    (set/union (map #(set [(assoc % :role r)]) rec) others)))

(defn- delete-by-name [n recs]
  (set/select #(not (= (:name %) n)) recs))

;;;=============================================
;;; Public Function: Update Ref employee-records
;;;=============================================
(defn update-employee-role [n r]
  "update the role for employee named n to the new role r"
  (dosync 
    (ref-set employee-records (update-role n r @employee-records))))

(defn delete-employee-by-name [n]
  "delete employee with name n"
  (dosync
    (ref-set employee-records
             (delete-by-name n @employee-records))))

(defn add-employee [e]
  "add new employee e to employee-records"
  (dosync (commute employee-records conj e)))

;;;=========================
;;; initialize employee data
;;;=========================
(add-employee (struct employee "Jack" 0 :Engineer))
(add-employee (struct employee "Jill" 1 :Finance))
(add-employee (struct-map employee :name "Hill" :id 2 :role :Stand))

In the first few lines we define the employee structure. The interesting definition after that is employee-records.

(def employee-records (ref #{}))

In Clojure refs allow mutation of a storage location with a transaction.

user=> (def x (ref [1 2 3]))
#'user/x
user=> x
clojure.lang.Ref@128594c
user=> @x
[1 2 3]
user=> (deref x)
[1 2 3]
user=>

Next we define private functions update-role and delete-by-name using defn- (note the minus '-' at the end). Observe that these are pure functions without any side-effects.

update-role takes the employee name n, the new role r and a table of employee records recs. As sequences are immutable, this function returns a new table of records with the employee role updated appropriately. delete-by-name also behaves in a similar manner by returning a new table of employees after deleting the relevant employee record.

For an explanation of the set API see Clojure API reference.

We still haven't looked at how state is maintained. This is done by the public functions in the listing update-employee-role, delete-employee-by-name and add-employee.

These functions delegate the job of record processing to the private functions. The important things to note are the use of the following functions:

ref-set sets the value of a ref.
dosync is mandatory as refs can only be updated in a transaction and dosync sets up the transaction.
commute updates the in-transaction value of a ref.

For a detailed explanation of these functions see the refs section in API reference.

The add-employee function is quite trivial and hence not broken up into private and public function.

The source listing initializes the records with sample data towards the end.

Below is the interaction for this program.

user=> (load-file "employee.clj")
#{{:name "Jack", :id 0, :role :Engineer} {:name "Hill", :id 2, :role :Stand} {:name "Jill", :id 1, :role :Finance}}

user=> @employee-records
#{{:name "Jack", :id 0, :role :Engineer} {:name "Hill", :id 2, :role :Stand} {:name "Jill", :id 1, :role :Finance}}

user=> (add-employee (struct employee "James" 3 :Bond))
#{{:name "James", :id 3, :role :Bond} {:name "Jack", :id 0, :role :Engineer} {:name "Hill", :id 2, :role :Stand} {:name "Jill", :id 1, :role :Finance}}
user=> @employee-records
#{{:name "James", :id 3, :role :Bond} {:name "Jack", :id 0, :role :Engineer} {:name "Hill", :id 2, :role :Stand} {:name "Jill", :id 1, :role :Finance}}

user=> (update-employee-role "Jill" :Sr.Finance)
#{{:name "James", :id 3, :role :Bond} {:name "Jack", :id 0, :role :Engineer} {:name "Hill", :id 2, :role :Stand} {:name "Jill", :id 1, :role :Sr.Finance}}
user=> @employee-records
#{{:name "James", :id 3, :role :Bond} {:name "Jack", :id 0, :role :Engineer} {:name "Hill", :id 2, :role :Stand} {:name "Jill", :id 1, :role :Sr.Finance}}

user=> (delete-employee-by-name "Hill")
#{{:name "James", :id 3, :role :Bond} {:name "Jack", :id 0, :role :Engineer} {:name "Jill", :id 1, :role :Sr.Finance}}
user=> @employee-records
#{{:name "James", :id 3, :role :Bond} {:name "Jack", :id 0, :role :Engineer} {:name "Jill", :id 1, :role :Sr.Finance}}

Two things to note about the program:

Using refs and transactions makes the program inherently thread safe. If we want to extend the program for a multi-threaded environment (using Clojure agents) it will scale with minimal change.
Keeping the pure functionality separate from the public function that manages state, it is easier to ensure that the functionality is correct as pure functions are easier to test.

Namespaces ^[1] edit

Overview edit

use require to load clojure libraries
use refer to refer to functions in the current namespace
use use to load and refer all in one step
use import to refer to Java classes in the current namespace

Require

1. You can load the code for any clojure library with (require libname). Try it with clojure.contrib.math:

     (require clojure.contrib.math)

2. Then print the directory of names available in the namespace

     (dir clojure.contrib.math)

3. Show using lcm to calculate the least common multiple:

     1	(clojure.contrib.math/lcm 11 41)
     2	-> 451

4. Writing out the namespace prefix on every function call is a pain, so you can specify a shorter alias using as:

     (require [clojure.contrib.math :as m])

5. Calling the shorter form is much easier:

     1	(m/lcm 120 1000)
     2	-> 3000

6. You can see all the loaded namespaces with

     (all-ns)

Refer and Use edit

1. It would be even easier to use a function with no namespace prefix at all. You can do this by referring to the name, which makes a reference to the name in the current namespace:

     (refer 'clojure.contrib.math)

2. Now you can call lcm directly:

     1	(lcm 16 30)
     2	-> 240

3. If you want to load and refer all in one step, call use:

     (use 'clojure.contrib.math)

4. Referring a library refers all of its names. This is often undesirable, because

it does not clearly document intent to readers
it brings in more names than you need, which can lead to name collisions

Instead, use the following style to specify only those names you want:

     (use '[clojure.contrib.math :only (lcm)])

The :only option is available on all the namespace management forms. (There is also an :exclude which works as you might expect.)

5. The variable *ns* always contains the current namespace, and you can see what names your current namespace refers to by calling

     (ns-refers *ns*)

6. The refers map is often pretty big. If you are only interested in one symbol, pass that symbol to the result of calling ns-refers:

     1	((ns-refers *ns*) 'dir)
     2	-> #'clojure.contrib.ns-utils/dir

Import edit

1. Importing is like referring, but for Java classes instead of Clojure namespaces. Instead of

     (java.io.File. "woozle")

you can say

     1	(import java.io.File)
     2	(File. "woozle")

2. You can import multiple classes in a Java package with the form

     (import [package Class Class])

For example:

     1	(import [java.util Date Random])
     2	(Date. (long (.nextInt (Random.))))

3. Programmers new to Lisp are often put off by the "inside-out" reading of forms like the date creation above. Starting from the inside, you

get a new Random
get the next random integer
cast it to a long
pass the long to the Date constructor

You don't have to write inside-out code in Clojure. The -> macro takes its first form, and passes it as the first argument to its next form. The result then becomes the first argument of the next form, and so on. It is easier to read than to describe:

     1	(-> (Random.) (.nextInt) (long) (Date.))
     2	-> #<Date Sun Dec 21 12:47:20 EST 1969>

Load and Reload edit

The REPL isn't for everything. For work you plan to keep, you will want to place your source code in a separate file. Here are the rules of thumb to remember when creating your own Clojure namespaces.

1. Clojure namespaces (a.k.a. libraries) are equivalent to Java packages.

2. Clojure respects Java naming conventions for directories and files, but Lisp naming conventions for namespace names. So a Clojure namespace com.my-app.utils would live in a path named com/my_app/utils.clj. Note especially the underscore/hyphen distinction.

3. Clojure files normally begin with a namespace declaration, e.g.

     (ns com.my-app.utils)

4. The syntax for import/use/refer/require presented in the previous sections is for REPL use. Namespace declarations allow similar forms—similar enough to aid memory, but also different enough to confuse. The following forms at the REPL:

     1	(use 'foo.bar)
     2	(require 'baz.quux)
     3	(import '[java.util Date Random])

would look like this in a source code file:

     1	(ns
     2	 com.my-app.utils
     3	 (:use foo.bar)
     4	 (:require baz.quux)
     5	 (:import [java.util Date Random]))

Symbols become keywords, and quoting is no longer required.

5. At the time of this writing, the error messages for doing it wrong with namespaces are, well, opaque. Be careful.

Now let's try creating a source code file. We aren't going to bother with explicit compilation for now. Clojure will automatically (and quickly) compile source code files on the classpath. Instead, we can just add Clojure (.clj) files to the src directory.

1. Create a file named student/dialect.clj in the src directory, with the appropriate namespace declaration:

     (ns student.dialect)

2. Now, implement a simple canadianize function that takes a string, and appends , eh?

     (defn canadianize [sentence] (str sentence ", eh"))

3. From your REPL, use the new namespace:

     (use 'student.dialect)

4. Now try it out.

     1	(canadianize "Hello, world.")
     2	-> "Hello, world., eh"

5. Oops! We need to trim the period off the end of the input. Fortunately, clojure.contrib.str-utils2 provides chop. Go back to student/dialect.clj and add require in clojure.contrib.str-utils2:

     (ns student.dialect (:require [clojure.contrib.str-utils2 :as s]))

6. Now, update canadianize to use chop:

     (defn canadianize [sentence] (str (s/chop sentence) ", eh?"))

7. If you simply retry calling canadianize from the repl, you will not see your new change, because the code was already loaded. However, you can use namespace forms with reload ( or reload-all) to reload a namespace (and its dependencies).

     (use :reload 'student.dialect)

8. Now you should see the new version of canadianize:

     1	(canadianize "Hello, world.")
     2	-> "Hello, world, eh?"

Functional Programming edit

Anonymous Functions edit

Clojure supports anonymous functions using fn or the shorter reader macro #(..). The #(..) is convenient due to its conciseness but is somewhat limited as #(..) form cannot be nested.

Below are some examples using both forms:

user=> ((fn [x] (* x x)) 3)
9

user=> (map #(list %1 (inc %2)) [1 2 3] [1 2 3])
((1 2) (2 3) (3 4))

user=> (map (fn [x y] (list x (inc y))) [1 2 3] [1 2 3])
((1 2) (2 3) (3 4))

user=> (map #(list % (inc %)) [1 2 3])
((1 2) (2 3) (3 4))

user=> (map (fn [x] (list x (inc x))) [1 2 3])
((1 2) (2 3) (3 4))

user=> (#(apply str %&) "Hello")
"Hello"

user=> (#(apply str %&) "Hello" ", " "World!")
"Hello, World!"

Note that in #(..) form, %N is used for arguments (1 based) and %& for the rest argument. % is a synonym for %1.

Lazy Evaluation of Sequences edit

This section tries to walk through some code to give a better feel for the lazy evaluation of sequences by Clojure and how that might be useful. We also measure memory and time to understand whats happening better.

Consider a scenario where we want to do a big-computation (1 second each) on records in a list with a billion items. Typically we may not need all the billion items processed (e.g. we may need only a filtered subset).

Let's define a little utility function free-mem to help us monitor memory usage and another function big-computation that takes 1 second to do its job.

(defn free-mem [] (.freeMemory (Runtime/getRuntime)))

(defn big-computation [x] (Thread/sleep 1000) (* 10 x))

In the functions above we use java.lang.Runtime and java.lang.Thread for getting free memory and supporting sleep.

We will also be using the built in function time to measure our performance.

Here is a simple usage at REPL:

user=> (defn free-mem [] (.freeMemory (Runtime/getRuntime)))
#'user/free-mem

user=> (defn big-computation [x] (Thread/sleep 1000) (* 10 x))
#'user/big-computation

user=> (time (big-computation 1))
"Elapsed time: 1000.339953 msecs"
10

Now we define a list of 1 billion numbers called nums.

user=> (time (def nums (range 1000000000)))
"Elapsed time: 0.166994 msecs"
#'user/nums

Note that it takes Clojure only 0.17 ms to create a list of 1 billion numbers. This is because the list is not really created. The user just has a promise from Clojure that the appropriate number from this list will be returned when asked for.

Now, let's say, we want to apply big-computation to x from 10000 to 10005 from this list.

This is the code for it:

;; The comments below should be read in the numbered order
;; to better understand this code.

(time                              ; [7] time the transaction
  (def v                           ; [6] save vector as v
    (apply vector                  ; [5] turn the list into a vector
           (map big-computation    ; [4] process each item for 1 second
                (take 5            ; [3] take first 5 from filtered items
                      (filter      ; [2] filter items 10000 to 10010
                        (fn [x] (and (> x 10000) (< x 10010)))
                        nums)))))) ; [1] nums = 1 billion items

Putting this code at the REPL, this is what we get:

user=> (free-mem)
2598000
user=> (time (def v (apply vector (map big-computation (take 5 (filter (fn [x] (and (> x 10000) (< x 10010))) nums))))))
"Elapsed time: 5036.234311 msecs"
#'user/v
user=> (free-mem)
2728800

The comments in the code block indicate the working of this code. It took us ~5 seconds to execute this. Here are some points to note:

It did not take us 10000 seconds to filter out item number 10000 to 10010 from the list
It did not take us 10 seconds to get first 5 items from the list of 10 filtered list
Overall, it took the computation only 5 seconds which is basically the computation time.
The amount of free memory is virtually the same even though we now have the promise of a billion records for processing. (It actually seems to have gone up a bit due to garbage collection)

Now if we access v it takes negligible time.

user=> (time (seq v))
"Elapsed time: 0.042045 msecs"
(100010 100020 100030 100040 100050)
user=>

Another point to note is that a lazy sequence does not mean that the computation is done every time; once the computation is done, it gets cached.

Try the following:

user=> (time (def comps (map big-computation nums)))
"Elapsed time: 0.113564 msecs"
#'user/comps

user=> (defn t5 [] (take 5 comps))
#'user/t5

user=> (time (doall (t5)))
"Elapsed time: 5010.49418 msecs"
(0 10 20 30 40)

user=> (time (doall (t5)))
"Elapsed time: 0.096104 msecs"
(0 10 20 30 40)

user=>

In the first step we map big-computation to a billion nums. Then we define a function t5 that takes 5 computations from comps. Observe that the first time t5 takes 5 seconds and after that it takes neglegible time. This is because once the calculation is done, the results are cached for later use. Since the result of t5 is also lazy, doall is needed to force it to be eagerly evaluated before time returns to the REPL.

Lazy data structures can offer significant advantage assuming that the program is designed to leverage that. Designing a program for lazy sequences and infinite data structures is a paradigm shift from eagerly just doing the computation in languages like C and Java vs giving a promise of a computation.

This section is based on this mail in the Clojure group.

Infinite Data Source edit

As Clojure supports lazy evaluation of sequences, it is possible to have infinite data sources in Clojure. The infinite sequence (0 1 2 3 4 5 ....) can be defined using (range) since clojure 1.2:^[2]

 (def nums (range))         ; Old version (def nums (iterate inc 0))
 ;; ⇒ #'user/nums
 (take 5 nums)
 ;; ⇒ (0 1 2 3 4)
 (drop 5 (take 11 nums))
 ;; ⇒ (5 6 7 8 9 10)

Here we see two functions that are used for create an infinite list of numbers starting from 0. As Clojure supports lazy sequences, only the required items are generated and taken of the head of this list. In the above case, if you were to type out (range) or (iterate inc 0) directly at the prompt, the [http://clojure.org/reader reader] would continue getting the next number forever and you would need to terminate the process.

(iterate f x) is a function that continuously applies f to the result of the previous application of f to x. Meaning, the result is ...(f(f(f(f .....(f(f(f x))))).... (iterate inc 0) first gives 0 as the result, then (inc 0) => 1, then (inc (inc 0)) => 2 and so on.

(take n coll) basically removes n items from the collection. There are many variation of this theme:

(take n coll)
(take-nth n coll)
(take-last n coll)
(take-while pred coll)
(drop n coll)
(drop-while pred coll)

The reader is encouraged to look at the Clojure Sequence API for details.

List Comprehension edit

List Comprehensions are the constructs offered by a language that make it easy to create new lists from old ones. As simple as it sounds, it is a very powerful concept. Clojure has good support for List comprehensions.

Lets say we want a set of all x + 1 for all x divisible by 4 with x starting from 0.

Here is one way to do it in Clojure:

(def nums (iterate inc 0))
;; ⇒ #'user/nums
(def s (for [x nums :when (zero? (rem x 4))] (inc x)))
;; ⇒ #'user/s
(take 5 s)
;; ⇒ (1 5 9 13 17)

nums is the infinite list of numbers that we saw in the previous section. We need to (def s ...) for the set as we are creating an infinite source of numbers. Running it directly at the prompt will make the reader suck out numbers from this source indefinitely.

The key construct here is the for macro. Here the expression [x nums ... says that x comes out of nums one at a time. The next clause .. :when (zero? (rem x 4)) .. basically says that x should be pulled out only if it meets this criteria. Once this x is out, inc is applied to it. Binding all this to s gives us an infinite set. Hence, the (take 5 s) and the expected result that we see.

Another way to achieve the same result is to use map and filter.

(def s (map inc (filter (fn [x] (zero? (rem x 4))) nums)))
;; ⇒ #'user/s
(take 5 s)
;; ⇒ (1 5 9 13 17)

Here we create a predicate (fn [x] (zero? (rem x 4))) and pull out x's from nums only if this predicate is satisfied. This is done by filter. Note that since Clojure is lazy, what filter gives is only a promise of supplying the next number that satisfies the predicate. It does not (and cannot in this particular case) evaluate the entire list. Once we have this stream of x's, it is simply a matter of mapping inc to it (map inc ....

The choice between List Comprehension i.e. for and map/filter is largely a matter of user preference. There is no major advantage of one over the other.

Lisp edit

Sequence Functions edit

(first coll) edit

Gets the first element of a sequence. Returns nil for an empty sequence or nil.

 (first (list 1 2 3 4))
 ;; ⇒ 1
 (first (list))
 ;; ⇒ nil
 (first nil)
 ;; ⇒ nil
 (map first [[1 2 3] "Test" (list 'hi 'bye)])
 ;; ⇒ (1 \T hi)
 (first (drop 3 (list 1 2 3 4)))
 ;; ⇒ 4

(rest coll) edit

Gets everything except the first element of a sequence. Returns nil for an empty sequence or nil.

 (rest (list 1 2 3 4))
 ;; ⇒ (2 3 4)
 (rest (list))
 ;; ⇒ nil
 (rest nil)
 ;; ⇒ nil
 (map rest [[1 2 3] "Test" (list 'hi 'bye)])
 ;; ⇒ ((2 3) (\e \s \t) (bye))
 (rest (take 3 (list 1 2 3 4)))
 ;; ⇒ (2 3)

(map f colls*) edit

Applies f lazily to each item in the sequences, returning a lazy sequence of the return values of f.

Because the supplied function always returns true, these both return a sequence of true, repeated ten times.

 (map (fn [x] true) (range 10))
 ;; ⇒ (true true true true true true true true true true)
 (map (constantly true) (range 10)) 
 ;; ⇒ (true true true true true true true true true true)

These two functions both multiply their argument by 2, so (map ...) returns a sequence where every item in the original is doubled.

 (map (fn [x] (* 2 x)) (range 10))
 ;; ⇒ (0 2 4 6 8 10 12 14 16 18)
 (map (partial * 2) (range 10))
 ;; ⇒ (0 2 4 6 8 10 12 14 16 18)

(map ...) may take as many sequences as you supply to it (though it requires at least one sequence), but the function argument must accept as many arguments as there are sequences.

Thus, these two functions give the sequences multiplied together:

 (map (fn [a b] (* a b)) (range 10) (range 10))
 ;; ⇒ (0 1 4 9 16 25 36 49 64 81)
 (map * (range 10) (range 10))
 ;; ⇒ (0 1 4 9 16 25 36 49 64 81)

But the first one will only take two sequences as arguments, whereas the second one will take as many as are supplied.

 (map (fn [a b] (* a b)) (range 10) (range 10) (range 10))
 ;; ⇒ java.lang.IllegalArgumentException: Wrong number of args passed
 (map * (range 10) (range 10) (range 10))
 ;; ⇒ (0 1 8 27 64 125 216 343 512 729)

(map ...) will stop evaluating as soon as it reaches the end of any supplied sequence, so in all three of these cases, (map ...) stops evaluating at 5 items (the length of the shortest sequence,) despite the second and third giving it sequences that are longer than 5 items (in the third example, the longer sequence is of infinite length.)

Each of these takes a a sequence made up solely of the number 2 and a sequence of the numbers (0 1 2 3 4) and multiplies them together.

 (map * (replicate 5 2) (range 5))
 ;; ⇒ (0 2 4 6 8)
 (map * (replicate 10 2) (range 5))
 ;; ⇒ (0 2 4 6 8)
 (map * (repeat 2) (range 5))
 ;; ⇒ (0 2 4 6 8)

(every? pred coll) edit

Returns true if pred is true for every item in a sequence. False otherwise. pred, in this case, is a function taking a single argument and returning true or false.

As this function returns true always, (every? ...) evaluates to true. Note that these two functions say the same thing.

 (every? (fn [x] true) (range 10))
 ;; ⇒ true
 (every? (constantly true) (range 10))
 ;; ⇒ true

(pos? x) returns true when its argument is greater than zero. Since (range 10) gives a sequence of numbers from 0 to 9 and (range 1 10) gives a sequence of numbers from 1 to 10, (pos? x) returns false once for the first sequence and never for the second.

 (every? pos? (range 10))
 ;; ⇒ false
 (every? pos? (range 1 10))
 ;; ⇒ true

This function returns true when its argument is an even number. Since the range between 1 and 10 and the sequence (1 3 5 7 9) contain odd numbers, (every? ...) returns false.

As the sequence (2 4 6 8 10) contains only even numbers, (every? ...) returns true.

 (every? (fn [x] (= 0 (rem x 2))) (range 1 10))
 ;; ⇒ false
 (every? (fn [x] (= 0 (rem x 2))) (range 1 10 2))
 ;; ⇒ false
 (every? (fn [x] (= 0 (rem x 2))) (range 2 10 2))
 ;; ⇒ true

If I had a need, elsewhere, to check if a number were even, I might, instead, write the following, making (even? num) an actual function before passing it as an argument to (every? ...)

 (defn even? [num] (= 0 (rem num 2)))
 ;; ⇒ #<Var: user/even?>
 (every? even? (range 1 10 2))
 ;; ⇒ false
 (every? even? (range 2 10 2))
 ;; ⇒ true

Complementary function: (not-every? pred coll)

Returns the complementary value to (every? pred coll). False if pred is true for all items in the sequence, true if otherwise.

 (not-every? pos? (range 10))
 ;; ⇒ true
 (not-every? pos? (range 1 10))
 ;; ⇒ false

Looping and Iterating edit

Three different ways to loop from 1 to 20, increment by 2, printing the loop index each time (from mailing list discussion):

 ;; Version 1
 (loop [i 1]
   (when (< i 20)
     (println i)
     (recur (+ 2 i))))
 
 ;; Version 2
 (dorun (for [i (range 1 20 2)]
          (println i)))
 
 ;; Version 3
 (doseq [i (range 1 20 2)]
   (println i))

Mutual Recursion edit

Mutual recursion is tricky but possible in Clojure. The form of (defn ...) allows the body of a function to refer to itself or previously existing names only. However, Clojure does allow dynamic redefinition of function bindings, in the following way:

 ;;; Mutual recursion example
 
 ;; Forward declaration
 (def even?)
 
 ;; Define odd in terms of 0 or even
 (defn odd? [n]
   (if (zero? n)
       false
       (even? (dec n))))
 
 ;; Define even? in terms of 0 or odd
 (defn even? [n]
   (if (zero? n)
       true
       (odd? (dec n))))
 
 ;; Is 3 even or odd?
 (even? 3) 
 ;; ⇒ false

Mutual recursion is not possible in internal functions defined with let. To declare a set of private recursive functions, you can use the above technique with defn- instead of defn, which will generate private definitions.

However one can emulate mutual recursive functions with loop and recur.

(use 'clojure.contrib.fcase)

(defmacro multi-loop
  [vars & clauses]
  (let [loop-var  (gensym "multiloop__")
        kickstart (first clauses)
        loop-vars (into [loop-var kickstart] vars)]
    `(loop ~loop-vars
       (case ~loop-var
          ~@clauses))))

(defn even?
  [n]
  (multi-loop [n n]
    :even (if (zero? n)
            true
            (recur :odd (dec n)))
    :odd  (if (zero? n)
            false
            (recur :even (dec n)))))

Collection Abstractions edit

Concurrency edit

Macros edit

A nice walkthrough on how to write a macro can be found at http://blog.n01se.net/?p=33 by Chouser.

Macros are used to transform data structures at compile time. Let's develop a new do1 macro. The do special form of Clojure evaluates all containing forms for their side-effects and returns the return value of the last one. do1 should act similar, but return the value of the first sub-form.

In the beginning one should first think about how the macro should be invoked.

(do1
  :x
  :y
  :z)

The return value should be :x. Then the next step is to think about how we would do this manually.

(let [x :x]
  :y
  :z
  x)

This first evaluates :x, then :y and :z. Finally the let evaluates to the result of evaluating :x. This can be turned into a macro using defmacro and `.

(defmacro do1
  [fform & rforms]
  `(let [x# ~fform]
     ~@rforms
     x#))

So what happens here. It is just a simple translation. We use the let to create a temporary place for the result of our first form to stay. Since we cannot simply use some name (it might be used in the user code), we generate a new one with x#. The # is a special notation of Clojure to help us: it generates a new name, which is guaranteed to be not used by the user code. The ~ "unquotes" our first form, that is ~fform is replaced by the first argument. Then the ~@ is used to inject the remaining forms. Using the @ basically removes one set of () from the following expression. Finally we refer again to the result of the first form with x#.

We can check the expansion of our macro with (macroexpand-1 '(do1 :x :y :z)).

Libraries edit

The lib package from clojure.contrib is now integrated into clojure. It is easy to define libraries that can be loaded by other scripts. Suppose we have an awesome add1 function which we want to provide to other developers. So what do we need? First we settle on a namespace, eg. example.ourlib. Now we have to create a file in the classpath with the filename "example/ourlib.clj". The contents are pretty straight forward.

(ns example.ourlib)

(defn add1
  [x]
  (add x 1))

All we have to do now is to use the functionality of ns. Suppose we have another file, where we want to use our function. ns lets us specify our requirements in a lot of ways. The simplest is :require

(ns example.otherns
  (:require example.ourlib))

(defn check-size
  [x]
  (if (too-small x)
    (example.ourlib/add1 x)
    x))

But what if we need the add1 function several times? We have to type always the namespace in front. We could add a (refer 'example.ourlib), but we can have this easier. Just use :use instead of :require! :use loads the library as :require does and immediately refers to the namespace.

So now we have already two small libraries which are maybe used in a third program.

(ns example.thirdns
  (:require example.ourlib)
  (:require example.otherns))

Again we can save some typing here. Similar to import we can factor out the common prefix of our libraries' namespaces.

(ns example.thirdns
  (:require (example ourlib otherns)))

Of course ourlib contains 738 more functions, not only those shown above. We don't really want to have use because bringing in so many names risks conflicts, but we also don't want to type the namespace all the time either. So the first thing we do is employ an alias. But wait! You guessed it: ns helps us again.

(ns example.otherns
  (:require (example [ourlib :as ol])))

The :as takes care of the aliasing and now we can refer to our add1 function as ol/add1!

Up to now it is already quite nice. But if we think a bit about our source code organization, we might end up with the insight that 739 functions in one single file is maybe not the best idea to keep around. So we decide to do some refactoring. We create a file "example/ourlib/add1.clj" and put our functions there. We don't want the user to have to load many files instead of one, so we modify the "example/ourlib.clj" file to load any additional files as follows.

(ns example.ourlib
  (:load "ourlib/add1"
         "ourlib/otherfunc"
         "ourlib/morefuncs"))

So the user still loads the "public" example.ourlib lib, which takes care of loading the rest. (The :load implementation includes code to provide the ".clj" suffix for the files being loaded)

For more information see the docstring of require - (doc require).

References edit

↑ http://github.com/relevance/labrepl
↑ range on ClojureDocs

[1] ttp://github.com/relevance/labrepl

[2] range on ClojureDocs

[1]

[2]