Introduction to newLISP/Lists

Lists

edit

Lists are used everywhere in newLISP - LISP stands for list processing - so it's not surprising that there are many useful functions for working with lists. It's quite hard to organize them all into one logical descriptive narrative, but here's an introduction to most of them.

A good thing about newLISP is that many of the functions that work on lists also work on strings, so you'll meet many of these again in the next chapter, where they'll be applied to strings.

Building lists

edit

You can build a list and assign it to a symbol directly. Quote the list to stop it being evaluated immediately:

(set 'vowels '("a" "e" "i" "o" "u"))
;-> ("a" "e" "i" "o" "u") ; an unevaluated list

Lists are often created for you by other functions, but you can build your own too, using one of these functions:

  • list makes a new list from expressions
  • append glues lists together to form a new list
  • cons prepends an element to a list or makes a list
  • push inserts a new member in a list
  • dup duplicates an element

list, append, and cons

edit

Use list to build a list from a sequence of expressions:

(set 'rhyme (list 1 2 "buckle my shoe" 
    '(3 4) "knock" "on" "the" "door"))
; rhyme is now a list with 8 items"
;-> (1 2 "buckle my shoe" '(3 4) "knock" "on" "the" "door")

Notice that the (3 4) element is itself a list, which is nested inside the main list.

cons accepts exactly two expressions, and can do two jobs: insert the first element at the start of an existing list, or build a new two element list. In both cases it returns a new list. newLISP automatically chooses which job to do depending on whether the second element is a list or not.

(cons 1 2)                           ; makes a new list
;-> (1 2)

(cons 1 '(2 3))                      ; inserts an element at the start
;-> (1 2 3)

To glue two or more lists together, use append:

(set 'odd '(1 3 5 7) 'even '(2 4 6 8))
(append odd even)
;-> (1 3 5 7 2 4 6 8)

Notice the difference between list and append when you join two lists:

(set 'a '(a b c) 'b '(1 2 3))

(list a b)
;-> ((a b c) (1 2 3))                   ; list makes a list of lists

(append a b)
;-> (a b c 1 2 3)                       ; append makes a list

list preserves the source lists when making the new list, whereas append uses the elements of each source list to make a new list.

To remember this: List keeps the List-ness of the source lists, but aPPend Picks the elements out and Packs them all together again.

append can also assemble a bunch of strings into a new string.

push: pushing items onto lists

edit

push is a powerful command that you can use to make a new list or insert an element at any location in an existing list. Pushing an element at the front of the list moves everything one place to the right, whereas pushing an element at the end of the list just attaches it and makes a new last element. You can also insert an element anywhere in the middle of a list.

Despite its constructive nature, it's technically a destructive function, because it changes the target list permanently, so use it carefully. It returns the value of the element that was inserted, rather than the whole list.

(set 'vowels '("e" "i" "o" "u"))
(push (char 97) vowels)
; returns "a"
; vowels is now ("a" "e" "i" "o" "u")

When you refer to the location of an element in a list, you use zero-based numbering, which you would expect if you're an experienced programmer:

 

If you don't specify a location or index, push pushes the new element at the front. Use a third expression to specify the location or index for the new element. -1 means the last element of the list, 1 means the second element of the list counting from the front from 0, and so on:

(set 'vowels '("a" "e" "i" "o"))
(push "u" vowels -1)
;-> "u"
; vowels is now ("a" "e" "i" "o" "u")

(set 'evens '(2 6 10))
(push 8 evens -2)                       ; goes before the 10
(push 4 evens 1)                        ; goes after the 2

; evens is now (2 4 6 8 10)

If the symbol you supply as a list doesn't exist, push usefully creates it for you, so you don't have to declare it first.

(for (c 1 10)
 (push c number-list -1)                ; doesn't fail first time!
 (println number-list))
(1)
(1 2)
(1 2 3)
(1 2 3 4)
(1 2 3 4 5)
(1 2 3 4 5 6)
(1 2 3 4 5 6 7)
(1 2 3 4 5 6 7 8)
(1 2 3 4 5 6 7 8 9)
(1 2 3 4 5 6 7 8 9 10)


By the way, there are plenty of other ways of generating a list of unsorted numbers. You could also do a number of random swaps like this:

(set 'l (sequence 0 99))
(dotimes (n 100)
  (swap (l (rand 100)) (l (rand 100)))))

although it would be even easier to use randomize:

(randomize (sequence 1 99))
;-> (54 38 91 18 76 71 19 30 ...

(That's one of the nice things about newLISP - a more elegant solution is just a re-write away!)

push has an opposite, pop, which destructively removes an element from a list, returning the removed element. We'll meet pop and other list surgery functions later. See List surgery.

These two functions, like many other newLISP functions, work on strings as well as lists. See push and pop work on strings too.

dup: building lists of duplicate elements

edit

A useful function called dup lets you construct lists quickly by repeating elements a given number of times:

(dup 1 6)             ; duplicate 1 six times
;-> (1 1 1 1 1 1)

(dup '(1 2 3) 6)
;-> ((1 2 3) (1 2 3) (1 2 3) (1 2 3) (1 2 3) (1 2 3))

(dup x 6)
;-> (x x x x x x)

There's a trick to get dup to return a list of strings. Because dup can also be used to duplicate strings into a single longer string, you supply an extra true value at the end of the list, and newLISP creates a list of strings rather than a string of strings.

(dup "x" 6)           ; a string of strings
;-> "xxxxxx"

(dup "x" 6 true)      ; a list of strings
;-> ("x" "x" "x" "x" "x" "x")

Working with whole lists

edit

Once you've got a list, you can start to work on it. First, let's look at the functions that operate on the list as a unit. After that, I'll look at the functions that let you carry out list surgery - operations on individual list elements.

Using and processing lists

edit

dolist works through every item of a list:

(set 'vowels '("a" "e" "i" "o" "u"))
(dolist (v vowels)
  (println (apply upper-case (list v))))
A
E
I
O
U


In this example, apply expects a function and a list, and uses the elements of that list as arguments to the function. So it repeatedly applies the upper-case function to the loop variable's value in v. Since upper-case works on strings but apply expects a list, I've had to use the list function to convert the current value of v (a string) in each iteration to a list.

A better way to do this is to use map:

(map upper-case '("a" "e" "i" "o" "u"))
;-> ("A" "E" "I" "O" "U")

map applies the named function, upper-case in this example, to each item of the list in turn. The advantage of map is that it both traverses the list and applies the function to each item of the list in one pass. The result is a list, too, which might be more useful for subsequent processing.

There's more about dolist and apply elsewhere (see Working through a list, and Apply and map: applying functions to lists).

reverse

edit

reverse does what you would expect, and reverses the list. It's a destructive function, changing the list permanently.

(reverse '("A" "E" "I" "O" "U"))
;-> ("U" "O" "I" "E" "A")

sort and randomize

edit

In a way, randomize and sort are complementary, although sort changes the original list, whereas randomize returns a disordered copy of the original list. sort arranges the elements in the list into ascending order, organizing them by type as well as by value.

Here's an example: create a list of letters and a list of numbers, stick them together, shuffle the result, and then sort it again:

(for (c (char "a") (char "z"))
 (push (char c) alphabet -1))

(for (i 1 26) 
 (push i numbers -1))

(set 'data (append alphabet numbers))

;-> ("a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p"
; "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" 1 2 3 4 5 6 7 8 9 10 11 12
; 13 14 15 16 17 18 19 20 21 22 23 24 25 26)

(randomize data)

;-> ("l" "r" "f" "k" 17 10 "u" "e" 6 "j" 11 15 "s" 2 22 "d" "q" "b" 
; "m" 19 3 5 23 "v" "c" "w" 24 13 21 "a" 4 20 "i" "p" "n" "y" 14 "g" 
; 25 1 8 18 12 "o" "x" "t" 7 16 "z" 9 "h" 26)

(sort data)

;-> (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
; 25 26 "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o"
; "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z")

Compare data before it was randomized and after it was sorted. The sort command sorts the list by data type as well as by value: integers before strings, strings before lists, and so on.

The default sorting method is <, which arranges thet values so that each is less than the next.

To change the sort method, you can supply one of newLISP's built-in comparison functions, such as >. Adjacent objects are considered to be correctly sorted when the comparison function is true for each adjacent pair:

(for (c (char "a") (char "z")) 
  (push (char c) alphabet -1)) 

alphabet
;-> ("a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" 
; "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z")

(sort alphabet >)
;-> ("z" "y" "x" "w" "v" "u" "t" "s" "r" "q" "p" "o" "n" 
; "m" "l" "k" "j" "i" "h" "g" "f" "e" "d" "c" "b" "a")

You can supply a custom sort function. This is a function that takes two arguments and returns true if they're in the right order (ie if the first should come before the second), and false if they aren't. For example, suppose you want to sort a list of file names so that the shortest name appears first. Define a function that returns true if the first argument is shorter than the second, and then use sort with this custom sorting function:

(define (shorter? a b)        ; two arguments, a and b 
 (< (length a) (length b)))

(sort (directory) shorter?)
;->
("." ".." "var" "usr" "tmp" "etc" "dev" "bin" "sbin" "mach" ".vol" 
"Users" "cores" "System" "Volumes" "private" "Network" "Library" 
"mach.sym" ".Trashes" "Developer" "automount" ".DS_Store" 
"Desktop DF" "Desktop DB" "mach_kernel" "Applications" "System Folder" ...)


Advanced newLISPers often write a nameless function and supply it directly to the sort command:

(sort (directory) (fn (a b) (< (length a) (length b))))

This does the same job, but saves 25 characters or so. You can use either fn or lambda to define an inline or anonymous function.

unique

edit

unique returns a copy of a list with all duplicates removed:

(set 'data '( 1 1 2 2 2 2 2 2 2 3 2 4 4 4 4))
(unique data)
;-> (1 2 3 4)

There are also some useful functions for comparing lists. See Working with two or more lists.

flat

edit

flat is useful for dealing with nested lists, because it shows what they look like without a complicated hierarchical structure:

(set 'data '(0 (0 1 2) 1 (0 1) 0 1 (0 1 2) ((0 1) 0))) 
(length data)
;-> 8
(length (flat data))
;-> 15
(flat data)
;-> (0 0 1 2 1 0 1 0 1 0 1 2 0 1 0)

Fortunately, flat is non-destructive, so you can use it without worrying about losing the structure of nested lists:

data
;-> (0 (0 1 2) 1 (0 1) 0 1 (0 1 2) ((0 1) 0)) ; still nested

transpose

edit

transpose is designed to work with matrices (a special type of list: see Matrices). It also does something useful with ordinary nested lists. If you imagine a list of lists as a table, it flips rows and columns for you:

(set 'a-list 
 '(("a" 1) 
   ("b" 2) 
   ("c" 3)))

(transpose a-list)

;->
(("a" "b" "c")
 ( 1 2 3))

(set 'table 
'((A1 B1 C1 D1 E1 F1 G1 H1)
  (A2 B2 C2 D2 E2 F2 G2 H2)
  (A3 B3 C3 D3 E3 F3 G3 H3)))

(transpose table)

;->
((A1 A2 A3)
 (B1 B2 B3)
 (C1 C2 C3)
 (D1 D2 D3)
 (E1 E2 E3)
 (F1 F2 F3) 
 (G1 G2 G3) 
 (H1 H2 H3))

And here's a nice bit of newLISP sleight of hand:

(set 'table '((A 1) (B 2) (C 3) (D 4) (E 5)))
;-> ((A 1) (B 2) (C 3) (D 4) (E 5))

(set 'table (transpose (rotate (transpose table))))
;-> ((1 A) (2 B) (3 C) (4 D) (5 E))

and each sublist has been reversed. You could, of course, do this:

(set 'table (map (fn (i) (rotate i)) table))

which is shorter, but a bit slower.

explode

edit

The explode function lets you blow up a list:

(explode (sequence 1 10))
;-> ((1) (2) (3) (4) (5) (6) (7) (8) (9) (10))

You can specify the size of the pieces, too:

(explode (sequence 1 10) 2)
;-> ((1 2) (3 4) (5 6) (7 8) (9 10))

(explode (sequence 1 10) 3)
;-> ((1 2 3) (4 5 6) (7 8 9) (10))

(explode (sequence 1 10) 4)
;-> ((1 2 3 4) (5 6 7 8) (9 10))

List analysis: testing and searching

edit

Often you don't know what's in a list, and you want some forensic tools to find out more about what's inside it. newLISP has a good selection.

We've already met length, which finds the number of elements in a list.

The starts-with and ends-with functions test the start and ends of lists:

(for (c (char "a") (char "z")) 
 (push (char c) alphabet -1))

;-> alphabet is ("a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l"
; "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z")

(starts-with alphabet "a")             ; list starts with item "a"?
;-> true

(starts-with (join alphabet) "abc")    ; convert to string and test
;-> true

(ends-with alphabet "z")               ; testing the list version
;-> true

These functions work equally well with strings, too (and they take regular expressions). See Testing and comparing strings.

How about contains? In fact there isn't one single newLISP function that does this job. Instead you have find, match, member, ref, filter, index, and count, among others. Which one you use depends on what sort of answer you want to the question Does this list contain this item?, and whether the list is a nested list or a flat one.

If you want a simple answer with a quick top-level-only search, use find. See find.

If you want the item and the rest of the list as well, use member. See member.

If you want the index number of the first occurrence, even if the list contains nested lists, you can use ref. See ref and ref-all.

If you want a new list containing all the elements that match your search element, use find-all. See find-all.

If you want to know whether or not the list contains a certain pattern of elements, use match. See matching patterns in lists.

You can find all list items that satisfy a function, either built-in or one of your own, with the filter, clean, and index functions. See Filtering lists: filter, clean, and index.

The exists and for-all functions check elements in a list to see if they pass a test.

If you want to find elements in a list purely to change them to something else, then don't bother to look for them first, just use replace. See Replacing information: replace. You can also find and replace list elements using the set-ref function. See Find and replace matching elements.

If you want to know how many occurrences there are of items in the list, use count. See Working with two or more lists.

Let's look at some examples of each of these.

find

edit

find looks through a list for an expression and returns an integer or nil. The integer is the index of the first occurrence of the search term in the list. It's possible for find to return 0 - if the list starts with the item, its index number is 0, but this isn't a problem - you can use this function in an if test, because 0 evaluates to true.

(set 'sign-of-four 
 (parse (read-file "/Users/me/Sherlock-Holmes/sign-of-four.txt")
 {\W} 0))

(if  (find "Moriarty" sign-of-four)        ; Moriarty anywhere?
  (println "Moriarty is mentioned")
  (println "No mention of Moriarty"))
No mention of Moriarty 


(if (find "Lestrade" sign-of-four)
  (println "Lestrade is mentioned") 
  (println "No mention of Lestrade"))
Lestrade is mentioned.
(find "Watson" sign-of-four)
;-> 477

Here I've parsed the text of Sir Arthur Conan Doyle's The Sign of Four (which you can download from Project Gutenberg), and tested whether the resulting list of strings contains various names. The integer returned is the first occurrence of the string element in the list.

find lets you use regular expressions, too, so you can find any string elements in a list that match a string pattern:

(set 'loc (find "(tea|cocaine|morphine|tobacco)" sign-of-four 0)) 
(if loc
  (println "The drug " (sign-of-four loc) " is mentioned.")
  (println "No trace of drugs"))
The drug cocaine is mentioned.

Here I'm looking for any traces of chemical indulgence in Sherlock Holmes' bohemian way of life: "(tea|cocaine|morphine|tobacco)" means any one of tea, cocaine, morphine, or tobacco.

This form of find lets you look for regular expression patterns in the string elements of a list. You'll meet regular expressions again when we explore strings. See Regular expressions.

(set 'word-list '("being" "believe" "ceiling" "conceit" "conceive" 
"deceive" "financier" "foreign" "neither" "receive" "science" 
"sufficient" "their" "vein" "weird"))

(find {(c)(ie)(?# i before e except after c...)} word-list 0)
;-> 6                                   ; the first one is "financier"

Here we're looking for any string elements in the word-list that match our pattern (a c followed by ie, the old and inaccurate spelling rule i before e except after c).

The regular expression pattern here (enclosed in braces, which are string delimiters, doing more or less the same job as quotation marks) is (c) followed by (ie). Then there's a comment, starting with (?#. Comments in regular expressions are useful when things get cryptic, as they often do.

find can also accept a comparison function. See Searching lists.

find finds only the first occurrence in a list. To find all occurrences, you could keep repeating find until it returns nil. The word list gets shorter each time through, and the found elements are added at the end of another list:

(set 'word-list '("scientist" "being" "believe" "ceiling" "conceit" 
"conceive" "deceive" "financier" "foreign" "neither" "receive" "science" 
"sufficient" "their" "vein" "weird"))

(while (set 'temp 
 (find {(c)(ie)(?# i before e except after c...)} word-list 0))
   (push (word-list temp) results -1)
   (set 'word-list ((+ temp 1) word-list)))

results
;-> ("scientist" "financier" "science" "sufficient")

But in this case it's much easier to use filter:

(filter (fn (w) (find {(c)(ie)} w 0)) word-list)
;-> ("scientist" "financier" "science" "sufficient")

- see Filtering lists: filter, clean, and index.

Alternatively, you can use ref-all (see ref and ref-all) to get a list of indices.

If you're not using regular expressions, you can use count, which, given two lists, looks through the second list and counts how many times each item in the first list occurs. Let's see how many times the main characters' names are mentioned:

(count '("Sherlock" "Holmes" "Watson" "Lestrade" "Moriarty" "Moran")
 sign-of-four)
;-> (34 135 24 1 0 0)

The list of results produced by count shows how many times each element in the first list occurs in the second list, so there are 34 mentions of Sherlock, 135 mentions of Holmes, 24 of Watson, and only one mention for poor old Inspector Lestrade in this story.

It's worth knowing that find examines the list only superficially. For example, if the list contains nested lists, you should use ref rather than find, because ref looks inside sublists:

(set 'maze 
 '((1 2)
   (1 2 3)
   (1 2 3 4)))

(find 4 maze)
;-> nil                           ; didn't look inside the lists

(ref 4 maze)
;-> (2 3)                         ; element 3 of element 2

member

edit

The member function returns the rest of the source list rather than index numbers or counts:

(set 's (sequence 1 100 7))             ; 7 times table?
;-> (1 8 15 22 29 36 43 50 57 64 71 78 85 92 99)

(member 78 s)
;-> (78 85 92 99)

matching patterns in lists

edit

There's a powerful and complicated function called match, which looks for patterns in lists. It accepts the wild card characters *, ?, and +, which you use to define a pattern of elements. + means one or more elements, * means zero or more elements, and ? means one element. For example, suppose you want to examine a list of random digits between 0 and 9 for patterns. First, generate a list of 10000 random numbers as source data:

(dotimes (c 10000) (push (rand 10) data))
;-> (7 9 3 8 0 2 4 8 3 ...)

Next, you decide that you want to find the following pattern:

 1 2 3

somewhere in the list, ie anything followed by 1 followed by a 2 followed by a 3 followed by anything. Call match like this:

(match '(* 1 2 3 *) data)

which looks odd, but it's just a pattern specification in a list, followed by the source data. The list pattern:

(* 1 2 3 *)

means any sequence of atoms or expressions (or nothing), followed by a 1, then a 2, then a 3, followed by any number of atoms or expressions or nothing. The answer returned by this match function is another list, consisting of two sublists, one corresponding to the first *, the other corresponding to the second:

((7 9 3 8  . . .  0 4 5)  (7 2 4 1 . . . 3 5 5 5))

and the pattern you were looking for first occurred in the gap between these lists (in fact it occurs half a dozen times later on in the list as well). match can also handle nested lists.

To find all occurrences of a pattern, not just the first, you can use match in a while loop. For example, to find and remove every 0 when it's followed by another 0, repeat the match on a new version of the list until it stops returning a non-nil value:

(set 'number-list '(2 4 0 0 4 5 4 0 3 6 2 3 0 0 2 0 0 3 3 4 2 0 0 2))

(while (set 'temp-list (match '(* 0 0 *) number-list))
  (println temp-list)
  (set 'number-list (apply append temp-list)))
((2 4) (4 5 4 0 3 6 2 3 0 0 2 0 0 3 3 4 2 0 0 2))
((2 4 4 5 4 0 3 6 2 3) (2 0 0 3 3 4 2 0 0 2))
((2 4 4 5 4 0 3 6 2 3 2) (3 3 4 2 0 0 2))
((2 4 4 5 4 0 3 6 2 3 2 3 3 4 2) (2))
> number-list
;-> (2 4 4 5 4 0 3 6 2 3 2 3 3 4 2 2)


You don't have to find elements first before replacing them: just use replace, which does the finding and replacing in one operation. And you can use match as a comparison function for searching lists. See Replacing information: replace, and Searching lists.

find-all

edit

find-all is a powerful function with a number of different forms, suitable for searching lists, association lists, and strings. For list searches, you supply four arguments: the search key, the list, an action expression, and the functor, which is the comparison function you want to use for matching the search key:

(set 'food '("bread" "cheese" "onion" "pickle" "lettuce"))
(find-all "onion" food (print $0 { }) >)
;-> bread cheese lettuce

Here, find-all is looking for the string "onion" in the list food. It's using the > function as a comparison function, so it will find anything that "onion" is greater than. For strings, 'greater than' means appearing later in the default ASCII sorting order, so that "cheese" is greater than "bread" but less than "onion". Notice that, unlike other functions that let you provide comparison functions (namely find, ref, ref-all, replace when used with lists, set-ref, set-ref-all, and sort), the comparison function must be supplied. With the < function, the result is a list of things that "onion" is less than:

(find-all "onion" food (print $0 { }) <)
;-> pickle

ref and ref-all

edit

The ref function returns the index of the first occurrence of an element in a list. It's particularly suited for use with nested lists, because, unlike find, it looks inside all the sublists, and returns the address of the first appearance of an element. As an example, suppose you've converted an XML file, such as your iTunes library, into a large nested list, using newLISP's built-in XML parser:

(xml-type-tags nil nil nil nil)         ; controls XML parsing
(set 'itunes-data 
 (xml-parse
   (read-file "/Users/me/Music/iTunes/iTunes Music Library.xml") 
   (+ 1 2 4 8 16)))

Now you can look for any expression in the data, which is in the form of an ordinary newLISP list:

(ref "Brian Eno" itunes-data)

and the returned list will be the location of the first occurrence of that string in the list:

(0 2 14 528 6 1)

- this is a set of index numbers which together define a kind of address. This example means: in list element 0, look for sublist element 2, then find sublist element 14 of that sublist, and so on, drilling down into the highly-nested XML-based data structure. See Working with XML.

ref-all does a similar job, and returns a list of addresses:

(ref-all "Brian Eno" itunes-data)
;-> ((0 2 14 528 6 1) (0 2 16 3186 6 1) (0 2 16 3226 6 1))

These functions can also accept a comparison function. See Searching lists.

Use these functions when you're searching for something in a nested list. If you want to replace it when you've found it, use the set-ref and set-ref-all functions. See Find and replace matching elements.

Filtering lists: filter, clean, and index

edit

Another way of finding things in lists is to filter the list. Like panning for gold, you can create a filter that keeps only the stuff you want, flushing the unwanted stuff away.

The functions filter and index have the same syntax, but filter returns the list elements, whereas index returns the index numbers (indices) of the wanted elements rather than the list elements themselves. (These functions don't work on nested lists.)

The filtering functions filter, clean, and index use another function for testing elements: the element appears in the results list according to whether it passes the test or not. You can either use a built-in function or define your own. Typically, newLISP functions that tests and return true or false (sometimes called predicate functions) have names ending with question marks:

NaN? array? atom? context? directory? empty? file? float? global? integer? lambda? legal? list? macro? nil? null? number? primitive? protected? quote? string? symbol? true? zero?

So, for example, an easy way to find integers in (and remove floating-point numbers from) a list is to use the integer? function with filter. Only integers pass through this filter:

(set 'data '(0 1 2 3 4.01 5 6 7 8 9.1 10))
(filter integer? data)
;-> (0 1 2 3 5 6 7 8 10)

filter has a complementary function called clean which removes elements that satisfy the test:

(set 'data '(0 1 2 3 4.01 5 6 7 8 9.1 10))
(clean integer? data)
;-> (4.01 9.1)

Think of clean as getting rid of dirt - it gets rid of anything that passes the test. Think of filter as panning for gold, keeping what passes the test.

This next filter finds all words in Conan Doyle's story The Empty House that contain the letters pp. The filter is a lambda expression (a temporary function without a name) that returns nil if the element doesn't contain pp. The list is a list of string elements generated by parse, which breaks up a string into a list of smaller strings according to a pattern.

(set 'empty-house-text 
 (parse 
   (read-file "/Users/me/Sherlock-Holmes/the-empty-house.txt")
   {,\s*|\s+} 0))

(filter (fn (s) (find "pp" s)) empty-house-text)

;->
("suppressed" "supply" "disappearance" "appealed" "appealed"
"supplemented" "appeared" "opposite" "Apparently" "Suppose"
"disappear" "happy" "appears" "gripped" "reappearance."
"gripped" "opposite" "slipped" "disappeared" "slipped"
"slipped" "unhappy" "appealed" "opportunities." "stopped"
"stepped" "opposite" "dropped" "appeared" "tapped"
"approached" "suppressed" "appeared" "snapped" "dropped"
"stepped" "dropped" "supposition" "opportunity" "appear"
"happy" "deal-topped" "slipper" "supplied" "appealing"
"appear")

You can also use filter or clean for tidying up lists before using them - removing empty strings produced by a parse operation, for example.

When would you use index rather than filter or clean? Well, use index when you later want to access the list elements by index number rather than their values: we'll meet functions for selecting list items by index in the next section. For example, whereas ref found the index of only the first occurrence, you could use index to return the index numbers of every occurrence of an element.

If you have a predicate function that looks for a string in which the letter c is followed by ie, you can use that function to search a list of matching strings:

(set 'word-list '("agencies" "being" "believe" "ceiling"
"conceit" "conceive" "deceive" "financier" "foreign"
"neither" "receive" "science" "sufficient" "their" "vein"
"weird"))

(define (i-before-e-after-c? wd)        ; a predicate function
 (find {(c)(ie)(?# i before e after c...)} wd 0))

(index i-before-e-after-c? word-list)
;-> (0 7 11 12)                         
; agencies, financier, science, sufficient

Remember that lists can contain nested lists, and that some functions won't look inside the sub-lists:

(set 'maze 
 '((1 2.1)
   (1 2 3)
   (1 2 3 4)))

(filter integer? maze)
;-> ()                                  ; I was sure it had integers...

(filter list? maze)                     
;-> ((1 2.1) (1 2 3) (1 2 3 4))         ; ah yes, they're sublists!

(filter integer? (flat maze))           
;-> (1 1 2 3 1 2 3 4)                   ; one way to do it...

Testing lists

edit

The exists and for-all functions check elements in a list to see if they pass a test.

exists returns either the first element in the list that passes the test, or nil if none of them do.

(exists string? '(1 2 3 4 5 6 "hello" 7))
;-> "hello"

(exists string? '(1 2 3 4 5 6 7))
;-> nil

for-all returns either true or nil. If every list element passes the test, it returns true.

(for-all number? '(1 2 3 4 5 6 7))
;-> true

(for-all number? '("zero" 2 3 4 5 6 7))
;-> nil

Searching lists

edit

find, ref, ref-all and replace look for items in lists. Usually, you use these functions to find items that equal what you're looking for. However, equality is just the default test: all these functions can accept an optional comparison function that's used instead of a test for equality. This means that you can look for list elements that satisfy any test.

The following example uses the < comparison function. find looks for the first element that compares favourably with n, ie the first element that n is less than. With a value of 1002, the first element that satisfies the test is 1003, the 3rd element of the list, and so the returned value is 3.

(set 's (sequence 1000 1020))
;-> (1000 1001 1002 1003 1004 1005 1006 1007 1008 100
; 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020)

(set 'n 1002)
; find the first something that n is less than:
(find n s <)

;-> 3, the index of 1003 in s

You can write your own comparison function:

(set 'a-list 
   '("elephant" "antelope" "giraffe" "dog" "cat" "lion" "shark" ))

(define (longer? x y)
  (> (length x) (length y)))

(find "tiger" a-list longer?)
;-> 3 ; "tiger" is longer than element 3, "dog"

The longer? function returns true if the first argument is longer than the second. So find, with this function as the comparison, finds the first element in the list that makes the comparison true. Because tiger is longer than dog, the function returns 3, the index of dog in the list.

You could supply an anonymous (or lambda) function as part of the find function, rather than write a separate function:

(find "tiger" a-list (fn (x y) (> (length x) (length y))))

If you want your code to be readable, you'll probably move longer or more complex comparators out into their own separate - and documented - functions.

You can also use comparison functions with ref, ref-all, and replace.

A comparison function can be any function that takes two values and returns true or false. For example, here's a function that returns true if y is greater than 6 and less than x. The search is therefore for an element of the data list that is both smaller than the searched-for number, which in this case is 15, and yet bigger than 6.

(set 'data '(31 23 -63 53 8 -6 -16 71 -124 29))

(define (my-func x y)
  (and (> x y) (> y 6)))

(find 15 data my-func)
;-> 4 ; there's an 8 at index location 4

Summary: compare and contrast

edit

To summarize these contains functions, here they are in action:

(set 'data 
   '("this" "is" "a" "list" "of" "strings" "not" "of" "integers"))

(find "of" data)                        ; equality is default test
;-> 4                                   ; index of first occurrence

(ref "of" data)                         ; where is "of"?
;-> (4)                                 ; returns a list of indexes

(ref-all "of" data)                     
;-> ((4) (7))                           ; list of address lists

(filter (fn (x) (= "of" x)) data)       ; keep every of
;-> ("of" "of")                         

(index (fn (x) (= "of" x)) data)        ; indexes of the of's
;-> (4 7)                               

(match (* "of" * "of" *) data)          ; three lists between the of's
;-> (("this" "is" "a" "list") ("strings" "not") ("integers"))

(member "of" data)                      ; and the rest
;-> ("of" "strings" "not" "of" "integers")

(count (list "of") data)                ; remember to use two lists
;-> (2)                                 ; returns list of counts

Selecting items from lists

edit

There are various functions for getting at the information stored in a list:

  • first gets the first element
  • rest gets all but the first element
  • last returns the last element
  • nth gets the nth element
  • select selects certain elements by index
  • slice extracts a sublist

The first and rest functions are more sensible names for the traditional car and cdr LISP functions, which were based on the names of old computer hardware registers.

Picking elements: nth, select, and slice

edit

nth gets the nth element of a list:

(set 'phrase '("the" "quick" "brown" "fox" "jumped" "over" "the" "lazy" "dog"))

(nth 1 phrase)
;-> "quick"

nth can also look inside nested lists, because it accepts more than one index number:

(set 'zoo
 '(("ape" 3)
   ("bat" 47)
   ("lion" 4)))
(nth '(2 1) zoo)                         ; item 2, then subitem 1
;-> 4

If you want to pick a group of elements out of a list, you'll find select useful. You can use it in two different forms. The first form lets you supply a sequence of loose index numbers:

(set 'phrase '("the" "quick" "brown" "fox" "jumped" "over" "the" "lazy" "dog"))

(select phrase 0 -2 3 4 -4 6 1 -1)
;-> ("the" "lazy" "fox" "jumped" "over" "the" "quick" "dog")

A positive number selects an element by counting forward from the beginning, and a negative number selects by counting backwards from the end:

   0      1       2      3       4      5      6      7     8
("the" "quick" "brown" "fox" "jumped" "over" "the" "lazy" "dog")
  -9     -8      -7     -6      -5     -4     -3     -2    -1


You can also supply a list of index numbers to select. For example, you can use the rand function to generate a list of 20 random numbers between 0 and 8, and then use this list to select elements from phrase at random:

(select phrase (rand 9 20))
;-> ("jumped" "lazy" "over" "brown" "jumped" "dog" "the" "dog" "dog"
; "quick" "the" "dog" "the" "dog" "the" "brown" "lazy" "lazy" "lazy" "quick")

Notice the duplicates. If you had written this instead:

(randomize phrase)

there would be no duplicates: (randomize phrase) shuffles elements without duplicating them.

slice lets you extract sections of a list. Supply it with the list, followed by one or two numbers. The first number is the start location. If you miss out the second number, the rest of the list is returned. The second number, if positive, is the number of elements to return.

(slice (explode "schwarzwalderkirschtorte") 7)
;-> ("w" "a" "l" "d" "e" "r" "k" "i" "r" "s" "c" "h" "t" "o" "r" "t" "e")

(slice (explode "schwarzwalderkirschtorte") 7 6)
;-> ("w" "a" "l" "d" "e" "r")

If negative, the second number specifies an element at the other end of the slice counting backwards from the end of the list, -1 being the final element:

(slice (explode "schwarzwalderkirschtorte") 19 -1)
;-> ("t" "o" "r" "t")

The cake knife reaches as far as - but doesn't include - the element you specify.

Implicit addressing

edit

newLISP provides a faster and more efficient way of selecting and slicing lists. Instead of using a function, you can use index numbers and lists together. This technique is called implicit addressing.

Select elements using implicit addressing

edit

As an alternative to using nth, put the list's symbol and an index number in a list, like this:

(set 'r '("the" "cat" "sat" "on" "the" "mat"))
(r 1)                                   ; element index 1 of r
;-> "cat"

(nth 1 r)                               ; the equivalent using nth
;-> "cat"

(r 0)
;-> "the"

(r -1)
;-> "mat"

If you have a nested list, you can supply a sequence of index numbers that identify the list in the hierarchy:

(set 'zoo 
 '(("ape" 3) 
   ("bat" 47) 
   ("lion" 4)))                            ; three sublists in a list

(zoo 2 1)
;-> 4
(nth '(2 1) zoo)                           ; the equivalent using nth
;-> 4

where the '(2 1) first finds element 2, ("lion" 4), then finds element 1 (the second one) of that sublist.

Selecting a slice using implicit addressing

edit

You can also use implicit addressing to get a slice of a list. This time, put one or two numbers to define the slice, before the list's symbol, inside a list:

(set 'alphabet '("a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" 
"l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"))

(13 alphabet)                 ; start at 13, get the rest
;-> ("n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z")

(slice alphabet 13)           ; equivalent using slice
;-> ("n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z")

(3 7 alphabet)                ; start at 3, get 7 elements
;-> ("d" "e" "f" "g" "h" "i" "j")

(slice alphabet 3 7)          ; equivalent using slice
;-> ("d" "e" "f" "g" "h" "i" "j")

Earlier, we parsed the iTunes XML library:

(xml-type-tags nil nil nil nil)
(silent 
 (set 'itunes-data
   (xml-parse 
    (read-file 
      "/Users/me/Music/iTunes/iTunes Music Library.xml")
   (+ 1 2 4 8 16))))

Let's access the inside of the resulting XML structure using the implicit addressing techniques:

(set 'eno (ref "Brian Eno" itunes-data))
;-> (0 2 14 528 6 1)                    ; address of Brian Eno

(0 4 eno)                               ; implicit slice
;-> (0 2 14 528)

(itunes-data (0 4 eno))
;->
(dict 
 (key "Track ID") 
 (int "305") 
 (key "Name") 
 (string "An Ending (Ascent)") 
 (key "Artist") 
 (string "Brian Eno") ; this was (0 2 14 528 6 1)
 (key "Album") 
 (string "Ambient Journeys") 
 (key "Genre") 
 (string "ambient, new age, electronica") 
 (key "Kind") 
 (string "Apple Lossless audio file") 
 (key "Size") 
 (int "21858166") 
; ... 
 )

How to remember the difference between the two types of implicit addressing? sLice numbers go in the Lead, sElect numbers go at the End.

List surgery

edit

Shortening lists

edit

To shorten a list, by removing elements from the front or back, use chop or pop. chop makes a copy and works from the end, pop changes the original and works from the front.

chop returns a new list by cutting the end off the list:

(set 'vowels '("a" "e" "i" "o" "u"))
(chop vowels)
;-> ("a" "e" "i" "o")

(println vowels)
("a" "e" "i" "o" "u")                   ; original unchanged

An optional third argument for chop specifies how many elements to remove:

(chop vowels 3)
;-> ("a" "e")

(println vowels)
("a" "e" "i" "o" "u") ; original unchanged

pop (the opposite of push) permanently removes the specified element from the list, and works with list indices rather than lengths:

(set 'vowels '("a" "e" "i" "o" "u"))

(pop vowels)                            ; defaults to 0-th element

(println vowels)
("e" "i" "o" "u")

(pop vowels -1)

(println vowels)
("e" "i" "o")

You can also use replace to remove items from lists.

Changing items in lists

edit

You can easily change elements in lists, using the following functions:

  • replace changes or removes elements
  • swap swaps two elements
  • setf sets the value of an element
  • set-ref searches a nested list and changes an element
  • set-ref-all searches for and changes every element in a nested list

These are destructive functions, just like push, pop, reverse, and sort, and change the original lists, so use them carefully.

Changing the nth element

edit

To set the nth element of a list (or array) to another value, use the versatile setf command:

(set 'data (sequence 100 110))
;-> (100 101 102 103 104 105 106 107 108 109 110)

(setf (data 5) 0)
;-> 0

data
;-> (100 101 102 103 104 0 106 107 108 109 110)

Notice how the setf function returns the value that has just been set, 0, rather than the changed list.

This example uses the faster implicit addressing. You could of course use nth to create a reference to the nth element first:

(set 'data (sequence 100 110))
;-> (100 101 102 103 104 105 106 107 108 109 110)

(setf (nth 5 data) 1)
;-> 1

data
;-> (100 101 102 103 104 1 106 107 108 109 110)

setf must be used on a list or array or element stored in a symbol. You can't pass raw data to it:

(setf (nth 5 (sequence 100 110)) 1)
;-> ERR: no symbol reference found

(setf (nth 5 (set 's (sequence 100 110))) 1)
; 'temporary' storage in symbol s
;-> 1
s
;-> (100 101 102 103 104 1 106 107 108 109 110)

Using it

edit

Sometimes when you use setf, you want to refer to the old value when setting the new value. To do this, use the system variable $it. During a setf expression, $it contains the old value. So to increase the value of the first element of a list by 1:

(set 'lst (sequence 0 9))
;-> (0 1 2 3 4 5 6 7 8 9)
(setf (lst 0) (+ $it 1))
;-> 1
lst
;-> (1 1 2 3 4 5 6 7 8 9)

You can do this with strings too. Here's how to 'increment' the first letter of a string:

(set 'str "cream")
;-> "cream"
(setf (str 0) (char (inc (char $it))))
;-> "d"
str
;-> "dream"

Replacing information: replace

edit

You can use replace to change or remove elements in lists. Specify the element to change and the list to search in, and also a replacement if there is one.

(set 'data (sequence 1 10)) 
(replace 5 data)                   ; no replacement specified
;-> (1 2 3 4 6 7 8 9 10)           ; the 5 has gone

(set 'data '(("a" 1) ("b" 2)))
(replace ("a" 1) data)             ; data is now (("b" 2))

Every matching item is deleted.

replace returns the changed list:

(set 'data (sequence 1 10)) 
(replace 5 data 0)                 ; replace 5 with 0
;-> (1 2 3 4 0 6 7 8 9 10)

The replacement can be a simple value, or any expression that returns a value.

(set 'data (sequence 1 10)) 
(replace 5 data (sequence 0 5))
;->(1 2 3 4 (0 1 2 3 4 5) 6 7 8 9 10)

replace updates a set of system variables $0, $1, $2, up to $15, and the special variable $it, with the matching data. For list replacements, only $0 and $it are used, and they hold the value of the found item, suitable for using in the replacement expression.

(replace 5 data (list (dup $0 2)))    ; $0 holds 5
;-> (1 2 3 4 ((5 5)) 6 7 8 9 10)

For more about system variables and their use with string replacements, see System variables.

If you don't supply a test function, = is used:

(set 'data (sequence 1 10)) 
(replace 5 data 0 =)
;-> (1 2 3 4 0 6 7 8 9 10)

(set 'data (sequence 1 10)) 
(replace 5 data 0)              ; = is assumed
;-> (1 2 3 4 0 6 7 8 9 10)

You can make replace find elements that pass a different test, other than equality. Supply the test function after the replacement value:

(set 'data (randomize (sequence 1 10)))
;-> (5 10 6 1 7 4 8 3 9 2)
(replace 5 data 0 <)  ; replace everything that 5 is less than
;-> (5 0 0 1 0 4 0 3 0 2)

The test can be any function that compares two values and returns a true or false value. This can be amazingly powerful. Suppose you have a list of names and their scores:

(set 'scores '(
   ("adrian" 234 27 342 23 0) 
   ("hermann" 92 0 239 47 134) 
   ("neville" 71 2 118 0) 
   ("eric" 10 14 58 12 )))

How easy is it to add up the numbers for all those people whose scores included a 0? Well, with the help of the match function, this easy:

(replace '(* 0 *) scores (list (first $0) (apply + (rest $0))) match)

(("adrian" 626) 
 ("hermann" 512) 
 ("neville" 191) 
 ("eric" 10 14 58 12))

Here, for each matching element, the replacement expression builds a list from the name and the sum of the scores. match is employed as a comparator function - only matching list elements are selected for totalization, so Eric's scores weren't totalled since he didn't manage to score 0.

See Changing substrings for more information about using replace on strings.

Modifying lists

edit

There are even more powerful ways of modifying the elements in lists. Meet set-ref and set-ref-all.

You can locate and modify elements using these functions, which are designed to work well with nested lists. (See also Working with XML for some applications.)

Find and replace matching elements

edit

The set-ref function lets you modify the first matching element in a list:

(set 'l '((aaa 100) (bbb 200)))
;-> ((aaa 100) (bbb 200))

To change that 200 to a 300, use set-ref like this:

(set-ref 200 l 300)               ; change the first 200 to 300
;-> ((aaa 100) (bbb 300))

Find and replace all matching elements: set-ref-all

edit

set-ref finds the first matching element in a nested list and changes it; set-ref-all can replace every matching element. Consider the following nested list that contains data on the planets:

 (("Mercury"
      (p-name "Mercury")
      (diameter 0.382)
      (mass 0.06)
      (radius 0.387)
      (period 0.241)
      (incline 7)
      (eccentricity 0.206)
      (rotation 58.6)
      (moons 0))
  ("Venus"
      (p-name "Venus")
      (diameter 0.949)
      (mass 0.82)
      (radius 0.72)
      (period 0.615)
      (incline 3.39)
      (eccentricity 0.0068)
      (rotation -243)
      (moons 0))
  ("Earth"
      (p-name "Earth")
      (diameter 1)
;      ...

How could you change every occurrence of that 'incline' symbol to be 'inclination'? It's easy using set-ref-all:

(set-ref-all 'incline planets 'inclination)   ; key - list - replacement

This returns the list with every 'incline changed to 'inclination.

As with replace, the default test for finding matching elements is equality. But you can supply a different comparison function. This is how you could examine the list of planets and change every entry where the moon's value is greater than 9 to say "lots" instead of the actual number.

(set-ref-all '(moons ?) planets (if (> (last $0) 9) "lots" (last $0)) match)

The replacement expression compares the number of moons (the last item of the result which is stored in $0) and evaluates to "lots" if it's greater than 9. The search term is formulated using match-friendly wildcard syntax, to match the choice of comparison function.

Swap

edit

The swap function can exchange two elements of a list, or the values of two symbols. This changes the original list:

(set 'fib '(1 2 1 3 5 8 13 21))
(swap (fib 1) (fib 2))                         ; list swap 
;-> (1 1 2 3 5 8 13 21)

fib
;-> (1 1 2 3 5 8 13 21)                 ; is 'destructive'

Usefully, swap can also swap the values of two symbols without you having to use an intermediate temporary variable.

(set 'x 1 'y 2)
(swap x y)
;-> 1
x
;-> 2
y
;-> 1

This parallel assignment can make life easier sometimes, such as in this slightly unusual iterative version of a function to find Fibonacci numbers:

(define (fibonacci n)
 (let (current 1 next 0)
  (dotimes (j n)
    (print current " ")
    (inc next current)
    (swap current next))))
  
(fibonacci 20)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765

Working with two or more lists

edit

If you have two lists, you might want to ask questions such as How many items are in both lists?, Which items are in only one of the lists?, How often do the items in this list occur in another list?, and so on. Here are some useful functions for answering these questions:

  • difference finds the set difference of two lists
  • intersect finds the intersection of two lists
  • count counts the number of times each element in one list occurs in a second list

For example, to see how many vowels there are in a sentence, put the known vowels in one list, and the sentence in another (first use explode to turn the sentence into a list of characters):

(count '("a" "e" "i" "o" "u") (explode "the quick brown fox jumped over the lazy dog"))
;-> (1 4 1 4 2)

or the slightly shorter:

(count (explode "aeiou") (explode "the quick brown fox jumped over the lazy dog"))
;-> (1 4 1 4 2)

The result, (1 4 1 4 2), means that there are 1 a, 4 e's, 1 i, 4 o's, and 2 u's in the sentence.

difference and intersect are functions that will remind you of those Venn diagrams you did at school (if you went to a school that taught them). In newLISP lists can represent sets.

difference returns a list of those elements in the first list that are not in the second list. For example, you could compare two directories on your system to find files that are in one but not the other. You can use the directory function for this.

(set 'd1 
 (directory "/Users/me/Library/Application Support/BBEdit"))
(set 'd2 
 (directory "/Users/me/Library/Application Support/TextWrangler"))

(difference d1 d2)
;-> ("AutoSaves" "Glossary" "HTML Templates" "Stationery" "Text Factories")

(difference d2 d1)
;-> ()

It's important which list you put first! There are five files or directories in directory d1 that aren't in directory d2, but there are no files or directories in d2 that aren't also in d1.

The intersect function finds the elements that are in both lists.

(intersect d2 d1)
;-> ("." ".." ".DS_Store" "Language Modules" "Menu Scripts" "Plug-Ins" "Read Me.txt" "Scripts" "Unix Support")

Both these functions can take an additional argument, which controls whether to keep or discard any duplicate items.

You could use the difference function to compare two revisions of a text file. Use parse (Parsing strings) to split the files into lines first:

(set 'd1 
 (parse (read-file "/Users/me/f1-(2006-05-29)-1.html") "\r" 0))

(set 'd2 
 (parse (read-file "/Users/me/f1-(2006-05-29)-6.html") "\r" 0))

(println (difference d1 d2))
(" <p class=\"body\">You could use this function to find" ...)

Association lists

edit

There are various techniques available in newLISP for storing information. One very easy and effective technique is to use a list of sublists, where the first element of each sublist is a key. This structure is known as an association list, but you could also think of it as a dictionary, since you look up information in the list by first looking for the key element.

You can also implement a dictionary using newLISP's contexts. See Introducing contexts.

You can make an association list using basic list functions. For example, you can supply a hand-crafted quoted list:

(set 'ascii-chart '(("a" 97) ("b" 98) ("c" 99) 
; ...
))

Or you could use functions like list and push to build the association list:

(for (c (char "a") (char "z"))
 (push (list (char c) c) ascii-chart -1))

ascii-chart
;-> (("a" 97) ("b" 98) ("c" 99) ... ("z" 122))

It's a list of sublists, and each sublist has the same format. The first element of a sublist is the key. The key can be a string, a number, or a symbol. You can have any number of data elements after the key.

Here's an association list that contains some data about the planets in the solar system:

(set 'sol-sys
 '(("Mercury" 0.382 0.06 0.387 0.241 7.00 0.206 58.6 0) 
   ("Venus" 0.949 0.82 0.72 0.615 3.39 0.0068 -243 0) 
   ("Earth" 1.00 1.00 1.00 1.00 0.00 0.0167 1.00 1) 
   ("Mars" 0.53 0.11 1.52 1.88 1.85 0.0934 1.03 2) 
   ("Jupiter" 11.2 318 5.20 11.86 1.31 0.0484 0.414 63) 
   ("Saturn" 9.41 95 9.54 29.46 2.48 0.0542 0.426 49) 
   ("Uranus" 3.98 14.6 19.22 84.01 0.77 0.0472 -0.718 27) 
   ("Neptune" 3.81 17.2 30.06 164.8 1.77 0.0086 0.671 13) 
   ("Pluto" 0.18 0.002 39.5 248.5 17.1 0.249 -6.5 3)
   )
   ; 0: Planet name 1: Equator diameter (earth) 2: Mass (earth) 
   ; 3: Orbital radius (AU) 4: Orbital period (years) 
   ; 5: Orbital Incline Angle 6: Orbital Eccentricity 
   ; 7: Rotation (days) 8: Moons
)

Each sublist starts with a string, the name of a planet, which is followed by data elements, numbers in this case. The planet name is the key. I've included some comments at the end, because I'm never going to remember that element 2 is the planet's mass, in Earth masses.

You could easily access this information using standard list-processing techniques, but newLISP offers some tailor-made functions that are designed to work specifically with these dictionary or association lists:

  • assoc finds the first occurrence of the keyword and return the sublist
  • lookup looks up the value of a keyword inside the sublist

Both assoc and lookup take the first element of the sublist, the key, and retrieve some data from the appropriate sublist. Here's assoc in action, returning the sublist:

(assoc "Uranus" sol-sys)
;-> ("Uranus" 3.98 14.6 19.22 84.01 0.77 0.0472 -0.718 27)

And here's lookup, which goes the extra mile and gets data out of an element of one of the sublists for you, or the final element if you don't specify one:

(lookup "Uranus" sol-sys)
;-> 27, moons - value of the final element of the sublist

(lookup "Uranus" sol-sys 2)
;-> 14.6, element 2 of the sublist is the planet's mass

This saves you having to use a combination of assoc and nth.

One problem that you might have when working with association lists with long sublists like this is that you can't remember what the index numbers represent. Here's one solution:

(constant 'orbital-radius 3)
(constant 'au 149598000)                 ; 1 au in km
(println "Neptune's orbital radius is " 
 (mul au (lookup "Neptune" sol-sys orbital-radius)) 
 " kilometres")
Neptune's orbital radius is 4496915880 kilometres

Here we've defined orbital-radius and au (astronomical unit) as constants, and you can use orbital-radius to refer to the right column of a sublist. This also makes the code easier to read. The constant function is like set, but the symbol you supply is protected against accidental change by another use of set. You can change the value of the symbol only by using the constant function again.

Having defined these constants, here's an expression that lists the different orbits of the planets, in kilometres:

(dolist (planet-data sol-sys)             ; go through list
 (set 'planet (first planet-data))        ; get name 
 (set 'orb-rad
  (lookup planet sol-sys orbital-radius)) ; get radius
 (println 
    (format "%-8s %12.2f %18.0f" 
     planet 
     orb-rad 
     (mul au orb-rad))))
Mercury          0.39           57894426
Venus            0.72          107710560
Earth            1.00          149598000
Mars             1.52          227388960
Jupiter          5.20          777909600
Saturn           9.54         1427164920
Uranus          19.22         2875273560
Neptune         30.06         4496915880
Pluto           39.50         5909121000


When you want to manipulate floating-point numbers, use the floating-point arithmetic operators add, sub, mul, div rather than +, -, *, and /, which work with (and convert values to) integers.

Replacing sublists in association lists

edit

To change values stored in an association list, use the assoc function as before, to find the matching sublist, then use setf on that sublist to change the value to a new sublist.

(setf (assoc "Jupiter" sol-sys) '("Jupiter" 11.2 318 5.20 11.86 1.31 0.0484 0.414 64))

Adding new items to association lists

edit

Association lists are ordinary lists, too, so you can use all the familiar newLISP techniques with them. Want to add a new 10th planet to our sol-sys list? Just use push:

(push '("Sedna" 0.093 0.00014 .0001 502 11500 0 20 0) sol-sys -1)

and check that it was added OK with:

(assoc "Sedna" sol-sys)
;-> ("Sedna" 0.093 0.00014 0.0001 502 11500 0 20 0)

You can use sort to sort the association list. (Remember though that sort changes lists permanently.) Here's a list of planets sorted by mass. Since you don't want to sort them by name, you use a custom sort (see sort and randomize) to compare the mass (index 2) values of each pair:

(constant 'mass 2)
(sort sol-sys (fn (x y) (> (x mass) (y mass))))

(println sol-sys)
("Jupiter" 11.2 318 5.2 11.86 1.31 0.0484 0.414 63) 
("Saturn" 9.41 95 9.54 29.46 2.48 0.0542 0.426 49) 
("Neptune" 3.81 17.2 30.06 164.8 1.77 0.0086 0.671 13) 
("Uranus" 3.98 14.6 19.22 84.01 0.77 0.0472 -0.718 27) 
("Earth" 1 1 1 1 0 0.0167 1 1) 
("Venus" 0.949 0.82 0.72 0.615 3.39 0.0068 -243 0) 
("Mars" 0.53 0.11 1.52 1.88 1.85 0.0934 1.03 2) 
("Mercury" 0.382 0.06 0.387 0.241 7 0.206 58.6 0) 
("Pluto" 0.18 0.002 39.5 248.5 17.1 0.249 -6.5 3)

You can also easily combine the data in the association list with other lists:

; restore to standard order - sort by orbit radius
(sort sol-sys (fn (x y) (< (x 3) (y 3))))   

; define Unicode symbols for planets
(set 'unicode-symbols 
  '(("Mercury" 0x263F )
    ("Venus" 0x2640 )
    ("Earth" 0x2641 )
    ("Mars" 0x2642 )
    ("Jupiter" 0x2643 )
    ("Saturn" 0x2644 ) 
    ("Uranus" 0x2645 )
    ("Neptune" 0x2646 )
    ("Pluto" 0x2647)))
(map 
 (fn (planet) 
 (println (char (lookup (first planet) unicode-symbols)) 
  "\t" 
 (first planet))) 
 sol-sys)
☿ (Unicode symbol for Mercury)
♀ (Unicode symbol for Venus)
♁ (Unicode symbol for Earth)
♂ (Unicode symbol for Mars)
♃ (Unicode symbol for Jupiter)
♄ (Unicode symbol for Saturn)
♅ (Unicode symbol for Uranus)
♆ (Unicode symbol for Neptune)
♇ (Unicode symbol for Pluto)

Here we've created a temporary inline function that map applies to each planet in sol-sys - lookup finds the planet name and retrieves the Unicode symbol for that planet from the unicode-symbols association list.

You can quickly remove an element from an association list with pop-assoc.

(pop-assoc (sol-sys "Pluto"))

This removes the Pluto element from the list.

newLISP offers powerful data storage facilities in the form of contexts, which you can use for building dictionaries, hash tables, objects, and so on. You can use association lists to build dictionaries, and work with the contents of dictionaries using association list functions. See Introducing contexts.

You can also use a database engine - see Using a SQLite database.

find-all and association lists

edit

Another form of find-all lets you search an association list for a sublist that matches a pattern. You can specify the pattern with wildcard characters. For example, here's an association list:

(set 'symphonies 
  '((Beethoven 9)
    (Haydn 104)
    (Mozart 41)
    (Mahler 10)
    (Wagner 1)
    (Schumann 4)
    (Shostakovich 15)
    (Bruckner 9)))

To find all the sublists that end with 9, use the match pattern (? 9), where the question mark matches any single item:

(find-all '(? 9) symphonies)
;-> ((Beethoven 9) (Bruckner 9))

(For more about match patterns - wild card searches for lists - see matching patterns in lists.)

You can also use this form with an additional action expression after the association list:

(find-all '(? 9) symphonies 
   (println (first $0) { wrote 9 symphonies.}))
Beethoven wrote 9 symphonies.
Bruckner wrote 9 symphonies.

Here, the action expression uses $0 to refer to each matched element in turn.