Introduction to newLISP/Working with numbers
Working with numbers
editIf you work with numbers, you'll be pleased to know that newLISP includes most of the basic functions that you'd expect to find, plus many more. This section is designed to help you make the best use of them and to avoid some of the minor pitfalls that you might encounter. As always, see the official documentation for full details.
Integers and floating-point numbers
editnewLISP handles two different types of number: the integer and the floating-point number. Integers are precise, whereas floating-point numbers (floats) are less precise. There are advantages and disadvantages to each. If you need to use very large integers, larger than 9 223 372 036 854 775 807, see the section covering the differences between large (64-bit) integers and big integers (of unlimited size) - Bigger numbers.
The arithmetic operators +, -, *, /, and % always return integer values. A common mistake is to forget this and use / and * without realising that they're carrying out integer arithmetic:
(/ 10 3)
;-> 3
This might not be what you were expecting!
Floating-point numbers keep only the 15 or 16 most important digits (ie the digits at the left of the number, with the highest place values).
The philosophy of a floating-point number is that's close enough, rather than that's the exact value.
Suppose you try to define a symbol PI to store the value of pi to 50 decimal places:
(constant 'PI 3.14159265358979323846264338327950288419716939937510)
;-> 3.141592654
(println PI)
3.141592654
It looks like newLISP has cut about 40 digits off the right hand side! In fact about 15 or 16 digits have been stored, and 35 of the less important digits have been discarded.
How does newLISP store this number? Let's look using the format function:
(format {%1.50f} PI)
;-> "3.14159265358979311599796346854418516159057617187500"
Now let's make a little script to compare both numbers as strings, so we don't have to grep visually the differences:
(setq original-pi-str "3.14159265358979323846264338327950288419716939937510")
(setq pi (float original-pi-str))
(setq saved-pi-str (format {%1.50f} pi))
(println pi " -> saved pi (float)")
(println saved-pi-str " -> saved pi formatted")
(println original-pi-str " -> original pi")
(dotimes (i (length original-pi-str) (!= (original-pi-str i) (saved-pi-str i)))
(print (original-pi-str i)))
(println " -> original and saved versions are equal up to this")
3.141592654 -> saved pi (float)
3.14159265358979311599796346854418516159057617187500 -> saved pi formatted
3.14159265358979323846264338327950288419716939937510 -> original pi
3.141592653589793 -> original and saved versions are equal up to this
Notice how the value is accurate up to 9793, but then drifts away from the more precise string you originally supplied. The numbers after 9793 are typical of the way all computers store floating-point values - it isn't newLISP being creative with your data!
The largest float you can use seems to be - on my machine, at least - about 10308. Only the first 15 or so digits are stored, though, so that's mostly zeroes, and you can't really add 1 to it.
Another example of the motto of a floating-point number: that's close enough!
The above comments are true for most computer languages, by the way, not just newLISP. Floating-point numbers are a compromise between convenience, speed, and accuracy.
Integer and floating-point maths
editWhen you're working with floating-point numbers, use the floating-point arithmetic operators add, sub, mul, div, and mod, rather than +, -, *, /, and %, their integer-only equivalents:
(mul PI 2)
;-> 6.283185307
and, to see the value that newLISP is storing (because the interpreter's default output resolution is 9 or 10 digits):
(format {%1.16f} (mul PI 2))
;-> "6.2831853071795862"
If you forget to use mul here, and use * instead, the numbers after the decimal point are thrown away:
(format {%1.16f} (* PI 2))
;-> "6.0000000000000000"
Here, pi was converted to 3 and then multiplied by 2.
You can re-define the familiar arithmetic operators so that they default to using floating-point routines rather than integer-only arithmetic:
; before
(+ 1.1 1.1)
;-> 2
(constant (global '+) add)
; after
(+ 1.1 1.1)
;-> 2.2
You could put these definitions in your init.lsp file to have them available for all newLISP work on your machine. The main problem you'll find is when sharing code with others, or using imported libraries. Their code might produce surprising results, or yours might!
Conversions: explicit and implicit
editTo convert strings into numbers, or numbers of one type into another, use the int and float functions.
The main use for these is to convert a string into a number - either an integer or a float. For example, you might be using a regular expression to extract a string of digits from a longer string:
(map int (find-all {\d+} {the answer is 42, not 41}))
;-> (42 41) ; a list of integers
(map float (find-all {\d+(\.\d+)?} {the value of pi is 3.14, not 1.618}))
;-> (3.14 1.618) ; a list of floats
A second argument passed to int specifies a default value which should be used if the conversion fails:
(int "x")
;-> nil
(int "x" 0)
;-> 0
int is a clever function that can also convert strings representing numbers in number bases other than 10 into numbers. For example, to convert a hexadecimal number in string form to a decimal number, make sure it is prefixed with 0x, and don't use letters beyond f:
(int (string "0x" "1F"))
;-> 31
(int (string "0x" "decaff"))
;-> 14600959
And you can convert strings containing octal numbers by prefixing them with just a 0:
(int "035")
;-> 29
Binary numbers can be converted by prefixing them with 0b:
(int "0b100100100101001001000000000000000000000010100100")
;-> 160881958715556
Even if you never use octal or hexadecimal, it's worth knowing about these conversions, because one day you might, either deliberately or accidentally, write this:
(int "08")
which evaluates to 0 rather than 8 - a failed octal-decimal conversion rather than the decimal 8 that you might have expected! For this reason, it's always a good idea to specify not only a default value but also a number base whenever you use int on string input:
(int "08" 0 10) ; default to 0 and assume base 10
;-> 8
If you're working with big integers (integers larger than 64-bit integers), use bigint rather than int. See Bigger numbers.
Invisible conversion and rounding
editSome functions convert floating-point numbers to integers automatically. Since newLISP version 10.2.0 all operators made of letters of the alphabet produce floats and operators written with special characters produce integers.
So using ++ will convert and round your numbers to integers, and using inc will convert your numbers to floats:
(setq an-integer 2)
;-> 2
(float? an-integer)
;-> nil
(inc an-integer)
;-> 3
(float? an-integer)
;-> true
(setq a-float (sqrt 2))
;-> 1.414213562
(integer? a-float)
;-> nil
(++ a-float)
;-> 2
(integer? a-float)
;-> true
To make inc and dec work on lists you need to access specific elements or use map to process all:
(setq numbers '(2 6 9 12))
;-> (2 6 9 12)
(inc (numbers 0))
;-> 3
numbers
;-> (3 6 9 12)
(map inc numbers)
;-> (4 7 10 13)
; but WATCH OUT!
(map (curry inc 3) numbers) ; this one doesn't produce what you expected
;-> (6 12 21 33)
; use this instead:
(map (curry + 3) numbers)
;-> (6 9 12 15)
Many newLISP functions automatically convert integer arguments into floating-point values. This usually isn't a problem. But it's possible to lose some precision if you pass extremely large integers to functions that convert to floating-point:
(format {%15.15f} (add 1 922337203685477580))
;-> "922337203685477632.000000000000000"
Because the add function converted the very large integer to a float, a small amount of precision was lost (amounting to about 52, in this case). Close enough? If not, think carefully about how you store and manipulate numbers.
Number testing
editSometimes you will want to test whether a number is an integer or a float:
(set 'PI 3.141592653589793)
;-> 3.141592654
(integer? PI)
;-> nil
(float? PI)
;-> true
(number? PI)
;-> true
(zero? PI)
;-> nil
With integer? and float?, you're testing whether the number is stored as an integer or float, not whether the number is mathematically an integer or a floating-point value. For example, this test returns nil, which might surprise you:
(integer? (div 30 3))
;-> nil
It's not that the answer isn't 10 (it is), but rather that the answer is a floating-point 10, not an integer 10, because the div function always returns a floating-point value.
Absolute signs, from floor to ceiling
editIt's worth knowing that the floor and ceil functions return floating-point numbers that contain integer values. For example, if you use floor to round pi down to the nearest integer, the result is 3, but it's stored as a float not as an integer:
(integer? (floor PI))
;-> nil
(floor PI)
;-> 3
(float? (ceil PI))
;-> true
The abs and sgn functions can also be used when testing and converting numbers. abs always returns a positive version of its argument, and sgn returns 1, 0, or -1, depending on whether the argument is positive, zero, or negative.
The round function rounds numbers to the nearest whole number, with floats remaining floats. You can also supply an optional additional value to round the number to a specific number of digits. Negative numbers round after the decimal point, positive numbers round before the decimal point.
(set 'n 1234.6789)
(for (i -6 6)
(println (format {%4d %12.5f} i (round n i))))
-6 1234.67890 -5 1234.67890 -4 1234.67890 -3 1234.67900 -2 1234.68000 -1 1234.70000 0 1235.00000 1 1230.00000 2 1200.00000 3 1000.00000 4 0.00000 5 0.00000 6 0.00000
sgn has an alternative syntax that lets you evaluate up to three different expressions depending on whether the first argument is negative, zero, or positive.
(for (i -5 5)
(println i " is " (sgn i "below 0" "0" "above 0")))
-5 is below 0 -4 is below 0 -3 is below 0 -2 is below 0 -1 is below 0 0 is 0 1 is above 0 2 is above 0 3 is above 0 4 is above 0 5 is above 0
Number formatting
editTo convert numbers into strings, use the string and format functions:
(reverse (string PI))
;-> "456395141.3"
Both string and println use only the first 10 or so digits, even though more (up to 15 or 16) are stored internally.
Use format to output numbers with more control:
(format {%1.15f} PI)
;-> "3.141592653589793"
The format specification string uses the widely-adopted printf-style formatting. Remember too that you can use the results of the format function:
(string "the value of pi is " (format {%1.15f} PI))
;-> "the value of pi is 3.141592653589793"
The format function lets you output numbers as hexadecimal strings as well:
(format "%x" 65535)
;-> "ffff"
Number utilities
editCreating numbers
editThere are some useful functions that make creating numbers easy.
Sequences and series
editsequence produces a list of numbers in an arithmetical sequence. Supply start and finish numbers (inclusive), and a step value:
(sequence 1 10 1.5)
;-> (1 2.5 4 5.5 7 8.5 10)
If you specify a step value, all the numbers are stored as floats, even if the results are integers, otherwise they're integers:
; with step value sequence gives floats
(sequence 1 10 2)
;-> (1 3 5 7 9)
(map float? (sequence 1 10 2))
;-> (true true true true true)
; without step value sequence gives integers
(sequence 1 5)
;-> (1 2 3 4 5)
> (map float? (sequence 1 5))
;-> (nil nil nil nil nil)
series multiplies its first argument by its second argument a number of times. The number of repeats is specified by the third argument. This produces geometric sequences:
(series 1 2 20)
;-> (1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288)
Every number is stored as a float.
The second argument of series can also be a function. The function is applied to the first number, then to the result, then to that result, and so on.
(series 10 sqrt 20)
;-> (10 3.16227766 1.77827941 1.333521432 1.154781985 1.074607828 1.036632928
1.018151722 1.009035045 1.004507364 1.002251148 1.001124941 1.000562313
1.000281117 1.000140549 1.000070272 1.000035135 1.000017567 1.00000878
1.000004392)
The normal function returns a list of floating-point numbers with a specified mean and a standard deviation. For example, a list of 6 numbers with a mean of 10 and a standard deviation of 5 can be produced as follows:
(normal 10 5 6)
;-> (6.5234375 14.91210938 6.748046875 3.540039062 4.94140625 7.1484375)
Random numbers
editrand creates a list of randomly chosen integers less than a number you supply:
(rand 7 20)
; 20 numbers between 0 and 6 (inclusive) or 7 (exclusive)
;-> (0 0 2 6 6 6 2 1 1 1 6 2 0 6 0 5 2 4 4 3)
Obviously (rand 1) generates a list of zeroes and isn't useful. (rand 0) doesn't do anything useful either, but it's been assigned the job of initializing the random number generator.
If you leave out the second number, it just generates a single random number in the range.
random generates a list of floating-point numbers multiplied by a scale factor, starting at the first argument:
(random 0 2 10)
; 10 numbers starting at 0 and scaled by 2
;-> (1.565273852e-05 0.2630755763 1.511210644 0.9173002638
; 1.065534475 0.4379183727 0.09408923243 1.357729434
; 1.358592812 1.869385792))
Randomness
editUse seed to control the randomness of rand (integers), random (floats), randomize (shuffled lists), and amb (list elements chosen at random).
If you don't use seed, the same set of random numbers appears each time. This provides you with a predictable randomness - useful for debugging. When you want to simulate the randomness of the real world, seed the random number generator with a different value each time you run the script:
Without seed:
; today
(for (i 10 20)
(print (rand i) { }))
7 1 5 10 6 2 8 0 17 18 0
; tomorrow
(for (i 10 20)
(print (rand i) { }))
7 1 5 10 6 2 8 0 17 18 0 ; same as yesterday
With seed:
; today
(seed (date-value))
(for (i 10 20)
(print (rand i) { }))
2 10 3 10 1 11 8 13 6 4 0
; tomorrow
(seed (date-value))
(for (i 10 20)
(print (rand i) { }))
0 7 10 5 5 8 10 16 3 1 9
General number tools
editmin and max work as you would expect though they always return floats. Like many of the arithmetic operators, you can supply more than one value:
(max 1 2 13.2 4 2 1 4 3 2 1 0.2)
;-> 13.2
(min -1 2 17 4 2 1 43 -20 1.1 0.2)
;-> -20
(float? (max 1 2 3))
;-> true
The comparison functions allow you to supply just a single argument. If you use them with numbers, newLISP helpfully assumes that you're comparing with 0. Remember that you're using postfix notation:
(set 'n 3)
(> n)
;-> true, assumes test for greater than 0
(< n)
;-> nil, assumes test for less than 0
(set 'n 0)
(>= n)
;-> true
The factor function finds the factors for an integer and returns them in a list. It's a useful way of testing a number to see if it's prime:
(factor 5)
;-> (5)
(factor 42)
;-> (2 3 7)
(define (prime? n)
(and
(set 'lst (factor n))
(= (length lst) 1)))
(for (i 0 30)
(if (prime? i) (println i)))
2 3 5 7 11 13 17 19 23 29
Or you could use it to test if a number is even:
(true? (find 2 (factor n)))
;-> true if n is even
gcd finds the largest integer that exactly divides two or more numbers:
(gcd 8 12 16)
;-> 4
Floating-point utilities
editIf omitted, the second argument to the pow function defaults to 2.
(pow 2) ; default is squared
;-> 4
(pow 2 2 2 2) ; (((2 squared) squared) squared)
;-> 256
(pow 2 8) ; 2 to the 8
;-> 256
(pow 2 3)
;-> 8
(pow 2 0.5) ; square root
;-> 1.414213562
You can also use sqrt to find square roots. To find cube and other roots, use pow:
(pow 8 (div 1 3)) ; 8 to the 1/3
;-> 2
The exp function calculates ex, where e is the mathematical constant 2.718281828, and x is the argument:
(exp 1)
;-> 2.71828128
The log function has two forms. If you omit the base, natural logarithms are used:
(log 3) ; natural (base e) logarithms
;-> 1.098612289
Or you can specify another base, such as 2 or 10:
(log 3 2)
;-> 1.584962501
(log 3 10) ; logarithm base 10
;-> 0.4771212547
Other mathematical functions available by default in newLISP are fft (fast Fourier transform), and ifft (inverse fast Fourier transform).
Trigonometry
editAll newLISP's trigonometry functions, sin, cos, tan, asin, acos, atan, atan2, and the hyperbolic functions sinh, cosh, and tanh, work in radians. If you prefer to work in degrees, you can define alternative versions as functions:
(constant 'PI 3.141592653589793)
(define (rad->deg r)
(mul r (div 180 PI)))
(define (deg->rad d)
(mul d (div PI 180)))
(define (sind _e)
(sin (deg->rad (eval _e))))
(define (cosd _e)
(cos (deg->rad (eval _e))))
(define (tand _e)
(tan (deg->rad (eval _e))))
(define (asind _e)
(rad->deg (asin (eval _e))))
(define (atan2d _e _f)
(rad->deg (atan2 (deg->rad (eval _e)) (deg->rad (eval _f)))))
and so on.
When writing equations, one approach is to build them up from the end first. For example, to convert an equation like this:
build it up in stages, like this:
1 (tand beta)
2 (tand beta) (sind epsilon)
3 (mul (tand beta) (sind epsilon))
4 (sind lamda) (mul (tand beta) (sind epsilon))
5 (sind lamda) (cosd epsilon) (mul (tand beta) (sind epsilon))
6 (sub (mul (sind lamda) (cosd epsilon))
(mul (tand beta) (sind epsilon)))
7 (atan2d (sub (mul (sind lamda) (cosd epsilon)) (mul (tand beta)(sind epsilon)))
(cosd lamda))
8 (set 'alpha
and so on...
It's often useful to line up the various expressions in your text editor:
(set 'right-ascension
(atan2d
(sub
(mul
(sind lamda)
(cosd epsilon))
(mul
(tand beta)
(sind epsilon)))
(cosd lamda)))
If you have to convert a lot of mathematical expressions from infix to postfix notation, you might want to investigate the infix.lsp module (available from the newLISP website):
(load "/usr/share/newlisp/modules/infix.lsp")
(INFIX:xlate
"(sin(lamda) * cos(epsilon)) - (cos(beta) * sin(epsilon))")
;->
(sub (mul (sin lamda) (cos epsilon)) (mul (tan beta) (sin epsilon)))
Arrays
editnewLISP provides multidimensional arrays. Arrays are very similar to lists, and you can use most of the functions that operate on lists on arrays too.
A large array can be faster than a list of similar size. The following code uses the time function to compare how fast arrays and lists work.
(for (size 200 1000)
; create an array
(set 'arry (array size (randomize (sequence 0 size))))
; create a list
(set 'lst (randomize (sequence 0 size)))
(set 'array-time
(time (dotimes (x (/ size 2))
(nth x arry)) 100))
; repeat at least 100 times to get non-zero time!
(set 'list-time
(time (dotimes (x (/ size 2))
(nth x lst)) 50))
(println "with " size " elements: array access: "
array-time
"; list access: "
list-time
" "
(div list-time array-time )))
with 200 elements: array access: 1; list access: 1 1 with 201 elements: array access: 1; list access: 1 1 with 202 elements: array access: 1; list access: 1 1 with 203 elements: array access: 1; list access: 1 1 ... with 997 elements: array access: 7; list access: 16 2.285714286 with 998 elements: array access: 7; list access: 17 2.428571429 with 999 elements: array access: 7; list access: 17 2.428571429 with 1000 elements: array access: 7; list access: 17 2.428571429
The exact times will vary from machine to machine, but typically, with 200 elements, arrays and lists are comparable in speed. As the sizes of the list and array increase, the execution time of the nth accessor function increases. By the time the list and array contain 1000 elements each, the array is 2 to 3 times faster to access than the list.
To create an array, use the array function. You can make a new empty array, make a new one and fill it with default values, or make a new array that's an exact copy of an existing list.
(set 'table (array 10)) ; new empty array
(set 'lst (randomize (sequence 0 20))) ; new full list
(set 'arry (array (length lst) lst)) ; new array copy of a list
To make a new list that's a copy of an existing array, use the array-list function:
(set 'lst2 (array-list arry)) ; makes new list
To tell the difference between lists and arrays, you can use the list? and array? tests:
(array? arry)
;-> true
(list? lst)
;-> true
Functions available for arrays
editThe following general-purpose functions work equally well on arrays and lists: first, last, rest, mat, nth, setf, sort, append, and slice.
There are also some special functions for arrays and lists that provide matrix operations: invert, det, multiply, transpose. See Matrices.
Arrays can be multi-dimensional. For example, to create a 2 by 2 table, filled with 0s, use this:
(set 'arry (array 2 2 '(0)))
;-> ((0 0) (0 0))
The third argument to array supplies some initial values that newLISP will use to fill the array. newLISP uses the value as effectively as it can. So, for example, you can supply a more than sufficient initializing expression:
(set 'arry (array 2 2 (sequence 0 10)))
arry
;-> ((0 1) (2 3)) ; don't need all of them
or just provide a hint or two:
(set 'arry (array 2 2 (list 1 2)))
arry
;-> ((1 2) (1 2))
(set 'arry (array 2 2 '(42)))
arry
;-> ((42 42) (42 42))
This array initialization facility is cool, so I sometimes use it even when I'm creating lists:
(set 'maze (array-list (array 10 10 (randomize (sequence 0 10)))))
;-> ((9 4 0 2 10 6 7 1 8 5)
(3 9 4 0 2 10 6 7 1 8)
(5 3 9 4 0 2 10 6 7 1)
(8 5 3 9 4 0 2 10 6 7)
(1 8 5 3 9 4 0 2 10 6)
(7 1 8 5 3 9 4 0 2 10)
(6 7 1 8 5 3 9 4 0 2)
(10 6 7 1 8 5 3 9 4 0)
(2 10 6 7 1 8 5 3 9 4)
(0 2 10 6 7 1 8 5 3 9))
Getting and setting values
editTo get values from an array, use the nth function, which expects a list of indices for the dimensions of the array, followed by the name of the array:
(set 'size 10)
(set 'table (array size size (sequence 0 (pow size))))
(dotimes (row size)
(dotimes (column size)
(print (format {%3d} (nth (list row column) table))))
; end of row
(println))
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
(nth also works with lists and strings.)
As with lists, you can use implicit addressing to get values:
(set 'size 10)
(set 'table (array size size (sequence 0 (pow size))))
(table 3)
;-> (30 31 32 33 34 35 36 37 38 39) ; row 3 (0-based!)
(table 3 3) ; row 3 column 3 implicitly
;-> 33
To set values, use setf. The following code replaces every number that isn't prime with 0.
(set 'size 10)
(set 'table (array size size (sequence 0 (pow size))))
(dotimes (row size)
(dotimes (column size)
(if (not (= 1 (length (factor (nth (list row column) table)))))
(setf (table row column) 0))))
table
;-> ((0 0 2 3 0 5 0 7 0 0)
(0 11 0 13 0 0 0 17 0 19)
(0 0 0 23 0 0 00 0 29)
(0 31 0 0 0 0 0 37 0 0)
(0 41 0 43 0 0 0 47 0 0)
(0 0 0 53 0 0 0 0 0 59)
(0 61 0 0 0 0 0 67 0 0)
(0 71 0 73 0 0 0 0 0 79)
(0 0 0 83 0 0 0 0 0 89)
(0 0 0 0 0 0 0 97 0 0))
Instead of the implicit addressing (table row column), I could have written (setf (nth (list row column) table) 0). Implicit addressing is slightly faster, but using nth can make code easier to read sometimes.
Matrices
editThere are functions that treat an array or a list (with the correct structure) as a matrix.
- invert returns the inversion of a matrix
- det calculates the determinant
- multiply multiplies two matrices
- mat applies a function to two matrices or to a matrix and a number
- transpose returns the transposition of a matrix
transpose is also useful when used on nested lists (see Association lists).
Statistics, financial, and modelling functions
editnewLISP has an extensive set of functions for financial and statistical analysis, and for simulation modelling.
Given a list of numbers, the stats function returns the number of values, the mean, average deviation from mean value, standard deviation (population estimate), variance (population estimate), skew of distribution, and kurtosis of distribution:
(set 'data (sequence 1 10))
;->(1 2 3 4 5 6 7 8 9 10)
(stats data)
(10 5.5 2.5 3.02765035409749 9.16666666666667 0 -1.56163636363636)
Here's a list of other functions built in:
- beta calculate the beta function
- betai calculate the incomplete beta function
- binomial calculate the binomial function
- corr calculate the Pearson product-moment correlation coefficient
- crit-chi2 calculate the Chi square for a given probability
- crit-f calculate the critical minimum F for a given confidence probability
- crit-t calculate the critical minimum Student's t for a given confidence probability
- crit-z calculate the critical normal distributed Z value of a given cumulated probability
- erf calculate the error function of a number
- gammai calculate the incomplete gamma function
- gammaln calculate the log gamma function
- kmeans-query calculate the Euclidian distances from the data vector to centroids
- kmeans-train perform Kmeans cluster analysis on matrix-data
- normal produce a list of normal distributed floating point numbers
- prob-chi2 calculate the cumulated probability of a Chi square
- prob-f find the probability of an observed statistic
- prob-t find the probability of normal distributed value
- prob-z calculate the cumulated probability of a Z value
- stats find statistical values of central tendency and distribution moments of values
- t-test use student's t-test to compare the mean value
Bayesian analysis
editStatistical methods developed initially by Reverend Thomas Bayes in the 18th century have proved versatile and popular enough to enter the programming languages of today. In newLISP, two functions, bayes-train and bayes-query, work together to provide an easy way to calculate Bayesian probabilities for datasets.
Here's how to use the two functions to predict the likelihood that a short piece of text is written by one of two authors.
First, choose texts from the two authors, and generate datasets for each. I've chosen Oscar Wilde and Conan Doyle.
(set 'doyle-data
(parse (lower-case
(read-file "/Users/me/Documents/sign-of-four.txt")) {\W} 0))
(set 'wilde-data
(parse (lower-case
(read-file "/Users/me/Documents/dorian-grey.txt")) {\W} 0))
The bayes-train function can now scan these two data sets and store the word frequencies in a new context, which I'm calling Lexicon:
(bayes-train doyle-data wilde-data 'Lexicon)
This context now contains a list of words that occur in the lists, and the frequencies of each. For example:
Lexicon:_always
;-> (21 110)
ie the word always appeared 21 times in Conan Doyle's text, and 110 times in Wilde's. Next, the Lexicon context can be saved in a file:
(save "/Users/me/Documents/lex.lsp" 'Lexicon)
and reloaded whenever necessary with:
(load "/Users/me/Documents/lex.lsp")
With training completed, you can use the bayes-query function to look up a list of words in a context, and return two numbers, the probabilities of the words belonging to the first or second set of words. Here are three queries. Remember that the first set was Doyle, the second was Wilde:
(set 'quote1
(bayes-query
(parse (lower-case
"the latest vegetable alkaloid" ) {\W} 0)
'Lexicon))
;-> (0.973352412 0.02664758802)
(set 'quote2
(bayes-query
(parse
(lower-case
"observations of threadbare morality to listen to" ) {\W} 0)
'Lexicon))
;-> (0.5 0.5)
(set 'quote3
(bayes-query
(parse
(lower-case
"after breakfast he flung himself down on a divan
and lit a cigarette" ){\W} 0)
'Lexicon))
;-> (0.01961482169 0.9803851783)
These numbers suggest that quote1 is probably (97% certain) from Conan Doyle, that quote2 is neither Doylean nor Wildean, and that quote3 is likely to be from Oscar Wilde.
Perhaps that was lucky, but it's a good result. The first quote is from Doyle's A Study in Scarlet, and the third is from Wilde's Lord Arthur Savile's Crime, both texts that were not included in the training process but - apparently - typical of the author's vocabulary. The second quote is from Jane Austen, and the methods developed by the Reverend are unable to assign it to either of the authors.
Financial functions
editnewLISP offers the following financial functions:
- fv returns the future value of an investment
- irr returns the internal rate of return
- nper returns the number of periods for an investment
- npv returns the net present value of an investment
- pmt returns the payment for a loan
- pv returns the present value of an investment
Logic programming
editThe programming language Prolog made popular a type of logic programming called unification. newLISP provides a unify function that can carry out unification, by matching expressions.
(unify '(X Y) '((+ 1 2) (- (* 4 5))))
((X (+ 1 2)) (Y (- (* 4 5))))
When using unify, unbound variables start with an uppercase character to distinguish them from symbols.
Bit operators
editThe bit operators treat numbers as if they consist of 1's and 0's. We'll use a utility function that prints out numbers in binary format using the bits function:
(define (binary n)
(if (< n 0)
; use string format for negative numbers
(println (format "%6d %064s" n (bits n)))
; else, use decimal format to be able to prefix with zeros
(println (format "%6d %064d" n (int (bits n))))))
This function prints out both the original number and a binary representation of it:
(binary 6)
;-> 6 0000000000000000000000000000000000000000000000000000000000000110
;-> " 6 0000000000000000000000000000000000000000000000000000000000000110"
The shift functions (<< and >>) move the bits to the right or left:
(binary (<< 6)) ; shift left
;-> 12 0000000000000000000000000000000000000000000000000000000000001100
;->" 12 0000000000000000000000000000000000000000000000000000000000001100"
(binary (>> 6)) ; shift right
;-> 3 0000000000000000000000000000000000000000000000000000000000000011
;->" 3 0000000000000000000000000000000000000000000000000000000000000011"
The following operators compare the bits of two or more numbers. Using 4 and 5 as examples:
(map binary '(5 4))
;-> 5 0000000000000000000000000000000000000000000000000000000000000101
;-> 4 0000000000000000000000000000000000000000000000000000000000000100
;-> (" 5 0000000000000000000000000000000000000000000000000000000000000101"
;-> " 4 0000000000000000000000000000000000000000000000000000000000000100")
(binary (^ 4 5)) ; exclusive or: 1 if only 1 of the two bits is 1
;-> 1 0000000000000000000000000000000000000000000000000000000000000001
;->" 1 0000000000000000000000000000000000000000000000000000000000000001"
(binary (| 4 5)) ; or: 1 if either or both bits are 1 ;-> 5 0000000000000000000000000000000000000000000000000000000000000101 ;->" 5 0000000000000000000000000000000000000000000000000000000000000101"
(binary (& 4 5)) ; and: 1 only if both are 1
;-> 4 0000000000000000000000000000000000000000000000000000000000000100
;->" 4 0000000000000000000000000000000000000000000000000000000000000100"
The negate or not function (~) reverses all the bits in a number, exchanging 1's and 0's:
(binary (~ 5)) ; not: 1 <-> 0
;-> -6 1111111111111111111111111111111111111111111111111111111111111010
;->" -6 1111111111111111111111111111111111111111111111111111111111111010"
The binary function that prints out these strings uses the & function to test the last bit of the number to see if it's a 1, and the >> function to shift the number 1 bit to the right, ready for the next iteration.
One use for the OR operator (|) is when you want to combine regular expression options with the regex function.
crc32 calculates a 32 bit CRC (Cyclic Redundancy Check) for a string.
Bigger numbers
editFor most applications, integer calculations in newLISP involve whole numbers up to 9223372036854775807 or down to -9223372036854775808. These are the largest integers you can store using 64 bits. If you add 1 to the largest 64-bit integer, you'll 'roll over' (or wrap round) to the negative end of the range:
(set 'large-int 9223372036854775807)
(+ large-int 1)
;-> -9223372036854775808
But newLISP can handle much bigger integers than this, the so-called 'bignums' or 'big integers'.
(set 'number-of-atoms-in-the-universe 100000000000000000000000000000000000000000000000000000000000000000000000000000000)
;-> 100000000000000000000000000000000000000000000000000000000000000000000000000000000L
(++ number-of-atoms-in-the-universe)
;-> 100000000000000000000000000000000000000000000000000000000000000000000000000000001L
(length number-of-atoms-in-the-universe)
;-> 81
(float number-of-atoms-in-the-universe)
;->1e+80
Notice that newLISP indicates a big integer using a trailing "L". Usually, you can do calculations with big integers without any thought:
(* 100000000000000000000000000000000 100000000000000000000000000000)
;-> 10000000000000000000000000000000000000000000000000000000000000L
Here both operands are big integers, so the answer is automatically big as well.
However, you need to take more care when your calculations combine big integers with other types of number. The rule is that the first argument of a calculation determines whether to use big integers. Compare this loop:
(for (i 1 10) (println (+ 9223372036854775800 i)))
9223372036854775801 9223372036854775802 9223372036854775803 9223372036854775804 9223372036854775805 9223372036854775806 9223372036854775807 -9223372036854775808 -9223372036854775807 -9223372036854775806 -9223372036854775806
with this:
(for (i 1 10) (println (+ 9223372036854775800L i))) ; notice the "L"
9223372036854775801L 9223372036854775802L 9223372036854775803L 9223372036854775804L 9223372036854775805L 9223372036854775806L 9223372036854775807L 9223372036854775808L 9223372036854775809L 9223372036854775810L ;-> 9223372036854775810L
In the first example, the first argument of the function was a large (64-bit integer). So adding 1 to the largest possible 64 bit integer caused a roll-over - the calculation stayed in the large integer realm.
In the second example, the L appended to the first argument of the addition forced newLISP to switch to big integer operations even though both the operands were 64 bit integers. The size of the first argument determines the size of the result.
If you supply a literal big integer, you don't have to append the "L", since it's obvious that the number is a big integer:
(for (i 1 10) (println (+ 92233720368547758123421231455634 i)))
92233720368547758123421231455635L 92233720368547758123421231455636L 92233720368547758123421231455637L 92233720368547758123421231455638L 92233720368547758123421231455639L 92233720368547758123421231455640L 92233720368547758123421231455641L 92233720368547758123421231455642L 92233720368547758123421231455643L 92233720368547758123421231455644L 92233720368547758123421231455644L
There are other ways you can control the way newLISP converts between large and big integers. For example, you can convert something to a big integer using the bigint function:
(set 'bignum (bigint 9223372036854775807))
(* bignum bignum)
;-> 85070591730234615847396907784232501249L
(set 'atoms (bigint 1E+80))
;-> 100000000000000000000000000000000000000000000000000000000000000000000000000000000L
(++ atoms)
;-> 100000000000000000000000000000000000000000000000000000000000000000000000000000001L