Write Yourself a Scheme in 48 Hours/Creating IO Primitives

Write Yourself a Scheme in 48 Hours
← Defining Scheme Functions	Creating IO Primitives	Towards a Standard Library →

Our Scheme can't really communicate with the outside world yet, so it would be nice if we could give it some I/O functions. Also, it gets really tedious typing in functions every time we start the interpreter, so it would be nice to load files of code and execute them.

The first thing we'll need is a new constructor for LispVals. PrimitiveFuncs have a specific type signature that doesn't include the IO monad, so they can't perform any IO. We want a dedicated constructor for primitive functions that perform IO:

| IOFunc ([LispVal] -> IOThrowsError LispVal)

While we're at it, let's also define a constructor for the Scheme data type of a port. Most of our IO functions will take one of these to read from or write to:

| Port Handle

A Handle is basically the Haskell notion of a port: it's an opaque data type, returned from openFile and similar IO actions, that you can read and write to.

For completeness, we ought to provide showVal methods for the new data types:

showVal (Port _)   = "<IO port>"
showVal (IOFunc _) = "<IO primitive>"

This'll let the REPL function properly and not crash when you use a function that returns a port.

We also need to update apply, so that it can handle IOFuncs:

apply (IOFunc func) args = func args

We'll need to make some minor changes to our parser to support load. Since Scheme files usually contain several definitions, we need to add a parser that will support several expressions, separated by whitespace. And it also needs to handle errors. We can reuse much of the existing infrastructure by factoring our basic readExpr so that it takes the actual parser as a parameter:

readOrThrow :: Parser a -> String -> ThrowsError a
readOrThrow parser input = case parse parser "lisp" input of
    Left err  -> throwError $ Parser err
    Right val -> return val

readExpr = readOrThrow parseExpr
readExprList = readOrThrow (endBy parseExpr spaces)

Again, think of both readExpr and readExprList as specializations of the newly-renamed readOrThrow. We'll be using readExpr in our REPL to read single expressions; we'll be using readExprList from within load to read programs.

Next, we'll want a new list of IO primitives, structured just like the existing primitive list:

ioPrimitives :: [(String, [LispVal] -> IOThrowsError LispVal)]
ioPrimitives = [("apply", applyProc),
                ("open-input-file", makePort ReadMode),
                ("open-output-file", makePort WriteMode),
                ("close-input-port", closePort),
                ("close-output-port", closePort),
                ("read", readProc),
                ("write", writeProc),
                ("read-contents", readContents),
                ("read-all", readAll)]

The only difference here is in the type signature. Unfortunately, we can't use the existing primitive list because lists cannot contain elements of different types. We also need to change the definition of primitiveBindings to add our new primitives:

primitiveBindings :: IO Env
primitiveBindings = nullEnv >>= (flip bindVars $ map (makeFunc IOFunc) ioPrimitives
                                               ++ map (makeFunc PrimitiveFunc) primitives)
     where makeFunc constructor (var, func) = (var, constructor func)

We've generalized makeFunc to take a constructor argument, and now call it on the list of ioPrimitives in addition to the plain old primitives.

Now we start defining the actual functions. applyProc is a very thin wrapper around apply, responsible for destructuring the argument list into the form apply expects:

applyProc :: [LispVal] -> IOThrowsError LispVal
applyProc [func, List args] = apply func args
applyProc (func : args)     = apply func args

makePort wraps the Haskell function openFile, converting it to the right type and wrapping its return value in the Port constructor. It's intended to be partially-applied to the IOMode, ReadMode for open-input-file and WriteMode for open-output-file:

makePort :: IOMode -> [LispVal] -> IOThrowsError LispVal
makePort mode [String filename] = liftM Port $ liftIO $ openFile filename mode

closePort also wraps the equivalent Haskell procedure, this time hClose:

closePort :: [LispVal] -> IOThrowsError LispVal
closePort [Port port] = liftIO $ hClose port >> (return $ Bool True)
closePort _           = return $ Bool False

readProc (named to avoid a name conflict with the built-in read) wraps the Haskell hGetLine and then sends the result to parseExpr, to be turned into a LispVal suitable for Scheme:

readProc :: [LispVal] -> IOThrowsError LispVal
readProc []          = readProc [Port stdin]
readProc [Port port] = (liftIO $ hGetLine port) >>= liftThrows . readExpr

Notice how hGetLine port is of type IO String yet readExpr is of type String -> ThrowsError LispVal, so they both need to be converted (with liftIO and liftThrows, respectively) to the IOThrowsError monad. Only then can they be piped together with the monadic bind operator.

writeProc converts a LispVal to a string and then writes it out on the specified port:

writeProc :: [LispVal] -> IOThrowsError LispVal
writeProc [obj]            = writeProc [obj, Port stdout]
writeProc [obj, Port port] = liftIO $ hPrint port obj >> (return $ Bool True)

We don't have to explicitly call show on the object we're printing, because hPrint takes a value of type Show a. It's calling show for us automatically. This is why we bothered making LispVal an instance of Show; otherwise, we wouldn't be able to use this automatic conversion and would have to call showVal ourselves. Many other Haskell functions also take instances of Show, so if we'd extended this with other IO primitives, it could save us significant labor.

readContents reads the whole file into a string in memory. It's a thin wrapper around Haskell's readFile, again just lifting the IO action into an IOThrowsError action and wrapping it in a String constructor:

readContents :: [LispVal] -> IOThrowsError LispVal
readContents [String filename] = liftM String $ liftIO $ readFile filename

The helper function load doesn't do what Scheme's load does (we handle that later). Rather, it's responsible only for reading and parsing a file full of statements. It's used in two places: readAll (which returns a list of values) and load (which evaluates those values as Scheme expressions).

load :: String -> IOThrowsError [LispVal]
load filename = (liftIO $ readFile filename) >>= liftThrows . readExprList

readAll then just wraps that return value with the List constructor:

readAll :: [LispVal] -> IOThrowsError LispVal
readAll [String filename] = liftM List $ load filename

Implementing the actual Scheme load function is a little tricky, because load can introduce bindings into the local environment. Apply, however, doesn't take an environment argument, and so there's no way for a primitive function (or any function) to do this. We get around this by implementing load as a special form:

eval env (List [Atom "load", String filename]) = 
     load filename >>= liftM last . mapM (eval env)

Finally, we might as well change our runOne function so that instead of evaluating a single expression from the command line, it takes the name of a file to execute and runs that as a program. Additional command-line arguments will get bound into a list args within the Scheme program:

runOne :: [String] -> IO ()
runOne args = do
    env <- primitiveBindings >>= flip bindVars [("args", List $ map String $ drop 1 args)] 
    (runIOThrows $ liftM show $ eval env (List [Atom "load", String (args !! 0)])) 
        >>= hPutStrLn stderr

That's a little involved, so let's go through it step-by-step. The first line takes the original primitive bindings, passes that into bindVars, and then adds a variable named args that's bound to a List containing String versions of all but the first argument. (The first argument is the filename to execute.) Then, it creates a Scheme form load "arg1", just as if the user had typed it in, and evaluates it. The result is transformed to a string (remember, we have to do this before catching errors, because the error handler converts them to strings and the types must match) and then we run the whole IOThrowsError action. Then we print the result on stderr. (Traditional UNIX conventions hold that stdout should be used only for program output, with any error messages going to stderr. In this case, we'll also be printing the return value of the last statement in the program, which generally has no meaning to anything.)

Then we change main so it uses our new runOne function. Since we no longer need a third clause to handle the wrong number of command-line arguments, we can simplify it to an if statement:

main :: IO ()
main = do args <- getArgs
          if null args then runRepl else runOne $ args

Write Yourself a Scheme in 48 Hours
← Defining Scheme Functions	Creating IO Primitives	Towards a Standard Library →