Common Lisp/Advanced topics/Files and Directories

Testing whether a File ExistsEdit

Use the function PROBE-FILE which will return a generalized boolean - either NIL if the file doesn't exists, or its truename (which might be different from the argument you supplied).

edi@bird:/tmp> ln -s /etc/passwd foo
edi@bird:/tmp> cmucl
; Loading #p"/home/edi/.cmucl-init".
CMU Common Lisp 18d-pre, level-1 built 2002-01-15 on maftia1, running on bird
Send questions to cmucl-help@cons.org. and bug reports to cmucl-imp@cons.org.
Loaded subsystems:
    Python native code compiler, target Intel x86
    CLOS based on PCL version:  September 16 92 PCL (f)
    Gray Streams Protocol Support
    CLX X Library MIT R5.02
* (probe-file "/etc/passwd")

#p"/etc/passwd"
* (probe-file "foo")

#p"/etc/passwd"
* (probe-file "bar")

NIL

Opening a FileEdit

Common Lisp has OPEN and CLOSE functions which resemble the functions of the same denominator from other programming languages you're probably familiar with. However, it is almost always recommendable to use the macro WITH-OPEN-FILE instead. Not only will this macro open the file for you and close it when you're done, it'll also take care of it if your code leaves the body abnormally (such as by a use of THROW). A typical use of WITH-OPEN-FILE looks like this:

(with-open-file (str <file-spec>
                     :direction <direction>
                     :if-exists <if-exists>
                     :if-does-not-exist <if-does-not-exist>)
  <your code here>)
  • STR is a variable which'll be bound to the stream which is created by opening the file.
  • <file-spec> will be a truename or a pathname.
  • <direction> is usually :INPUT (meaning you want to read from the file), :OUTPUT (meaning you want to write to the file) or :IO (which is for reading and writing at the same time) - the default is :INPUT.
  • <if-exists> specifies what to do if you want to open a file for writing and a file with that name already exists - this option is ignored if you just want to read from the file. The default is :ERROR which means that an error is signalled. Other useful options are :SUPERSEDE (meaning that the new file will replace the old one), NIL (the stream variable will be bound to NIL), and :RENAME (i.e. the old file is renamed).
  • <if-does-not-exist> specifies what to do if the file you want to open does not exist. It is one of :ERROR for signalling an error, :CREATE for creating an empty file, or NIL for binding the stream variable to NIL. The default is, to be brief, to do the right thing depending on the other options you provided. See the CLHS for details.

Note that there are a lot more options to WITH-OPEN-FILE. See the CLHS entry for OPEN for all the details. You'll find some examples on how to use WITH-OPEN-FILE below. Also note that you usually don't need to provide any keyword arguments if you just want to open an existing file for reading.

Using Strings instead of FilesEdit

Reading a File one Line at a TimeEdit

READ-LINE will read one line from a stream (which defaults to standard input) the end of which is determined by either a newline character or the end of the file. It will return this line as a string without the trailing newline character. (Note that READ-LINE has a second return value which is true if there was no trailing newline, i.e. if the line was terminated by the end of the file.) READ-LINE will by default signal an error if the end of the file is reached. You can inhibit this by supplying NIL as the second argument. If you do this, READ-LINE will return NIL if it reaches the end of the file.

> (with-open-file (stream "/etc/passwd")
    (do ((line (read-line stream nil)
               (read-line stream nil)))
        ((null line))
      (print line)))

You can also supply a third argument which will be used instead of NIL to signal the end of the file:

> (with-open-file (stream "/etc/passwd")
    (loop for line = (read-line stream nil 'foo)
          until (eq line 'foo)
          do (print line)))

Reading a File one Character at a TimeEdit

READ-CHAR is similar to READ-LINE, but it only reads one character as opposed to one line. Of course, newline characters aren't treated differently from other characters by this function.

* (with-open-file (stream "/etc/passwd")
    (do ((char (read-char stream nil)
               (read-char stream nil)))
        ((null char))
      (print char)))

Looking one Character aheadEdit

You can 'look at' the next character of a stream without actually removing it from there - this is what the function PEEK-CHAR is for. It can be used for three different purposes depending on its first (optional) argument (the second one being the stream it reads from): If the first argument is NIL, PEEK-CHAR will just return the next character that's waiting on the stream:

* (with-input-from-string (stream "I'm not amused")
    (print (read-char stream))
    (print (peek-char nil stream))
    (print (read-char stream))
    (values))

#\I 
#\' 
#\'

If the first argument is T, PEEK-CHAR will skip whitespace characters, i.e. it will return the next non-whitespace character that's waiting on the stream. The whitespace characters will vanish from the stream as if they had been read by READ-CHAR:

* (with-input-from-string (stream "I'm 
                                   not amused")
    (print (read-char stream))
    (print (read-char stream))
    (print (read-char stream))
    (print (peek-char t stream))
    (print (read-char stream))
    (print (read-char stream))
    (values))
#\I 
#\' 
#\m 
#\n 
#\n 
#\o

If the first argument to PEEK-CHAR is a character, the function will skip all characters until that particular character is found:

* (with-input-from-string (stream "I'm not amused")
    (print (read-char stream))
    (print (peek-char #\a stream))
    (print (read-char stream))
    (print (read-char stream))
    (values))
#\I 
#\a 
#\a 
#\m 

Note that PEEK-CHAR has further optional arguments to control its behaviour on end-of-file similar to those for READ-LINE and READ-CHAR (and it will signal an error by default):

* (with-input-from-string (stream "I'm not amused")
    (print (read-char stream))
    (print (peek-char #\d stream))
    (print (read-char stream))
    (print (peek-char nil stream nil 'the-end))
    (values))
#\I 
#\d 
#\d 
THE-END

You can also put one character back onto the stream with the function UNREAD-CHAR. You can use it as if, after you have read a character, you decide that you'd better used PEEK-CHAR instead of READ-CHAR:

* (with-input-from-string (stream "I'm not amused")
    (let ((c (read-char stream)))
      (print c)
      (unread-char c stream)
      (print (read-char stream))
      (values)))
#\I 
#\I

Note that the front of a stream doesn't behave like a stack: You can only put back exactly one character onto the stream. Also, you must put back the same character that has been read previously, and you can't unread a character if none has been read before.

Random Access to a FileEdit

Use the function FILE-POSITION for random access to a file. If this function is used with one argument (a stream), it will return the current position within the stream. If it's used with two arguments (see below), it will actually change the file position in the stream.

* (with-input-from-string (stream "I'm not amused")
    (print (file-position stream))
    (print (read-char stream))
    (print (file-position stream))
    (file-position stream 4)
    (print (file-position stream))
    (print (read-char stream))
    (print (file-position stream))
    (values))
0 
#\I 
1 
4
#\n 
5

Redirecting the Standard Output of your ProgramEdit

You do it like this:

(let ((*standard-output* <some form generating a stream>))
  ...)

Because *STANDARD-OUTPUT* is a dynamic variable, all references to it during execution of the body of the LET form refer to the stream that you bound it to. After exiting the LET form, the old value of *STANDARD-OUTPUT* is restored, no matter if the exit was by normal execution, a RETURN-FROM leaving the whole function, an exception, or what-have-you. (This is, incidentally, why global variables lose much of their brokenness in Common Lisp compared to other languages: since they can be bound for the execution of a specific form without the risk of losing their former value after the form has finished, their use is quite safe; they act much like additional parameters that are passed to every function.)

If the output of the program should go to a file, you can do the following:

(with-open-file (*standard-output* "somefile.dat" :direction :output
                                   :if-exists :supersede)
  ...)

WITH-OPEN-FILE opens the file - creating it if necessary - binds *STANDARD-OUTPUT*, executes its body, closes the file, and restores *STANDARD-OUTPUT* to its former value. It doesn't get more comfortable than this!

Faithful Output with Character StreamsEdit

By faithful output I mean that characters with codes between 0 and 255 will be written out as is. It means, that I can (PRINC (CODE-CHAR 0..255) s) to a stream and expect 8-bit bytes to be written out, which is not obvious in the times of Unicode and 16 or 32 bit character representations. It does not require that the characters ä, ß, or þ must have their CHAR-CODE in the range 0..255 — the implementation is free to use any code. But it does require that no #\Newline to CRLF translation takes place, among others.

Common Lisp has a long tradition of distinguishing character from byte (binary) I/O, e.g.READ-BYTE and READ-CHAR are in the standard. Some implementations let both functions be called interchangeably. Others allow either one or the other. (The simple stream proposal defines the notion of a bivalent stream where both are possible.)

Varying element-types are useful as some protocols rely on the ability to send 8-Bit output on a channel. E.g. with HTTP, the header is normally ASCII and ought to use CRLF as line terminators, whereas the body can have the MIME type application/octet-stream, where CRLF translation would destroy the data. (This is how the Netscape browser on MS-Windows destroys data sent by incorrectly configured Webservers which declare unknown files as having MIME type text/plain — the default in most Apache configurations).

What follows is a list of implementation dependent choices and behaviours and some code to experiment.

  • CLISP: On CLISP, faithful output is possible using
      :external-format
      (ext:make-encoding :charset 'charset:iso-8859-1 :line-terminator :unix)

You can also use (SETF (STREAM-ELEMENT-TYPE F) '(UNSIGNED-BYTE 8)), where the ability to SETF is a CLISP-specific extension. Using :EXTERNAL-FORMAT :UNIX will cause portability problems, since the default character set on MS-Windows is CHARSET:CP1252. CHARSET:CP1252 doesn't allow output of e.g. (CODE-CHAR #x81):

      ;*** - Character #\u0080 cannot be represented in the character set CHARSET:CP1252

Characters with code > 127 cannot be represented in ASCII:

      ;*** - Character #\u0080 cannot be represented in the character set CHARSET:ASCII
  • CMUCL: :EXTERNAL-FORMAT :DEFAULT (untested) — no unicode, so probably no problems.
  • AllegroCL: #+(AND ALLEGRO UNIX) :DEFAULT (untested) — seems enough on UNIX, but would not work on the MS-Windows port of AllegroCL.
  • LispWorks: :EXTERNAL-FORMAT '(:LATIN-1 :EOL-STYLE :LF) (confirmed by Marc Battyani)

Here's some sample code to play with:

(defvar *unicode-test-file* "faithtest-out.txt")
 
(defun generate-256 (&key (filename *unicode-test-file*)
			  #+CLISP (charset 'charset:iso-8859-1)
                          external-format)
  (let ((e (or external-format
	       #+CLISP (ext:make-encoding :charset charset :line-terminator :unix))))
    (describe e)
    (with-open-file (f filename :direction :output
		     :external-format e)
      (write-sequence
        (loop with s = (make-string 256)
	      for i from 0 to 255
	      do (setf (char s i) (code-char i))
	      finally (return s))
       f)
      (file-position f))))
 
;(generate-256 :external-format :default)
;#+CLISP (generate-256 :external-format :unix)
;#+CLISP (generate-256 :external-format 'charset:ascii)
;(generate-256)
 
(defun check-256 (&optional (filename *unicode-test-file*))
  (with-open-file (f filename :direction :input
		     :element-type '(unsigned-byte 8))
    (loop for i from 0
	  for c = (read-byte f nil nil)
	  while c
	  unless (= c i)
	  do (format t "~&Position ~D found ~D(#x~X)." i c c)
	  when (and (= i 33) (= c 32))
	  do (let ((c (read-byte f)))
	       (format t "~&Resync back 1 byte ~D(#x~X) - cause CRLF?." c c) ))
    (file-length f)))
 
#| CLISP
(check-256 *unicode-test-file*)
(progn (generate-256 :external-format :unix) (check-256))
; uses UTF-8 -> 385 bytes
 
(progn (generate-256 :charset 'charset:iso-8859-1) (check-256))
 
(progn (generate-256 :external-format :default) (check-256))
; uses UTF-8 + CRLF(on MS-Windows) -> 387 bytes
 
(progn (generate-256 :external-format
  (ext:make-encoding :charset 'charset:iso-8859-1 :line-terminator :mac)) (check-256))
(progn (generate-256 :external-format
  (ext:make-encoding :charset 'charset:iso-8859-1 :line-terminator :dos)) (check-256))
|#

Fast Bulk I/OEdit

If you need to copy a lot of data and the source and destination are both streams (of the same element type), it's very fast to use READ-SEQUENCE and WRITE-SEQUENCE:

(let ((buf (make-array 4096 :element-type (stream-element-type input-stream)))
  (loop for pos = (read-sequence input-stream)
        while (plusp pos)
        do (write-sequence buf output-stream :end pos))))


Creating a fileEdit

text fileEdit

This function creates text file and writes string content to it. [1]

; definition of function 
(defun writeToFile (name content) ; content and name are both strings 
	(with-open-file 
		( stream  name ;  creating a stream object named name
		  :direction :output
		  :if-exists :overwrite
		  :if-does-not-exist :create ) ;  creates a file namend name if it does not already exist,
       (format stream content)) ; content is the input that is written to the file via the format function
     name) ; This function returns the name of the file for further processing ! 
 
; execution of function
(writeToFile "a.txt" "hello word")

XPM fileEdit

; definition of function 
(defun write_To_XPM_File (file)
	; initial values of local variable 
	; which exist only for the duration of its scope, which in Lisp is the length of the let statement.
	( let( 
		(iWidth 4) ; the pixmap width
		(iHeight 4) ; the pixmap height,
		(NoColors 2) ; the number of colors
		(cpp 1))	  ;  the number of characters per pixel
 
             (with-open-file 
		( st file 
		  :direction :output
		  :if-exists :overwrite
		  :if-does-not-exist :create )
       ; write XPM file header to the file
		(format st "/* XPM */~%static char *a[] = {~%")
		(format st "\"~a ~a ~a ~a\",~%" iWidth iHeight NoColors cpp)
		; write color table  to file 
		(format st "\"* c #000000\",~%") ;  black
		(format st "\". c #ffffff\",~%") ; white
		; write strings containing line of pixel's colors
		(format st "\".**.\",~%")
		(format st "\".**.\",~%")
		(format st "\".**.\",~%")
		(format st "\".**.\",~%")
		; write end of the file 
	    (format st "};~%")))    ;  ~%   is a    new line
	format t "file was created " file) ; t is a standard output 
) ; defun write_To_XPM_File
 
; execution of function
(write_To_XPM_File "c.xpm" )

RereferencesEdit

  1. Richard Socher : Write To File In Lisp
Last modified on 9 March 2013, at 17:29