A Beginner's Python Tutorial/File I/O

Last lesson we learned how to load external code into our program. Without any introduction (like what I usually have), let's delve into file input and output with normal text files, and later the saving and restoring of instances of classes. (Wow, our lingo power has improved greatly!)

Opening a File


To open a text file you use, well, the open() function. Seems sensible. You pass certain parameters to open() to tell it in which way the file should be opened - 'r' for read only, 'w' for writing only (if there is an old file, it will be written over), 'a' for appending (adding things on to the end of the file) and 'r+' for both reading and writing. But less talk, lets open a file for reading (you can do this in your python idle mode). Open a normal text file. We will then print out what we read inside the file:

Code Example 1 - Opening a file
openfile = open('pathtofile', 'r')

That was interesting. You'll notice a lot of '\n' symbols. These represent newlines (where you pressed enter to start a new line). The text is completely unformatted, but if you were to pass the output of openfile.read() to print (by typing print openfile.read()) it would be nicely formatted.

Seek and You Shall Find


Did you try typing in print openfile.read()? Did it fail? It likely did, and reason is because the 'cursor' has changed its place. Cursor? What cursor? Well, a cursor that you really cannot see, but still a cursor. This invisible cursor tells the read function (and many other I/O functions) where to start from. To set where the cursor is, you use the seek() function. It is used in the form seek(offset, whence).

whence is optional, and determines where to seek from. If whence is 0, the bytes/letters are counted from the beginning. If it is 1, the bytes are counted from the current cursor position. If it is 2, then the bytes are counted from the end of the file. If nothing is put there, 0 is assumed.

offset describes how far from whence that the cursor moves. for example:

openfile.seek(45,0) would move the cursor to 45 bytes/letters after the beginning of the file. openfile.seek(10,1) would move the cursor to 10 bytes/letters after the current cursor position. openfile.seek(-77,2) would move the cursor to 77 bytes/letters before the end of the file (notice the - before the 77) Try it out now. Use openfile.seek() to go to any spot in the file and then try typing print openfile.read(). It will print from the spot you seeked to. But realise that openfile.read() moves the cursor to the end of the file – you will have to seek again.

Other I/O Functions


There are many other functions that help you with dealing with files. They have many uses that empower you to do more, and make the things you can do easier. Let's have a look at tell(), readline(), readlines(), write() and close().

tell() returns where the cursor is in the file. It has no parameters, just type it in (like what the example below will show). This is infinitely useful, for knowing what you are referring to, where it is, and simple control of the cursor. To use it, type fileobjectname.tell() - where fileobjectname is the name of the file object you created when you opened the file (in openfile = open('pathtofile', 'r') the file object name is openfile).

readline() reads from where the cursor is till the end of the line. Remember that the end of the line isn't the edge of your screen - the line ends when you press enter to create a new line. This is useful for things like reading a log of events, or going through something progressively to process it. There are no parameters you have to pass to readline(), though you can optionally tell it the maximum number of bytes/letters to read by putting a number in the brackets. Use it with fileobjectname.readline().

readlines() is much like readline(), however readlines() reads all the lines from the cursor onwards, and returns a list, with each list element holding a line of code. Use it with fileobjectname.readlines(). For example, if you had the text file:

Code Example 2 - example text file
Line 1

Line 3
Line 4

Line 6

The returned list from readlines() would be:

Table 1 - resulting list from readlines
Index Value
0 'Line 1'
1 ''
2 'Line 3'
3 'Line 4'
4 ''
5 'Line 6'

The write() function, writes to the file. How did you guess? It writes from where the cursor is, and overwrites text in front of it - like in MS Word, where you press 'insert' and it writes over the top of old text. To utilise this most purposeful function, put a string between the brackets to write e.g. fileobjectname.write('this is a string').

close, you may figure, closes the file so that you can no longer read or write to it until you reopen in again. Simple enough. To use, you would write fileobjectname.close(). Simple!

In Python IDLE mode, open up a test file (or create a new one...) and play around with these functions. You can do some simple (and very inconvenient) text editing.

Pickles, in Python, are objects saved to a file. An object in this case could be a variables, instance of a class, or a list, dictionary, or tuple. Other things can also be pickled, but with limits. The object can then be restored, or unpickled, later on. In other words, you are 'saving' your objects.

So how do we pickle? With the dump() function, which is inside the pickle module - so at the beginning of your program you will have to write import pickle. Simple enough? Then open an empty file, and use pickle.dump() to drop the object into that file. Let's try that:

Code Example 3 - pickletest.py
### pickletest.py

# import the pickle module
import pickle

# lets create something to be pickled
# How about a list?
picklelist = ['one',2,'three','four',5,'can you count?']

# now create a file
# replace filename with the file you want to create
file = open('filename', 'wb')

# now let's pickle picklelist

# close the file, and your pickling is complete

First we open 'filename' for writing binary data(wb) - open('filename', 'wb')

Then we dump our list into binary file.

The code to do this is laid out like pickle.dump(object_to_pickle, file_object) where:

  • object_to_pickle is the object you want to pickle (i.e. save it to file)
  • file_object is the file object you want to write to (in this case, the file object is 'file')

After you close the file, open it in notepad and look at what you see. Along with some other gibblygook, you will see bits of the list we created.

Now to re-open, or unpickle, your file. to use this, we would use pickle.load():

Code Example 4 - unpickletest.py
### unpickletest.py
### unpickle file

# import the pickle module
import pickle

# now open a file for reading
# replace filename with the path to the file you created in pickletest.py
unpicklefile = open('filename', 'rb')

# now load the list that we pickled into a new object
unpickledlist = pickle.load(unpicklefile)

# close the file, just for safety

# Try out using the list
for item in unpickledlist:
  print item

Nifty, eh?

Of course, the limitation above is that we can only put in one object to a file. We could get around this by putting lots of picklable objects in a list or dictionary, and then pickling that list or dictionary. This is the quickest and easiest way, but you can do some pretty advanced stuff if you have advanced knowledge of pickle.

Or we can just use multiple 'pickle.dump(list, file)' functions:[1]

# pickle multiple lists
pickle.dump(list_1, file)
pickle.dump(list_2, file) 

and multiple 'pickle.load(file)' functions:

# unpickle multiple lists
list_read_1 = pickle.load(file)
list_read_2 = pickle.load(file) 

We won't cover complex examples here.