How to Think Like a Computer Scientist: Learning with Python 2nd Edition/Dictionaries

Dictionaries

edit

All of the compound data types we have studied in detail so far --- strings, lists, and tuples---are sequence types, which use integers as indices to access the values they contain within them.

Dictionaries are a different kind of compound type. They are Python's built-in mapping type. They map keys, which can be any immutable type, to values, which can be any type, just like the values of a list or tuple.

As an example, we will create a dictionary to translate English words into Spanish. For this dictionary, the keys are strings.

One way to create a dictionary is to start with the empty dictionary and add key-value pairs. The empty dictionary is denoted {}:

The first assignment creates a dictionary named eng2sp; the other assignments add new key-value pairs to the dictionary. We can print the current value of the dictionary in the usual way:

The key-value pairs of the dictionary are separated by commas. Each pair contains a key and a value separated by a colon.

The order of the pairs may not be what you expected. Python uses complex algorithms to determine where the key-value pairs are stored in a dictionary. For our purposes we can think of this ordering as unpredictable.

Another way to create a dictionary is to provide a list of key-value pairs using the same syntax as the previous output:

It doesn't matter what order we write the pairs. The values in a dictionary are accessed with keys, not with indices, so there is no need to care about ordering.

Here is how we use a key to look up the corresponding value:

The key 'two' yields the value 'dos'.

Dictionary operations

edit

The del statement removes a key-value pair from a dictionary. For example, the following dictionary contains the names of various fruits and the number of each fruit in stock:

If someone buys all of the pears, we can remove the entry from the dictionary:

Or if we're expecting more pears soon, we might just change the value associated with pears:

The len function also works on dictionaries; it returns the number of key-value pairs:

Dictionary methods

edit

Dictionaries have a number of useful built-in methods.

The keys method takes a dictionary and returns a list of its keys.

As we saw earlier with strings and lists, dictionary methods use dot notation, which specifies the name of the method to the right of the dot and the name of the object on which to apply the method immediately to the left of the dot. The parentheses indicate that this method takes no parameters.

A method call is called an invocation; in this case, we would say that we are invoking the keys method on the object eng2sp. As we will see in a few chapters when we talk about object oriented programming, the object on which a method is invoked is actually the first argument to the method.

The values method is similar; it returns a list of the values in the dictionary:

The items method returns both, in the form of a list of tuples --- one for each key-value pair:

The has_key method takes a key as an argument and returns True if the key appears in the dictionary and False otherwise:

This method can be very useful, since looking up a nonexistent key in a dictionary causes a runtime error:

Aliasing and copying

edit

Because dictionaries are mutable, you need to be aware of aliasing. Whenever two variables refer to the same object, changes to one affect the other.

If you want to modify a dictionary and keep a copy of the original, use the copy method. For example, opposites is a dictionary that contains pairs of opposites:

alias and opposites refer to the same object; copy refers to a fresh copy of the same dictionary. If we modify alias, opposites is also changed:

If we modify copy, opposites is unchanged:

Sparse matrices

edit

We previously used a list of lists to represent a matrix. That is a good choice for a matrix with mostly nonzero values, but consider a sparse matrix_ like this one:

sparse matrix The list representation contains a lot of zeros:

An alternative is to use a dictionary. For the keys, we can use tuples that contain the row and column numbers. Here is the dictionary representation of the same matrix:

We only need three key-value pairs, one for each nonzero element of the matrix. Each key is a tuple, and each value is an integer.

To access an element of the matrix, we could use the [] operator:

Notice that the syntax for the dictionary representation is not the same as the syntax for the nested list representation. Instead of two integer indices, we use one index, which is a tuple of integers.

There is one problem. If we specify an element that is zero, we get an error, because there is no entry in the dictionary with that key:

The get method solves this problem:

The first argument is the key; the second argument is the value get should return if the key is not in the dictionary:

get definitely improves the semantics of accessing a sparse matrix. Shame about the syntax.

Hints

edit

If you played around with the fibonacci function from the last chapter, you might have noticed that the bigger the argument you provide, the longer the function takes to run. Furthermore, the run time increases very quickly. On one of our machines, fibonacci(20) finishes instantly, fibonacci(30) takes about a second, and fibonacci(40) takes roughly forever.

To understand why, consider this call graph for fibonacci with n = 4:

fibonacci tree A call graph shows a set function frames, with lines connecting each frame to the frames of the functions it calls. At the top of the graph, fibonacci with n = 4 calls fibonacci with n = 3 and n = 2. In turn, fibonacci with n = 3 calls fibonacci with n = 2 and n = 1. And so on.

Count how many times fibonacci(0) and fibonacci(1) are called. This is an inefficient solution to the problem, and it gets far worse as the argument gets bigger.

A good solution is to keep track of values that have already been computed by storing them in a dictionary. A previously computed value that is stored for later use is called a hint. Here is an implementation of fibonacci using hints:

The dictionary named previous keeps track of the Fibonacci numbers we already know. We start with only two pairs: 0 maps to 1; and 1 maps to 1.

Whenever fibonacci is called, it checks the dictionary to determine if it contains the result. If it's there, the function can return immediately without making any more recursive calls. If not, it has to compute the new value. The new value is added to the dictionary before the function returns.

Using this version of fibonacci, our machines can compute fibonacci(100) in an eyeblink.

The L at the end of the number indicates that it is a long integer.

Long integers

edit

Python provides a type called long that can handle any size integer (limited only by the amount of memory you have on your computer).

There are three ways to create a long value. The first one is to compute an arithmetic expression too large to fit inside an int. We already saw this in the fibonacci(100) example above. Another way is to write an integer with a capital L at the end of your number:

The third is to call long with the value to be converted as an argument. long, just like int and float, can convert ints, floats, and even strings of digits to long integers:

Counting letters

edit

In Chapter 7, we wrote a function that counted the number of occurrences of a letter in a string. A more general version of this problem is to form a histogram of the letters in the string, that is, how many times each letter appears.

Such a histogram might be useful for compressing a text file. Because different letters appear with different frequencies, we can compress a file by using shorter codes for common letters and longer codes for letters that appear less frequently.

Dictionaries provide an elegant way to generate a histogram:

We start with an empty dictionary. For each letter in the string, we find the current count (possibly zero) and increment it. At the end, the dictionary contains pairs of letters and their frequencies.

It might be more appealing to display the histogram in alphabetical order. We can do that with the items and sort methods:

Case Study: Robots

edit

The game

edit

In this case study we will write a version of the classic console based game, robots_.

Robots is a turn-based game in which the protagonist, you, are trying to stay alive while being chased by stupid, but relentless robots. Each robot moves one square toward you each time you move. If they catch you, you are dead, but if they collide they die, leaving a pile of dead robot junk in their wake. If other robots collide with the piles of junk, they die.

The basic strategy is to position yourself so that the robots collide with each other and with piles of junk as they move toward you. To make the game playable, you also are given the ability to teleport to another location on the screen -- 3 times safely and randomly thereafter, so that you don't just get forced into a corner and loose every time.

Setting up the world, the player, and the main loop

edit

Let's start with a program that places the player on the screen and has a function to move her around in response to keys pressed:

Programs like this one that involve interacting with the user through events such as key presses and mouse clicks are called event-driven programs_.

The main event loop at this stage is simply:

The event handling is done inside the move_player function. update_when('key_pressed') waits until a key has been pressed before moving to the next statement. The multi-way branching statement then handles the all keys relevant to game play.

Pressing the escape key causes move_player to return True, making not finished false, thus exiting the main loop and ending the game. The 4, 7, 8, 9, 6, 3, 2, and 1 keys all cause the player to move in the appropriate direction, if she isn't blocked by the edge of a window.

Adding a robot

edit

Now let's add a single robot that heads toward the player each time the player moves.

Add the following place_robot function between place_player and move_player:

Add move_robot immediately after move_player:

We need to pass both the robot and the player to this function so that it can compare their locations and move the robot toward the player.

Now add the line robot = place_robot() in the main body of the program immediately after the line player = place_player(), and add the move_robot(robot, player) call inside the main loop immediately after finished = move_player(player).

Checking for Collisions

edit

We now have a robot that moves relentlessly toward our player, but once it catches her it just follows her around wherever she goes. What we want to happen is for the game to end as soon as the player is caught. The following function will determine if that has happened:

Place this new function immediately below the move_player function. Now let's modify play_game to check for collisions:

We rename the variable finished to defeated, which is now set to the result of collided. The main loop runs as long as defeated is false. Pressing the key still ends the program, since we check for quit and break out of the main loop if it is true. Finally, we check for defeated immediately after the main loop and display an appropriate message if it is true.

Adding more robots

edit

There are several things we could do next:

  • give the player the ability to teleport to another location to escape pursuit.
  • provide safe placement of the player so that it never starts on top of a robot.
  • add more robots.

Adding the ability to teleport to a random location is the easiest task, and it has been left to you to complete as an exercise.

How we provide safe placement of the player will depend on how we represent multiple robots, so it makes sense to tackle adding more robots first.

To add a second robot, we could just create another variable named something like robot2 with another call to place_robot. The problem with this approach is that we will soon want lots of robots, and giving them all their own names will be cumbersome. A more elegant solution is to place all the robots in a list:

Now instead of calling place_robot in play_game, call place_robots, which returns a single list containing all the robots:

With more than one robot placed, we have to handle moving each one of them. We have already solved the problem of moving a single robot, however, so traversing the list and moving each one in turn does the trick:

Add move_robots immediately after move_robot, and change play_game to call move_robots instead of move_robot.

We now need to check each robot to see if it has collided with the player:

Add check_collisions immediately after collided and change the line in play_game that sets defeated to call check_collisions instead of collided.

Finally, we need to loop over robots to remove each one in turn if defeated becomes true. Adding this has been left as an exercise.

Winning the game

edit

The biggest problem left in our game is that there is no way to win. The robots are both relentless and indestructible. With careful maneuvering and a bit of luck teleporting, we can reach the point where it appears there is only one robot chasing the player (all the robots will actually just be on top of each other). This moving pile of robots will continue chasing our hapless player until it catches it, either by a bad move on our part or a teleport that lands the player directly on the robots.

When two robots collide they are supposed to die, leaving behind a pile of junk. A robot (or the player) is also supposed to die when it collides with a pile of junk. The logic for doing this is quite tricky. After the player and each of the robots have moved, we need to:

  1. Check whether the player has collided with a robot or a pile of junk. If so, set defeated to true and break out of the game loop.
  2. Check each robot in the robots list to see if it has collided with a pile of junk. If it has, disregard the robot (remove it from the robots list).
  3. Check each of the remaining robots to see if they have collided with another robot. If they have, discard all the robots that have collided and place a pile of junk at the locations they occupied.
  4. Check if any robots remain. If not, end the game and mark the player the winner.

Let's take on each of these tasks in turn.

Adding junk

edit

Most of this work will take place inside our check_collisions function. Let's start by modifying collided, changing the names of the parameters to reflect its more general use:

We now introduce a new empty list named junk immediately after the call to place_robots:

and modify check_collisions to incorporate the new list:

Be sure to modify the call to check_collisions (currently defeated = check_collisions(robots, player)) to include junk as a new argument.

Again, we need to fix the logic after if defeated: to remove the new junk from the screen before displaying the They got you! message:

Since at this point junk is always an empty list, we haven't changed the behavior of our program. To test whether our new logic is actually working, we could introduce a single junk pile and run our player into it, at which point the game should remove all items from the screen and display the ending message.

It will be helpful to modify our program temporarily to change the random placement of robots and player to predetermined locations for testing. We plan to use solid boxes to represent junk piles. We observe that placing a robot is very similar to placing a junk pile, and modify place_robot to do both:

Notice that x and y are now parameters, along with a new parameter that we will use to set filled to true for piles of junk.

Our program is now broken, since the call in place_robots to place_robot does not pass arguments for x and y. Fixing this and setting up the program for testing is left to you as an exercise.

Removing robots that hit junk

edit

To remove robots that collide with piles of junk, we add a nested loop to check_collisions between each robot and each pile of junk. Our first attempt at this does not work:

Running this new code with the program as setup in exercise 11, we find a bug. It appears that the robots continue to pass through the pile of junk as before.

Actually, the bug is more subtle. Since we have two robots on top of each other, when the collision of the first one is detected and that robot is removed, we move the second robot into the first position in the list and it is missed by the next iteration. It is generally dangerous to modify a list while you are iterating over it. Doing so can introduce a host of difficult to find errors into your program.

The solution in this case is to loop over the robots list backwards, so that when we remove a robot from the list all the robots whose list indexes change as a result are robots we have already evaluated.

As usual, Python provides an elegant way to do this. The built-in function, reversed provides for backward iteration over a sequence. Replacing:

with:

will make our program work the way we intended.

Turning robots into junk and enabling the player to win

edit

We now want to check each robot to see if it has collided with any other robots. We will remove all robots that have collided, leaving a single pile of junk in their wake. If we reach a state where there are no more robots, the player wins.

Once again we have to be careful not to introduce bugs related to removing things from a list over which we are iterating.

Here is the plan:

  1. Check each robot in robots (an outer loop, traversing forward).
  2. Compare it with every robot that follows it (an inner loop, traversing backward).
  3. If the two robots have collided, add a piece of junk at their location, mark the first robot as junk, and remove the second one.
  4. Once all robots have been checked for collisions, traverse the robots list once again in reverse, removing all robots marked as junk.
  5. Check to see if any robots remain. If not, declare the player the winner.

Adding the following to check_collisions will accomplish most of what we need to do:

We make use of the enumerate function we saw in Chapter 9 to get both the index and value of each robot as we traverse forward. Then a reverse traversal of the slice of the remaining robots, reversed(robots[index+1:]), sets up the collision check.

Whenever two robots collide, our plan calls for adding a piece of junk at that location, marking the first robot for later removal (we still need it to compare with the other robots), and immediately removing the second one. The body of the if collided(robot1, robot2): conditional is designed to do just that, but if you look carefully at the line:

you should notice a problem. robot1['junk'] will result in a syntax error, since our robot dictionary does not yet contain a 'junk' key. To fix this we modify place_robot to accommodate the new key:

It is not at all unusual for data structures to change as program development proceeds. Stepwise refinement of both program data and logic is a normal part of the structured programming process.

After robot1 is marked as junk, we add a pile of junk to the junk list at the same location with junk.append(place_robot(robot1['x'], robot1['y'], True)), and then remove robot2 from the game by first removing its shape from the graphics window and then removing it from the robots list.

The next loop traverses backward over the robots list removing all the robots previously marked as junk. Since the player wins when all the robots die, and the robot list will be empty when it no longer contains live robots, we can simply check whether robots is empty to determine whether or not the player has won.

This can be done in check_collisions immediately after we finish checking robot collisions and removing dead robots by adding:

Hmmm... What should we return? In its current state, check_collisions is a Boolean function that returns true when the player has collided with something and lost the game, and false when the player has not lost and the game should continue. That is why the variable in the play_game function that catches the return value is called defeated.

Now we have three possible states:

  1. robots is not empty and the player has not collided with anything -- the game is still in play
  2. the player has collided with something -- the robots win
  3. the player has not collided with anything and robots is empty -- the player wins

In order to handle this with as few changes as possible to our present program, we will take advantage of the way that Python permits sequence types to live double lives as Boolean values. We will return an empty string -- which is false -- when game play should continue, and either "robots_win" or "player_wins" to handle the other two cases. check_collisions should now look like this:

A few corresponding changes need to be made to play_game to use the new return values. These are left as an exercise.

Glossary

edit

Exercises

edit
  1. Write a program that reads in a string on the command line and returns a table of the letters of the alphabet in alphabetical order which occur in the string together with the number of times each letter occurs. Case should be ignored. A sample run of the program would look this this:

    $ python letter_counts.py "ThiS is String with Upper and lower case Letters."
    a  2
    c  1
    d  1
    e  5
    g  1
    h  2
    i  4
    l  2
    n  2
    o  1
    p  2
    r  4
    s  5
    t  5
    u  1
    w  2
    $
  2. Give the Python interpreter's response to each of the following from a continuous interpreter session:

    #.
    #.
    #.
    #.
    #.
    #.
    #.

    Be sure you understand why you get each result. Then apply what you have learned to fill in the body of the function below:

    Your solution should pass the doctests.
  3. Write a program called alice_words.py that creates a text file named alice_words.txt containing an alphabetical listing of all the words found in alice_in_wonderland.txt_ together with the number of times each word occurs. The first 10 lines of your output file should look something like this:

    Word              Count
    =======================
    a                 631
    a-piece           1
    abide             1
    able              1
    about             94
    above             3
    absence           1
    absurd            2
    How many times does the word, alice, occur in the book?
  4. What is the longest word in Alice in Wonderland ? How many characters does it have?
  5. Copy the code from the Setting up the world, the player, and the main loop section into a file named robots.py and run it. You should be able to move the player around the screen using the numeric keypad and to quit the program by pressing the escape key.
  6. Laptops usually have smaller keyboards than desktop computers that do not include a separate numeric keypad. Modify the robots program so that it uses 'a', 'q', 'w', 'e', 'd', 'c', 'x', and 'z' instead of '4', '7', '8', '9', '6', '3', '2', and '1' so that it will work on a typical laptop keyboard.
  7. Add all the code from the Adding a robot section in the places indicated. Make sure the program works and that you now have a robot following around your player.
  8. Add all the code from the Checking for Collisions section in the places indicated. Verify that the program ends when the robot catches the player after displaying a They got you! message for 3 seconds.
  9. Modify the move_player function to add the ability for the player to jump to a random location whenever the 0 key is pressed. (hint: place_player already has the logic needed to place the player in a random location. Just add another conditional branch to move_player that uses this logic when key_pressed('0') is true.) Test the program to verify that your player can now teleport to a random place on the screen to get out of trouble.
  10. Make all the changes to your program indicated in Adding more robots. Be sure to loop over the robots list, removing each robot in turn, after defeated becomes true. Test your program to verify that there are now two robots chasing your player. Let a robot catch you to test whether you have correctly handled removing all the robots. Change the argument from 2 to 4 in robots = place_robots(2) and confirm that you have 4 robots.
  11. Make the changes to your program indicated in Adding junk. Fix place_robots by moving the random generation of values for x and y to the appropriate location and passing these values as arguments in the call to place_robot. Now we are ready to make temporary modifications to our program to remove the randomness so we can control it for testing. We can start by placing a pile of junk in the center of our game board. Change:

    to:

    Run the program and confirm that there is a black box in the center of the board. Now change place_player so that it looks like this:

    Finally, temporarily comment out the random generation of x and y values in place_robots and the creation of numbots robots. Replace this logic with code to create two robots in fixed locations:

    When you start your program now, it should look like this:

    robots 1

    When you run this program and either stay still (by pressing the 5 repeatedly) or move away from the pile of junk, you can confirm that the robots move through it unharmed. When you move into the junk pile, on the other hand, you die.
  12. Make the following modifications to play_game to integrate with the changes made in Turning robots into junk and enabling the player to win:

    1. Rename defeated to winner and initialize it to the empty string instead of False.