Introduction to Game Theory/Prisoner's Dilemma

Let's start by jumping right in and looking at a game. The game commonly referred to as The Prisoner's Dilemma is a classic example used to demonstrate game theory. It is usually explained through the use of this story, although the actual game called The Prisoner's Dilemma - often just called PD for short, is not limited to this situation. The underlying dynamics of it can be used to describe all sorts of phenomena.

The Story


Two men, Andy and Bob, were arrested after an armed robbery. The police had enough evidence to convict the two for the theft of the get-away car, but not enough to convict them for the actual armed robbery. However, if the police could get a confession from either of the two men they could conceivably convict them both for the armed robbery.

The police locked the two men in two separate rooms and gave them each the same offer:

If Andy confessed and Bob stayed silent, then Andy would go scot-free and Bob would be charged for the robbery and get 10 years in jail. Of course, this worked the other way around as well. If Bob confessed and Andy stayed silent, Andy would receive the 10 years.
If Andy confessed and Bob confessed as well, then they would both receive 7 years in jail.
If both Andy and Bob stayed silent, then they would both receive 2 years in prison for the get-away car robbery.

The two prisoners are left to make their decision without any way to contact each other. The question is: what did each person choose?



The solution that occurs every time this game is played (assuming each acts in his own best interests) is that both Andy and Bob will choose to confess, resulting in a sentence of 7 years each. This answer seems to be counter-intuitive, doesn't it? Why would both players choose to confess, an option that is clearly inferior to both of them staying silent and getting 2 years each? Not only this, but in terms of total years in prison, this is the worst possible outcome!



The reason that both players choose to confess is easy to explain. Let's talk about Person A (and whatever holds for Andy, will hold for Bob as well because they are both in identical situations).

The following is the explanation assuming that Andy & Bob cannot communicate their choices to each other directly or indirectly.

Andy has the following Matrix:

If he confesses:

Minimum Jail Term: 0 Years (If Bob remains silent)
Maximum Jail Term: 7 Years (If Bob confesses)

If he remains silent:

Minimum Jail Term: 2 Years (If Bob remains silent)
Maximum Jail Term: 10 Years (If Bob confesses)

Table Format
Bob remains silent Bob Confesses
Andy Confesses 0 7
Andy Remains Silent 2 10

The expected payoff for the game (the average amount of benefit that a strategy will provide) is better — in this case, 3.5 years expected jail time for confessing versus 6 years for silence — if Andy confesses. Therefore, from a rational perspective, Andy should choose to confess rather than remain silent.

Also, it doesn't matter what Bob does - Andy is always better off confessing. If Bob confesses, Andy can get either 7 years for confessing or 10 for silence, and if Bob is silent, Andy can get either 0 years for confessing or 2 for silence. Unfortunately for Person Andy, this also holds true for Person Bob - who is always better off confessing. This means that if both agents do what is in their best interests, they will be 7 years together in prison! This demonstrates that in many games the "best" solution - the one where the total utility of the outcome is highest - is not the one which will ultimately occur.

Strategic games · Matrix Notation