Cooperation in the Prisoner's dilemma

As a game, the Prisoner's dilemma is deceptively simple- there are just two players, each choosing, once, to either cooperate or defect. But the question of whether cooperation can be ensured (for mutual benefit) is a fascinating one. The scenario is as follows:

Two prisoners are independently interrogated by the police for a crime they are guilty of, which they may individually either confess to or deny. The authorities lack firm evidence to convict either prisoner of this crime so they can only receive a minor jail sentence for previous misdemeanours if neither confesses. If both prisoners confess to the crime, then there is no doubt over their guilt and each receive the standard sentence. However, as an incentive to confess, the prisoners are told that if they confess whilst the other denies, then they will walk free for their honesty whilst the other is made an example of and receives a still harsher sentence.

In Bimatrix form, one version is

                                         Defect       Co-operate
                                    +--------------+--------------+
                                    |              |              |
                     Defect         |   1  , 1     |   4  ,  0    |
                                    |              |              |
                                    +--------------+--------------+
                                    |              |              |
                     Co-operate     |   0  , 4     |   3 ,  3     |
                                    |              |              |
                                    +--------------+--------------+

Where row one denotes a confession of guilt by Player 1 (in the standard terminology of the prisoner’s dilemma, to defect) and row two denotes a denial of guilt by Player 1 (referred to as the strategy of co-operation, in the sense that they co-operate with the other prisoner rather than the authorities). Similarly, the first column represents defection by Player 2, and the second cooperation by player 2.

The Prisoner’s dilemma is at the heart of many problems in social science. For instance, it can be recast in terms of economics as follows. Two companies may between them control the supply of a particular good, and are able to supply at either a high or low level. Were both to restrict supply by producing at a low level (mutual co-operation), then the price can be kept artificially high. There is insufficient demand in the market to justify high supply (mutual defection) by both companies, and thus if this occurs each will receive a smaller profit. However, restriction of supply by a single company will maintain the higher price point. Hence if one defects (high supply at high price) and the other co-operates (high price, but for a low supply) then the defecting company gains a significant market-share (and profit) advantage over the co-operating one, and so each company is motivated to defect.

In dominated strategy, it was shown that a non-cooperative analysis of the dilemma implies that the rational play is mutual defection, since to strive for a greater payoff by mutual cooperation incurs the risk of receiving no payoff at all. Nonetheless, there is considerable experimental evidence for people choosing to co-operate more than this game-theoretic analysis would predict. A recent experiment of this kind1 with university students revealed that defection rates of non-economics majors was under 40%. Whilst economics majors defected 60% of the time in the standard game, when given the opportunity to make (non-binding) deals with the other participants before play, both categories dropped to a defection rate of around 30%. Thus the question arises as to why it is possible for co-operative behaviour to emerge from a system where it is not a priori assumed by the players. If one seeks to use game theory to explain behaviour of individuals, these discrepancies between theory and practice must be resolved. Several interpretations are possible.

One way to resolve this problem is to argue that the mutual defection strategy is the correct response to the incorrect problem- that is, the payoff matrix presented is inaccurate in capturing the utility to each player; appropriate modifications will then yield a problem where mutual cooperation is the Nash equilibrium, thus explaining experimental observations.

Typically, such arguments relate to the notion of utility and introduce an additional cost to defection when the other player co-operates. Mechanisms ranging from guilt, inequality aversion or simple fear of repercussions from external authorities (be it the state or other criminals) have been advanced to explain this psychological difference between the objective payoff and subjective payoff to the defecting player.

A more interesting modification is to preserve the payoff matrix, but consider the relative merits of co-operation. The aversion to co-operation is due to the possibility of receiving a sucker payoff of 0 when the other player defects and you co-operate. Obviously for this to occur, one must play against an opponent who opts to defect. Earlier, rational behaviour was considered in terms of average expected outcome over a large number of plays. Since your payoff increases by 2 for mutual co-operation but only decreases by 1 when suckered by a defector, if you expect the probability of another player co-operating to be high enough, it may seem preferable to also co-operate, if only for purely selfish reasons.

We may attempt to formalise this as follows. We introduce a meta-game, whereby a player can assign themselves to one of two categories: category D, members of which always defect; or category C, where the strategy is to defect against players from category D, but to co-operate with others from category C. We can then consider whether it is rational to opt for category C or D. If the proportion of the population in category C is p, then a player from category D expects a payoff of 1 whereas one from category C expects 3×p + 1×(1 - p) = 1 + 2p ≥ 1; category C is preferable. This is unsurprising- if one can perfectly identify other co-operators, it pays to co-operate with them. But this analysis has performed a sleight-of-hand: if one can entirely trust other players to co-operate, then there was never any dilemma, since a co-operative game emerges. Without that trust, there is a category superior to C: consider a category C', whose members pretend to be of category C, then defect anyway. This ensures a return of 1 against other defectors, or other players from C', but gains a payoff of 4 rather than 3 from members of category C; who now must factor in the potential of a sucker payoff. In line with the assumptions of non-cooperative game theory, the ability to label oneself as being of a particular category is of no use without a guarantee of honesty.

Certainly then in the one-shot game, there is no avoiding the rationality of defection. An explanation for observed behaviour therefore must require further factors, such as the desire to establish trust for future benefit over a series of games. For this, we consider the iterated prisoner's dilemma.


1R. Frank, T. Gilovich and D. Regan (1993) ‘Does studying economics inhibit cooperation?’ Journal of Economic Perspectives Vol 7 Issue 2 (spring).


Part of A survey of game theory- see project homenode for details and links to the print version.