### Cooperation in the Prisoner's dilemma

As a game, the Prisoner's dilemma is deceptively simple- there are just two players, each choosing, once, to either cooperate or defect. But the question of whether cooperation can be ensured (for mutual benefit) is a fascinating one. The scenario is as follows:

*Two prisoners are independently interrogated by the
police for a crime they are guilty of, which they may individually either confess to or deny. The
authorities lack firm evidence to convict either prisoner of this crime so they can only receive a
minor jail sentence for previous misdemeanours if neither confesses. If both prisoners confess to the
crime, then there is no doubt over their guilt and each receive the standard sentence. However, as
an incentive to confess, the prisoners are told that if they confess whilst the other denies, then they
will walk free for their honesty whilst the other is made an example of and receives a still harsher
sentence.*

In Bimatrix form, one version is

Defect Co-operate
+--------------+--------------+
| | |
Defect | 1 , 1 | 4 , 0 |
| | |
+--------------+--------------+
| | |
Co-operate | 0 , 4 | 3 , 3 |
| | |
+--------------+--------------+

Where row one denotes a confession of guilt by Player 1 (in the standard terminology of the
prisoner’s dilemma, to defect) and row two denotes a denial of guilt by Player 1 (referred to as
the strategy of co-operation, in the sense that they co-operate with the other prisoner rather than
the authorities). Similarly, the first column represents defection by Player 2, and the second cooperation
by player 2.

The Prisoner’s dilemma is at the heart of many problems in social science. For instance, it can
be recast in terms of economics as follows. Two companies may between them control the supply
of a particular good, and are able to supply at either a high or low level. Were both to restrict
supply by producing at a low level (mutual co-operation), then the price can be kept artificially
high. There is insufficient demand in the market to justify high supply (mutual defection) by both
companies, and thus if this occurs each will receive a smaller profit. However, restriction of supply
by a single company will maintain the higher price point. Hence if one defects (high supply at high
price) and the other co-operates (high price, but for a low supply) then the defecting company gains
a significant market-share (and profit) advantage over the co-operating one, and so each company
is motivated to defect.

In dominated strategy, it was shown that a non-cooperative analysis of the dilemma implies that the rational play is mutual defection, since to strive for a greater payoff by mutual cooperation incurs the risk of receiving no payoff at all. Nonetheless, there is considerable experimental evidence for people choosing
to co-operate more than this game-theoretic analysis would predict. A recent experiment of this kind^{1} with university students revealed that defection
rates of non-economics majors was under 40%. Whilst economics majors defected 60% of the time
in the standard game, when given the opportunity to make (non-binding) deals with the other
participants before play, both categories dropped to a defection rate of around 30%. Thus the question arises as to why it is possible for co-operative behaviour to emerge from a system where it is not *a priori* assumed by the players. If one seeks to use game theory to explain behaviour of individuals, these discrepancies between theory and practice must be resolved. Several interpretations are possible.

One way to resolve this problem is to argue that the mutual defection strategy is the correct
response to the incorrect problem- that is, the payoff matrix presented is inaccurate in capturing
the utility to each player; appropriate modifications will then yield a problem where mutual cooperation
is the Nash equilibrium, thus explaining experimental observations.

Typically, such arguments relate to the notion of utility and introduce
an additional cost to defection when the other player co-operates. Mechanisms ranging from guilt,
inequality aversion or simple fear of repercussions from external authorities (be it the state or other
criminals) have been advanced to explain this psychological difference between the objective payoff
and subjective payoff to the defecting player.

A more interesting modification is to preserve the payoff matrix, but consider the relative merits
of co-operation. The aversion to co-operation is due to the possibility of receiving a sucker payoff
of 0 when the other player defects and you co-operate. Obviously for this to occur, one must play
against an opponent who opts to defect. Earlier, rational behaviour was considered in terms of
average expected outcome over a large number of plays. Since your payoff increases by 2 for mutual
co-operation but only decreases by 1 when suckered by a defector, if you expect the probability of
another player co-operating to be high enough, it may seem preferable to also co-operate, if only
for purely selfish reasons.

We may attempt to formalise this as follows. We introduce a meta-game, whereby a player can
assign themselves to one of two categories: category D, members of which always defect; or category
C, where the strategy is to defect against players from category D, but to co-operate with others
from category C. We can then consider whether it is rational to opt for category C or D.
If the proportion of the population in category C is p, then a player from category D expects
a payoff of 1 whereas one from category C expects 3×p + 1×(1 - p) = 1 + 2p ≥ 1; category
C is preferable. This is unsurprising- if one can perfectly identify other co-operators, it pays to
co-operate with them. But this analysis has performed a sleight-of-hand: if one can entirely trust
other players to co-operate, then there was never any dilemma, since a co-operative game emerges.
Without that trust, there is a category superior to C: consider a category C', whose members
pretend to be of category C, then defect anyway. This ensures a return of 1 against other defectors,
or other players from C', but gains a payoff of 4 rather than 3 from members of category C; who
now must factor in the potential of a sucker payoff. In line with the assumptions of non-cooperative
game theory, the ability to label oneself as being of a particular category is of no use without a
guarantee of honesty.

Certainly then in the one-shot game, there is no avoiding the rationality of defection. An explanation for observed behaviour therefore must require further factors, such as the desire to establish trust for future benefit over a series of games. For this, we consider the iterated prisoner's dilemma.

^{1}R. Frank, T. Gilovich and D. Regan (1993) ‘Does studying economics inhibit cooperation?’
*Journal of Economic Perspectives* Vol 7 Issue 2 (spring).

Part of *A survey of game theory*- see project homenode for details and links to the print version.