Saturday, February 23, 2013

Prisoners and Drink

In recent times, I’ve had occasion to spend time with people who drink – sometimes quite heavily  - and also people who gave up for an extended period before returning to their old drinking habits. It does people like me no harm to watch others' lives imperceptibly unravelling until they reach a place where the social stitching has disintegrated so badly that they no longer function properly. As an exercise in self-therapy, then, I found myself considering a few ideas that I hadn’t pursued for quite some time. Drinking is a game one plays with oneself, so a degree of understanding can be gained from a look at the mathematical theory of games. The ‘Prisoner’s Dilemma’ is, I think, the first example of co-operative game theory. Two men are arrested, but the police do not have enough information for a conviction. The police separate the two men, and offer both the same deal: if one testifies against his partner (defects/betrays), and the other remains silent (cooperates with/assists his partner), the betrayer goes free and the one that remains silent gets a one-year sentence. If both remain silent, both are sentenced to only one month in jail on a minor charge. If each 'rats out' the other, each receives a three-month sentence. Each prisoner must choose either to betray or remain silent; the decision of each is kept secret from his partner. What should they do? If it is assumed that each player is only concerned with lessening his own time in jail, the game becomes a non-zero sum game where the two players may either assist or betray the other. The sole concern of the prisoners seems to be increasing his own reward. The interesting symmetry of this problem is that the optimal decision for each is to betray the other, even though they would be better off if they both cooperated.
The normal game is shown below:

Prisoner B stays silent (cooperates)
Prisoner B betrays (defects)
Prisoner A stays silent (cooperates)
Each serves 1 month
Prisoner A: 12 months
Prisoner B: goes free
Prisoner A betrays (defects)
Prisoner B: 12 months
Prisoner A: goes free
Each serves 3 months
Here, regardless of what the other decides, each prisoner gets a higher pay-off by betraying the other. For example, Prisoner A can (according to the payoffs above) state that no matter what prisoner B chooses, prisoner A is better off 'ratting him out' (defecting) than staying silent (cooperating). As a result, based on the payoffs above, prisoner A should logically betray him. The game is symmetric, so Prisoner B should act the same way. Since both rationally decide to defect, each receives a lower reward than if both were to stay quiet. Traditional game theory results in both players being worse off than if each chose to lessen the sentence of his accomplice at the cost of spending more time in jail himself.
The structure of the traditional Prisoners’ Dilemma can be analysed by removing its original prisoner setting. Suppose that the two players are represented by colours, red and blue, and that each player chooses to either "Cooperate" or "Defect".
If both players play "Cooperate" they both get the payoff A. If Blue plays "Defect" while Red plays "Cooperate" then Blue gets B while Red gets C. Symmetrically, if Blue plays "Cooperate" while Red plays "Defect" then Blue gets payoff C while Red gets payoff B. If both players play "Defect" they both get the payoff D.
In terms of general point values:
Canonical PD payoff matrix

To be a prisoner's dilemma, the following must be true:
B > A > D > C
The fact that A>D implies that the "Both Cooperate" outcome is better than the "Both Defect" outcome, while B>A and D>C imply that "Defect" is the dominant strategy for both agents.
If two players play prisoners' dilemma more than once in succession and they remember previous actions of their opponent and change their strategy accordingly, the game is called iterated prisoners' dilemma.  But, enough of all this. Addiction can be seen as a game and can be cast as an intertemporal psychodynamic (PD)  problem between the present and future selves of the addict. In this case, defecting means relapsing, and it is easy to see that not defecting both today and in the future is by far the best outcome, and that defecting both today and in the future is the worst outcome. The case where one abstains today but relapses in the future is clearly a bad outcome - in some sense the discipline and self-sacrifice involved in abstaining today have been "wasted" because the future relapse means that the addict is right back where he started and will have to start over (which is quite demoralising, and makes starting over more difficult). The final case, where one engages in the addictive behaviour today while abstaining "tomorrow" will be familiar to anyone who has struggled with an addiction. The problem here is that (as in other PDs) there is an obvious benefit to defecting "today", but tomorrow one will face the same PD, and the same obvious benefit will be present then, ultimately leading to an endless string of defections. One trick – or ’learned behaviour’ as my psych friends would put it, is to make a rule never to defect today – it can always be postponed until tomorrow. And, tomorrow never comes.

1 comment:

  1. My head hurts. I absolutely positively despised these exercises when assigned at any point during my education. Years later, I am not surprised to recognise that familiar sense of hatred as I began reading your blog and clued in that we were going to discuss The Prisoner's Dilemma. I considered skipping all the middle bit and reading the concluding paragraph, but I thought I might miss some erudite little gem.
    I could have read the first and last paragraph and been totally impressed. Tomorrow does not ever come and this truth is the recovering addict's best friend.


Note: Only a member of this blog may post a comment.