Newcomb's paradox
Apr. 15th, 2009 11:11 pmLinks
Newcomb's paradox on wikiepedia
Newcomb's paradox on Overcoming Bias blog
Newcomb's paradox on Scott Aaronson's blog and lectures
I first came across it via overcoming bias, and discussed it with a few people, but then recently saw it again in one of the transcriptions of scott aaronson's philosophy/quantum/computing lectures.
Newcomb's paradox
In very short, Newcomb's paradox says, suppose you're a professor and a grad student (or, in some cases, a superintelligent alien) comes to you and demonstrates this experiment. She chooses a volunteer, examines them, then takes two boxes, puts £1000 in box A and either £1000000 or nothing in the box B (see below for how she decides). She brings the boxes into the room and explains the set-up to the volunteer and says that they're allowed to either take the mystery box (when they either get lots or nothing) or take both boxes (when they get at least £1000.)
She even lets them see the £100000 beforehand so they know it exists, and lets them peek into box A to show it does have the money in, though box B remains a secret until afterwards.
"What's the catch," the volunteer asks. "Ah," begins the experimenter. "I have previously examined you, and worked out which choice you're going to make. If you were going to choose both boxes, I put nothing in box B. Only if you were going to take box B only, did I choose to put £1000000 in it.
"Hm", says the volunteer. "What do I do?"
A few caveats
"What if the volunteer would change their mind when they discovered the reasoning, or is going to choose based on a coin toss?" "Then I didn't accept them as a volunteer."
"How do I know it works?" You can't be sure, but she performs the experiment lots of times and is always right, so you are convinced. (Some examples ask you to presume as part of the conditions that she can, or take it on trust, but I think "having seen it work" makes it most convincing and concrete.)
"Ah, but I don't care about £1000, and certainly not if I've got £1000000, so I don't care," Well, ask what you would do if the numbers were a bit different. Can you pretend there's no combination where you'd risk something to get the little one, yet risk more to get the big one?
"How did they know what they'd pick when this experiment was performed the first time?" It doesn't really matter, just assume that you have seen it working with pretty-perfect prediction.
What would you do? An enumeration of the two obvious arguments.
11: "Why should you take both boxes?" Duh! Because whatever's in either of the boxes, you get all of it. And if that means I fail to get the million, then it's already too late to change that, isn't it?
22: "Why should you take box B?" Duh! Because you've just seen 50 people do the experiment, and all the ones who took both got £1000 and all the ones who took B got £1000000. Follow what works, even if you can't justify it with maths.
That's why it's a paradox, because, if you squint long enough, both answers seem perfectly reasonable.
I know this seems a little convoluted, but I tried to make it comprehensible if terse even to people without a very high opinion or training in philosophy (like me, in general). And hopefully get it to the point where at least asking the question makes sense.
Wait, if we use baysian reasoning, I bet the arguments will instantly become transparent and non-controversial. Right?
11: As above. Look at the table, and enumerate the possibilities. Choosing more boxes always gives a bigger payoff.
22: Ah, no, you're cheating. Based on the previous evidence, you must assume a priori that you are on row 2 or 3. After than, the choice is easy: row 3 gives more money. (See below for more "so, there's a 2/3 chance I'm in this universe..." type reasoning.)
Can we put this on a more rational footing? How does she predict what's going to happen
In fact, there are several ways.
1. You can do it if you postulate time travel, or determinism and a copy-teleport-machine, but those are not very realistic things to postulate, whether they would be physically possible or not.
2. A super-intelligent alien scans your brain and models it in a computer.
3. They give you a short-term-memory impairing drug, and try the experiment out several times beforehand while you remain the same person with the same experiences, but have no memory of the trials.
4. They discover that 94% of the time, all men choose one way, and all women choose the other. (But the experiment is double-blind, run by a technician who doesn't know the expected results, so the grad student peeks, then tells you that there's a 94% correlation, but not which way round it is, then invites you to participate yourself.)
Further arguments
33: Aha! In method 3, you don't know which one you are, one of the trial runs, or the final experiment. The only consistent answer which gets the big money is to assume you're more likely to be a trial run, and hence choose the money.
44: Aha! If so, then (*invokes Greg Egan*), the same reasoning applies with method 2. That suggests that you don't know if you're you, or the simulation of you!
55: Nope. Not so. What about method 4? Surely you can't claim that your consciousness might be either (a) you or (b) "the statistical correlation between gender and box choice"??
Which leaves us back where we started. (But remind me to come back to the "am I equally likely to be me, or some other human or simulation of a human".)
Free will
"What does it have to do with free will?" Well, the experiment is completely (sort-of) practical to do. In theory. And so you'd think it should be also actually possible to choose which to take. And yet it doesn't seem to be, and the answer seems to depend maybe on whether you believe in something you can call "free will".
In fact, people divide between take A&B, take B, and "problem is stupid, won't consider". In general, I think the last answer is often over-overlooked. In this case, if I'd seen it work out like that, I'd agree to take only box B, even if I couldn't explain the mathematics behind it. However, I also definitely feel I should be able to justify one case or the other.
Informally, it seems most people seem to eventually take B, but I don't know how important that is.
Apocrypha
Links to prisoner's dilemma, links to doomsday paradox, etc, etc.
Newcomb's paradox on wikiepedia
Newcomb's paradox on Overcoming Bias blog
Newcomb's paradox on Scott Aaronson's blog and lectures
I first came across it via overcoming bias, and discussed it with a few people, but then recently saw it again in one of the transcriptions of scott aaronson's philosophy/quantum/computing lectures.
Newcomb's paradox
In very short, Newcomb's paradox says, suppose you're a professor and a grad student (or, in some cases, a superintelligent alien) comes to you and demonstrates this experiment. She chooses a volunteer, examines them, then takes two boxes, puts £1000 in box A and either £1000000 or nothing in the box B (see below for how she decides). She brings the boxes into the room and explains the set-up to the volunteer and says that they're allowed to either take the mystery box (when they either get lots or nothing) or take both boxes (when they get at least £1000.)
She even lets them see the £100000 beforehand so they know it exists, and lets them peek into box A to show it does have the money in, though box B remains a secret until afterwards.
| Choice | In box A | In box B | Total obtained |
| B only | £1000 | £0 | £0 |
| Both | £1000 | £0 | £1000 |
| B only | £1000 | £1000000 | £1000000 |
| Both | £1000 | £1000000 | £1001000 |
"What's the catch," the volunteer asks. "Ah," begins the experimenter. "I have previously examined you, and worked out which choice you're going to make. If you were going to choose both boxes, I put nothing in box B. Only if you were going to take box B only, did I choose to put £1000000 in it.
"Hm", says the volunteer. "What do I do?"
A few caveats
"What if the volunteer would change their mind when they discovered the reasoning, or is going to choose based on a coin toss?" "Then I didn't accept them as a volunteer."
"How do I know it works?" You can't be sure, but she performs the experiment lots of times and is always right, so you are convinced. (Some examples ask you to presume as part of the conditions that she can, or take it on trust, but I think "having seen it work" makes it most convincing and concrete.)
"Ah, but I don't care about £1000, and certainly not if I've got £1000000, so I don't care," Well, ask what you would do if the numbers were a bit different. Can you pretend there's no combination where you'd risk something to get the little one, yet risk more to get the big one?
"How did they know what they'd pick when this experiment was performed the first time?" It doesn't really matter, just assume that you have seen it working with pretty-perfect prediction.
What would you do? An enumeration of the two obvious arguments.
11: "Why should you take both boxes?" Duh! Because whatever's in either of the boxes, you get all of it. And if that means I fail to get the million, then it's already too late to change that, isn't it?
22: "Why should you take box B?" Duh! Because you've just seen 50 people do the experiment, and all the ones who took both got £1000 and all the ones who took B got £1000000. Follow what works, even if you can't justify it with maths.
That's why it's a paradox, because, if you squint long enough, both answers seem perfectly reasonable.
I know this seems a little convoluted, but I tried to make it comprehensible if terse even to people without a very high opinion or training in philosophy (like me, in general). And hopefully get it to the point where at least asking the question makes sense.
Wait, if we use baysian reasoning, I bet the arguments will instantly become transparent and non-controversial. Right?
11: As above. Look at the table, and enumerate the possibilities. Choosing more boxes always gives a bigger payoff.
22: Ah, no, you're cheating. Based on the previous evidence, you must assume a priori that you are on row 2 or 3. After than, the choice is easy: row 3 gives more money. (See below for more "so, there's a 2/3 chance I'm in this universe..." type reasoning.)
Can we put this on a more rational footing? How does she predict what's going to happen
In fact, there are several ways.
1. You can do it if you postulate time travel, or determinism and a copy-teleport-machine, but those are not very realistic things to postulate, whether they would be physically possible or not.
2. A super-intelligent alien scans your brain and models it in a computer.
3. They give you a short-term-memory impairing drug, and try the experiment out several times beforehand while you remain the same person with the same experiences, but have no memory of the trials.
4. They discover that 94% of the time, all men choose one way, and all women choose the other. (But the experiment is double-blind, run by a technician who doesn't know the expected results, so the grad student peeks, then tells you that there's a 94% correlation, but not which way round it is, then invites you to participate yourself.)
Further arguments
33: Aha! In method 3, you don't know which one you are, one of the trial runs, or the final experiment. The only consistent answer which gets the big money is to assume you're more likely to be a trial run, and hence choose the money.
44: Aha! If so, then (*invokes Greg Egan*), the same reasoning applies with method 2. That suggests that you don't know if you're you, or the simulation of you!
55: Nope. Not so. What about method 4? Surely you can't claim that your consciousness might be either (a) you or (b) "the statistical correlation between gender and box choice"??
Which leaves us back where we started. (But remind me to come back to the "am I equally likely to be me, or some other human or simulation of a human".)
Free will
"What does it have to do with free will?" Well, the experiment is completely (sort-of) practical to do. In theory. And so you'd think it should be also actually possible to choose which to take. And yet it doesn't seem to be, and the answer seems to depend maybe on whether you believe in something you can call "free will".
In fact, people divide between take A&B, take B, and "problem is stupid, won't consider". In general, I think the last answer is often over-overlooked. In this case, if I'd seen it work out like that, I'd agree to take only box B, even if I couldn't explain the mathematics behind it. However, I also definitely feel I should be able to justify one case or the other.
Informally, it seems most people seem to eventually take B, but I don't know how important that is.
Apocrypha
Links to prisoner's dilemma, links to doomsday paradox, etc, etc.
no subject
Date: 2009-04-16 10:10 am (UTC)My previous favourite answer was out of Hofstadter's article on the subject, which was that one-boxing gives you a choice between two desirable outcomes – either you get the big cheque, or you get to catch the predictor out in a mistake and show that it was fallible after all. Several people in the comments here have reinvented that one, which is encouraging (Hofstadter asked the question of lots of his friends and was surprised nobody used that reasoning); the bet-half-a-million idea is a particularly nice tweak on the same thing.
But to some extent all of those answers are sort-of-cheating, in that they're dodging the question. The answers that say "proving the predictor wrong is also desirable" or "I start by making side bets" are essentially adjusting the outcome grid into something they can work with more easily, and hence avoiding the real question of "yes, but what would you do if the outcome grid weren't adjusted?". Same goes for answers like "depends how rich I was already" (if I considered £1000 a negligible sum then I might be more inclined to grandstand on the chance of the big cheque by one-boxing, whereas if I was at serious risk of starvation or eviction and £1000 was enough to save me then that might well bias me in favour of minimising my risk by two-boxing). I think that to answer in the spirit of the problem, one has to assume that what's in the boxes is not a thousand and a potential million pounds, but a thousand and a potential million units of pure linearly-valued utility.
As far as my actual personal answer goes, I incline towards "problem is stupid". I'm generally not a fan of using unrealistic hypothetical situations as a means of self-discovery, because (a) they tend to have the problem that your armchair speculation about what you'd do doesn't match what you'd actually do when the emotional stress was real rather than imagined, and (b) they also tend not to tell you what you really wanted to know, in that what they really tell you is which of your decision-making rules of thumb broke down first in the face of being stretched beyond its warranty limit, rather than which one is in some meaningful sense more fundamental to your nature.
So my position for the moment is that my brain's decision-making methodology is currently not equipped to cope with situations involving an almost-perfect Newcomb-predicting entity, and since there is neither such a thing nor a realistic prospect of such a thing I have not yet had reason to consider this an important flaw in that methodology. If one comes along, I may reconsider – but, as you suggest in your post, the right response will probably depend a lot on what I know about how the predictor works.
(In the even more unlikely situation that such a predictor existed but all I knew about it was that it had a track record established by unimpeachable empirical research, I suppose I'd have to consider all the possible mechanisms by which it might work and think about which was most likely to be the real answer. Which is one of those "pull a prior out of your arse" situations, and no matter how much you dress it up with Bayesian-sounding language the reality is that one just has to take a flying guess and be prepared to accept it as the fortunes of war if one guesses wrong.)
no subject
Date: 2009-04-16 10:38 am (UTC)Thank you! It jumped right out at me -- I think I said something like "my God, my brain has been eaten by Egan, I can no-one look at anything without hypothesising infinite copies of myself existing in quantum noise..."
It jumped out when I saw the wikipedia suggestion of the short–term–memory-impairing drug. I was enchanted to see the experiment might be possible. And that in the stm-i drug example, there was a clear, unambiguous answer.
From there I went straight to the "is a simulation me" question of #44.
However, #55 shows that either (a) the experiment is meaningless when the method doesn't work or (b) the "which one am i" argument of #33 fails because it ought to apply equally well in case #55, but doesn't.
But to some extent all of those answers are sort-of-cheating, in that they're dodging the question.
Yes, exactly. Overcoming bias has a theme on that, saying "Is that REALLY the reason you chose that answer, or are you rationalising? If the problem were changed to maximally disadvantage that answer, would it still ring true?" That is, invite people to suppose that £1000 and £1000000 were things they really couldn't do without.
not to tell you what you really wanted to know, in that what they really tell you is which of your decision-making rules of thumb broke down first in the face of being stretched beyond its warranty limit, rather than which one is in some meaningful sense more fundamental to your nature.
Yeah. Particularly with moral questions: you often discover that between two unacceptable choices, neither is acceptable, which isn't really news once you've accepted your moral rules are rules-of-thumb.
But it does seem worth poking your guidelines and seeing what holds up where -- it still seems to me like there OUGHT to be an answer to Newcomb, and it throws up light on all sorts of "OK, suppose I'm equally likely to be any human who has X..." questions, which come up elsewhere too.