When EAs look at the history of nuclear weapons, their reactions tend to fall into two camps.

The first camp (which I am inclined towards) is “Man, what a total mess. There were so many near misses, and people involved did such clearly terrible and risky things like setting up the dead hand system and whatever else. I guess that humans probably can’t be trusted to handle extremely dangerous technology.”

The other camp says “No nuclear weapons have been used or detonated accidentally since 1945. This is the optimal outcome, so I guess this is evidence that humanity is good at handling dangerous technology.”

This mostly comes up because people from the other camp tend to give numbers for the probability of AI x-risk that are 1-10%, and people from my camp tend to give numbers that are like 40%-80%. I think both camps are roughly equally represented among people who work on x-risk prevention, though the optimists have recently been doing a much more thorough job of arguing for their AI x-risk probabilities than the pessimists have.

When I talk to people from the other camp, I often have a conversation that goes like this:

Me: Okay, but what about all these crazy stories from The Doomsday Machine about extreme recklessness and risk?

Them: I don’t trust stories. It’s really hard to know what the actual situation was. The Doomsday Machine is just one book written by an activist who probably isn’t that reliable (eg see his massively exaggerated statements about how dangerous nuclear winter is). There will always be people telling you that something was a disaster. I prefer to look at unambiguous and unbiased evidence. In this particular case, the unbiased, unambiguous questions that we could have bet on in 1945 are things like “How many nuclear weapons will be fired in anger in the next fifty years? How many people will die from nuclear weapons? How many buildings will be destroyed?” And the answer to all of these is zero. Surely you agree that you would have lost money if you’d bet on these with me in 1945?

Me: I agree I would have lost money on that bet. But I still feel that my overall worldview of “people will do wild and reckless things” loses fewer Bayes points than yours does. If we’d bet not just on outcomes but on questions like “will someone build a doomsday machine” or “will countries take X measure to reduce the probability of accidental nuclear war”, I would have won money off you from almost all of those. My worldview would have won most of the bets.

Them: Except for the only bet that is unambiguously connected to the thing we actually care about.

Me: Yeah, but I don’t know if I care about that? Like, maybe I would have assigned 30% to “no nuclear weapons would have been fired”, but it’s not that bad to have something 30% likely happen. Whereas I feel you would have assigned numbers like 5% to a bunch of reckless things that I would have assigned 30% to, which is a much more egregious mistake.

Them: If you read actual writers at the time, like Bertrand Russell, they seem to imply very small probabilities of the outcome which actually happened; I think you’re being a bit overly generous about how well you would have done.

Me: Fair.

Them: I feel like your worldview suggests that way more bad things should have happened as a result of coordination failures than have actually happened. Like, I don’t think there are really examples of very bad things happening as a result of coordination failures.

Me: …what? What about climate change or state bioweapons programs or the response to covid?

Them: Climate change isn’t very important, it’s only going to make the world a few percent worse off.

Me: I agree, but firstly I don’t think politicians know that, and secondly they’re still doing much less than would be optimal.

Them: I think we’d do better on problems with actual big stakes.

Me: I don’t see any reason to believe that this is true. It doesn’t seem that we did noticeably better on nuclear weapons than on lower-stakes coordination problems.

Them: I think state bioweapons programs are another example of something where nothing very bad has happened.

Me: What about if covid turns out to have been accidentally released from a bioweapons lab?

Them: That will be an update for me.

Me: Why would that be an update? We already know that state bioweapons programs have killed thousands of people with accidental releases, and there’s no particular reason that they couldn’t cause worse disasters, and that international regulation has failed to control that.

Them: [inaudible. I don’t know how to rephrase the thing that people say at this point in the conversation.]

Me: Do you have any criticisms of me that you want to finish up with?

Them: Yeah. I think you’re overly focused on looking at the worst examples of coordination failures, rather than trying to get a balanced sense of our overall strengths and weaknesses. I also think you’re overly focused on stories where things sound like they should have gone terribly, and you’re updating insufficiently on the fact that for some reason, it always seems to go okay in the end; I think that you should update towards the possibility that you’re just really confused about how dangerous things are.

I feel very confused here.


see comments on LessWrong, Facebook