I’ve recently been spending some time thinking about the rationality mistakes I’ve made in the past. Here’s an interesting one: I think I have historically been too hasty to go from “other people seem very wrong on this topic” to “I am right on this topic”.
Throughout my life, I’ve often thought that other people had beliefs that were really repugnant and stupid. Now that I am older and wiser, I still think I was correct to think that these ideas were repugnant and stupid. Overall I was probably slightly insufficiently dismissive of things like the opinions of apparent domain experts and the opinions of people who seemed smart whose arguments I couldn’t really follow. I also overrated conventional wisdom about factual claims about how the world worked, though I underrated conventional wisdom about how to behave.
Examples of ideas where I thought the conventional wisdom was really dumb:
- I thought that animal farming was a massive moral catastrophe, and I thought it was a sign of terrible moral failure that almost everyone around me didn’t care about this and wasn’t interested when I brought it up.
- I thought that AI safety was a big deal, and I thought the arguments against it were all pretty stupid. (Nowadays the conventional wisdom has a much higher opinion of AI safety; I’m talking about 2010-2014.)
- I thought that people have terrible taste in economic policy, and that they mostly vote for good-sounding stuff that stops sounding good if you think about it properly for even a minute
- I was horrified by people proudly buying products that said “Made in Australia” on them; I didn’t understand how that wasn’t obviously racist, and I thought that we should make it much easier to allow anyone who wants to to come live in Australia. (This one has become much less controversial since Trump inadvertently convinced liberals that they should be in favor of immigration liberalization.)
- In the spirit of loudly acknowledging past mistakes: I thought and still think that a lot of people’s arguments about why it’s good to call the police on bike thieves were dumb. See eg many of the arguments people made in response to a post of mine about this (that in fairness was a really dumb post, IMO)
I think I was right about other people being wrong. However, I think that my actual opinions on these topics were pretty confused and wrong, much more than I thought at the time. Here’s how I updated my opinion for all the things above:
- I have updated against the simple view of hedonic utilitarianism under which it’s plausible that simple control systems can suffer. A few years ago, I was seriously worried that the future would contain much more factory farming and therefore end up net negative; I now think that I overrated this fear, because (among other arguments) almost no-one actually endorses torturing animals, we just do it out of expediency, and in the limit of better technology our weak preferences will override our expediency.
- My understanding of AI safety was “eventually someone will build a recursively self improving singleton sovereign AGI, and we need to figure out how to build it such that it can have an off switch and it implements some good value function instead of something bad.” I think this picture was massively oversimplified. On the strategic side, I didn’t think about the possibilities of slower takeoffs or powerful technologies without recursive self improvement; on the technical safety side, I didn’t understand that it’s hard to even build a paperclip maximizer, and a lot of our effort might go into figuring out how to do that.
- Other people have terrible taste in economic policy, but I think that I was at the time overconfident in various libertarianish ideas that I’m now less enthusiastic about. Also, I no longer think it’s a slam dunk that society is better off from becoming wealthier, because of considerations related to the far future, animals, and whether more money makes us happier.
- I think that immigration liberalization is more dangerous than I used to think, because rich societies seem to generate massive positive externalities for the rest of the world and it seems possible that a sudden influx of less educated people with (in my opinion) worse political opinions might be killing the goose that lays the golden eggs.
- Re bike thieves: I think that even though utilitarianism is good and stuff, it’s extremely costly to have thievery be tolerated, because then you have to do all these negative-sum things like buying bike locks. Also it seems like we’re generally better off if people help with enforcement of laws.
In all of these cases, my arguments against others were much higher quality than my actual beliefs. Much more concerningly, I think I was much better at spotting the holes in other people’s arguments than spotting holes in my own.
There’s also a general factor here of me being overconfident in the details of ideas that had some ring of truth to them. Like, the importance of AGI safety seemed really obvious to me, and I think that my sense of obviousness has historically been pretty good at spotting arguments that later stand up to intense scrutiny. But I was massively overconfident in my particular story for how AGI would go down. I should have been more disjunctive: I should have said “It sure seems like something like this ought to happen, and it seems like step three could happen in any of these four possible ways, and I don’t know which of them will be true, and maybe it will actually be another one, but I feel pretty convinced that there’s some way it will happen”.
Here are some other ideas which I continue to endorse which had that ring of truth to them, but whose details I’ve been similarly overconfident about. (Some of these are pretty obscure.)
- The simulation hypothesis
- UDASSA
- The malignancy of the universal prior
- The mathematical universe hypothesis
- Humans have weird complex biases related to categories like race and gender, and we should be careful about this in our thinking. (Nowadays this idea is super widespread and so it feels weird to put it in the same list as all these crazy other ideas. But when I first encountered it seriously in my first year of college, it felt like an interesting and new idea, in the same category as many of the cognitive biases I heard about on LessWrong.)
And here are ideas which had this ring of truth to them that I no longer endorse:
- We should fill the universe with hedonium.
- The future might be net negative, because humans so far have caused great suffering with their technological progress and there’s no reason to imagine that this will change. Futurists are biased against this argument because they personally don’t want to die and have a strong selfish desire for human civilization to persist.
- Because of Landauer’s limit, civilizations have an incentive to aestivate. (This one is wrong because it involves a misunderstanding of thermodynamics.)
My bias towards thinking my own beliefs are more reasonable than they are would be disastrous if it prevented me from changing my mind in response to good new arguments. Luckily, I don’t think that I am particularly biased in that direction, for two reasons. Firstly, when I’m talking to someone who thinks I’m wrong, for whatever reason I usually take them pretty seriously and I have a small crisis of faith that prompts me to go off and reexamine my beliefs a bunch. Secondly, I think that most of the time that people present an argument which later changes my mind, my initial reaction is confusion rather than dismissiveness.
As an example of the first: Once upon a time I told someone I respected that they shouldn’t eat animal products, because of the vast suffering caused by animal farming. He looked over scornfully and told me that it was pretty rich for me to say that, given that I use Apple products—hadn’t I heard about the abusive Apple factory conditions and how they have nets to prevent people killing themselves by jumping off the tops of the factories? I felt terrified that I’d been committing some grave moral sin, and then went off to my room to research the topic for an hour or two. I eventually became convinced that the net effect of buying Apple products on human welfare is probably very slightly positive but small enough to not worry about, and also it didn’t seem to me that there’s a strong deontological argument against doing it.
(I went back and told the guy about the result of me looking into it. He said he didn’t feel interested in the topic anymore and didn’t want to talk about it. I said “wow, man, I feel pretty annoyed by that; you gave me a moral criticism and I took it real seriously; I think it’s bad form to not spend at least a couple minutes hearing about what I found.” Someone else who was in the room, who was very enthusiastic about social justice, came over and berated me for trying to violate someone else’s preferences about not talking about something. I learned something that day about how useful it is to take moral criticism seriously when it’s from people who don’t seem to be very directed by their morals.)
Other examples: When I first ran across charismatic people who were in favor of deontological values and social justicey beliefs, I took those ideas really seriously and mulled them over a lot. A few weeks ago, someone gave me some unexpectedly harsh criticism about my personal manner and several aspects of how I approach my work; I updated initially quite far in the direction of their criticism, only to update 70% of the way back towards my initial views after I spent ten more hours thinking and talking to people about it.
Examples of the second: When I met people whose view of AI safety didn’t match my own naive view, I felt confused and took them seriously (including when they were expressing a bunch of skepticism of MIRI). When Howie Lempel told me he thought the criminal justice system was really racist, I was surprised and quickly updated my opinion to “I am confused about this”, rather than dismissing him.
I can’t think of cases where I initially thought an argument was really stupid but then it ended up convincing either me or a majority of people who I think of as my epistemic peers and superiors (eg people who I think have generally good judgement at EA orgs).
However, I can think of cases where I felt initially that an argument is dumb, but lots of my epistemic peers think that the argument is at least sort of reasonable. I am concerned by this and I’m trying to combat it. For example, the following arguments are in my current list of things that I am worried I’m undervaluing because they initially seem implausible to me, and are on my to-do list to eventually look into more carefully: Drexler’s Comprehensive AI Systems. AI safety via ambitious value learning. Arguments that powerful AI won’t lead to a singleton.
Please let me know if you have examples along these lines where I seemed dumber than I’m presenting here.
Here’s another perspective on why my approach might be a problem. I think that people are often pretty bad at expressing why they believe things, and in particular they don’t usually say “I don’t know why I believe this, but I believe it anyway.” So if I dismiss arguments that suck, I might be dismissing useful knowledge that other people have gained through experience.
I think I’ve made mistakes along these lines in the past. For example, I used to have a much lower opinion of professionalism than I now do. And there are a couple of serious personal mistakes I’ve made where I looked around for the best arguments against doing something weird I wanted to do, and all of those arguments sucked, and then I decided to do the weird thing, and then it was a bad idea.
Katja Grace calls this mistake “breaking Chesterton’s fence in the presence of bull”.
This would suggest the heuristic “Take received wisdom on topics into account, even if you ask people where the received wisdom comes from and they tell you a source that seems extremely unreliable”.
I think this heuristic is alright but shouldn’t be an overriding consideration. The ideas that evolve through the experience of social groups are valuable because they’re somewhat selected for truth and importance. But the selection process for these ideas is extremely simple and dumb.
I’d expect that in most cases where something is bad, there is a legible argument for why we shouldn’t do it (where I’m including arguments from empirical evidence as legible arguments). I’d prefer to just learn all of the few things that society implicitly knows, rather than giving up every time it disagrees with me.
Maybe this is me being arrogant again, but I feel like the mistake I made with the bike-stealing thing wasn’t me refusing to bow to social authority, it was me not trying hard enough to think carefully about the economics of the situation. My inside view is that if I now try to think about economics, I don’t need to incorporate that much outside-view-style discounting of my own arguments.
I have the big advantage of being around people who are really good at articulating the actual reasons why things are bad. Possibly the number one strength of the rationalist community is creating and disseminating good explicit models of things that are widely implicitly understood (eg variants of Goodhart’s law, Moloch, Chesterton’s fence, the unilateralist’s curse, “toxoplasma of rage”). If I was in any other community, I’m worried that I’d make posts like the one about the bike, and no-one would be able to articulate why I was wrong in a way that was convincing. So I don’t necessarily endorse other people taking the strategy I take.
I am not aware of that many cases where I believed something really stupid because all the common arguments against it seemed really dumb to me. If I knew of more cases like this, I’d be more worried about this.
Claire Zabel says, in response to all this:
I’d say you’re too quick to buy a whole new story if it has the ring of truth, and too quick to ask others (and probably yourself) to either refute on the spot, or accept, a complex and important new story about something about the world, and leave too little room to say “this seems sketchy but I can’t articulate how” or “I want to think about it for a while” or “I’d like to hear the critics’ counterargumetns” or “even though none of the above has yielded fruit, I’m still not confident about this thing”
This seems plausible. I spend a bunch of time trying to explain why I’m worried about AI risk to people who don’t know much about the topic. This requires covering quite a lot of ground; perhaps I should try harder to explicitly say “by the way, I know I’m telling you a lot of crazy stuff; you should take as long as it takes to evaluate all of this on your own; my goal here is just to explain what I believe; you should use me as a datapoint about one place that human beliefs sometimes go after thinking about the subject.”
I feel like my intuitive sense of whether someone else’s argument is roughly legit is pretty good, and I plan to continue feeling pretty confident when I intuitively feel like someone else is being dumb. But I am trying to not make the jump from “I think that this argument is roughly right” to “I think that all of the steps in this fleshed out version of that argument are roughly right”. Please let me know if you think I’m making that particular mistake.