Here are three science facts that I think are extremely underrated (by which I mean they’re relatively badly understood, even by my unusually science-literate friends). Most feel fairly obvious in hindsight.
I am not a qualified expert (or an unqualified expert) on any of these topics, this is all the result of me reading science for fun and then talking to friends and tutors about it. I’ve talked to several experts about all of these, so I’m pretty sure I have the general picture right, but I probably have a few of the details wrong, especially for the biology ones. I didn’t check this post by anyone before posting it.
The quantum mechanical behavior of electrons does most of the work towards explaining most of the features of the world around us.
QM of electrons results in basically all of chemistry, in materials being solids or liquids or gases, and hard or soft or squishy or springy, and red or green or transparent or shiny. Life on Earth works in ways that are intimately related to quantum mechanics of electrons, with other areas of physics appearing as minor details (eg the Newtonian mechanics that do an excellent job of describing how nuclei move around and interact, or the nuclear forces which cause nuclei to be stable but which don’t have much of an active role in everyday life) or as backdrops (eg the Sun being large and hot for a long time, which heats up the chemistry for long enough to evolve life).
In my experience, physicists are also weirdly unknowledgeable about this fact. Physics PhD students who I talk to often have basic misconceptions in this area, for example when they believe that solid objects can’t pass through each other because their electrons repel each other (this is false because this repulsive force is countered by the attraction the electrons feel to the nuclei in the other object; in fact the net electric force resulting from atom being near is attractive rather than repulsive, this is what a London dispersion force is).
I am not sure why this is so underrated. I think it’s partially because physics undergrad degrees don’t have compulsory courses in quantum chemistry and so people aren’t forced to confront it. Also, some of this physics is bizarrely recent. For example, it turns out that objects are solid because of the Pauli exclusion principle; this was first proven in 1966 in “Stability of Matter I and II” by Dyson and Lenard, which implies in the introduction that this explanation for stability was novel. [EDIT: Actually, based on the introduction to this paper, I now think that this explanation of stability was older, and the novel results there were just the proofs that the Pauli exclusion principle is necessary for stability.]. I feel like 1966 is shockingly recent for an answer to such a basic question about how the world works.
I think that it would be possible to write a much more efficient explanation of this than currently exists, for an audience of people who are comfortable with math and who want to know the concepts but don’t need to be able to compute efficiently with them. But that explanation doesn’t exist. I learned this by struggling through a smattering of different sources; one particularly helpful source was the chapters of Thijssen’s “Computational Physics” on the Hartree-Fock method.
Also, I still don’t understand how exactly you calculate the visual properties of a material from the QM of the electrons; if you think you know the whole story there, I’d love to hear it from you sometime.
Biological life involves complicated machinery to allow cells to do different things in different circumstances
This is totally obvious, but for some reason I found it really fascinating when I first learned about it.
Here are some things that cells can do:
- Make a protein only if some chemical is or isn’t present in the cell. And various more complicated versions of this; for example, the most famous example of this kind of gene regulation is that E. coli produces an enzyme to digest lactose only if there is lactose present and there isn’t glucose present.
- Do something with a time delay–eg, do it once a day, or do it after twenty minutes.
- Split into two cells, in such a way that both daughter cells have the same behavior. Eg when your liver is growing, the cells that grow there need to turn into liver cells and not skin cells or hair follicles or whatever.
The big “a-ha” moment for me here was the following. The basic story of cells is that genes get read and turned into proteins, which are then used to do stuff; a key part of the stuff that proteins do is affect how much various genes are producing proteins. This kind of system, where some genes produce proteins which affect how much other genes are expressed, is called a gene regulatory network.
In hindsight it is totally obvious that cells have to have some kind of mechanism for doing this stuff, but I hadn’t thought about how powerful it has to be.
If you want to learn about this, you should read chapter 8 of the excellent book “Essential Cell Biology”, which is one of my favorite textbooks ever; if you want more details, I recommend “Molecular Biology of the Cell”, which is by the same authors but is much longer.
Learning about this affects my intuitions about how complex evolved systems work, which is relevant for thinking about powerful ML systems. I somewhat confidently expect ML systems to look more like gene regulatory networks over time, eg with more emphasis placed on learning a mapping from internal state to internal state.
Biological life is heavily optimized for evolvability
When I was a kid, I was told that evolution is like shooting a gun at your car in the hope that it would make the car go faster–it almost never makes the car go faster, but sometimes it does, and evolution got to shoot at the car a lot of times.
This is extremely inaccurate. Shooting at a car is vanishingly unlikely to make your car go faster, and will not efficiently search the space of possible cars. In contrast, biological evolution is set up to maximize the probability that mutations are helpful.
Here are some mechanisms for this:
- Genes mutate in ways that are much more helpful than random insertions/deletions/edits for generating novel useful results. For example, gene duplication is somewhat common, in which you copy a gene twice, which allows the two copies to evolve independently.
- Genes are separated by long strands of noncoding (“junk”) DNA, and also have long sequences of noncoding DNA inside them. This is helpful because it means that if you randomly copy one section of a chromosome into another, the large noncoding sections mean that you have a pretty high probability of not cutting from or pasting into the middle of a part of a gene which is coding for a protein. (Analogously, if you wanted to make new sentences by taking two strands of paper with text on them, and randomly cutting a section from one and pasting it into another, you’re a lot less likely to cut in the middle of a word if there is a lot of spacing between the words.)
- Even better, some of the individual sections which make up genes seem to correspond to “modules” which can be assembled together to make proteins which are likely to be structurally stable and which might be useful. This might have been an important component of how multicellular life evolved. (see László Patthy, “Genome evolution and the evolution of exon-shuffling — a review”. I paid Tegan McCaslin to research this for a few hours for me earlier this year; if you want I can share you on the notes she made.)
- Evolution works well with all the stuff about gene expression and gene regulatory networks I mentioned in the previous section. In particular, small mutations to the promoter sequence of a gene (the region of the gene near the transcription start site which can produce small changes in the amount that the gene is expressed. This makes it easy for evolution to mess around with the relative strengths of different processes.
- Genes can be read multiple different ways to produce different proteins–genes often have sections which are only sometimes included in the RNA produced from the gene, and sometimes genes contain several different sections of which only one is included in the RNA. This process is called alternative splicing. Alternative splicing is used in 95% of human genes which contain multiple sections of coding DNA. This allows evolution to efficiently experiment with variations on proteins without having to destroy the ability to produce the normal version of the protein.
- In bacteria, there are lots of mechanisms for gene transfer between unrelated cells: among other things, they make little tubes and send pieces of DNA back and forth to each other. This allows unrelated species of bacteria to transmit eg antibiotic resistance to each other. (Also, at least some bacteria keep a blacklist of DNA segments which are known to occur in viruses and then remove those sequences from their genome if they see them. This is insanely elaborate and cool.)
Basically all the facts here are discussed in chapter 9 of “Essential Cell Biology”.
All of this makes me feel it’s more likely that we’ll make powerful ML systems by finding clever ways to search more efficiently through the space of network parameters and architectures and so on.