A bolt of excitement ran through the field of cardiology in the early 1980s when anti-arrhythmia drugs burst onto the scene. Researchers knew that heart-attack victims with steady heartbeats had the best odds of survival, so a medication that could tamp down irregularities seemed like a no-brainer. The drugs became the standard of care for heart-attack patients and were soon smoothing out heartbeats in intensive care wards across the United States.
But in the early 1990s, cardiologists realized that the drugs were also doing something else: killing about 56,000 heart-attack patients a year. Yes, hearts were beating more regularly on the drugs than off, but their owners were, on average, one-third as likely to pull through. Cardiologists had been so focused on immediately measurable arrhythmias that they had overlooked the longer-term but far more important variable of death.
The fundamental error here is summed up in an old joke scientists love to tell. Late at night, a police officer finds a drunk man crawling around on his hands and knees under a streetlight. The drunk man tells the officer he’s looking for his wallet. When the officer asks if he’s sure this is where he dropped the wallet, the man replies that he thinks he more likely dropped it across the street. Then why are you looking over here? the befuddled officer asks. Because the light’s better here, explains the drunk man.
That fellow is in good company. Many, and possibly most, scientists spend their careers looking for answers where the light is better rather than where the truth is more likely to lie. They don’t always have much choice. It is often extremely difficult or even impossible to cleanly measure what is really important, so scientists instead cleanly measure what they can, hoping it turns out to be relevant. After all, we expect scientists to quantify their observations precisely. As Lord Kelvin put it more than a century ago, “When you can measure what you are speaking about, and express it in numbers, you know something about it.”
There is just one little problem. While these surrogate measurements yield clean numbers, they frequently throw off the results, sometimes dramatically so. This “streetlight effect,” as I call it in my new book, Wrong (Little, Brown), turns up in every field of science, filling research journals with experiments and studies that directly contradict previously published work. It is a tradition that was already well established back in 1915 when an important experiment led by a rather prominent young physicist named Albert Einstein was published. To discover the ratio of magnetic forces to gyroscopic forces on an electron, Einstein had to infer what the electrons in an iron bar were up to based on a minuscule rotation their activity caused the bar to make. His answer was off by a factor of two, as corrected by more careful, but similarly inferential, experiments three years later. (What a loser!)
Physicists have a good excuse for huddling under the streetlight when they are pushing at the limits of human understanding. But the effect also vexes medical research, where you might think great patient data is there for the tabulating. The story of the anti-arrhythmia drugs only hints at the extent of the problem. In 2005, John Ioannidis of the University of Ioannina in Greece examined the 45 most prominent studies published since 1990 in the top medical journals and found that about one-third of them were ultimately refuted. If one were to look at all medical studies, it would be more like two-thirds, he says. And for some kinds of leading-edge studies, like those linking a disease to a specific gene, wrongness infects 90 percent or more.
We should fully expect scientific theories to frequently butt heads and to wind up being disproved sometimes as researchers grope their way toward the truth. That is the scientific process: Generate ideas, test them, discard the flimsy, repeat. In fact, testing ideas is supposed to be the core competence of most scientists. But if tests of the exact same idea routinely generate differing, even opposite, results, then what are we humble nonscientists supposed to believe?
I have spent the past three years examining why expert pronouncements so often turn out to be exaggerated, misleading, or flat-out wrong. There are several very good reasons why that happens, and one of them is that scientists are not as good at making trustworthy measurements as we give them credit for. It’s not that they are mostly incompetents and cheats. Well, some of them are: In several confidential surveys spanning different fields, anywhere from 10 to 50 percent of scientists have confessed to perpetrating or being aware of some sort of research misbehavior. And numerous studies have highlighted remarkably lax supervision of research assistants and technicians. A bigger obstacle to reliable research, though, is that scientists often simply cannot get at the things they need to measure.
Examples of how the streetlight effect sends studies off track are ubiquitous. In many cases it is painfully obvious that scientists are stuck with surrogate measures in place of what they really want to quantify. After decades of dueling studies about whether it was an asteroid or volcanic eruptions that did in the dinosaurs, it is apparent that the mineral-deposit evidence is indirect and open to interpretation, even if the scientists advancing the various claims sound pretty sure of themselves. Astronomers enlist surrogate measures all the time, since there is no way to stick thermometers in stars or to unreel tape measures to other galaxies. Likewise, economists cannot track the individual behaviors of billions of consumers and investors, so they rely on economic indicators and extracts of data.
How reliable are the results? In 1992 a now-classic study by researchers at Harvard and the National Bureau of Economic Research examined papers from a range of economics journals and determined that approximately none of them had conclusively proved anything one way or the other. Given that dismal assessment—and given the great influence of economists on financial institutions and regulation—it’s a wonder the global economic infrastructure is not in far worse shape. (Of course, scientific findings that point out the problems with scientific findings are fair game for reanalysis too.)
By far the most familiar and vexing consequences of the streetlight effect show up in those ever-shifting medical findings. Take this straightforward and critical question: Can vitamin D supplements lower the risk of breast, colon, and other cancers? Yes, by as much as 75 percent, several well-publicized studies have concluded over the past decade. No, not at all, several other equally well-publicized studies have concluded. In 2008 alone, around 380 published research articles addressed the link between vitamin D and cancer in one way or another. The ocean of data on the topic is vast, swelling, and teeming with sharp contradictions.