Health statistics may be hazardous to our mental health. Inundated by numbers purporting to predict everything from our likelihood of dying from cancer to our chances of contracting AIDS, we respond with a curious range of reactions that rarely reflect the true nature of the alleged risk. We ignore real dangers while reacting emotionally to phantoms; we blithely accept dubious conclusions while disbelieving sensible ones; or we simply (or not so simply) misinterpret the numbers. (The National Council on Unchallengeable Statistics reports that 88.47 percent of us have one of these five reactions 5.61 times a day, leading to 452,888,988,750 cases of dyscalculia recorded in this country annually.)
Part of the problem lies in our psychological inability to confront either numbers or health hazards with anything approaching objectivity. Another aspect of the problem is mathematical; it stems from our ignorance about the oddities of statistical analysis itself. A third is factual: if we don't know how the statistics in question were obtained, then we can't possibly know what they really mean.
The psychological obstacles to rational understanding of statistics are the most familiar. Consider the near panic that ensued last year when a guest on a national talk show blamed his wife's recent death from brain cancer on her use of a cellular telephone--a case that in some ways serves as a paradigm for many recent health scares. The man alleged that there was a causal connection between his wife's frequent use of their cellular phone and her subsequent brain cancer. He sued (the case is still pending), and the concomitant media delirium created fear, confusion, and a decline in the stock prices of the companies that manufacture cellular phones.
The power of a dramatic anecdote in this instance clouded the commonsense distinction between the incidence of some condition and the absolute number of cases. Brain cancer is a rare disease; it strikes roughly 7 people out of 100,000 a year. Yet because the U.S. population is so large, that rate still amounts to some 17,500 new sufferers annually. Oddly, the real statistical relationship between brain cancer and cellular phones seems to argue that these devices actually inhibit the formation of brain tumors. The argument: There are an estimated 10 million users of cellular phones in this country. Multiplying 10 million by 7/100,000, we determine that approximately 700 brain tumors should be expected annually among users of these devices. Since only a handful have come to public attention, we should conclude that cellular phones might even effectively ward off brain tumors. Absurd, to be sure, but no more so than the reasoning behind the original hysteria.
Clearly, the appeal of some statistics has little to do with the validity of the numbers themselves. We show a psychological preference for believing and remembering statistics that are nice round numbers-- especially when they are multiples of 10. These numbers become a part of the statistical folklore even though in some cases they have no known origin; even when they do, few people (if any) can explain precisely what they mean. It has been maintained for years, for example, that we each use only about 10 percent of our brain capacity; that the condom failure rate is 10 percent; and until just last year, that 10 percent of Americans are homosexual. Such statistics are partly artifacts, I suspect, of our decimal system; in a base 12 system, we'd no doubt show a similar affinity for statistics that were multiples of 8.333 percent.
This psychomathematical stew gets even more murky when we throw in large numbers or unfamiliar elements. Drugs are a scourge, no doubt, but we tend to ignore the biggest killers--tobacco (400,000 deaths annually) and alcohol (90,000 annually)--and become alarmed by more exotic substances such as cocaine and heroin, even though abuse of all illicit drugs combined results in about 20,000 deaths a year. Similarly, many people fear nuclear power plants, yet the prosaic problem of lead in old paint and pipes has caused far more harm. Likewise, California biochemist Bruce Ames has estimated that we ingest 10,000 times as much natural pesticide (hydrazines in mushrooms, aflatoxins in peanuts) as we do man-made residues, yet no one rides around with a bumper sticker reading NO PEANUTS.
Misunderstandings about the mathematics of statistics create a different set of confusions. Consider Simpson's paradox, an easily made arithmetic error that has a number of real-world consequences. The error consists in concluding that if one averages several sets of numbers, and then averages these averages, the number one gets is the average of all the numbers. This seeming paradox applies to percentages as well. Thus if a study indicates that the health of 36 percent of one ethnic group (call them greens) and 45 percent of another ethnic group (call them reds) improves from some treatment, and a second study indicates that the health of 60 percent of greens and 65 percent of reds improves, it can't be concluded that a higher percentage of reds improves. The first study might, for example, have included 100 greens and 1,000 reds, while in the second study these numbers might have been reversed. Since newspapers frequently bombard us with percentages of ethnic groups that are vulnerable to certain maladies, Simpson's paradox threatens to cause a veritable epidemic of dyscalculia.
A similar error might creep into a story on, say, welfare reform. HALF OF WELFARE RECIPIENTS ARE LONG TERM, the headline might read, based on the following data: Mr. Green, let's say, was on welfare for years. In January he picked up his usual public assistance check. So did Mrs. Blue. By February, Blue was off the welfare rolls, but Ms. Orange received assistance for the first time. In March, Orange got a job and stopped receiving payments, but Mr. Purple signed up for assistance. Poor Mr. Green, meanwhile, continued to receive benefits throughout the year. If one examines the record for any given month, one finds that 50 percent of the people--Mr. Green and one other person--were chronic welfare recipients. Yet only one-thirteenth of the people who received welfare that year remained on the rolls for long.
The mathematical notion of conditional probability also has a way of tripping people up. The probability that someone speaks English given that he or she is a U.S. citizen is, let's assume, 95 percent. The conditional probability that someone is a U.S. citizen given that he or she speaks English is much less, say 20 percent. Misunderstanding conditional probability can lead us to draw inaccurate conclusions about critical health-care issues. Consider, for example, this situation: you've taken a test for dread disease D (perhaps dyscalculia), and your doctor has solemnly advised you that you've tested positive. How despondent should you be?
To see that cautious optimism may be appropriate, suppose there is a test for disease D that is 99 percent accurate in the following sense: if someone has D, the test will be positive 99 percent of the time, and if one doesn't have it, the test will be negative 99 percent of the time. (For simplicity I assume the same percentage holds for both positive and negative tests.) Suppose further that .1 percent--one out of every 1,000 people--actually has this rare disease.
Let's assume now that 100,000 tests for D are administered. Of these, how many are positive? On average, 100 of these 100,000 people (.1 percent) will have D, and so, since 99 percent of the 100 will test positive, we will have, on average, 99 positive tests. Of the 99,900 healthy people, 1 percent will test positive, resulting in a total of 999 positive tests. Thus, of the total of 1,098 positive tests (999 + 99), most (999) are false positives; therefore the conditional probability that you have D given that you tested positive is 99/1,098, or a bit over 9 percent, and this for a test that was assumed to be 99 percent accurate! To reiterate, the conditional probability that you test positive given that you have D is 99 percent, yet only 9 percent of those with positive tests will have D.
The whole panoply of statistical tests, estimates, and procedures contains nuances seemingly tailor-made (or perhaps mathematician-made) to confuse the unwary. Determining, for example, whether clusters of a particular disease constitute evidence of something seriously awry--or merely a coincidental clumping--is not easy. You may notice, for instance, that a lot of people in your neighborhood seem to be getting brain cancer. But random distributions are not homogeneous. That is, a perfectly even distribution of brain cancer victims across all 50 states would be highly unlikely--far more unlikely than the chance concentrations that occur here and there. (A more familiar example is tossing a coin. Even though the result of the toss--heads or tails--is completely random, you would not expect to get a perfect string of head, then tail, then head, then tail. You would expect strings of all heads or all tails--sometimes even long ones.)
Moreover, most people don't realize that what's critical about a random sample is its absolute size, not its percentage of the population. Although it seems counterintuitive, a random sample of 500 people taken from the entire U.S. population of 250 million is generally far more predictive than a random sample of 50 people out of a population of 2,500.
More elementary, but also widespread, is the confusion of correlation and causation. Any study correlating shoe size with intelligence, for example, would show that children with larger feet reason better than those with smaller feet. But there is no causal connection here. Children with bigger feet reason better because they're older. Funny, perhaps, but consider a newspaper article announcing a link between bottled water and healthier babies. Readers would clearly be invited to infer a causal connection. Without further evidence, however, this invitation should be refused; affluent parents are more likely both to drink bottled water and to have healthy children (since they have the stability and wherewithal to buy good food, clothing, shelter, and amenities). Making a practice of questioning correlations when reading about "links" between this practice and that condition is good statistical hygiene.
Often the meaning of statistics gets muddied by a simple lack of information about how the numbers were obtained. The 10 percent condom failure rate, originally cited in a Planned Parenthood study, is one example. It appears that it resulted from asking couples what their primary method of birth control was and whether it ever failed them. Of couples using condoms, approximately one in ten said yes. A statistic was born, even though there seem to be no other studies to back up the figure and no reasonable way to interpret it. Condom leakage rates, for example, are known to be exceedingly low. (Based on its own and other investigations, Consumer Reports concluded, "in principle, latex condoms can be close to 100 percent effective.") As far as effectiveness as a means of contraception, however, the data vary significantly with the respondent's age and marital status, categories that are surely independent of condom leakage rates. The problem seems to lie with the users, not the device. Likewise, if the question is prevention of sexually transmitted diseases, the numbers again depend on how carefully the condoms are used, but such figures are very difficult to estimate (except, perhaps, for voyeurs). There is considerable circumstantial evidence, however; for example, prostitutes in Nevada who always use condoms contract almost no sexually transmitted diseases.
In the same vein, recent media attention has focused on studies purporting to show that women smokers are at a higher risk of lung cancer than men smokers are. But the way the comparative risks were calculated, at least according to some researchers, makes the findings suspect. The studies, these critics point out, looked at male and female lung cancer cases separately. For each group, and for equal-size control groups of men or women without lung cancer, they asked, "What is the risk of getting the disease?" And they found that smokers in the female group made up a higher percentage of the lung cancer patients than did smokers in the male group. But the ratio for females might have been higher simply because more nonsmoking men get lung cancer from other causes--exposure to carcinogens in the workplace, for example. The starting baseline is different, so the comparisons are questionable at best.
Another case of misplaced comparisons appeared as an exchange of letters in the New York Times over an issue concerning the health of the body politic: the question of whether more blacks or whites voted on the basis of race in the recent New York City mayoral election. The first writer argued that since 95 percent of blacks voted for (black) Mayor David Dinkins, and only 75 percent of whites voted for (white) candidate (and victor) Rudolph Giuliani, the black vote was more racially motivated than the white. The second writer pointed out that this failed to take into account the preference of most black voters for any Democratic candidate, of which Dinkins was one. Assuming that 80 percent of blacks usually vote for Democrats, and only 50 percent of whites vote for Republicans, then only 15 percent of blacks voted for Dinkins based on his race, but as many as 25 percent of whites voted for Giuliani based on race alone. There are other interpretations as well.
Failure to put statistics in context makes it all but impossible to evaluate personal risk with a clear eye. For example, we often hear that 1 in 8 women will develop breast cancer. This figure is misleading for several reasons (not the least of which is that people often misread it as a mortality risk instead of a lifetime incidence risk; the mortality risk is 1 in 28). But most significant is that the incidence of breast cancer, like that of most cancers, rises with age; the risk of a woman's developing breast cancer by age 50 is 1 in 52, but by age 85 it is 1 in 9. And by age 95 it's 1 in 8. According to the National Cancer Institute, the typical 40- year-old has about a 1.5 percent chance of developing the disease before age 50 and a 3.8 percent chance of developing it before 60. The typical 20- year-old, in contrast, has a .04 percent chance of developing the disease before age 30, and a .5 percent chance of developing it before age 40. The lifetime risk has risen in the last 20 years, but that is probably the result of two factors: more frequent screenings have led to the early detection of more cases, and, since women are dying less frequently from other causes, they are living to ages at which the risk of getting breast cancer is higher.
In fact, mortality rates for most diseases vary enormously depending on the age group you choose to look at. Most people are familiar with the statistic that the two leading killers of Americans are heart disease and cancer. This is true overall. But if you look at, say, a population of people in their twenties, the leading killers are car accidents, homicide, suicide, drownings, poisonings, and fires. And because the victims are younger, their deaths result in more lost years of potential life (calculated somewhat arbitrarily as the years before the age of 65). So even though the number of dead is smaller, the number of lost years of life is greater.
Furthermore, there is a natural tendency to discount quantities that come due in the future, whether they involve incidence of death and disease or money due on a mortgage. The idea of suffering 20 years from now is considerably easier to bear than the more imminent sacrifices that may be necessary to prevent its occurrence; only by such a dis-calculation can we conclude that the inconvenience of wearing condoms is too much to pay for a life.
One final note: implausibly precise statistics are often bogus (as were, of course, those mentioned in the "study" cited at the beginning of this article). Consider a precise number well known to generations of parents and doctors: the normal human body temperature of 98.6 degrees Fahrenheit. Recent investigations involving millions of measurements have revealed that this number is too high; normal human body temperature actually varies around an average of 98.2 degrees. The fault, however, was not with the original measurements. A range of measurements between 36.2 and 37.5 degrees Celsius was averaged and sensibly rounded to the nearest degree: 37 degrees Celsius. When this temperature was converted to Fahrenheit, the rounding was forgotten and 98.6 was taken to be accurate to the nearest tenth of a degree. Had the original range of temperatures been translated one for one, the equivalent Fahrenheit temperatures would have ranged from 97.2 to 99.5 degrees. Apparently dyscalculia can even cause fevers and chills. Inoculate yourself.