Error in America: Of Antidepressants and Statistics

“It is true that you may fool all of the people some of the time; you may even fool some of the people all of the time; But you can’t fool all the people all the time.” – President Abraham Lincoln

“There are more false claims made in the medical literature than anybody appreciates. There’s no question about that.” – biostatistician Steven Goodman of the Johns Hopkins University School of Public Health

Through two stories and examples, this article portrays three morals: (1) Honesty in research and efficient allocation of resources makes for good public policy and respects citizen’s autonomy in their decisions regarding the acceptance or denial of care. (2) We should never underestimate the profit motive’s ability to interfere with what is in the best interests of the American people. (3) Keep an open mind and don’t be so quick to believe what you read, even if it is in JAMA or The New England Journal.

America, America, America.

One day we will stop letting these problems take place. From Pfizer to Goldman Sachs, voodoo economists to global warming deniers, the Tea Party to Glenn Beck, America has no shortage of charlatans, snake oil salesmen, and the honest-but-mislead. I have a couple errors I’d like to talk about related to medicine and research but I could just as easily talk about Constitutional law or the fallacy of the trickle down theory. Simply put, there’s a wide variety of problems in our country. That said, let’s jump right in.

Researchers at the Rand Corp. in 2002 surveyed close to 700 adults who had received a prescription for an antidepressant. Of those who reported receiving the medication for depression, just 20% tested positive when screened for the disease. Fewer than 30% of those receiving the medication had any depressive symptoms at all.

Our first story is that of the history of antidepressant medications. Depression is among the most common problems seen in primary-care medicine and soon will be the second leading cause of disability in this country. And here’s where the error begins. Studies with positive results on the efficacy of antidepressants were selectively published. In fact, nearly all the studies that show benefit were published; however, almost none of the studies that show these drugs are ineffective were published. Think about that for a second. If one conducts studies and then cherry picks the results, then that person or company can make the data show absolutely anything they want. And there’s a good chance that’s exactly what happened.

According to the published literature, it appears that 94% of the trials conducted were positive. By contrast, the FDA analysis showed that 51% were positive (1). This bias in the literature leads to a finding that antidepressants work better than they actually do and BINGO! The companies are able to sell more product and make more profit all while fooling doctors and patients into thinking the drugs are more effective than they actually are. In fact, as a result of selective reporting, the published literature conveyed an effect size nearly one third larger than the effect size derived from the FDA data. The danger here is that while these drugs will work for some patients, had the patient known the real effect rate compared to the potential side effects (some of which are quite significant), that person may have opted out of treatment and chosen other safer, potentially cheaper and potentially more effective methods of dealing with depression. The patient’s personal liberty and right to make decisions about their own care has been interfered with by not giving the true picture of the product. Additionally, some studies like the one by Rand show that only one-in-five patients taking antidepressant medication for depression actually has symptoms of depression. The other four-in-five are being prescribed unnecessary medication, enduring unnecessary cost, and being subjected to unnecessary and potentially serious side effects.

Morals of the Story: Honesty in research and efficient allocation of resources makes for good public policy and respects citizen’s autonomy in their decisions regarding the acceptance or denial of care. We should never underestimate the profit motive’s ability to interfere with what is in the best interests of the American people.

Our second story concerns statistics. Let’s start with an example:

A study compared three different antacids for relief of gastric pain. The dependent variable was the pH of gastric contents after treatment. Twenty subjects, 10 men and 10 women, were in each group. A t test showed that C was better than A (p < 0.05) and better than B (p < 0.001), but there was no difference between A and B (p >0.05).

Awesome! Sounds like a good study, huh? We should rush out and buy antacid C! Well, hold your horses there John Wayne, and let’s look at what these researchers did wrong. First off, the researchers are using the wrong statistical test. Since these researchers are using three groups, an ANOVA would be the more appropriate test. Additionally, sex should be used as a second factor to look for differences between men and women as well as interactions. Finally, we have no idea of the magnitude of the effect. The means and standard deviation should always be reported so that we can make inferences as to the effect’s magnitude.

Let’s look at another example:

To test if people get mental disorders from their children, researchers measured the following in 50 kids: (1) the cleanliness of the bedroom, (2) the number of times “Aw, must I?” is said in a 1-hour block, and (3) the internal between when the kids should be home and when the kid actually arrives. In the mothers, the researchers measure (1) the amount of Prozac taken each day, (2) the number of times “You’re driving me crazy” is said in a 1-hour block, and (3) the amount of time spend watching game shows… Significant improvement in the model was noted when a covariance term between cleanliness and lateness arriving home was added. In total, 20 parameters were used in the model.

Catch the problems with this one? For one, unless there’s a theoretical justification for adding the covariance term, then it shouldn’t be there. You can just start adding terms because they help the model. Second, since there are 6 variables, there can be a maximum of 21 parameters ((6 x 7) / 2). However, if the number of parameters is near the limit, the sample size will need to be closer to 200 to get reproducible results. Subtle errors but errors nonetheless.

It’s science’s dirtiest secret: Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing. Over the years, hundreds of published papers have warned that science’s love affair with statistics has spawned countless illegitimate findings. In fact, “if you believe what you read in the scientific literature, you shouldn’t believe what you read in the scientific literature.”

That said, no one in their right mind would claim that science hasn’t made important and lasting contributions and that it will continue to do so. However, the fact remains that any single scientific study alone is quite likely to be incorrect, largely due to misunderstanding and misuse of statistics. As Dr. Goodman puts it, “A lot of scientists don’t understand statistics and they don’t understand statistics because the statistics don’t make sense.”

The solution to this problem will be time-consuming and likely expensive. First off, many researchers may need additional training in statistics or we’ll need more statisticians to oversee articles and block publication of potentially dubious results. One way would be mandating review by trained statisticians through federal legislation, perhaps via a government-run center (“The Federal Centers for Scientific Credibility” or FCSC) that oversees research before publication. The implications of this solution’s interference with private industry are obvious and frightening.

Another solution would be the development of sophisticated statistics software designed to detect and flag errors. This would of course require R&D funding, likely at the government level and would necessitate the reporting of results in a standardized format for automated analysis. This solution would be difficult and only as good as the programming. It would require months to years of testing with simultaneous human review to find bugs and correct errors in the code.

It’s important to remember that while not all articles are bad, there exists evidence that some of them do have conclusions based on subtle errors in analysis.

Moral of the Story: Keep an open mind and don’t be so quick to believe what you read, even if it is in JAMA or The New England Journal.

For further reading, I recommend the following:

(1) Turner EH et al. 2007. Selective publication of antidepressant trials and its influence on apparent efficacy. New England Journal of Medicine. 358: 252-260.

(2) Odd’s Are, It’s Wrong:,_Its_Wrong

(3) Any credible textbook on statistics


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: