Study says authors exaggerate their findings in paper abstracts, and that's a problem when readers take them at face value.
We’ve all been told not to judge a book by its cover. But we shouldn’t be judging academic studies by their abstracts, either, according to a new paper in BMJ Evidence-Based Medicine. The study -- which found exaggerated claims in more than half of paper abstracts analyzed -- pertains to psychology and psychiatry research. It notes that “spin” is troublesome in those fields because it can impact clinical care decisions. But the authors say that this kind of exaggeration happens in other fields, too.
“Researchers are encouraged to conduct studies and report findings according to the highest ethical standards,” the paper says, meaning “reporting results completely, in accordance with a protocol that outlines primary and secondary endpoints and prespecified subgroups and statistical analyses.”
Yet authors are free to choose “how to report or interpret study results.” And in an abstract, in particular, they may include “only the results they want to highlight or the conclusions they wish to draw.”
In a word: spin.
Based on the idea that randomized controlled trials often inform how patients are treated, researchers used PubMed to find these kinds of studies. Their sample included those published from 2012-17 in well-regarded psychology and psychiatry journals: JAMA Psychiatry, American Journal of Psychiatry, Journal of Child Psychology and Psychiatry, Psychological Medicine, British Journal of Psychiatry and Journal of the American Academy of Child and Adolescent Psychiatry.
Crucially, they analyzed only trials with results that were not statistically significant, and therefore were susceptible to spin -- 116 in all.
Evidence of spin included focusing only on statistically significant results, interpreting nonsignificant results or equivalent, using favorable rhetoric with regard to the nonsignificant results and declaring that an intervention was beneficial despite its statistical insignificance.
How often did articles’ abstracts exaggerate the actual findings? More than half the time, or 56 percent. Spin happened in 2 percent of titles, 21 percent of abstract results sections and 49 percent of abstract conclusion sections. Fifteen percent of abstracts had spin in both their results and conclusion sections.
Spin was more common in studies that compared a proposed treatment with typical care or placebo than in other kinds of studies. But industry funding was not associated with a greater likelihood of exaggeration, as just 10 of 65 spun trials had any of this kind of funding.
The study notes several limitations, including that looking for spin is inherently subjective work. But it says that it’s important to guard against spin because researchers have an ethical obligation to honestly and clearly report their results and because spinning an abstract “may mislead physicians who are attempting to draw conclusions about a treatment for patients.” Physicians read only an article abstract, versus the entire article, a majority of the time, it says, citing prior research on the matter, and many editorial decisions are based on the abstract alone. Positive results are also more likely to be published in the first place, the paper notes, citing one study that found 15 percent of peer reviewers asked authors to spin their manuscripts.
What’s to be done? Journal editors may consider inviting reviewers to comment on the presence of spin, the article suggests.
Reporting guidelines also are used by several journals already to “ensure accurate and transparent reporting of clinical trial results, and the use of such guidelines improves trial reporting,” the paper says. While the recent Consolidated Standards of Reporting Trials on abstracts don’t contain language discouraging spin, it says, “research reporting could be improved by discouraging spin in abstracts.”
Lead author Sam Jellison, a medical student at Oklahoma State University, underscored that his paper is not the first to explore academic spin. Yet making more readers “aware of what spin is might be the first and largest step to take to fight this problem,” he said. Jellison said that the existing literature suggests spin is not unique to psychology and psychiatry, and that those fields are actually “middle of the road” in terms of prevalence.
Philip Cohen, a professor of sociology at the University of Maryland at College Park who blogs about research, pointed out that reviewers already look at abstracts as part of their process, so in addition to the journal editor, "reviewers should be able to see if the abstract is overstating the findings.”
Still, a common way that sociologists inflate research findings in general is to mention those that are not statistically significant while downplaying the lack of significance, attributing it to a small sample or using phrases such as “does not reach statistical significance,” he said, “as if the effect is just trying but can't quite get there.”
Beyond questions of spin, Cohen said there is surely a problem with “people only publishing, or journals only accepting, dramatic findings,” he said. So the greatest source of exaggeration is probably in what gets published at all, with null findings or those that contradict existing positive results never seeing the light of day -- what Cohen noted has been called the "file drawer" problem.
While psychology isn't alone in the spin room, the field has had its share of data integrity and public perception problems. A landmark study in 2015, for example, found that most psychology studies don’t yield reproducible results.
Brian Nosek, a professor of psychology at the University of Virginia and lead author on the reproducibility study, said that spin involves two “connected problems,” neither of which is easy to solve. Authors are “incentivized to present their findings in the best possible light for publishability and impact, and readers often don't read the paper.”
As an author, he said, “even if I want to avoid spin,” it’s “entirely reasonable for me to try make the narrative of my title and abstract as engaging as possible so that people will read the paper.” And at the same time, it’s “very difficult to capture the complexity of almost any research finding in a phrase or short abstract.” It’s really a “skill” to present “complex findings briefly without losing accuracy.”
As a reader, Nosek continued, “even if I want to make the best possible decisions based on research evidence, I don't have time to read and evaluate everything deeply." In some cases, he said, "I need to be able to trust that the information conveyed briefly is accurate and actionable.”
Ultimately, when “decisions are important, we should have higher expectations of readers to gather the information necessary to make good decisions,” he said. “But we need to recognize pragmatic realities and develop better tools for readers to calibrate the confidence in the claims they see in brief, and provide cues prompting them to dig more deeply when the evidence is uncertain.”
It’s also “in our collective interest to provide authors more training in communicating their findings in abstracts and press releases," Nosek added.