Statistical significance and scientific importance are distinct, equally valuable aspects of communicating the significance of statistics in scientific research
Communicating the “significance” of statistics in scientific research is often plagued by the fact that the everyday usage of “significant” is very different than the technical meaning of that term. When most people read “significant,” they interpret it to mean “big” or “important.” When statisticians say “significant,” they intend it to mean that the estimated effect size is unlikely to have arisen by chance.
Although both aspects of “significance” are key parts of communicating scientific research, those definitions clearly are not interchangeable! Unfortunately, this confusion between statistical significance and scientific importance has been so widespread (even among scientists!) that statisticians have been making recommendations for years on ways to differentiate between them when communicating scientific results (Amrhein et al., 2019; McShane et al., 2019). Despite those efforts, the two usages continue to be conflated by scientists and laypeople alike.
Also contributing to the misunderstanding is that “statistics” has two different meanings: In everyday use, it refers to numeric facts. In technical usage, it refers to using numeric estimates based on randomly selected samples (subsets of a population) to draw inferences about the population.
What is “statistical significance”?
When statisticians say a result is “statistically significant,” they mean that a statistical test found evidence of an effect based on sample data. The word “statistically” is intended to emphasize that “significant” refers to “unlikely to have arisen by chance” due to how the sample was drawn (Witmer, 2019).
What is “scientific significance?”
The Oxford English Dictionary (2024) defines “significant” as “sufficiently great or important to be worthy of attention; noteworthy; consequential, influential.” What qualifies a statistic to meet those criteria? Even those definitions differ from one another, so let’s parse the distinct meanings before considering how to communicate them effectively.
First, to be “worthy of attention” or “consequential,” a result must be big enough to matter for the topic at hand, such as being clinically or educationally meaningful. A result in the opposite of the expected direction also merits attention.
Second, to be “consequential” or “influential,” a result must be one that can be used to inform a decision or the design of an intervention. To judge whether a study’s results can be applied in those ways, readers need to know several different things about that study, including whether
- The findings can be generalized
- The pattern could be explained by other factors;
- The results can be interpreted as cause-and-effect;
- The presumed cause can be changed.
Thus, the “significance” of statistics in science goes far beyond statistical significance.
Improving communication of “significance”
How can scientific writers clearly communicate the “significance” of their statistics? First, describe scientific importance before statistical significance, to ensure that those other, often overlooked, aspects are considered (Miller, 2023).
Second, always accompany “significance/significant” with a modifying term – “statistical” or “scientific.” Even better, replace “significant” with other words or phrases that convey the specific aspect of “importance” being described.
Communicating scientific importance
A thorough explanation of a study’s scientific importance will touch on each of the following dimensions, worded to convey the specific topic:
- Convey the size of the effect. If participants in a pilot job training program earned 20% more than similar people who did not undertake the program, that would be a meaningful increase. If they only earned 2% more, we shouldn’t bother replicating that training program.
- Express the direction of the effect. Modify terms like “correlated” or “associated” with words such as “positive” or “inverse” to convey direction. Conveying direction is especially pertinent if the pattern was the opposite of the predicted (or desired!) effect, such as if a new drug actually worsened survival compared to older medications.
- Report the “W’s” (when, where, who) for the study sample, and discuss whether the findings could be generalized to the population or to other places or groups. If a clinical trial of a new medication only studied people ages 25 to 49, we should hesitate to infer that the drug would have the same benefits and side effects among older people.
- List other factors that might explain the observed association – a phenomenon known as confounding. If researchers allowed people to self-select into the training program, their higher earnings could be due to greater determination or better job networks among participants than non-participants.
- Discuss whether it is plausible to infer causality based on that study. If the study compared how participants’ and non-participants’ earnings changed across the study period, that is stronger evidence that the program caused the improvement than if they just compared participants’ to non-participants’ earnings at the end.
- Discuss whether the apparent “cause” can be changed (and that change maintained) – a crucial determinant of whether we can expect the observed outcomes to be sustainable. If the training program succeeded in raising earnings because participants had employment support only during the study period, the observed earnings bump might fade once those supports are no longer there.
Communicating statistical significance
Having covered those elements of scientific importance, convey statistical significance, remembering that statistical significance does not override lack of scientific importance. With a large sample, a 1% increase in earnings could be statistically significant but would be too small to matter.
Conversely, remember that a result can be important even if it wasn’t statistically significant. If the earnings improvement wasn’t statistically significant, that should be emphasized, distinguishing possible reasons such as small sample size versus trivial effect size.
Before writing about statistical significance, identify the audience (Miller 2015).
For non-scientific audiences, paraphrase statistical significance using wording such as
- “The chances of observing a survival difference this large in our study if there were no real difference between the effectiveness of the old and new medications was less than one in a thousand.” [statistically significant at p<.001]
- “The difference between old and new medications could easily have occurred by chance alone.” [not statistically significant]
For audiences trained in statistics,
- If you intend the statistical meaning, write “statistically significant”, not just “significant.”
- Alternatively, statisticians suggest replacing ‘statistically significant’ with ‘statistically discernible’ to differentiate it from the colloquial use of ‘significant’ (Witmer, 2019). Jane E. Miller
Following those principles can “significantly” improve communication of statistics in science.
References
- Amrhein, Valentin, Sander Greenland, and Blakely McShane. 2019. “Retire Statistical Significance.” Nature, 567, 305–307. https://doi.org/10.1038/d41586-019-00857-9
- McShane, Blakeley B., David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tackett. 2019. “Abandon Statistical Significance.” The American Statistician, 73:sup1, 235–245. https://doi.org/10.1080/00031305.2018.1527253
- Miller, J.E. 2023. “Beyond Statistical Significance: A Holistic View of What Makes a Research Finding ‘Important.’” Numeracy. 16(1): Article 6. https://doi.org/10.5038/1936-4660.16.1.1428
- ______ 2015. The Chicago Guide to Writing about Numbers, 2nd Edition. University of Chicago Press.
- Oxford English Dictionary. 2024. https://www.oed.com/
- Witmer, J. (2019). Editorial. Journal of Statistics Education, 27(3), 136–137. https://doi.org/10.1080/10691898.2019.1702415