THE EDITOR'S CORNER

Lies, Damned Lies, and Statistics

ROBERT G. KEIM DDS, EdD, PhD

One of my favorite quotations comes from Mark Twain's Own Autobiography: The Chapters from the North American Review. As the American humorist put it with his customary wit, "Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: 'There are three kinds of lies: lies, damned lies, and statistics.'" While this oft-quoted jibe has more recently been shown to postdate Disraeli, the point certainly applies to scientific research: statistics can mislead the casual reader, whether unintentionally or intentionally on the investigator's part.

I've mentioned before in this column that, in addition to my work in orthodontics and orthodontic education, I am a professor of statistics and research methodology at the University of Southern California. In that capacity, I am often perturbed by the abuse of statistics in orthodontic research. Most authors take for granted that the validity of probability-based inferential statistics is practically sacrosanct and that statistical analysis is a requirement for valid scientific inquiry. In fact, Standard 6.1 of the Commission on Dental Accreditation's Accreditation Standards for Advanced Specialty Education Programs in Orthodontics and Dentofacial Orthopedics states: "Students/residents must initiate and complete a research project to include critical review of the literature, development of a hypothesis and the design, statistical analysis [emphasis mine] and interpretation of data."

Similar articles from the archive:

THE EDITOR'S CORNER The Power of the Pyramid KEIM October 2007
THE EDITOR'S CORNER The Weight of the Evidence KEIM March 2004
JCO INTERVIEWS Robert G. Keim, DDS, on Living with Statistics KEIM, GOTTLIEB May 1997

The inference is that a paper or research project is not "real" unless it includes a statistical analysis. Nothing could be further from the truth. In the greater scope of science, there are two broad areas of inquiry: quantitative and qualitative research. Quantitative research, which includes statistics, involves taking a sample that is assumed to be representative of the overall population. The validity of this assumption is rarely questioned or tested. Measurements of the variables under study in the sample are made, statistical tests are conducted on the data from these measurements, and conclusions are drawn about the nature of the population in general. These conclusions are supposed to influence the thinking of the entire profession about the issue in question. Most of the time, however, a simple qualitative assessment would have provided an equally legitimate decision.

Don't get me wrong here. I am not condemning the use of statistical inference in clinical decision making. What I am condemning is the abuse of that process--lies, damned lies, and statistics. In the February 2011 issue of the European Journal of Orthodontics, Polychronopoulou, Pandis, and Eliades make a strong argument against inappropriate analyses, directly addressing one of my pet peeves about orthodontic research: the overreliance on "p" values to determine "statistical significance" while ignoring confidence intervals and effect sizes.

Most practicing orthodontists have had only a cursory background in statistics from their graduate programs, focused on indications for the available statistical tests--t-tests, analyses of variance (ANOVAs), correlation coefficients (r), chi-squares, and occasionally multiple regressions; how to perform them on statistical software packages such as SPSS, SAS, or Stata; and how to interpret the resulting outputs. Basically, students are given a cookbook approach to analyzing the data gathered for their theses or required research projects, but no insight into the theoretical framework of practical statistics. The archaic definitions of p < .05 as statistically significant, p < .01 as highly significant, and p < .001 as very highly significant are still used all too frequently. Think about it for a minute. If we conduct an analysis on a data set and get p = .049, we conclude that there is a "significant difference". If we get p = .051, there is "no significant difference". Is there really that much difference between .049 and .051?

The word significant does not mean important or meaningful in a statistical context, as it does in everyday speech. A result is considered "statistically significant" if it is unlikely to have occurred by random chance alone. The ability to detect such significance is, among other factors, a function of the sample size. For instance, a study that included tens of thousands of participants might be able to establish with great confidence that residents of one city were more intelligent than residents of another city by 1/20 of an IQ point. That result would be statistically significant, but the difference is small enough to be utterly unimportant. Many researchers insist that tests of significance should always be accompanied by effect-size statistics, which approximate the size and thus the practical importance of the difference.

Qualitative research, commonly used in psychology and the social sciences, emphasizes real, direct experience as opposed to mathematical tests of hypotheses (which can be correctly understood as educated guesses). Case studies and case reports are examples of qualitative research methods. Although the sample size may be quite small, the goal is to achieve a deep and broad understanding of what is really going on, rather than generalizing from the study sample to the general population in an inferential manner. When correctly applied, qualitative methods produce information about only the particular cases studied, but the conclusions may be used to formulate hypotheses that are then testable by appropriate quantitative methods.

The use of qualitative research has increased remarkably in both frequency and respectability outside the social sciences in recent years. Many experts in the clinical sciences now argue that qualitative research is generally more valid than statistically based quantitative research. Qualitative research emphasizes pragmatic, practical, and clinically applicable decision making. Likewise, JCO has always been the "journal of practical orthodontics"--in fact, that was the original name of this magazine--placing a strong emphasis on practical, case-based papers with direct application in the busy orthodontic practice. In essence, then, we have always relied strongly on qualitative research.

Clinicians should not consider quantitative and qualitative research to be polar opposites or mutually exclusive. Rather, these methods should be thought of as a continuum, with the practical "truth" lying somewhere in the middle. JCO will continue to accept papers for publication that utilize quantitative, statistically based methods when applied appropriately--but we will always be cognizant of Twain's admonition about lies, damned lies, and statistics.

RGK

Journal of Clinical Orthodonics

THE EDITOR'S CORNER

Lies, Damned Lies, and Statistics

Similar articles from the archive:

DR. ROBERT G. KEIM DDS, EdD, PhD

My Account