The Significance of Significance
Traffic deaths on interstate highways have apparently been less numerous since speed limits were increased from 55 miles per hour to 65. The clear implication of the articles reporting this statistic has been that speed does not kill. Although that may well be the case, only a few of the reports mentioned that the mandated use of seatbelts and the efforts of law-enforcement agencies to crack down on drunk drivers may have contributed to the decrease, and none listed any other possible contributing factors. Surely the increasing number of vehicles on the road that have been manufactured to stricter safety standards, with such important features as power steering, air bags, and anti-lock brakes, must be considered in any explanation of improved traffic fatality statistics. Without accounting for all these variables, is it possible to conclude that faster is better?
Similarly, you may have read an item in your newspaper about the discovery that a great many entries in a well-known publisher's sweepstakes had simply been thrown into the bushes, and that a large majority of these were entries that involved no purchase of the company's publications. The inference was made that the company had discriminated in awarding prizes to paying customers, even though its rules stated that prize winners would be selected at random with no requirement of purchase. The inference does not stand up under simple logic, however, because at least some of the discarded entries did contain orders for publications. Nor does it stand up under a basic statistical test. Apparently, no one asked whether the percentage of non-orders in the discarded batch matched the percentage of non-orders in the entries that had been used in selecting the winners. Would that all statistical applications were that simple.
Most of us have only a vague knowledge of or interest in statistics. When a published paper points out that the results were validated by this test or that test, we tend to be impressed with the testing without knowing what it really means or whether a particular test was a proper one for examining a particular group of data. A test of validity should not merely be accepted on its face.
Sometimes the seeming authority of a statistical evaluation may mask the possibility that the research was poorly conceived, and that the statistical proof of significance of the results is a mirage. This can be seen in research projects that focus on one variable without consideration of the existence and effects of additional variables.
If a single independent variable is not isolated, the best one can say is that under the conditions that existed in the experiment, here are the results. They may or may not be repeatable. They may or may not be valid in the mouth in your office; each practice winds up conducting clinical testing to find the products and procedures the practice is most comfortable and successful with.
That does not mean that all research that fails to take into account all the possible variables is a waste of time. But a reader must apply some thought to the research basis-- for example, by paying attention to how many cases are in the sample and whether the sample is homogeneous; by evaluating the meaning of wide ranges and large standard deviations; and by considering whether one can make the leap from in vitro to in vivo. Those points do not require an in-depth knowledge of statistics. Perhaps the most demanding task for a reader, however, is to think of the variables that have not been considered in a research problem and to fathom the significance of the results presented.
Unfortunately, the results of clinical practice often cannot be subjected to statistical analysis at all. For example, we cannot treat a patient twice by two different methods to compare their efficacy. Even if we could, it would not be a valid experiment. There are too many variables involved. This makes suspect studies in which identical twins are treated by two different methods, and certainly studies of fraternal twins, which you sometimes see; or when one side of the mouth is treated one way and the other side another way. You can almost believe what you want to about the results of such studies. Thus, some reports of results of clinical procedures do not satisfy the most rigorous standards of research. They have to be scrutinized in light of one's own experience, and submitted to a common-sense test.
Whether dealing with findings that have no statistical basis at all or with research containing considerable statistical analysis, each practitioner must look for validity in the initial setup of the study, in the procedures that were used, and in the conclusions that are drawn. Any study should be chewed on a while, rather than swallowed whole.