| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
15 June 1999 | Volume 130 Issue 12 | Pages 995-1004
An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain "error rates," without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be usedthe Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings.
Author and Article Information
From Johns Hopkins University School of Medicine, Baltimore, Maryland.
Requests for Reprints: Steven Goodman, MD, PhD, Johns Hopkins University, 550 North Broadway, Suite 409, Baltimore, MD 21205; e-mail, sgoodman{at}jhu.edu. ACADEMIA AND CLINIC
Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy
![]()
Related articles in Annals:
This article has been cited by other articles:
![]() |
R. C. Kane The Clinical Significance of Statistical Significance Oncologist, November 1, 2008; 13(11): 1129 - 1133. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Anderson, P. C. McCormick, and P. D. Angevine Randomized Controlled Trials of the Treatment of Lumbar Disk Herniation: 1983-2007 J. Am. Acad. Ortho. Surg., October 1, 2008; 16(10): 566 - 573. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. A. Ioannidis Effect of Formal Statistical Significance on the Credibility of Observational Associations Am. J. Epidemiol., August 15, 2008; 168(4): 374 - 383. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Ameer Novel Trial Design: A Report From the 19th Frontiers Symposium of ACCP J. Clin. Pharmacol., July 1, 2008; 48(7): 793 - 798. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. K J Adhikari and D. C Scales Corticosteroids for acute respiratory distress syndrome BMJ, May 3, 2008; 336(7651): 969 - 970. [Full Text] [PDF] |
||||
![]() |
V. Vyshemirsky and M. A. Girolami Bayesian ranking of biochemical system models Bioinformatics, March 15, 2008; 24(6): 833 - 839. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Hubbard and R. M. Lindsay Why P Values Are Not a Useful Measure of Evidence in Statistical Significance Testing Theory Psychology, February 1, 2008; 18(1): 69 - 88. [Abstract] [PDF] |
||||
![]() |
S. Hirschfeld Credibility Threshold: Making Efficacy Study Data Generalizable ASCO Educational Book, January 1, 2008; 2008(1): 108 - 111. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Rangachari Statistics: not a confidence trick. A commentary on "Guidelines for reporting statistics in journals published by the American Physiological Society: the sequel" Advan Physiol Educ, December 1, 2007; 31(4): 300 - 301. [Full Text] [PDF] |
||||
![]() |
F. Reichenberger, R. Voswinckel, B. Enke, M. Rutsch, E. El Fechtali, T. Schmehl, H. Olschewski, R. Schermuly, N. Weissmann, H. A. Ghofrani, et al. Long-term treatment with sildenafil in chronic thromboembolic pulmonary hypertension Eur. Respir. J., November 1, 2007; 30(5): 922 - 927. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Nieto, A. Mazon, R. Pamies, J. J. Linana, A. Lanuza, F. O. Jimenez, A. Medina-Hernandez, and F. J. Nieto Adverse Effects of Inhaled Corticosteroids in Funded and Nonfunded Studies Arch Intern Med, October 22, 2007; 167(19): 2047 - 2053. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. N. Goodman Stopping at Nothing? Some Dilemmas of Data Monitoring in Clinical Trials Ann Intern Med, June 19, 2007; 146(12): 882 - 887. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. P. WINGO and S. N. GHAEMI STAR*D Level IV Methodology Am J Psychiatry, April 1, 2007; 164(4): 681 - 681. [Full Text] [PDF] |
||||
![]() |
T. Lang Documenting Research in Scientific Articles: Guidelines for Authors: 2. Reporting Hypothesis Tests Chest, January 1, 2007; 131(1): 317 - 319. [Full Text] [PDF] |
||||
![]() |
T. Lang Documenting research in scientific articles: guidelines for authors: reporting research designs and activities. Chest, October 1, 2006; 130(4): 1263 - 1268. [Full Text] [PDF] |
||||
![]() |
F. Reichenberger, R. Voswinckel, E. Steveling, B. Enke, A. Kreckel, H. Olschewski, F. Grimminger, W. Seeger, and H. A. Ghofrani Sildenafil treatment for portopulmonary hypertension Eur. Respir. J., September 1, 2006; 28(3): 563 - 567. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Steinberg Thematic review series: The Pathogenesis of Atherosclerosis. An interpretive history of the cholesterol controversy, part V: The discovery of the statins and the end of the controversy J. Lipid Res., July 1, 2006; 47(7): 1339 - 1351. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Vladusich, M. P. Lucassen, and F. W. Cornelissen Do Cortical Neurons Process Luminance or Contrast to Encode Surface Properties? J Neurophysiol, April 1, 2006; 95(4): 2638 - 2649. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Katz Methodological issues in clinical trials of opioids for chronic pain Neurology, December 29, 2005; 65(12_suppl_4): S32 - S49. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. P. West Contradictions in Highly Cited Clinical Research JAMA, December 7, 2005; 294(21): 2695 - 2695. [Full Text] [PDF] |
||||
![]() |
S. N Goodman Introduction to Bayesian methods I: measuring the strength of evidence Clinical Trials, August 1, 2005; 2(4): 282 - 290. [Abstract] [PDF] |
||||
![]() |
F. Buccelletti Reviparin in Acute Myocardial Infarction JAMA, June 1, 2005; 293(21): 2595 - 2595. [Full Text] [PDF] |
||||
![]() |
G. A. Diamond and S. Kaul Prior convictions: bayesian approaches to the analysis and interpretation of clinical megatrials J. Am. Coll. Cardiol., June 2, 2004; 43(11): 1929 - 1939. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Hill, O. Fruchter, M. D. Eisner, P. G. Tsoutsou, K. I. Gourgoulianis, J. S. Vourlekis, L. Richeldi, G. Raghu, T. E. King Jr., and A. S. Teirstein Interferon Gamma-1b for Pulmonary Fibrosis N. Engl. J. Med., April 22, 2004; 350(17): 1794 - 1797. [Full Text] [PDF] |
||||
![]() |
M. Zwahlen, P. Juni, and M. Egger What Now About Acetaminophen? Arch Intern Med, August 11, 2003; 163(15): 1862 - 1863. [Full Text] [PDF] |
||||
![]() |
K. J Hoggatt Commentary: Vitamin supplement use and confounding by lifestyle Int. J. Epidemiol., August 1, 2003; 32(4): 553 - 555. [Full Text] [PDF] |
||||
![]() |
E. H. Estey and P. F. Thall New designs for phase 2 clinical trials Blood, July 15, 2003; 102(2): 442 - 448. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Brophy, P. Belisle, and L. Joseph Evidence for Use of Coronary Stents: A Hierarchical Bayesian Meta-Analysis Ann Intern Med, May 20, 2003; 138(10): 777 - 786. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Cummings and F. P. Rivara Reporting Statistical Information in Medical Journal Articles Arch Pediatr Adolesc Med, April 1, 2003; 157(4): 321 - 324. [Full Text] [PDF] |
||||
![]() |
D. Hellman Evidence, Belief, and Action: The Failure of Equipoise to Resolve the Ethical Tension in the Randomized Clinical Trial J. Law Med. Ethics, September 1, 2002; 30(3): 375 - 380. [PDF] |
||||
![]() |
T. E. Feasby, J. Kennedy, H. Quan, W. A. Ghali, E. L. Hannan, A. J. Popp, M. R. Chassin, and E. Halm Outcomes in Carotid Endarterectomy Performed by Vascular Surgeons or Neurosurgeons * Response Stroke, June 1, 2002; 33(6): 1458 - 1458. [Full Text] [PDF] |
||||
![]() |
J. T. Chibnall, J. M. Jeral, and M. A. Cerullo Experiments on Distant Intercessory Prayer: God, Science, and the Lesson of Massah Arch Intern Med, November 26, 2001; 161(21): 2529 - 2536. [Full Text] [PDF] |
||||
![]() |
P. F. Sullivan, L. J. Eaves, K. S. Kendler, and M. C. Neale Genetic Case-Control Association Studies in Neuropsychiatry Arch Gen Psychiatry, November 1, 2001; 58(11): 1015 - 1024. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M Fritz and R. S Wainner Examining Diagnostic Tests: An Evidence-Based Perspective Physical Therapy, September 1, 2001; 81(9): 1546 - 1564. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A C Sterne, G. D. Smith, and D R Cox Sifting the evidence--what's wrong with significance tests? Physical Therapy, August 1, 2001; 81(8): 1464 - 1469. [Full Text] [PDF] |
||||
![]() |
D. B. Dunson Commentary: Practical Advantages of Bayesian Analysis of Epidemiologic Data Am. J. Epidemiol., June 15, 2001; 153(12): 1222 - 1226. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A C Sterne, G. D. Smith, and D R Cox Sifting the evidence{---}what's wrong with significance tests? Another comment on the role of statistical methods BMJ, January 27, 2001; 322(7280): 226 - 231. [Full Text] |
||||
![]() |
H. P. Lehmann and S. N. Goodman Bayesian Communication: A Clinically Significant Paradigm for Electronic Publication J. Am. Med. Inform. Assoc., May 1, 2000; 7(3): 254 - 266. [Abstract] [Full Text] |
||||
![]() |
E. Blot, F. Heron, M. Lishner, E. B. Rubenstein, K. V.I. Rolston, Y. J. Kim, O. Sezer, A. G. Freifeld, S. M. Steinberg, P. A. Pizzo, et al. Oral Antibiotics for Febrile Patients with Neutropenia Due to Cancer Chemotherapy N. Engl. J. Med., January 6, 2000; 342(1): 55 - 58. [Full Text] |
||||