Annals
Established in 1927 by the American College of Physicians
:
Advanced search
box Article
 arrow  Table of Contents                
space
 arrow  Abstract of this article Free
space
 arrow  Figures/Tables List
space
 arrow  Articles citing this article
space
box Services
 arrow  Send comment/rapid response letter
space
 arrow  Notify a friend about this article
space
 arrow  Alert me when this article is cited
space
 arrow  Add to Personal Archive
space
 arrow  Download to Citation Manager
space
 arrow  ACP Search                        
space
 arrow  Get Permissions
space
box Google Scholar
 arrow  Search for Related Content
space
box PubMed
Articles in PubMed by Author:
  arrow  Lau, J.
space
  arrow  Schmid, C. H.
space
 arrow  Related Articles in PubMed
space
 arrow  PubMed Citation
space
 arrow  PubMed
space

ACADEMIA AND CLINIC

SYSTEMATIC REVIEW SERIES

Series Editors: Cynthia Mulrow, MD, MSc and Deborah Cook, MD, MSc

Quantitative Synthesis in Systematic Reviews

right arrow Joseph Lau, MD; John P.A. Ioannidis, MD; and Christopher H. Schmid, PhD

1 November 1997 | Volume 127 Issue 9 | Pages 820-826

The final common pathway for most systematic reviews is a statistical summary of the data, or meta-analysis. The complex methods used in meta-analyses should always be complemented by clinical acumen and common sense in designing the protocol of a systematic review, deciding which data can be combined, and determining whether data should be combined. Both continuous and binary data can be pooled. Most meta-analyses summarize data from randomized trials, but other applications, such as the evaluation of diagnostic test performance and observational studies, have also been developed. The statistical methods of meta-analysis aim at evaluating the diversity (heterogeneity) among the results of different studies, exploring and explaining observed heterogeneity, and estimating a common pooled effect with increased precision. Fixed-effects models assume that an intervention has a single true effect, whereas random-effects models assume that an effect may vary across studies. Meta-regression analyses, by using each study rather than each patient as a unit of observation, can help to evaluate the effect of individual variables on the magnitude of an observed effect and thus may sometimes explain why study results differ. It is also important to assess the robustness of conclusions through sensitivity analyses and a formal evaluation of potential sources of bias, including publication bias and the effect of the quality of the studies on the observed effect.


A quantitative systematic review, or meta-analysis, uses statistical methods to combine the results of multiple studies. Meta-analyses have been done for systematic reviews of therapeutic trials, diagnostic test evaluations, and epidemiologic studies. Although the statistical methods involved may at first appear to be mathematically complex, their purpose is simple: They are trying to answer four basic questions. Are the results of the different studies similar? To the extent that they are similar, what is the best overall estimate? How precise and robust is this estimate? Finally, can dissimilarities be explained? This article provides some guidance in understanding the key technical aspects of the quantitative approach to these questions. We have avoided using equations and statistical notations; interested readers will find implementations of the described methods in the listed references. We focus here on the quantitative synthesis of reports of randomized, controlled, therapeutic trials because far more meta-analyses on therapeutic studies than on other types of studies have been published.

For practical reasons, we present a stepwise description of the tasks that are performed when statistical methods are used to combine data. These tasks are 1) deciding whether to combine data and defining what to combine, 2) evaluating the statistical heterogeneity of the data, 3) estimating a common effect, 4) exploring and explaining heterogeneity, 5) assessing the potential for bias, and 6) presenting the results.


Deciding Whether To Combine Data and Defining What To Combine
space

By the time one performs a quantitative synthesis, certain decisions should already have been made about the formulation of the question and the selection of included studies. These topics were discussed in two previous articles in this series [1, 2]. Statistical tests cannot compensate for lack of common sense, clinical acumen, and biological plausibility in the design of the protocol of a meta-analysis. Thus, a reader of a systematic review should always address these issues before evaluating the statistical methods that have been used and the results that have been generated. Combining poor-quality data, overly biased data, or data that do not make sense can easily produce unreliable results.

The data to be combined in a meta-analysis are usually either binary or continuous. Binary data involve a yes/no categorization (for example, death or survival). Continuous data take a range of values (for example, change in diastolic blood pressure after antihypertensive treatment, measured in mm Hg).

When one is comparing groups of patients, binary data can be summarized by using several measures of treatment effect that were discussed earlier in this series [3]. These measures include the risk ratio; the odds ratio; the risk difference; and, when study duration is important, the incidence rate. Another useful clinical measure, the number needed to treat (NNT), is derived from the inverse of the risk difference [3]. Treatment effect measures, such as the risk ratio and the odds ratio, provide an estimate of the relative efficacy of an intervention, whereas the risk difference describes the intervention's absolute benefit. The various measures of treatment effect offer complementary information, and all should be examined [4].

Continuous data can be summarized by the raw mean difference between the treatment and control groups when the treatment effect is measured on the same scale (for example, diastolic blood pressure in mm Hg), by the standardized mean difference when different scales are used to measure the same treatment effect (for example, different pain scales being combined), or by the correlation coefficients between two continuous variables [5]. The standardized mean difference, also called the effect size, is obtained by dividing the difference between the mean in the treatment group and the mean in the control group by the SD in the control group.


Evaluating the Statistical Heterogeneity of the Data
space

This step is intended to answer the question, Are the results of the different studies similar (homogeneous)? It is important to answer this question before combining any data. To do this, one must calculate the magnitude of the statistical diversity (heterogeneity) of the treatment effect that exists among the different sets of data.

Statistical diversity can be thought of as attributable to one or both of two causes. First, study results can differ because of random sampling error. Even if the true effect is the same in each study, the results of different studies would be expected to vary randomly around the true common fixed effect. This diversity is called the within-study variance. Second, each study may have been drawn from a different population, depending on the particular patients chosen and the interventions and conditions unique to the study. Therefore, even if each study enrolled a large patient sample, the treatment effect would be expected to differ. These differences, called random effects, describe the between-study variation with regard to an overall mean of the effects of all of the studies that could be undertaken.

The test most commonly used to assess the statistical significance of between-study heterogeneity is based on the chi-square distribution [6]. It provides a measure of the sum of the squared differences between the results observed and the results expected in each study, under the assumption that each study estimates the same common treatment effect. A large total deviation indicates that a single common treatment effect is unlikely. Any pooled estimate calculated must account for the between-study heterogeneity. In practice, this test has low sensitivity for detecting heterogeneity, and it has been suggested that a liberal significance level, such as 0.1, should be used [6].


Estimating a Common Effect
space

The questions that this step tries to answers are, 1) To the extent that data are similar, what is their best common point estimate of a therapeutic effect, and 2) how precise is this estimate? The mathematical process involved in this step generally involves combining (pooling) the results of different studies into an overall estimate. Compared with the results of individual studies, pooled results can increase statistical power and lead to more precise estimates of treatment effect.

Each study is given a weight according to the precision of its results. The rationale is that studies with narrow CIs should be weighted more heavily than studies with greater uncertainty. The precision is generally expressed by the inverse of the variance of the estimate of each study. The variance has two components: the variance of the individual study and the variance between different studies. When the between-study variance is found to be or assumed to be zero, each study is simply weighted by the inverse of its own variance, which is a function of the study size and the number of events in the study. This approach characterizes a fixed-effects model, as exemplified by the Mantel-Haenszel method [7, 8] or the Peto method [9] for dichotomous data. The Peto method has been particularly popular in the past. It has the advantage of simple calculation; however, although it is appropriate in most cases, it may introduce large biases if the data are unbalanced [10, 11]. On the other hand, random-effects models also add the between-study variance to the within-study variance of each individual study when the pooled mean of the random effects is calculated. The random-effects model most commonly used for dichotomous data is the DerSimonian and Laird estimate of the between-study variance [12]. Fixed- and random-effects models for continuous data have also been described [13]. Pooled results are generally reported as a point estimate and CI, typically a 95% CI.

Other quantitative techniques for combining data, such as the Confidence Profile Method [14], use Bayesian methods to calculate posterior probability distributions for effects of interest. Bayesian statistics are based on the principle that each observation or set of observations should be viewed in conjunction with a prior probability describing the prior knowledge about the phenomenon of interest [15]. The new observations alter this prior probability to generate a posterior probability. Traditional meta-analysis assumes that nothing is known about the magnitude of the treatment effect before randomized trials are performed. In Bayesian terms, the prior probability distribution is noninformative. Bayesian approaches may also allow the incorporation of indirect evidence in generating prior distributions [14] and may be particularly helpful in situations in which few data from randomized studies exist [16]. Bayesian analyses may also be used to account for the uncertainty introduced by estimating the between-study variance in the random-effects model, leading to more appropriate estimates and predictions of treatment efficacy [17].


Exploring and Explaining Heterogeneity
space

The next important issue is whether the common estimate obtained in the previous step is robust. Sensitivity analyses determine whether the common estimate is influenced by changes in the assumptions and in the protocol for combining the data.

A comparison of the results of fixed- and random-effects models is one such sensitivity analysis [18]. Generally, the random-effects model produces wider CIs than does the fixed-effects model, and the level of statistical significance may therefore be different depending on the model used. The pooled point estimate per se is less likely to be affected, although exceptions are possible [19].

Other sensitivity analyses may include the examination of the residuals and the chi-square components [13] and assessment of the effect of deleting each study in turn. Statistically significant results that depend on a single study may require further exploration.

Cumulative Meta-Analysis

Cumulative meta-analysis is another approach for assessing the impact of each study [20]; it is the opposite of the stepwise deletion. In cumulative meta-analysis, studies are sequentially pooled by adding one study at a time in a prespecified order [21]. One possible order is according to the dates these studies were conducted or published. Cumulative meta-analysis can help determine whether the pooled estimate has been robust over time and can also determine the point in time when statistical significance was reached for a pooled result. When the order is the year of publication, cumulative meta-analysis can be seen as a form of Bayesian inference. The prior probability (the prior belief) is generated by the pooled results of all prior studies, and the posterior probability is derived by adding the results of the new study to the results of the others [21].

Meta-Regression

Further sensitivity analyses are generally dictated by the nature and the specifics of the question that the meta-analysis tries to answer and by the possible reasons that can be identified to explain heterogeneity. One such computational procedure, commonly referred to as meta-regression, involves the statistical assessment of whether specific factors (covariates) influence the magnitude of the point estimate of the treatment effect across studies [22]. Meta-regression results are generally reported as slope coefficients with CIs. The covariates of interest may describe study or patient characteristics. These characteristics may be common for all patients in each study (for example, the specific route of administration of the experimental drug used in each study) or they may be average values representative of the studied cohort (such as the mean age of the patients). Averages of covariates measured at the patient level require cautious interpretation because the aggregate values may not adequately represent important minorities of patients [23-25].

Some covariates are ubiquitous, such as study sample size, study result variance, and control rate of events (the percentage of patients with an event of interest in the control group). Other covariates may be problem-specific. Often, information on covariates may not be uniformly collected or reported across all studies, and analyses involving these covariates may therefore not be useful. A variety of statistical methods, including weighted least-squares, logistic regression, and hierarchical models, can be used for meta-regression analyses [22, 26-28].

Figure 1 summarizes the three previous steps, which define the core of a meta-analysis. It shows that estimating, deciding whether to ignore, incorporating, exploring, and explaining heterogeneity are the key aims of the quantitative methods for synthesizing data from different studies.



View larger version (20K):
[in this window]
[in a new window]
 
Figure 1. Methodologic choices and their implications in dealing with heterogeneous data in a meta-analysis.

 

Subgroup Analysis

Subgroup analyses may be useful for addressing particular questions when data for different subgroups of patients are available from each study [29]. Combining specific subgroup data across studies follows the principles described above and may provide further insight into heterogeneity. Subgroup analyses in the retrospective setting of most meta-analyses are post hoc exercises and should be interpreted with caution, lest they turn into "fishing expeditions." An especially pernicious approach occurs when the data are divided into multiple subgroups on the basis of combinations of characteristics (such as age and dose) and differential treatment effects are claimed within very small subdivisions. Such interactions among subgroups are unlikely to describe the truth when derived from aggregated data.

Lack of uniform reporting of the data necessary for subgroup analyses across trials poses an additional problem. Thus, subgroup analyses should best be used as hypothesis-generating tools [22], although important observations may sometimes be made [30].


Assessing the Potential for Bias
space

The assessment of potential bias should always be part of a meta-analysis. Some issues relevant to this were discussed earlier in this series [1, 2]. Two major sources of bias for meta-analysis are the failure to find all of the studies performed in the clinical domain and the uncertain reliability of poor-quality studies.

Publication Bias

Studies with negative results are more likely to remain unpublished because investigators or the peer reviewers and editors are not enthusiastic about publishing "negative" information [31-33]. The chances of not being published are probably greater if the negative study is small and nonrandomized [34]. Some studies may be impossible to retrieve and include in a meta-analysis despite a thorough search of potential databases. Publication bias is difficult to eliminate, but some statistical procedures may be helpful in detecting its presence. An inverted funnel plot [35] is sometimes used to visually explore the possibility that publication bias is present (Figure 2). This method uses a scatterplot of studies that relates the magnitude of the treatment effect to the weight of the study. An inverted, funnel-shaped, symmetrical appearance of dots suggests that no study has been left out, whereas an asymmetrical appearance suggests the presence of publication bias. Formal computational approaches to test for, assess the extent of, and correct publication bias have also been described [36-39].



View larger version (17K):
[in this window]
[in a new window]
 
Figure 2. An inverted funnel plot to detect publication bias. This example used data from a meta-analysis of intravenous streptokinase for acute myocardial infarction [20]. The risk ratio for the mortality reduction in each study is plotted against the weight of the study, represented by the sample size. A symmetric triangle is fitted around the pooled estimate (arrow) so that it encompasses most of the studies. If small "negative" trials with large variance have been left unpublished, the plot will be asymmetrical: Small published studies will show very large estimates of the treatment effect compared with larger studies that have more conservative results. A symmetrical plot provides reassuring evidence that the treatment effect is similar in studies of small and large variance, whereas an asymmetrical plot suggests possible publication bias. The plot shown here reveals that there are fewer small studies (involving 10 to 100 participants) with risk ratios greater than 0.8 than there are small studies with risk ratios less than 0.8, whereas the numbers of medium and large studies are fairly symmetrical. These results suggest that some small studies with negative findings were not published. Outlier studies may also be readily identified by using this plot.

 

Quality

Study quality was discussed in detail earlier in this series [2]. Investigators have proposed incorporating quality scores into meta-analyses on the basis of checklists of study design components [40-43]. To date, no scale has been proven to correlate consistently with treatment efficacy [44]. Beyond the generic features of study design and conduct, general quality-scoring systems may have to be supplemented or replaced with more problem-specific quality items for each particular meta-analysis [45]. Empirical investigations have shown that studies of worse quality may overestimate treatment effects because they inadequately conceal treatment allocation and use inadequate blinding [46].


Presenting the Results
space

The results of meta-analyses are typically presented in a graphic form (Figure 3) that shows the point estimates and their CIs. This presentation aims to convey an impression of the results of the individual studies, to convey the extent of heterogeneity, and to report the pooled estimate. Meta-regression analyses may be depicted by plots in which the value of the covariate of interest is shown on the horizontal axis and the magnitude of the treatment effect is shown on the vertical axis [48]. It is also important to report results from sensitivity analyses on key issues and comparison between fixed-effects and random-effects methods of pooling data when their results are different. When appropriate, reporting the NNT helps translate the results into a more clinically meaningful metric [3].



View larger version (26K):
[in this window]
[in a new window]
 
Figure 3. Standard meta-analysis and cumulative meta-analysis, Left. A standard meta-analysis plot of the risk ratios for progression to AIDS or death in a comparison of early therapy with zidovudine (treatment group) or deferred therapy with zidovudine (control group) [47]. The point estimates for the risk ratio of each study and the pooled point estimate are shown by the points, and the horizontal lines show the CIs, typically 95% CIs. N is the number of patients in the study. The studies are ordered according to year of publication. As a standard convention, a risk ratio of less than 1 denotes a reduction in the number of events in the treated compared with the control group. Right. The results of a cumulative meta-analysis of the same data. N is the number of patients in the clinical trials. The points and lines represent the point estimates and the 95% CIs of the pooled results after the inclusion of each additional study in the calculations. The CIs typically narrow with the addition of more studies unless substantial heterogeneity exists.

 


Other Types of Data and Methods
space

Meta-Analysis of Diagnostic Tests

An important application of meta-analysis is the combination of sensitivity and specificity data of diagnostic tests across different studies [49]. Using weighted linear regression to generate a summary receiver-operating characteristics (ROC) curve has been proposed as a way to avoid the underestimation of test performance that results when the correlation between sensitivity and specificity is ignored [50]. An ROC curve is a plot of the percentage of true-positive results (the sensitivity of the test) against the percentage of false-positive results (1 –specificity) and thus represents the tradeoff between these two test characteristics.

Meta-Analysis of Other Nonrandomized, Uncontrolled Data

Uncontrolled cohort data can also be combined by using meta-analytic techniques. The principles are the same as those described for randomized data. However, greater care is needed in the conduct of the analysis and interpretation of the results when nonrandomized and uncontrolled data are used because these data are more likely to be biased. Of particular interest is the synthesis of dose-response data across different studies that investigate the effect of increasing values of a potential etiologic factor on an outcome of interest (for example, exposure to environmental tobacco smoke and the occurrence of lung cancer) [17, 51-53].

Meta-Analysis of Individual Patient Data

Most meta-analyses are based on group data as reported in the literature, but researchers occasionally make the effort to collect the detailed outcomes and risk factor data for the individual patients involved in each of several studies. These data can be used in survival analyses and multivariate regression analyses. Meta-analysis of individual patient data is more expensive and time-consuming than meta-analysis of grouped data, and it requires the coordination of large teams of investigators and a robust protocol [54]. Nevertheless, if possible, meta-analysis of individual patient data may represent the highest step in the hierarchy of evidence [55].


Conclusions
space

By quantitatively summarizing a collection of data from clinical studies, meta-analysis provides collective results of a kind that no individual study can offer. Meta-analysis is a relatively new discipline in clinical medicine. As might be expected, discrepancies between the results of large trials and meta-analyses of smaller trials [19], as well as differences in the results of meta-analyses addressing the same topic, do occur [56, 57]. New quantitative methods are being developed, and investigators have addressed these important issues [58-60]. Synthesis of the data from many individual studies requires sound, rigorous, quantitative methods, and the results of such syntheses should be interpreted with appropriate caution; meta-analysis is not a "magic" solution to the problem of scientific evidence and cannot replace clinical reasoning [61]. In addition, reliable meta-analysis requires consistent, high-quality reporting of the primary data from individual studies; the need for such reporting cannot be overstated [62, 63]. (Table 1)


View this table:
[in this window]
[in a new window]
 
Table 1. Key Points To Remember

 


Glossary
space
up arrowTop
dotGlossary
down arrowAuthor & Article Info
down arrowReferences

Bayesian inference: A statistical discipline that addresses how a prior estimate should be modified in the light of knowledge gained from new studies.

Cumulative meta-analysis: A method whereby the combined point estimate of an effect is sequentially computed by adding one study at a time in a prespecified order.

Fixed-effects model: A model that assumes that all studies are studying the same true effect and that variability is due to random error only.

Heterogeneity: The diversity that exists between studies. It may be due to identifiable factors or statistical factors, or both, especially the component that cannot be explained by random error.

Meta-regression: A regression analysis in which individual sets of data (studies) are used as the unit of observation.

Random-effects model: A model that assumes that the true effect differs among studies and therefore must be represented by a distribution of values instead of a single value.

Receiver-operating characteristic curve: A plot of the characteristics of a diagnostic test. It depicts the tradeoff between the sensitivity and the specificity of the test.

Dr. Ioannidis: Therapeutics Research Program, Division of AIDS, National Institute of Allergy and Infectious Diseases, Solar Building, Room 2C15, Bethesda, MD 20892.


Author and Article Information
space
up arrowTop
up arrowGlossary
dotAuthor & Article Info
down arrowReferences

From New England Medical Center and Tufts University School of Medicine, Boston, Massachusetts.
For definitions of terms used, see Glossary at end of text.
Acknowledgments: The authors thank Drs. Andrew Oxman and Larry V. Hedges for their reviews of and valuable comments on the manuscript and thank the clinical reviewer, Norman J. Wilder.
Grant Support: In part by grants R01 HS07782 and R01 HS 08532 from the Agency for Health Care Policy and Research (Drs. Lau and Schmid) and grant T32 AI07389 from the National Institutes of Health (Dr. Ioannidis).
Current Author Addresses: Drs. Lau and Schmid: Division of Clinical Care Research, New England Medical Center, 750 Washington Street, Box 63, Boston, MA 02111.


References
space
up arrowTop
up arrowGlossary
up arrowAuthor & Article Info
dotReferences

1. Counsell C. Formulating questions and locating primary studies for inclusion in systematic reviews. Ann Intern Med. 1997; 127:380-7.

2. Meade MO, Richardson WS. Selecting and appraising studies for a systematic review. Ann Intern Med. 1997; 127:531-7.

3. McQuay HJ, Moore RA. Using numerical results from systematic reviews in clinical practice. Ann Intern Med. 1997; 126:712-20.

4. Sinclair JC, Bracken MB. Clinically useful measures of effect in binary analyses of randomized trials. J Clin Epidemiol. 1994; 47:881-90.

5. Cooper H, Hedges LV. The Handbook of Research Synthesis. New York: Russell Sage Foundation; 1994.

6. Fleiss JL. Statistical Methods for Rates and Proportions. 2d ed. New York: J Wiley; 1981:161-5.

7. Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959; 22:719-48.

8. Laird NM, Mosteller F. Some statistical methods for combining experimental results. Int J Technol Assess Health Care. 1990; 6:5-30.

9. Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial infarction: an overview of the randomized trials. Prog Cardiovasc Dis. 1985; 27:335-71.

10. Greenland S, Salvan A. Bias in the one-step method for pooling study results. Stat Med. 1990; 9:247-52.

11. Fleiss JL. The statistical basis of meta-analysis. Stat Methods Med Res. 1993; 2:121-45.

12. DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986; 7:177-88.

13. Hedges LV, Olkin I. Statistical Methods for Meta-Analysis. Orlando: Academic Pr; 1985.

14. Eddy DM, Hasselblad V, Schacter RD. Meta-Analysis by the Confidence Profile Method: The Statistical Synthesis of Evidence. New York: Academic Pr; 1991.

15. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. London: Chapman & Hall; 1995:148-54.

16. Lilford RJ, Thornton JG, Braunholtz D. Clinical trials and rare diseases: a way out of a conundrum. BMJ. 1995; 311:1621-5.

17. Dumouchel W. Meta-analysis for dose-response models. Stat Med. 1995; 14:679-85.

18. Berlin JA, Laird NM, Sacks HS, Chalmers TC. A comparison of statistical methods for combining event rates from clinical trials. Stat Med. 1989; 8:141-51.

19. Borzak S, Ridker PM. Discordance between meta-analyses and large-scale randomized, controlled trials. Examples from the management of acute myocardial infarction. Ann Intern Med. 1995; 123:873-7.

20. Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med. 1992; 327:248-54.

21. Lau J, Schmid CH, Chalmers TC. Cumulative meta-analysis of clinical trials builds evidence for exemplary medical care. J Clin Epidemiol. 1995; 48:45-57.

22. Berlin JA, Antman EM. Advantages and limitations of metaanalytic regressions of clinical trials data. Online J Curr Clin Trials. 4 June 1994: Doc. No. 134.

23. Morgenstern H. Uses of ecologic analysis in epidemiologic research. Am J Public Health. 1982; 72:1336-44.

24. Langbein LI, Lichtman AJ. Ecological Inference. Beverly Hills, CA: Sage; 1978. (Sage University Paper Series on Quantitative Applications in the Social Sciences. Series no. 07-010.).

25. Greenland S, Robins J. Invited commentary: ecologic studies-biases, misconceptions, and counterexamples. Am J Epidemiol. 1994; 139:747-60.

26. McIntosh M. The population risk as an explanatory variable in research synthesis of clinical trials. Stat Med. 1996; 15:1713-28.

27. Morris CN, Normand SL. Hierarchical models for combining information and for meta-analyses. In: Bernardo JM, Berger JO, Dawid AP, Smith AF. Bayesian Statistics 4. New York: Oxford Univ Pr; 1992.

28. Smith TC, Spiegelhalter DJ, Thomas A. Bayesian approaches to random-effects meta-analysis: a comparative study. Stat Med. 1995; 14:2685-99.

29. Oxman AD, Guyatt GH. A consumer's guide to subgroup analyses. Ann Intern Med. 1992; 116:78-84.

30. Michels KB, Rosner BA. Data trawling: to fish or not to fish. Lancet. 1996; 348:1152-3.

31. Dickersin K, Chan S, Chalmers TC, Sacks HS, Smith H Jr. Publication bias and clinical trials. Control Clin Trials. 1987; 8:343-53.

32. Dickersin K. The existence of publication bias and risk factors for its occurrence. JAMA. 1990; 263:1385-9.

33. Begg CB. Publication bias, In: Cooper H, Hedges L, eds. The Handbook of Research Synthesis. New York: Russell Sage Foundation; 1994.

34. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet. 1991; 337:867-72.

35. Light RJ, Pillemer DB. Summing up: the science of reviewing research. Cambridge, MA: Harvard Univ Pr; 1984.

36. Begg CB, Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics. 1994; 50:1088-101.

37. Dear KB, Begg CB. An approach for assessing publication bias prior to performing a meta-analysis. Statistical Science. 1992; 7:237-45.

38. Hedges LV. Modeling publication selection effects in random effects models in meta-analysis. Statistical Science. 1992; 7:246-55.

39. Vevea JL, Hedges LV. A general linear model for estimating effect size in the presence of publication bias. Psychometrika. 1995; 60:419-35.

40. Chalmers TC, Smith H Jr, Blackburn B, Silverman B, Schroeder B, Reitman D, et al. A method for assessing the quality of a randomized control trial. Control Clin Trials. 1981; 2:31-49.

41. Mulrow CD, Linn WD, Gaul MK, Pugh JA. Assessing quality of a diagnostic test evaluation. J Gen Intern Med. 1989; 4:288-95.

42. Detsky AS, Naylor CD, O'Rourke K, McGeer AJ, L'Abbe KA. Incorporating variations in the quality of individual randomized trials into meta-analysis. J Clin Epidemiol. 1992; 45:255-65.

43. Moher D, Jadad AR, Nichol G, Penman M, Tugwell P, Walsh S. Assessing the quality of randomized controlled trials: an annotated bibliography of scales and checklists. Control Clin Trials. 1995; 16:62-73.

44. Emerson JD, Burdick E, Hoaglin DC, Mosteller F, Chalmers TC. An empirical study of the possible relation of treatment differences to quality scores in controlled randomized clinical trials. Control Clin Trials. 1990; 11:339-52.

45. Greenland S. Invited commentary: a critical look at some popular meta-analytic methods. Am J Epidemiol. 1994; 140:290-6.

46. Schultz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimension of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995; 273:408-12.

47. Ioannidis JP, Cappelleri JC, Lau J, Skolnik PR, Melville B, Chalmers TC, et al. Early or deferred zidovudine therapy in HIV-infected patients without an AIDS-defining illness. Ann Intern Med. 1995; 122:856-66.

48. Holme I. Relation of coronary heart disease incidence and total mortality to plasma cholesterol reduction in randomised trials: use of meta-analysis. Br Heart J. 1993; 69(1 Suppl):S42-7.

49. Irwig L, Tosteson AN, Gatsonis C, Lau J, Colditz G, Chalmers TC, et al. Guidelines for meta-analyses evaluating diagnostic tests. Ann Intern Med. 1994; 120:667-76.

50. Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations. Stat Med. 1993; 12:1293-316.

51. Tweedie RL, Mengersen KL. Meta-analytic approaches to dose-response relationships, with application in studies of lung cancer and exposure to environmental tobacco smoke. Stat Med. 1995; 14:545-69.

52. Greenland S, Longnecker MP. Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. Am J Epidemiol. 1992; 135:1301-9.

53. Smith SJ, Caudill SP, Steinberg KK, Thacker SB. On combining dose-response data from epidemiological studies by meta-analysis. Stat Med. 1995; 14:531-44.

54. Stewart LA, Clarke MJ. Practical methodology of meta-analyses (overviews) using updated individual patient data. Cochrane Working Group. Stat Med. 1995; 14:2057-79.

55. Olkin I. Statistical and theoretical considerations in meta-analysis. J Clin Epidemiol. 1995; 48:133-46.

56. Cook DJ, Witt LG, Cook RJ, Guyatt GH. Stress ulcer prophylaxis in the critically ill: a meta-analysis. Am J Med. 1991; 91:519-27.

57. Tryba M. Prophylaxis of stress ulcer bleeding. A meta-analysis. J Clin Gastroenterol. 1991:13(Suppl 2):544-55.

58. Villar J, Carroli G, Belizan JM. Predictive ability of meta-analyses of randomised controlled trials. Lancet. 1995; 345:772-6.

59. Cappelleri JC, Ioannidis JP, Schmid CH, de Ferranti SD, Aubert M, Chalmers TC, et al. Large trials vs meta-analysis of smaller trials: how do their results compare? JAMA. 1996; 276:1332-8.

60. Cook DJ, Reeve BK, Guyatt GH, Heyland DK, Griffith LE, Buckingham L, et al. Stress ulcer prophylaxis in critically ill patients. Resolving discordant meta-analyses. JAMA. 1996; 275:308-14.

61. Ioannidis JP, Lau J. On meta-analyses of meta-analyses [Letter]. Lancet. 1996; 348:756.

62. Altman DG. Better reporting of randomised controlled trials: the CONSORT statement. BMJ. 1996; 313:570-1.

63. Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I, et al. Improving the quality of reporting of randomized controlled trials. The CONSORT statement. JAMA. 1996; 276:637-9.


This article has been cited by other articles:


Home page
Diabetes CareHome page
J. Mitri, J. Castillo, and A. G. Pittas
Diabetes and Risk of Non-Hodgkin's Lymphoma: A meta-analysis of observational studies
Diabetes Care, December 1, 2008; 31(12): 2391 - 2397.
[Abstract] [Full Text] [PDF]


Home page
JAMAHome page
S. R. Nalluri, D. Chu, R. Keresztes, X. Zhu, and S. Wu
Risk of Venous Thromboembolism With the Angiogenesis Inhibitor Bevacizumab in Cancer Patients: A Meta-analysis
JAMA, November 19, 2008; 300(19): 2277 - 2285.
[Abstract] [Full Text] [PDF]


Home page
MutagenesisHome page
T. Lao, W. Gu, and Q. Huang
A meta-analysis on XRCC1 R399Q and R194W polymorphisms, smoking and bladder cancer risk
Mutagenesis, November 1, 2008; 23(6): 523 - 532.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
N. Pabalan, B. Bapat, L. Sung, H. Jarjanazi, O. Francisco-Pabalan, and H. Ozcelik
Cyclin D1 Pro241Pro (CCND1-G870A) Polymorphism Is Associated with Increased Cancer Risk in Human Populations: A Meta-Analysis
Cancer Epidemiol. Biomarkers Prev., October 1, 2008; 17(10): 2773 - 2781.
[Abstract] [Full Text] [PDF]


Home page
Am J Sports MedHome page
B. Reider
Toward a Common Language
Am. J. Sports Med., July 1, 2008; 36(7): 1261 - 1262.
[Full Text] [PDF]


Home page
Hum Mol GenetHome page
I. Meulenbelt, J. L. Min, S. Bos, N. Riyazi, J. J. Houwing-Duistermaat, H.-J. van der Wijk, H. M. Kroon, M. Nakajima, S. Ikegawa, A. G. Uitterlinden, et al.
Identification of DIO2 as a new susceptibility locus for symptomatic osteoarthritis
Hum. Mol. Genet., June 15, 2008; 17(12): 1867 - 1875.
[Abstract] [Full Text] [PDF]


Home page
BMJHome page
E. Evangelou, G. Tsianos, and J. P A Ioannidis
Doctors' versus patients' global assessments of treatment effectiveness: empirical survey of diverse treatments in clinical trials
BMJ, June 7, 2008; 336(7656): 1287 - 1290.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
R. Agarwal, R. Srinivas, A. Nath, and S. K. Jindal
Is the Mortality Higher in the Pulmonary vs the Extrapulmonary ARDS?: A Metaanalysis
Chest, June 1, 2008; 133(6): 1463 - 1473.
[Abstract] [Full Text] [PDF]


Home page
HypertensionHome page
C. Zeng, V. A. M. Villar, G. M. Eisner, S. M. Williams, R. A. Felder, and P. A. Jose
G Protein-Coupled Receptor Kinase 4: Role in Blood Pressure Regulation
Hypertension, June 1, 2008; 51(6): 1449 - 1455.
[Full Text] [PDF]


Home page
Hum Mol GenetHome page
K. Chapman, A. Takahashi, I. Meulenbelt, C. Watson, J. Rodriguez-Lopez, R. Egli, A. Tsezou, K. N. Malizos, M. Kloppenburg, D. Shi, et al.
A meta-analysis of European and Asian cohorts reveals a global role of a functional SNP in the 5' UTR of GDF5 with osteoarthritis susceptibility
Hum. Mol. Genet., May 15, 2008; 17(10): 1497 - 1504.
[Abstract] [Full Text] [PDF]


Home page
Obstet GynecolHome page
N. R. Shah, J. B. Jones, J. Aperi, R. Shemtov, A. Karne, and J. Borenstein
Selective Serotonin Reuptake Inhibitors for Premenstrual Syndrome and Premenstrual Dysphoric Disorder: A Meta-Analysis
Obstet. Gynecol., May 1, 2008; 111(5): 1175 - 1182.
[Abstract] [Full Text] [PDF]


Home page
Clin TrialsHome page
I. J Dahabreh
Meta-analysis of rare events: an update and sensitivity analysis of cardiovascular events in randomized trials of rosiglitazone
Clinical Trials, April 1, 2008; 5(2): 116 - 120.
[Abstract] [PDF]


Home page
CMAJHome page
D. J.A. Jenkins MD PhD, A. R. Josse MSc, J. Beyene PhD, P. Dorian MD MSc, M. L. Burr MD DSc (Me, R. LaBelle BSc, C. W.C. Kendall PhD, and S. C. Cunnane PhD
Fish-oil supplementation in patients with implantable cardioverter defibrillators: a meta-analysis
Can. Med. Assoc. J., January 15, 2008; 178(2): 157 - 164.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Moonesinghe, M. J. Khoury, T. Liu, and J. P. A. Ioannidis
Required sample size and nonreplicability thresholds for heterogeneous genetic associations
PNAS, January 15, 2008; 105(2): 617 - 622.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
Z. Hu, G. Jin, L. Wang, F. Chen, X. Wang, and H. Shen
MDM2 Promoter Polymorphism SNP309 Contributes to Tumor Susceptibility: Evidence from 21 Case-Control Studies
Cancer Epidemiol. Biomarkers Prev., December 1, 2007; 16(12): 2717 - 2723.
[Abstract] [Full Text] [PDF]


Home page
Eur Respir JHome page
D. Gupta, R. Agarwal, A. N. Aggarwal, and S. K. Jindal
Molecular evidence for the role of mycobacteria in sarcoidosis: a meta-analysis
Eur. Respir. J., September 1, 2007; 30(3): 508 - 516.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
F. K. Kavvoura, T. Akamizu, T. Awata, Y. Ban, D. A. Chistiakov, I. Frydecka, A. Ghaderi, S. C. Gough, Y. Hiromatsu, R. Ploski, et al.
Cytotoxic T-Lymphocyte Associated Antigen 4 Gene Polymorphisms and Autoimmune Thyroid Disease: A Meta-Analysis
J. Clin. Endocrinol. Metab., August 1, 2007; 92(8): 3162 - 3170.
[Abstract] [Full Text] [PDF]


Home page
Clin TrialsHome page
J. P. Ioannidis and T. A Trikalinos
An exploratory test for an excess of significant findings
Clinical Trials, June 1, 2007; 4(3): 245 - 253.
[Abstract] [PDF]


Home page
Obstet GynecolHome page
A. Sotiriadis, A. Makrigiannakis, T. Stefos, E. Paraskevaidis, and S. N. Kalantaridou
Fibrinolytic Defects and Recurrent Miscarriage: A Systematic Review and Meta-Analysis
Obstet. Gynecol., May 1, 2007; 109(5): 1146 - 1155.
[Abstract] [Full Text] [PDF]


Home page
Am J EpidemiolHome page
E. E. Ntzani, E. C. Rizos, and J. P. A. Ioannidis
Genetic Effects versus Bias for Candidate Polymorphisms in Myocardial Infarction: Case Study and Overview of Large-Scale Evidence
Am. J. Epidemiol., May 1, 2007; 165(9): 973 - 984.
[Abstract] [Full Text] [PDF]


Home page
CMAJHome page
J. P.A. Ioannidis and T. A. Trikalinos
The appropriateness of asymmetry tests for publication bias in meta-analyses: a large survey
Can. Med. Assoc. J., April 10, 2007; 176(8): 1091 - 1096.
[Abstract] [Full Text] [PDF]


Home page
JNCI J Natl Cancer InstHome page
P. A. Kyzas, D. Denaxa-Kyza, and J. P. A. Ioannidis
Quality of Reporting of Cancer Prognostic Marker Studies: Association With Reported Prognostic Effect
J Natl Cancer Inst, February 7, 2007; 99(3): 236 - 243.
[Abstract] [Full Text] [PDF]


Home page
Arch Pediatr Adolesc MedHome page
E. Zintzaras and A. G. Kaditis
Sleep-Disordered Breathing and Blood Pressure in Children: A Meta-analysis
Arch Pediatr Adolesc Med, February 1, 2007; 161(2): 172 - 178.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
E. E. Pakos, T. A. Trikalinos, A. D. Fotopoulos, and J. P. A. Ioannidis
Prosthesis Infection: Diagnosis after Total Joint Arthroplasty with Antigranulocyte Scintigraphy with 99mTc-labeled Monoclonal Antibodies--A Meta-Analysis
Radiology, December 1, 2006; 242(1): 101 - 108.
[Abstract] [Full Text] [PDF]


Home page
JNCI J Natl Cancer InstHome page
M. Kyrgiou, G. Salanti, N. Pavlidis, E. Paraskevaidis, and J. P. A. Ioannidis
Survival Benefits With Diverse Chemotherapy Regimens for Ovarian Cancer: Meta-analysis of Multiple Treatments.
J Natl Cancer Inst, November 15, 2006; 98(22): 1655 - 1663.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
S. M. Bagshaw, P. D. Galbraith, L. B. Mitchell, R. Sauve, D. V. Exner, and W. A. Ghali
Prophylactic Amiodarone for Prevention of Atrial Fibrillation After Cardiac Surgery: A Meta-Analysis
Ann. Thorac. Surg., November 1, 2006; 82(5): 1927 - 1937.
[Abstract] [Full Text] [PDF]


Home page
JNCI J Natl Cancer InstHome page
D. Mauri, N. Pavlidis, N. P. Polyzos, and J. P. A. Ioannidis
Survival with aromatase inhibitors and inactivators versus standard hormonal therapy in advanced breast cancer: meta-analysis.
J Natl Cancer Inst,