Identifying Adverse Events Caused by Medical Care: Degree of Physician Agreement in a Retrospective Chart Review

  1. A. Russell Localio, JD, MPH, MS;
  2. Susan L. Weaver, MS;
  3. J. Richard Landis, PhD;
  4. Ann G. Lawthers, ScD;
  5. Troyen A. Brennan, MD, JD;
  6. Liesi Hebert, ScD; and
  7. Tonya J. Sharp, MS
  1. From Pennsylvania State University College of Medicine, Hershey, Pennsylvania; Harvard School of Public Health, Boston, Massachusetts; and Rush University and Rush-Presbyterian-St. Luke's Medical Center, Chicago, Illinois. Grant Support: In part by grant R01 HS07067-01 from the Agency for Health Care Policy and Research. Requests for Reprints: A. Russell Localio, JD, MPH, MS, Center for Biostatistics and Epidemiology, Pennsylvania State University College of Medicine, PO Box 850, Hershey, PA 17033-0850. Current Author Addresses: Mr. Localio and Dr. Landis: Center for Biostatistics and Epidemiology, Pennsylvania State University College of Medicine, PO Box 850, Hershey, PA 17033-0850.

    Abstract

    Objective: To 1) assess the degree of agreement among physicians on the cause of previously flagged adverse outcomes and 2) relate the findings to systems of quality assurance and performance assessment and proposals for no-fault compensation for medical injuries.

    Design: Observational study of 7533 pairs of “structured implicit” reviews (subjective opinions based on guidelines) of medical records done by 127 physicians working independently.

    Setting: Random sample of 51 inpatient facilities in New York State.

    Patients: Random sample of inpatient medical records from the selected facilities.

    Measurements: 1] Number of agreed-upon adverse events compared with the number of cases of extreme disagreement and 2) internally and indirectly standardized rates at which physician reviewers found adverse events (injuries to patients caused at least in part by medical management).

    Results: In 12.9% of cases (971 of 7533), the two physicians in a pair had extreme disagreement about the occurrence of an adverse event. These cases outnumbered those in which both reviewers found an adverse event (10%; n = 757). Agreement was highest for wound infections and lowest for adverse events attributed to failure to diagnose or lack of therapy. The amount of experience the physicians had in reviewing records tended to increase the level of agreement. Even after standardization to the results of the entire sample, individual physicians' rates of finding at least slight evidence of an adverse event varied widely (range, 9.9% to 43.7%) (P < 0.001).

    Conclusions: Structured implicit reviews produced disagreement on the causes of adverse patient outcomes. If systems of quality assurance, performance audits, or no-fault patient compensation are to succeed, methods for overcoming the common tendency toward disagreement among experts must be developed.

    Retrospective case review has long been a main-stay of peer review. It supports scientific studies and medical audits as well as assessments of the appropriateness, effectiveness, and quality of health care provided by physicians, hospitals, or regions. As part of quality assurance, hospitals and clinics regularly use formal and informal case review. Insurers and managed care organizations rely on case review when making decisions about coverage. All forms of case review depend heavily on expert opinion.

    Case review also underlies current and proposed systems of compensating patients for injuries caused by medical care. Under the current litigation system, the patient must prove, with the support of expert medical opinion, that medical care contributed to the injury (causation) and fell below the standards of practice in the community (negligence). Under proposed “no-fault” alternatives to litigation, entitlement to compensation and liability for payment might also depend on an expert's opinion as to whether the patient's outcome was caused by medical care rather than by a preexisting disease or condition [1-4].

    Critics have identified several problems with case review. First, experts cannot form a consensus about which outcomes are adverse. Second, medical technology changes rapidly and creates uncertainty about the appropriateness and effectiveness of practices. Third, administrative or transaction costs in making individualized determinations of causation might be high [5-8]. The American College of Physicians [9] and others [10] have called for further demonstration projects.

    Building on previous research on the reliability of clinical judgments, we used a large sample of physician reviews of medical records to estimate the degree of agreement on the cause of adverse patient outcomes. We also discuss the implications of the results for quality assurance, performance assessment, and proposals for no-fault patient compensation.

    Methods

    Cases were obtained from the Medical Practice Study, a project designed to estimate the rate of adverse events occurring among inpatients in a random sample of 31 429 medical records from 51 health care facilities in New York State. We defined an adverse event as an injury that 1) was caused at least in part by medical management and 2) required or prolonged hospitalization or led to disability after discharge. The injury could result from a provider's action or inaction in either inpatient or outpatient settings or from a drug or medical device. The medical management did not have to be substandard or inappropriate; the injury could follow an unexpected complication. Adverse outcomes caused solely by underlying disease or by the intended consequences of treatment were not considered to be adverse events. For example, an injury to the recurrent laryngeal nerve during partial thyroidectomy (an unplanned and unintended but recognized complication) would be considered an adverse event, but the intentional destruction of the same nerve in a radical thyroid resection for cancer would not. A broken experimental balloon that led to an embolus and stroke during cardiac catheterization would, as a complication of treatment, also be an adverse event, especially if the patient's risk was unknown. This result would apply even in a study approved by a Human Subjects Committee.

    Other aspects of the Medical Practice Study and the general methods have been widely reported [11-15]. The following methods are relevant to our report.

    Record Review

    Records were reviewed in two stages. In stage 1 (which is not the subject of this report), nurses and medical records administrators used a single review per case to screen the entire sample of records for the presence of 1 or more of 18 explicit criteria (Figure 2). These criteria were based primarily on previous research [16] and were revised by the physician investigators of the Medical Practice Study. Although explicit, the criteria were broad and open to interpretation. The nurses and records administrators received an extensive manual, which contained detailed examples of the criteria, and 2 hours of focused classroom training from team leaders chosen for this project. To increase the efficiency and accuracy of screening, the nurses and records administrators used preprinted forms generated by the project management team. Nurses were instructed to refer any questionable cases for stage 2 review. Questions of a more general nature were referred to supervisors and then to the project office for consistent responses. The estimated negative predictive value of the screening was 99.5% [17].

    Figure 2.
    View larger version:
    Figure 2. Screening Criteria Implemented at Stage 1 Review by Nurses and Medical Records Administrators.

    Judgments on adverse events by pairs of physician-reviewers and rate of agreement on occurrence of adverse events compared with extreme disagreement. If a = cases of extreme disagreement (one reviewer scored the outcome as 0 [no possible adverse event] and the other scored the case as 4, 5, or 6) and b = cases for which both reviewers found adverse events (both scored the case as 4, 5, or 6), then the reported rate of agreement = a/(a + b). Bars represent exact binomial 95% CIs. Numbers in parentheses are the population-weighted estimates of the number of cases in New York State in 1984 that are represented by the sampled cases reported in this figure.

    In stage 2, each record that had or may have had at least one criterion present was further analyzed by two physicians who worked independently. Physicians were recruited primarily from New York State through a network of personal contacts of the study investigators. The physicians could not review records at the hospitals in which they practiced. Most were board certified in surgery (23%) or internal medicine (68%); the remaining were certified in obstetrics and gynecology, family practice, pediatrics, urology, or emergency medicine. Eighty-five percent were male. Most physicians were in the early stage of their careers: Fifty-five percent had received board certification within the 10 years before the study began. All physicians had telephone access to a panel of experts.

    A separate manual and a structured abstraction form guided the stage 2 review. As described previously [17], both were revised repeatedly after extensive pilot testing. This 65-page manual included explicit instructions on several types of adverse events. According to the manual, for example, all surgical wound infections were “almost invariably” adverse events, as were all falls and all drug reactions that prolonged hospitalization or caused disability. A 14-page abstraction form first asked the physician reviewer to assess whether an adverse event might have occurred. If the physician found “no possible adverse event,” the review was stopped and the case received a score of 0. If an adverse event might have occurred, the reviewer considered a list of factors on the cause of the injury and rated his or her confidence about the occurrence of an adverse event on an interval scale of 1 to 6 (Figure 1). For a confidence score of 2 (“slight to modest evidence” of an adverse event) or greater, the reviewer indicated the type of event (fall, drug reaction, wound infection, error of omission, or failure to diagnose), the number of additional days of hospitalization (if applicable), and the degree of disability over and above the underlying disease. Finally, the reviewers considered whether the error amounted to negligence. Within this structure, however, the physician could be discreet in judging the cause of the injury, hospitalization, or disability (a “structured implicit” review). All physician reviewers identified themselves by number, with the understanding that their confidential opinions would not be used for quality assurance, peer review, or litigation. Copies of the abstraction booklet are available from the authors.

    Figure 1.
    View larger version:
    Figure 1.

    Our report focuses on the two independent expert opinions obtained during stage 2 review as to whether an adverse outcome identified during stage 1 had been caused at least in part by medical management. Results of each assessment of causation were linked to the patient's computerized discharge data summary to identify the patient's age, diagnosis, and discharge status.

    Statistical Analysis

    Agreement between Reviewers

    We calculated a rate of agreement between the two physician reviewers in each pair on adverse events using a statistic described by Grant [18] for assessing agreement on abnormal tracings from electronic fetal monitoring. In our application, the numerator of this statistic was the number of cases in which both reviewers assessed their confidence in an adverse event as “more likely than not” or greater. This assessment corresponded to a score of 4, 5, or 6. The denominator was the sum of the numerator and the number of cases of extreme disagreement, for which one reviewer scored the case as 4, 5, or 6 and the other physician found “no possible adverse event” (a score of 0). The statistic therefore compared the number of cases with agreed-upon adverse events with the number of clear disagreements.

    This statistic does not include cases for which both physicians agreed that no adverse event had occurred. It recognizes that agreement about whether a patient's condition is normal (no adverse event) is usually greater than agreement about whether a patient has disease or an abnormal condition [19-22]. The statistic is also not affected by the number of clearly normal cases in the samples of cases for review. In our study, the number of cases clearly without adverse events at stage 2 was influenced by the coarseness of the previous screening process. The stage 1 reviewers were cautioned to avoid false-negative determinations if they were in doubt, so that adverse events would not be overlooked.

    This statistic also facilitated comparisons of rates of agreement across subsets of such adverse events as drug reactions, which are defined only if one or both physicians found and described the event. Clearly normal cases (for which both physicians found no possible adverse event) could not be categorized in this manner. In addition, the statistic permitted comparisons between our data and those of similar studies that had different designs and prevalences of abnormal findings (adverse events, preventable deaths, drug reactions, or quality problems). For all rates, we calculated exact binomial CIs [23], which are slightly wider than those calculated using the normal approximation [24].

    We also computed the commonly used weighted κ statistic [25]; we recognized, however, that this does not distinguish among different patterns of agreement [26] and might be unsuitable for comparisons across studies [27]. Weighted κ statistics vary with the prevalence of the result being scored [28], depend on the choice of weight, and are sensitive to the study design (that is, to the number of clearly normal persons in the sample [29]).

    Reviewer Calibration

    We define calibration as a reviewer's underlying propensity for finding an adverse event [30, 31]. When the cases being reviewed have varying degrees of evidence about the cause of injury, reviewers with different calibrations will necessarily disagree about some cases. Only reviewers with common calibrations have a chance of perfect agreement, and even then they might disagree on individual cases.

    The unit of analysis was the review rather than the case. Each physician's rate of finding adverse events was standardized [32] by using a regression model to account for differences across cases. We selected physician confidence scores of 0 or 1 (no possible adverse event or little or no evidence of management-related causation of an adverse event) to reflect the absence of an adverse event and scores of 2 through 6 to indicate the occurrence of an adverse event. The use of a single cut-point allowed for a relatively simple binary-outcome regression model of adverse events. We used this particular cut-point because the reviewers had to describe only adverse events that had a score of 2 or more; some reviewers appear to have used scores of 0 and 1 as equivalents. Cases scored as 2 or 3, reflecting that the reviewer believed an adverse event had occurred but that he or she had a low level of confidence in this opinion, were lumped with cases for which opinions on causation were stronger (scores of 4, 5, and 6). Other cut-points were then used to assess the sensitivity to this choice.

    Candidate regression covariates selected on the basis of our previous knowledge on predictors of adverse events [11, 14] included hospital (or, alternatively, hospital ownership, teaching status, and location), patient age, survival at hospital discharge, length of stay in the hospital, diagnosis group, race, reimbursement source, and the median income of the ZIP code of the patient's residence. Initial modeling with logistic regression resulted in an estimate for each reviewer of the expected number of adverse events, adjusted for the mix of cases. We then applied empirical Bayes methods [33] through mixed-effects logistic regression to compensate for two problems: 1) the natural tendency for physicians who had done fewer reviews to have more variable estimates than would physicians who had done many reviews and 2) the implicit multiple comparisons of repeated testing of each physician against the entire group [34, 35]. The mixed model included the predictors of an adverse event as fixed effects and the physician reviewers as random effects. All calculations were done using the SAS statistical package (SAS Institute, Cary, North Carolina), and empirical Bayes methods were done using a specialized program within the SAS package [36].

    Results

    Agreement between Reviewers

    Working independently, 127 paired physicians assessed 7533 inpatient admissions for the occurrence of an adverse event (Table 1). A total of 15 066 reviews were done. On average, physicians found that an event was at least “more likely than not” to have been caused by medical management in 18% of reviews (2764 of 15 066). There were more cases of extreme disagreement on the occurrence of an adverse event than there were cases for which both reviewers found an adverse event. The paired physicians strongly disagreed in 12.9% of cases (971 of 7533): One physician found that the event was at least “more likely than not” to have been caused by medical management, and the other found no possible adverse event. In 10% of cases (n = 757), the two physicians agreed on the occurrence of an adverse event; their scores indicated a confidence level of “more likely than not” that management caused an adverse event (Figure 1). The rate of agreement on the presence of an adverse event for all cases was 0.44 (757/[757 + 971]). After adjustment for the sampling weights, this rate was 0.42 (95% CI, 0.39 to 0.44). For comparison purposes, this sample of reviews has a chance rate of agreement of 0.11, which represents the chance number of cases for which both reviewers found adverse events, divided by the sum of chance agreements on adverse events plus the chance cases of extreme disagreement.

    Table 1. Accounting of Cases Sampled and Subjected to Duplicate Physician Review for Adverse Events

    Type of Adverse Event

    Overall, the rates of agreement differed significantly among the types of cases specified by the reviewers (P < 0.001) (Figure 2). Wound infections showed the highest rate (0.62 [CI, 0.56 to 0.69]). Drug reactions and falls had markedly lower rates of agreement (0.48 [CI, 0.42 to 0.54] and 0.37 [CI, 0.17 to 0.61], respectively). However, the sample size makes the rate for falls unstable. Rates of agreement were lowest when reviewers concluded that the event was caused by a failure to diagnose (0.32 [CI, 0.26 to 0.38]) or the omission of therapy (0.24 [CI, 0.18 to 0.31]).

    Reviewer Experience

    We found greater rates of agreement within pairs of physicians who had reviewed many cases on this project before the date of their review of the case of interest. For example, when both reviewers had previously reviewed 200 cases (n = 27), the rate of agreement was 0.62 (205 of 332) (CI, 0.56 to 0.67). This figure, however, was heavily influenced by 2 of these 27 physicians. Exclusion of these 2 physicians' 494 joint reviews reduced the rate of agreement among the most experienced physicians to 0.50 (113 of 227) (CI, 0.43 to 0.56). The κ statistic was 0.57 (CI, 0.50 to 0.63). This rate remained significantly higher than that for cases assessed by less experienced reviewers (0.37 [CI, 0.35 to 0.40]; P < 0.001).

    These results, however, were not always consistent. Of the 7533 pairs of reviews, 237 were completed by pairs of the seven “senior” physicians who helped to design this study, the abstracting forms, and the guidelines for the review. All seven were thoroughly familiar with the processes and goals of the study. For this subsample of cases, the rate of agreement was similar to that for all cases: 0.41 (15 of 37) (CI, 0.25 to 0.66). The κ statistic was 0.50 (CI, 0.34 to 0.66).

    Reviewer Calibration

    Individual reviewers showed markedly different calibrations (that is, the rates at which they found adverse events). The regression model used to standardize cases and permit comparison among reviewers included the following predictors of adverse events: the facility from which the medical records were obtained, the patient's age, whether the patient died in the hospital, the length of the hospital stay, and four groupings of diagnosis-related groups. Each predictor was statistically significant (P < 0.001).

    The observed and expected numbers of adverse events across the 127 physician reviewers were markedly different when compared using a chi-square test (P < 0.001) or computer simulations (P < 0.001). This large variation in reviewer-specific adjusted rates persisted regardless of the cut-point used to delineate a finding of an adverse event along the ordered scale of confidence in the reviewer's finding. An empirical Bayes analysis (Table 2) identified 18 of the 127 reviewers as statistical “outliers;” the standardized rates at which these 18 physicians found adverse events were significantly higher or lower than average (on the basis of a critical P value of 0.05) and ranged from 9.9% to 43.7%. Several of these outlier physicians were experienced reviewers. Four of the 10 low outliers (who had low rates of finding adverse events) and 2 of the 8 high outliers each completed more than 200 reviews. A third, high outlier was one of the senior physicians who helped to design and guide the study. Thus, experience did not eliminate the variation in the propensity to find adverse events.

    Table 2. Rates of the Finding of Adverse Events by “Outlier” Physician Reviewers*

    Discussion

    Our results corroborate previous findings that assessments based on medical records, especially when implicit and not guided by objective criteria, produce disagreement among physicians on the appropriateness and quality of care [22, 37-43], the cause of injuries at birth [44], and the preventability of death [45-47]. Several aspects of our study shed further light on these issues. First, we used a measure of agreement that is flexible and informative. Second, our large sample of cases included different types of cases, thus facilitating a comparison of the rate of disagreement by the type of adverse event. Third, the large number of reviewers supported an estimate of the degree of variation in physicians' propensity to find adverse events. Finally, our sample was population-based and thus permitted an estimate of the statewide volume of adverse events that might be in dispute.

    By applying a simple statistic to data shown in a common format (Figure 1), we could compare the number of cases for which both reviewers agreed on the occurrence of an adverse event with the number of extreme disagreements. Readers might define adverse events (in terms of confidence score) or extreme disagreement differently than we have. For that reason, we advise that raw data rather than arbitrary or common statistics of agreement be reported.

    Rates of agreement varied across subsets of cases. Agreement was greatest for wound infections, which were covered by specific guidelines and should be clearly associated with the site and time of surgery. Falls should also be relatively easy to identify, except in cases in which patients have not complied with orders for bed rest or medications. Our finding of the lower rate of agreement with drug reactions is consistent with findings of earlier studies [48-50] that relied heavily on unguided expert opinion on a question of “inescapable difficulties and complexities” [51]. As a recent report suggests [52], adverse drug events might now be described with greater agreement than our study and others have shown. Reproducibility and validity of judgments on drug reactions can be improved with the use of diagnostic tests [53], focused algorithms [54, 55], or consensus conferences [56].

    In contrast, adverse events caused by omitted therapy or failure to diagnose a treatable disease or condition are, by nature, more complex. In these cases, the underlying disease rather than the medical intervention caused the adverse outcome; the reviewer must conclude that appropriate therapy or timely diagnosis could have prevented or cured the disease or caused it to enter remission. Similar findings have been seen in studies on preventable deaths, in which trauma clearly causes the death and the aim of medical care is timely, appropriate intervention. Clinical experiments designed specifically to test whether physicians agree on which deaths are preventable have found case review to be unreliable [47]; independent reviewers often arrive at opposite conclusions [46, 57] in part because they differ in their prognoses of critically ill patients [58]. In our study, the statewide population estimate of cases of extreme disagreement on whether failure to diagnose or omission of therapy caused injuries was 28 800. This represents 32% of all cases of marked disagreement and is almost half the population estimate of the annual number of cases for which reviewers agreed that an adverse event had occurred (60 400) (Figure 2). The U.S. legal system has long recognized the difficulty of determining causation in these cases [59].

    The large number of reviewers in our study allowed us to assess variation in physicians' propensity to find adverse events, that is, differences in calibrations [60]. A lack of reviewer experience or understanding cannot explain this degree of variation, nor can the deficiencies in the medical record (all reviewers worked with the same materials). The explanation must lie in differences in previous expectations, in the diagnostic criteria being applied, or in the inability of some reviewers to avoid the hindsight bias of knowing that an unfavorable outcome has already occurred [61, 62]. In a similar retrospective analysis of hospital records done in New York State 20 years ago, Richardson [63] reported that “it would appear that some judges were consistently lenient in their judgments of care quality, whereas others, with the same apparent consistency, appeared to be strict.” Rates of agreement cannot be high unless the expert reviewers are uniformly calibrated.

    The design of a study can limit its generalizability. In particular, the sample of records that the physician reviewers examined depended on the stage 1 screening done by nurses. Previous reports indicate that the number of adverse events missed at stage 1 was low, but our design could have generated a biased sample for review. In addition, although our study describes reviewer variation in a particular clinical exercise and reports possible reasons for differences in levels of agreement, it was not designed (as are some studies [64, 65]) to elicit the sources of variation. Although both physicians in a pair based their opinions on the same screening criterion flagged previously by a nurse (Figure 2), we could not determine whether the reviewers disagreed because one physician could not locate the relevant supporting information in a disorganized medical record or whether both physicians found the same facts but had different opinions about their implications [66, 67]. Adding more information (for example, from medical records established before hospitalization or autopsy records) might improve reliability [47]. One study suggests, however, that reviewers will remain confident about their disparate opinions despite limitations in the record [45].

    Our study also could not measure the effect of discussion and building consensus among reviewers on levels of agreement. Two studies [47, 68] found only marginal improvement in agreement when the team approach was formally tested. Finally, our reviewers were primarily general internists or surgeons rather than specialists in the problems before them. Although the reviewers had telephone access to specialists, few made use of it. Further research should focus on whether specialists agree more often than do generalists.

    Our analysis of physician calibration on adverse events has a shortcoming common to statistical models. These models often cannot completely adjust for differences in patients' illnesses. Because the odds of an adverse event vary with illness, an underspecified regression model might overstate the true degree of variation among reviewers in their propensity for finding adverse events. However, the empirical Bayes methods we used tended to compensate for incomplete adjustment.

    Systems of performance review, quality assurance, or patient compensation based on case review continue to face the problem of “observer variability” among physicians in various tasks [69, 70]. Especially difficult are judgments on the cause of adverse outcomes, because the physician must often subjectively assess what has not occurred: the patient's outcome with different treatment [71]. For that reason, a previous study called for an end to subjective judgment on preventable trauma-related deaths [68]. With sufficiently large samples of adverse events, the level of agreement reported in our study and other studies [72, 73] might support comparisons of large groups of patients. However, that rate of agreement falls short of the rate considered necessary for making decisions about quality and accountability in individual cases [22, 74]. As some have proposed [75-77], the challenge is to design objective criteria, algorithms, or guidelines that will increase agreement on the cause of suboptimal outcomes. The manner in which disagreements are resolved can influence the number of adverse outcomes attributed to medical management. For example, a decision rule requiring a unanimous panel opinion on the presence of management-related causation might protect health care providers from unjustified censure or liability [46], but at the expense of a patient seeking compensation or coverage for injuries. Areas that must still be studied are questions on the optimal number, training, and qualifications of reviewers; the need for tiers of reviews; and the scope and quality of supporting medical records and documentation.

    Table 3

    Ms. Weaver: New England Research Institutes, 9 Galen Street, Watertown, MA 02172.

    Drs. Lawthers and Brennan: Department of Health Policy and Management, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115.

    Dr. Hebert: Rush Institute on Aging and Rush Alzheimer's Disease Center, Rush University and Rush-Presbyterian-St. Luke's Medical Center, 1645 West Jackson Boulevard, Suite 675, Chicago, IL 60612.

    Ms. Sharp: Department of Biostatistics, CB 7400, University of North Carolina, Chapel Hill, NC 27599-7400.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.
    17. 17.
    18. 18.
    19. 19.
    20. 20.
    21. 21.
    22. 22.
    23. 23.
    24. 24.
    25. 25.
    26. 26.
    27. 27.
    28. 28.
    29. 29.
    30. 30.
    31. 31.
    32. 32.
    33. 33.
    34. 34.
    35. 35.
    36. 36.
    37. 37.
    38. 38.
    39. 39.
    40. 40.
    41. 41.
    42. 42.
    43. 43.
    44. 44.
    45. 45.
    46. 46.
    47. 47.
    48. 48.
    49. 49.
    50. 50.
    51. 51.
    52. 52.
    53. 53.
    54. 54.
    55. 55.
    56. 56.
    57. 57.
    58. 58.
    59. 59.
    60. 60.
    61. 61.
    62. 62.
    63. 63.
    64. 64.
    65. 65.
    66. 66.
    67. 67.
    68. 68.
    69. 69.
    70. 70.
    71. 71.
    72. 72.
    73. 73.
    74. 74.
    75. 75.
    76. 76.
    77. 77.
    « Previous | Next Article »Table of Contents