Discordance of Databases Designed for Claims Payment versus Clinical Information Systems: Implications for Outcomes Research

  1. James G. Jollis, MD;
  2. Marek Ancukiewicz, PhD;
  3. Elizabeth R. DeLong, PhD;
  4. David B. Pryor, MD;
  5. Lawrence H. Muhlbaier, PhD; and
  6. Daniel B. Mark, MD, MPH
  1. From Duke University Medical Center, Durham, North Carolina. Requests for Reprints: James G. Jollis, MD, Box 3254, Duke University Medical Center, Durham, NC 27710. Grant Support: By grants HS-06503 and HS-05635 from the Agency for Health Care Policy and Research; grant HL-17670 from the National Heart, Lung, and Blood Institute; and a grant from the Robert Wood Johnson Foundation. Dr. Jollis was an American College of Cardiology Merck Research Fellow during the course of this research. Acknowledgments: The authors thank Patrick S. Romano, MD, MPH, and Leslie L. Roos, PhD, who designed the ICD-9-CM mapping system used in the Patient Outcomes Research Team for ischemic heart disease; the other members of the Patient Outcomes Research Team for ischemic heart disease for their comments on earlier versions of this work; and Lloyd Hedgpeth and his staff in the Duke Medical Center Information System for providing the insurance claims data for this study.

    Abstract

    Objective: To determine the suitability of insurance claims information for use in clinical outcomes research in ischemic heart disease.

    Design: Concordance study of two databases.

    Setting: Tertiary care referral center.

    Patients: A total of 12 937 consecutive patients hospitalized for cardiac catheterization for suspected ischemic heart disease between July 1985 and May 1990.

    Interventions: Two-by-two tables were used to compute overall and measures of agreement comparing clinical versus claims data for 12 important predictors of prognosis in patients with ischemic heart disease.

    Measurements: Kappa statistics (agreement adjusted for chance agreement) were used to quantify agreement rates.

    Results: Agreement rates between the clinical and claims databases ranged from 0.83 for the diagnosis of diabetes to 0.09 for the diagnosis of unstable angina ( values). Claims data failed to identify more than one half of the patients with prognostically important conditions, including mitral insufficiency, congestive heart failure, peripheral vascular disease, old myocardial infarction, hyperlipidemia, cerebrovascular disease, tobacco use, angina, and unstable angina, when compared with the clinical information system.

    Conclusions: Our results suggest that insurance claims data lack important diagnostic and prognostic information when compared with concurrently collected clinical data in the study of ischemic heart disease. Thus, insurance claims data are not as useful as clinical data for identifying clinically relevant patient groups and for adjusting for risk in outcome studies, such as analyses of hospital mortality.

    Insurance claims data are being used increasingly to study clinical outcomes and quality of care [1-4]. Each year, hospital-specific mortality rates, adjusted by clinically modified International Classification of Diseases (ICD-9-CM) codes from Medicare bills, are released by the Health Care Financing Administration (HCFA) [1-3]. Using ICD-9-CM data to adjust for illness severity, threefold differences for surgeon-specific mortality in Philadelphia were found by Williams and colleagues [4]. Many of the Patient Outcomes Research Teams (PORTs), supported by the Agency for Health Care Policy Research, are using ICD-9-CM coded Medicare discharge abstracts to examine the process of medical care, including physician- and hospital-specific performance [5-7].

    The potential advantages of using insurance claims data sets for clinical research have been described in many previous publications [8]. They include 1) large samples of geographically dispersed patients; 2) longitudinal records; 3) data already collected and available; and 4) defined sampling frames. The question remains: Are data collected to obtain insurance reimbursement a valid proxy for data collected for clinical care and research purposes? Such validity is essential to identify clinically relevant populations and to adjust for illness severity and differences in outcomes [9].

    Six reabstracting studies have attempted to answer this question with respect to analysis of patients discharged after acute myocardial infarction [10-15]. These studies selected patients with the ICD-9-CM code 410, the code for acute myocardial infarction. By examining medical records, they found that clinical criteria for an acute myocardial infarction were met in 43% to 87% of records where the code was used at discharge. Errors resulted when the physician listed the acute myocardial infarction incorrectly, when a myocardial infarction occurred in a previous admission, or when myocardial infarction was ruled out (if it was the admitting diagnosis).

    A substantial limitation of five of these studies was that they selected patients based on claims data. Thus, the groups selected for review were only those patients with an ICD-9-CM code for myocardial infarction. Using this design, it was only possible to obtain estimates of disagreement in one direction; patients who had a condition coded in the clinical data set, but not in the claims data set, could not be examined.

    A second limitation of the previous studies is that their comparison gold standard was based on retrospective review of information recorded in the discharge summary or medical chart. Medical record data are limited by the unstructured way in which they are collected. Inaccuracies in these sources cannot be identified in such a study, and it is possible that in some disagreements with ICD-9-CM codes, the medical record is incorrect.

    Our study examined the suitability of billing data compared with clinical data (prospectively collected for cardiology research and patient care) for use in clinical outcomes research. The descriptors for coronary artery disease that we examined were those listed as important determinants of prognosis by an expert panel from the American College of Cardiology [16].

    Methods

    Insurance Claims Data

    The administrative or insurance claims information comprised all discharge abstracts from Duke University Medical Center between July 1985 and May 1990 containing any procedure code for coronary arteriography. All discharged patients, regardless of insurance status or age, were routinely classified by ICD-9-CM codes recorded by trained medical record technicians based on the attending physician's listed discharge diagnoses, the discharge summary, and selected information from the progress notes and from the test result sections of the hospital chart [17]. These records contained up to 30 diagnostic codes and 9 procedure codes.

    After the technician had assembled the ICD-9-CM codes, the discharge abstract and the chart were returned to the attending physician for final approval by signature; ICD-9-CM codes were not generated for patients having outpatient cardiac catheterization unless they were subsequently admitted for further evaluation or treatment. The records for the subgroup of Medicare patients in this study were sent by Duke Hospital to the North Carolina Medicare intermediary and, thus, reflect the Duke Hospital data contained in the Health Care Financing Administration data sets.

    Clinical Database Data

    The clinical information consisted of important diagnostic and prognostic information about coronary artery disease routinely collected on standardized data forms by the cardiology fellow doing the cardiac catheterization for suspected ischemic heart disease. Information collected included details from the patient history, physical examination, laboratory studies, and cardiac catheterization, as previously described [18]. Each new fellow entering the catheterization laboratory was given a 3-hour training session on variable definitions and use of the data forms and was given an operations manual covering these details. In addition, all data were reviewed for accuracy by the attending angiographer associated with the case; additional consistency, range check, and other quality control measures were done during the data entry process by trained research technicians. This information was stored in the Duke Databank for Cardiovascular Disease, a completely separate and independent system from the hospital administrative records described above.

    Records Matching and Variable Definitions

    Records from the administrative and clinical files were matched by unique, patient hospital identification numbers and hospitalization dates. Only the first matching clinical record for each patient was included in the analysis.

    Twelve clinical variables were mapped to ICD-9-CM codes according to an algorithm developed by the Patient Outcomes Research Team for chronic ischemic heart disease (Table 1) (Romano PS, Roos LL. Unpublished observations). The variables studied were selected if they met two criteria: 1) They were considered to be determinants of prognosis for coronary artery disease according to an expert panel from the American College of Cardiology; 2) they could be mapped to diagnoses contained in the ICD-9-CM coding system [16, 17]. The definitions of the clinically identified conditions appear in the Appendix.

    Table 1. International Classification of Diseases-9-CM and Clinical Detail Map
    Appendix. Glossary of Terms

    Data Analysis

    Based on the clinical condition and the ICD-9-CM map described above, two-by-two tables were constructed to assess the agreement between the data sources. For the claims data, a condition was considered to be absent if it was not coded. For the clinical data, patients with missing data were excluded from the analysis for the specific missing condition. Kappa statistics were generated for each condition to measure agreement while controlling for chance agreement [19]. Confidence intervals and test statistics for proportions were calculated by the normal approximation. For the diagnoses of acute myocardial infarction, congestive heart failure, angina, and unstable angina, we reviewed a random sample of 15 clinical-positive and claims-negative charts as well as 15 claims-positive and clinical-negative charts for each diagnosis to illustrate the major reasons for disagreement. In addition to the comparisons made in the overall data sets, subsets defined by age, fiscal year, and sex were compared to determine if the coding accuracy varied according to these factors.

    Results

    The study group consisted of 12 937 consecutive patients having inpatient cardiac catheterization between July 1985 and May 1990. Although each record represented the first cardiac catheterization in the claims records, from the perspective of the clinical records, 89% involved the first catheterization, 8% involved the second catheterization, and the remaining 3% involved the third or subsequent catheterization. The patients had a mean age of 58.8 years, 34% were women, and the racial composition was 88% white, 10% black, and 2% other. At cardiac catheterization, the mean left ventricular ejection fraction was 52%. The distribution of the number of diseased major epicardial vessels (zero, one, two, or three) was 23%, 26%, 23%, and 28%, respectively. Overall, the study group characteristics were similar to those of other large angiographic registries except for the greater proportion of women and the higher mean age [20, 21].

    Measures of Agreement

    Specific measures of agreement between clinical database and ICD-9-CM variables are listed in Table 2 in descending order of value (the agreement rate adjusted for chance agreement). Kappas ranged from 0.83 for diabetes mellitus to 0.09 for unstable angina. Of the 12 conditions, only 3 (diabetes, acute myocardial infarction, and hypertension) were identified by the claims data more than 50% of the time that they were identified by the clinical data.

    Table 2. Comparison of Agreement by Condition Ranked by Kappa Value

    In the clinical data set, two conditions were graded according to severity, congestive heart failure, and mitral regurgitation. With increasing severity levels, the claims data were more likely to identify the presence of these conditions. Claims data identified 31% of clinically identified congestive heart failure that was New York Heart Association class I and II and identified 45% of class III and IV heart failure (P < 0.0001) [22]. Similarly, claims data identified 40% of grades I and II mitral regurgitation and identified 69% of grades III and IV mitral regurgitation (P < 0.0001).

    When all diagnoses were considered together, the overall agreement of ICD-9-CM codes with clinical data was 0.75 (99% CI, 0.75 to 0.76). The proportion of conditions in the clinical data set identified by claims data was 0.39 (99% CI, 0.38 to 0.39) (Table 3). Stratified by fiscal year of admission, the claims data were more likely to identify clinical conditions over time, from 0.33 (99% CI, 0.32 to 0.34) in fiscal year 1985 to 1986 to 0.46 (99% CI, 0.45 to 0.48) in fiscal year 1989 to 1990. Stratified by age, the claims records for patients older than 64 years were more likely to identify clinical conditions than were records for patients ages 64 or younger.

    Table 3. Proportion of Clinically Identified Conditions Identified by Claims Data

    In the clinical data set, information was missing for fewer than 1% of patients for all conditions except mitral insufficiency (607 missing, 4.7%) and hyperlipidemia (308 missing, 2.4%). Mitral insufficiency was based on ventriculography as defined in the Appendix. Patients missing this information did not have ventriculography because of renal insufficiency, cardiogenic shock, or hemodynamic instability. The principal reason for the missing hyperlipidemia information was that a fasting cholesterol level was pending at the time of cardiac catheterization and the previous lipid status was unknown. To evaluate the effects of these missing data on our results in a sensitivity analysis, we assumed that all of the missing clinical data for mitral insufficiency and hyperlipidemia were concordant with the claims data. This produced no substantial change in overall agreement. Kappa values for mitral insufficiency and hyperlipidemia increased by 0.02 and 0.01, respectively, and the proportion of clinical data identified by claims data increased by 0.01 for both diagnoses.

    Diagnoses Not Identified by the Data

    For the diagnoses of acute myocardial infarction, congestive heart failure, angina, and unstable angina, 15 678 cases occurred in which the clinical conditions were not identified by the claims data and 1276 cases occurred in which the claims diagnoses were not identified by the clinical data. We surveyed a stratified random sample of 120 charts to determine the reasons for disagreement between the two sources.

    For the 60 patients whose clinical conditions were not identified by the claims data set, reasons for disagreement suggested by chart review included the following: 1) the physician omitting or not specifically stating the condition in the discharge data or discharge diagnoses (n = 32); 2) the information arising after cardiac catheterization changing the diagnosis (n = 15); and 3) the medical record technician errors of coding acute myocardial infarction as old myocardial infarction (n = 5). Two charts contradicted the clinical diagnosis, both concerning congestive heart failure. For six charts, the medical record did not contain sufficient information to determine the correct source.

    For the 60 patients whose claims conditions were not identified by the clinical data, the reasons for disagreement suggested by chart review included the following: 1) the discharging physician listing an incorrect diagnosis in the discharge summary or discharge diagnoses (n = 16); 2) events occurring after catheterization (n = 10) (myocardial infarction, n = 4; congestive heart failure, n = 4; unstable angina, n = 2); 3) catheterization results changing the diagnosis (n = 4); and 4) medical record technician errors of coding diagnoses without support in the chart (n = 7) (acute myocardial infarction, n = 6; and unstable angina, n = 1). Six charts supported the claims data (angina, n = 1; congestive heart failure, n = 5). For 17 charts, the medical record did not contain sufficient information to determine the correct source.

    Discussion

    The most important finding of our study is that for prognostically important conditions (including mitral insufficiency, congestive heart failure, peripheral vascular disease, old myocardial infarction, hyperlipidemia, cerebrovascular disease, tobacco use, angina, and unstable angina), claims data failed to identify more than one half of those patients identified as having the condition by a clinical information system. Claims data also lack two of the most important prognostic variables for coronary disease, the left ventricular ejection fraction and the number of diseased vessels [16]. This deficiency of clinical detail raises serious concern about using claims data to identify patients with specific diagnoses, to adjust for appropriate risk factors, and to identify outcomes such as congestive heart failure or acute myocardial infarction and suggests the continuing need for prospectively collected clinical data in outcomes research.

    Under-estimating Comorbidity

    For example, the hospital-specific mortality information released annually by the Health Care Financing Administration uses ICD-9-CM-coded diagnoses in the models that estimate predicted mortality [1-3]. The chronic cardiovascular disease comorbidities in their explanatory model for mortality include conditions examined in our study: ICD-9-CM code 412 for old myocardial infarction; code 413 for angina; and code 428 for heart failure. By potentially missing 64% to 71% of patients with these conditions as suggested by our study, comorbidity may be severely under-estimated. Such under-estimation of comorbid illness would particularly affect mortality estimates of hospitals that take care of more severely ill patients, markedly under-estimating their predicted mortality.

    In the Health Care Financing Administration analysis, patients are also classified for presentation based on ICD-9-CM codes examined in our study including 410 (acute myocardial infarction) and 398.91, 402.01, 402.11, 402.91, and 428, which are all codes for congestive heart failure [1-3]. Our study suggests that the analysis presented by specific diagnostic categories for acute myocardial infarction and congestive heart failure potentially omits 24% and 61% of patients who should fall into such categories, respectively. If a bias existed in which patients were omitted, the findings of the Health Care Financing Administration analysis would be skewed. One source of such bias would be the omission of patients with a larger number of diagnoses due to truncation of the Medicare data at 5 diagnoses, and the tendency not to code chronic or comorbid conditions for patients who die. Previous analyses of claims data by Jenks and Iezzoni and their colleagues [23, 24] showed protective prognostic effects of diabetes, hypertension, and previous myocardial infarction. Because this contradicted the available clinical data as well as clinical intuition, they felt that this finding was due to undercoding of chronic conditions for more severely ill patients. However, this hypothesis could not be verified because concurrent clinical data collection or chart re-abstraction were not done in these studies.

    Our study analyzed the utility of insurance claims data as a substitute for clinical data to examine prognosis. We assumed that claims data were sufficiently valid to accomplish administrative goals; conversely, clinical data could be viewed as the standard only for the specific purpose of studying clinical outcomes. The different purposes of the two data sources best explain their disagreement. For example, previous myocardial infarctions may lead to increased risk in patients with coronary artery disease and are, therefore, recorded in the clinical data set. To determine reimbursement, claims data are limited to the most important diagnoses pertaining to the use of health resources as evident at the time of discharge. The guidelines for ICD-9-CM coding approved by the Health Care Financing Administration, the American Hospital Association, the American Medical Record Association, and the National Center for Health Statistics state, Code 412 (old myocardial infarction) is never used with a code in the 410 to 414 series (acute myocardial infarction, angina, coronary atherosclerosis, and other acute and subacute forms of ischemic heart disease) [25]. Thus, a large number of old myocardial infarctions that are recorded in the clinical data are not recorded by claims data. To improve insurance claims data for use in examining outcomes and quality of care, coding guidelines need to be modified to address the goal of characterizing illness severity in addition to the goal of determining the reasons for health resource use.

    Claims Data Omit Infarctions

    Our study is the largest to date that identified acute myocardial infarctions that were omitted by claims data. Twenty-four percent of clinically identified acute myocardial infarctions (1227 of 5032) were not coded by ICD-9-CM codes. Most previous work examining the validity of the acute myocardial infarction diagnosis only examined claims records that contained an ICD-9-CM code for acute myocardial infarction and, thus, were not able to detect acute infarctions that had not been coded [11-14]. The only previous study that selected charts regardless of the presence of an ICD-9-CM code for acute myocardial infarction was done by Fisher and colleagues [15]. Using medical record technician re-abstraction as the gold standard, Fisher found that 10% of 271 acute myocardial infarctions were not coded by Medicare insurance abstracts. Because Fisher's study was restricted to medical record technicians reviewing the same records available to the original claims coding technicians, their ability to identify additional myocardial infarctions was limited, in contrast to the physicians in the Duke clinical system who could use cardiac enzyme levels and serial electrocardiogram results to identify infarctions.

    Despite the overall lack of agreement between clinical and claims data, subgroup analysis suggests that claims data may be more likely to identify clinical conditions over time. Between 1985 and 1990, the overall proportion of clinical conditions identified by claims data increased by 13% (P < 0.001). A small but statistically significant trend of improved identification of conditions occurred between 1985 and 1990. This improvement is likely to continue if accurate discharge information remains a requirement for reimbursement.

    The improved identification of conditions for persons older than 64 years, the group eligible for Medicare, may be due to two factors. Because reimbursement for Medicare is based specifically on ICD-9-CM coded diagnoses, the financial incentive to specify illness may have made identification of such illness more likely. In addition, more severe levels of illness in the elderly may have led to increased coding of any illness.

    The problem of excess coding of conditions in the claims data has also become evident in our analysis. Of 60 instances of conditions coded in the claims data but not in the clinical data, at least 23 were coded in error. For an additional 17 patients, the correct source could not be determined from the charts, and in 10 instances the conditions were actually outcomes rather than comorbidities.

    The only independent audit of Duke hospital abstracting is done by the North Carolina Medicare intermediary, for which limited data are available from 1986 to 1990. The intermediary reabstracted a random sample of 3% of Duke Hospital Medicare insurance claims, and if more than 5% (at least 6 cases) contained errors in coding or billing, the hospital was placed on intensified review the following quarter. For the 4 years for which data were still available, Duke hospital was under intensified review for 7 of 16 quarters. The errors that were counted included diagnoses and procedure coding errors and other technical errors, such as submitting diagnoses that were not reimbursed by Medicare. Neither the North Carolina Medicare intermediary nor the Health Care Financing Administration generate data about past medical record coding performance across institutions.

    Limitations of Study

    Although our study is limited by its use of data from a single tertiary care institution, our results are comparable to those of previous population-based claims validity studies [11, 14, 15, 26]. The positive predictive value of an ICD-9-CM code for acute myocardial infarction in our study was 91% (acute myocardial infarction identified by the claims data also present in the clinical data) compared with the 80% to 92% range of the previous population representative studies. Compared with the claims data from the Health Care Financing Administration that are truncated after five diagnosis codes, the claims data in our study, containing up to 30 diagnostic codes per patient, are more likely to identify clinical conditions. The overall agreement of 0.75 in our study is similar to the aggregate results of the National Diagnosis Related Groups Validation studies (from 1985 and 1977) of 0.78 and 0.73, respectively [26, 27].

    A limitation of our study is that the clinical data collected at the time of cardiac catheterization were not specifically designed to match the ICD-9-CM classification system. Therefore, some of the differences between the clinical and claims data sets are due to differences in definitions of the various clinical states at the time of application. Although the clinical descriptors in the prospective data had explicit definitions, the ICD-9-CM system definitions were less precise for most conditions.

    Also, the clinical data were somewhat disadvantaged, because they were collected at an earlier point relative to the hospitalization than were the claims data. The mean length of hospitalization for patients in this study was 8.6 days (25th to 75th percentile, 3 to 11 days), whereas the mean time to cardiac catheterization was 1.3 days (25th to 75th percentile, 0 to 1 days). The clinical data were collected before cardiac catheterization, whereas the claims data were collected after discharge. Despite the advantage of having a longer period of hospitalization from which to detect a diagnosis, the claims data did not identify a large number of diagnoses present in the clinical data. In addition, in 10 of 60 patients with conditions coded in the claims data but not in the clinical data, the claims data were identifying an outcome or complication, rather than a comorbidity.

    Our study represents a cross-sectional comparison of claims and clinical data because we only examined the first matching record for each patient. In many claims data sets, information may exist from a number of health care encounters for a given patient. By combining claims information from more than one encounter, a longitudinal claims record can be constructed that may better identify clinical conditions. For patients who had multiple hospital encounters, physicians and medical record coders would have additional information to specify illness.

    Conclusion

    The failure of insurance claims data to identify more than one half of those patients having prognostically important conditions raises serious concerns about its suitability for use in studies of prognosis and outcome with coronary artery disease, such as claims-based mortality analyses [1-3]. Claims data, such as the Medicare Provider Analysis and Review file, can be used to characterize variations in process of care and outcome without addressing the underlying causes related to illness severity. Our study did not address the adequacy of ICD-9-CM codes for procedures, only for diagnoses. Because large reimbursement incentives exist to code procedures properly, analyses organized by procedure coding would be expected to provide more precise characterization of particular patient groups than those organized by diagnoses codes. Future studies should examine methods to improve the reliability of claims data, such as constructing clear, usable definitions for each condition, tying claims data collection more directly to the process of clinical care, and modifying coding guidelines to support a secondary goal of accurate clinical characterization of patients.

    Presented in part at the 41st Annual Scientific Session of the American College of Cardiology on 13 April 1992.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.
    17. 17.
    18. 18.
    19. 19.
    20. 20.
    21. 21.
    22. 22.
    23. 23.
    24. 24.
    25. 25.
    26. 26.
    27. 27.
    « Previous | Next Article »Table of Contents