Linking Medicare and National Survey Data

  1. Lee A. Lillard, PhD; and
  2. Melissa M. Farmer, MS
  1. From the RAND Center for the Study of Aging, Santa Monica, California. Note: This article is one of a series of articles comprising an Annals of Internal Medicine supplement entitled “Measuring Quality, Outcomes, and Cost of Care Using Large Databases: The Sixth Regenstrief Conference.” To see a complete list of the articles included in this supplement, please view its Table of Contents. Acknowledgment: The authors thank Jeannette Rogowski of RAND for work done on linking Medicare data with PSID data. Grant Support: In part by grants P20-AG12815 and PO1-AG08291 from the National Institute of Aging. Requests for Reprints: Lee A. Lillard, PhD, RAND Center for the Study of Aging, 1700 Main Street, PO Box 2138, Santa Monica, CA 90407-2138. Current Author Addresses: Dr. Lillard and Ms. Farmer: RAND Center for the Study of Aging, 1700 Main Street, PO Box 2138, Santa Monica, CA 90407-2138. Current Author Addresses: Dr. Lillard and Ms. Farmer: RAND Center for the Study of Aging, 1700 Main Street, PO Box 2138, Santa Monica, CA 90407-2138.

    Abstract

    Administrative records from the Medicare Program of the Health Care Financing Administration provide a valuable source of information for research on medical and public policy issues. This administrative database contains information on utilization of covered medical services, diagnoses, episodes of illness, and Medicare-covered costs of health care. Combining such data with information from national surveys on health status, demographics, and socioeconomic attributes substantially expands the scope of potential research questions that can be addressed. This article discusses the benefits and difficulties of linking Medicare administrative data with survey data and provides brief summaries of five national surveys of elderly U.S. citizens. These surveys can be valuable resources for examining the health status and life experiences of the Medicare population.

    A valuable new source of data for interdisciplinary research on aging has become increasingly available: linking representative data from national surveys on individual persons with summary data from administrative records on Medicare claims and the National Death Index (NDI). This article discusses databases that presently can be linked, describes information that is contained in those databases, and offers the advantages and limitations of using cross-linked data.

    Studies of health care in the elderly population have focused on medical histories, current health status, and health care utilization. A single source of information that includes measures in all of these domains has been difficult to find. Surveys can provide substantial information on various measures of health status, including death, perception of health, health behaviors, and functional status. Longitudinal survey research allows changes in health status to be examined over time. Administrative records, such as Medicare administrative files, provide detailed summaries of health care utilization and costs for elderly U.S. citizens. This wealth of information can be used to evaluate Medicare policy development and for clinical and epidemiologic research.

    Researchers from many fields have used either surveys or administrative records alone, thereby limiting their analyses to the particular sets of outcomes and covariates available from those data. AS a wider range of data becomes available (for example, when Medicare and survey data are linked), researchers can explore the interrelationships among health status, chronic health conditions, functional limitations, economic and family resources, health insurance status, medical care utilization, and costs. Future interdisciplinary research will be encouraged by availability of these new data resources.

    The main purpose of this article is to encourage researchers who are interested in studying the health of elderly persons to move beyond using surveys or administrative records. By linking Medicare administrative data with survey data, researchers can improve understanding of the health status and utilization rates of the Medicare population. The archive of linked databases has grown substantially over the past few years. In health services research, mortality information is often linked to survey data. The NDI [34], which includes the causes of death, has been an important resource that allows researchers to link interview information with data on deaths. Data links between the NDI, the Longitudinal Study of Aging (LSOA) [1], and the Panel Study of Income Dynamics (PSID) [2] already exist. In addition, the planning strategy for the Asset and Health Dynamics of the Oldest-Old (AHEAD) [3] has included cross-links to NDI data and social security records on earnings and benefits. During examination of the health status of elderly persons, linkage of survey data with Medicare administrative data is valuable. Because most U.S. citizens older than 65 years of age are covered by Medicare, Medicare records provide nearly complete information on actual health care utilization.

    Why Link Medicare Administrative Data with Survey Data?

    Administrative records from the Medicare Program of the Health Care Financing Administration (HCFA) provide a valuable source of information for health services research and public policies related to health. The Medicare database is designed to track utilization and costs of covered services. This database contains detailed data about providers, diagnoses, and services for eligible Medicare beneficiaries from 1984 or the date of the beneficiary's initial eligibility to the present.

    Although Medicare administrative records can be a wonderful source of information on hospitals, other providers of health care services, and regional health care patterns, they have limitations when used for research on beneficiaries. First, because all health care services are not covered by Medicare, information on the cost of uncovered services is missing from the records. Uncovered services include dental care, prescription drugs, eyeglasses, and elective or cosmetic procedures. Therefore, Medicare claims records do not provide a complete picture of a beneficiary's utilization of health care services and their costs.

    Second, although Medicare records contain detailed information on Medicare-covered costs, they do not provide information on other costs, including beneficiary out-of-pocket expenses (which are affected by the annual deductible) and costs covered by the beneficiary's insurers. For example, nursing home costs that are not covered by Medicare are not included in administrative records. Medicare covers a limited amount of nursing home care (primarily in skilled nursing facilities) for a limited time and with lifetime limits for each beneficiary. Therefore, information on the important transition from covered recovery care to longer-term health care and whether the transition was even initiated is missing.

    Although the Medicare database contains information on diagnoses, procedures, and some chronic conditions, it does not provide a complete picture of health. Surveys can contribute a more comprehensive view that includes chronic conditions, mental health, physical functions, activities of daily living, and general health status. Numerous health surveys span the entire life of survey participants. Today, attention is being focused on U.S. citizens who are 65 years of age and older. This article highlights some of the national panel surveys that are important for research on the elderly population and that have been or will be linked to Medicare data records.

    Table 1 includes summaries of some national survey databases that have been linked with the Medicare database. The objectives of such national surveys as AHEAD [3], the New Beneficiary Survey (NBS) and New Beneficiary Follow-up (NBF) [4], and LSOA [1] are to examine changes in the health and life experiences of elderly persons. These surveys can provide a broader picture of health in older persons than is possible with only utilization and cost data from Medicare records. Surveys by PSID [2] and AHEAD [3] offer even more detailed information on demographics, the economic wellbeing of respondents, and health behavior measures. One of the objectives of the National Long-Term Care Survey (NLTCS) [5] and of AHEAD is to capture out-of-pocket expenses for health care services. Both the NLTCS and AHEAD also provide extensive information on caregivers and the health care and support available to the elderly population.

    Table 1. Summary of National Survey Data Linked with Medicare Administrative Data

    Whereas Medicare data focus on the beneficiary, national survey data focus on the individual respondent, household, or both. For example, Medicare records are not well suited to matching spouses, but national surveys can often match spouses. After the link has been made between survey and Medicare data, matched spouses in the survey can be identified in Medicare records on health care utilization and costs.

    As with Medicare data, survey data have limitations. It is difficult to uncover rates of health care utilization and especially difficult to access the true cost of health care services from a household survey. Medicare data enhance the ability of researchers to study various behaviors measured in the surveys by providing measures of chronic conditions, surgeries and other procedures that can affect activities of daily living, the level of medical intervention, and costs. Furthermore, survey data can provide information needed to complete the health status portfolio of beneficiaries, including data on uncovered services, out-of-pocket expenses, supplemental insurance coverage (Medi-gap or employer provided), and transitions to institutions. On the other hand, Medicare data can provide information on health care utilization and costs. Therefore, the link between national surveys, especially panel surveys, and Medicare records is clearly beneficial. The scope of research questions that can be addressed is expanded considerably by such a link. The additional information allows analyses of the interactions and relations among health status, insurance coverage, utilization of health care services, and costs. The value of survey data from households with elderly persons is obviously enhanced when such data are linked with Medicare records. Research questions can now encompass a broader definition of health status; health care utilization and services; and social, economic, and behavioral experiences.

    Because databases that link national survey data with Medicare data have only recently become available, they are not yet widely represented in the literature. However, a few researchers have shown that using these linked databases can provide valuable information in several areas of research on the health of elderly persons. Much of the previous research has focused on the association between health status and rates of health care utilization. Links between LSOA and Medicare data allowed examination of functional transitions; subsequent hospital use and costs [6]; the risk factors involved with hospitalization for pneumonia; and the frequency, length of stay, cost, and mortality rates associated with pneumonia [7]. In addition, these data have been used to explore individual characteristics associated with consistency and volume of hospital utilization [8]. The linkage of LSOA, Medicare, and NDI data suggested that some of the variations in total real hospital charges were explained by sociodemographic characteristics and various measures of health status (including disease history, cause of death, and the proportion of charges incurred during the final year of life) [9]. Research done by using NBS and Medicare records have examined the effectiveness of self-reported health and functional status as indicators of future utilization rates and death of retired workers [10] and explored the probability of death and inpatient care for disabled workers [11].

    Linked data have contributed to research on health service providers, such as nursing homes and home health agencies. Risks for short- and long-term nursing home admissions have been estimated over time [12], and home health agencies were compared in terms of patient health status and rates of health care utilization [13]. The Medicare system itself has been the subject of research. Manton and colleagues [14] used linked data to examine the effects of the introduction of Medicare's Prospective Payment System on the use of health care services, giving special attention to posthospital care. Through linkage of survey data with Medicare records, impaired persons who may have needed health care but did not receive Medicare services were identified. Finally, Lillard and Rogowski [15] used PSID data to find that having private insurance increased respondents' Medicare expenditures for part B services but not for part A services. These studies have shown how linking Medicare and survey data has added to the health care industry's knowledge of the health status of Medicare beneficiaries and health care services within the Medicare system.

    Linking Medicare data with survey data is clearly advantageous, but the process of linking Medicare records to survey respondents can be tedious and imperfect. The HCFA has developed the Claims and Utilization Data system to retrieve Medicare data that can be linked to other data. In this vast network of files, three sets of files are most useful for preparing beneficiary summaries that can be linked to respondents in national surveys (Table 2). Since 1984, data have been compiled by using two different methods. The first is the historical Medicare Automated Data Retrieval System, which contains all claims from 1984 to 1991. Starting in 1991, another system, the National Claims History (NCH) Beneficiary Program Liability (BPL) was used to compile Medicare data. The NCH-BPL system is used with Standard Analytic Files to obtain claims linked by Health Insurance Claim numbers.

    Table 2. Medicare Administrative Claims Data Useful for Linking to Survey Data

    Confidentiality and privacy are important issues for research, especially when administrative records are being used. Although survey and administrative data are linked by using personal identification numbers, once the data are linked, personal identifiers are excluded from the data set used for research. In their place, aggregated measures of utilization and costs for a given calendar year are created. The Medicare-constructed variables differ from one study to the next, but they are similar in nature. Examples of annual aggregate variables that have been created for the PSID survey are shown in Table 3. These measures allow identification of the utilization and costs of health care services while preserving the confidentiality of survey respondents and Medicare beneficiaries.

    Table 3. Annual Aggregate Variables Often Created from Medicare Administrative Data*

    Difficulties in Linking Medicare and Survey Data

    Before Medicare and survey data can be linked, consent must be obtained from survey respondents. About 80% of eligible respondents give permission to link their Medicare files with survey data, but this rate varies with the survey sample. Some samples are drawn from social security and Medicare files (that is, NBS and NBF data and NLTCS data) and greatly facilitate the linkage process. Surveys with other sampling frameworks (for example, LSOA and AHEAD) must design the survey to accommo-date data linkage. These surveys tend to have fairly high linkage rates. Such surveys as PSID, which seek permission to link retrospectively through a mail-in supplement survey, tend to have lower rates of consent to linkage.

    After consent has been obtained, the investigator must obtain legitimate Medicare numbers from respondents. Several errors can occur when respondents supply Medicare numbers. Numbers can be translated incorrectly or miscoded during data entry. In addition, a respondent may have used several different claim numbers over time. Different health insurance claim numbers may exist for a given beneficiary who submitted claims under his or her own social security number (or Railroad Retirement Board beneficiary number), the social security number of a current spouse, or the social security number of one or more former spouses (death of a spouse or divorced after being married >10 years). Obtaining all claim numbers used by that beneficiary is necessary to determine the full claims history over a specified period. Valid claim numbers can be acquired by cross-referencing claim numbers in Medicare administrative records.

    Assuming that all valid claim numbers have been obtained, all claims records for the beneficiary must be retrieved from the system and summarized appropriately for the particular research project. For example, a study of utilization and costs would require summarizing files into annual utilization and costs for inpatient and directly related services. Medicare files are enormous, and retrieval of all records for a single beneficiary (who may have several claim numbers) is an enormous task.

    A problem that may become more difficult over time is missing information as a result of the increasing number of Medicare beneficiaries who receive at least some health care services in a managed care setting. Detailed data on utilization, diagnoses, and costs are not available because the Medicare program pays providers a negotiated fee per beneficiary rather than the actual cost of claims. At a minimum, the investigator must identify beneficiaries who are under managed care so that their data can be interpreted appropriately.

    Conclusions

    Research objectives often require data on both health status and health care utilization, as well as socioeconomic characteristics. Linking administrative and survey data can provide a rich database for such research. We encourage researchers to take advantage of the information that is rapidly becoming available. For researchers who are planning surveys, we encourage the planned linkage to Medicare and other administrative databases. These linked databases can provide a wealth of information on the health of the elderly population in the United States, including health status, health utilization, socioeconomic status, environmental and behavioral experiences, and availability of institutional services.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    « Previous | Next Article »Table of Contents