MEPS Home Medical Expenditure Panel Survey
Font Size:
Contact MEPS FAQ Site Map  
S
M
L
XL
 

MEPS HC-199: 2017 Medical Conditions

August 2019

Agency for Healthcare Research and Quality
Center for Financing, Access, and Cost Trends
5600 Fishers Lane
Rockville, MD 20857
(301) 427-1406


The MEPS instrument design changed beginning in Spring of 2018, affecting Panel 23 Round 1, Panel 22 Round 3, and Panel 21 Round 5. For the Full-Year 2017 PUFs, the Panel 22 Round 3 and Panel 21 Round 5 data were transformed to the degree possible to conform to the previous design. Data users should be aware of possible impacts on the data and especially trend analysis for these data years due to the design transition.

Table of Contents

A. Data Use Agreement
B. Background
1.0 Household Component
2.0 Medical Provider Component
3.0 Survey Management and Data Collection
C. Technical and Programming Information
1.0 General Information
2.0 Data File Information
2.1 Codebook Structure
2.2 Reserved Codes
2.3 Codebook Format
2.4 Variable Naming
2.5 File Contents
2.5.1 Identifier Variables (DUID-CONDRN)
2.5.2 Medical Condition Variables (AGEDIAG-ICD10CDX)
2.5.2.1 Priority Conditions and Injuries
2.5.2.2 Age Priority Condition Began
2.5.2.3 Follow-up Questions for Injuries and Priority Conditions
2.5.2.4 Sources for Conditions on the MEPS Conditions File
2.5.2.5 Treatment of Data from Rounds Not Occurring in 2017
2.5.2.6 Rounds in Which Conditions Were Reported/Selected (CRND1 – CRND5)
2.5.2.7 Diagnosis and Condition Codes
2.5.2.8 Clinical Classification Codes
2.5.3 Utilization Variables (OBNUM – RXNUM)
3.0 Survey Sample Information
3.1 Overview
3.2 Details on Person Weight Construction
3.2.1 MEPS Panel 21 Weight Development Process
3.2.2 MEPS Panel 22 Weight Development Process
3.2.3 The Final Weight for 2017
3.2.4 Coverage
3.3 Using MEPS Data for Trend Analysis
4.0 Merging/Linking MEPS Data Files
4.1 National Health Interview Survey (NHIS)
4.2 Longitudinal Analysis
References
Appendix 1: Variable-Source Crosswalk
Appendix 2: Condition Code Frequencies
Appendix 3: List of Conditions Asked in Priority Conditions Enumeration Section

A. Data Use Agreement

Individual identifiers have been removed from the micro-data contained in these files. Nevertheless, under sections 308 (d) and 903 (c) of the Public Health Service Act (42 U.S.C. 242m and 42 U.S.C. 299 a-1), data collected by the Agency for Healthcare Research and Quality (AHRQ) and/or the National Center for Health Statistics (NCHS) may not be used for any purpose other than for the purpose for which they were supplied; any effort to determine the identity of any reported cases is prohibited by law.

Therefore in accordance with the above referenced Federal Statute, it is understood that:

  1. No one is to use the data in this data set in any way except for statistical reporting and analysis; and

  2. If the identity of any person or establishment should be discovered inadvertently, then (a) no use will be made of this knowledge, (b) the Director Office of Management AHRQ will be advised of this incident, (c) the information that would identify any individual or establishment will be safeguarded or destroyed, as requested by AHRQ, and (d) no one else will be informed of the discovered identity; and

  3. No one will attempt to link this data set with individually identifiable records from any data sets other than the Medical Expenditure Panel Survey or the National Health Interview Survey. Furthermore, linkage of the Medical Expenditure Panel Survey and the National Health Interview Survey may not occur outside the AHRQ Data Center, NCHS Research Data Center (RDC) or the U.S. Census RDC network.

By using these data you signify your agreement to comply with the above stated statutorily based requirements with the knowledge that deliberately making a false statement in any matter within the jurisdiction of any department or agency of the Federal Government violates Title 18 part 1 Chapter 47 Section 1001 and is punishable by a fine of up to $10,000 or up to 5 years in prison.

The Agency for Healthcare Research and Quality requests that users cite AHRQ and the Medical Expenditure Panel Survey as the data source in any publications or research based upon these data.

Return To Table Of Contents

B. Background

1.0 Household Component

The Medical Expenditure Panel Survey (MEPS) provides nationally representative estimates of health care use, expenditures, sources of payment, and health insurance coverage for the U.S. civilian noninstitutionalized population. The MEPS Household Component (HC) also provides estimates of respondents’ health status, demographic and socio-economic characteristics, employment, access to care, and satisfaction with health care. Estimates can be produced for individuals, families, and selected population subgroups. The panel design of the survey, which includes 5 Rounds of interviews covering 2 full calendar years, provides data for examining person level changes in selected variables such as expenditures, health insurance coverage, and health status. Using computer assisted personal interviewing (CAPI) technology, information about each household member is collected, and the survey builds on this information from interview to interview. All data for a sampled household are reported by a single household respondent.

The MEPS-HC was initiated in 1996. Each year a new panel of sample households is selected. Because the data collected are comparable to those from earlier medical expenditure surveys conducted in 1977 and 1987, it is possible to analyze long-term trends. Each annual MEPS-HC sample size is about 15,000 households. Data can be analyzed at either the person or event level. Data must be weighted to produce national estimates.

The set of households selected for each panel of the MEPS HC is a subsample of households participating in the previous year’s National Health Interview Survey (NHIS) conducted by the National Center for Health Statistics. The NHIS sampling frame provides a nationally representative sample of the U.S. civilian noninstitutionalized population and reflects an oversample of Blacks and Hispanics. In 2006, the NHIS implemented a new sample design, which included Asian persons in addition to households with Black and Hispanic persons in the oversampling of minority populations. NHIS introduced a new sample design in 2016 that discontinued oversampling of these minority groups. The linkage of the MEPS to the previous year’s NHIS provides additional data for longitudinal analytic purposes.

Return To Table Of Contents

2.0 Medical Provider Component

Upon completion of the household CAPI interview and obtaining permission from the household survey respondents, a sample of medical providers are contacted by telephone to obtain information that household respondents can not accurately provide. This part of the MEPS is called the Medical Provider Component (MPC) and information is collected on dates of visits, diagnosis and procedure codes, charges and payments. The Pharmacy Component (PC), a subcomponent of the MPC, does not collect charges or diagnosis and procedure codes but does collect drug detail information, including National Drug Code (NDC) and medicine name, as well as date filled and sources and amounts of payment. The MPC is not designed to yield national estimates. It is primarily used as an imputation source to supplement/replace household reported expenditure information.

Return To Table Of Contents

3.0 Survey Management and Data Collection

MEPS HC and MPC data are collected under the authority of the Public Health Service Act. Data are collected under contract with Westat, Inc. (MEPS HC) and Research Triangle Institute (MEPS MPC). Data sets and summary statistics are edited and published in accordance with the confidentiality provisions of the Public Health Service Act and the Privacy Act. The National Center for Health Statistics (NCHS) provides consultation and technical assistance.

As soon as data collection and editing are completed, the MEPS survey data are released to the public in staged releases of summary reports, micro data files, and tables via the MEPS website. Selected data can be analyzed through MEPSnet, an on-line interactive tool designed to give data users the capability to statistically analyze MEPS data in a menu-driven environment.

Additional information on MEPS is available from the MEPS project manager or the MEPS public use data manager at the Center for Financing, Access, and Cost Trends, Agency for Healthcare Research and Quality, 5600 Fishers Lane, Rockville, MD 20857 (301-427-1406).

Return To Table Of Contents

C. Technical and Programming Information

1.0 General Information

This documentation describes the data contained in MEPS Public Use Release HC-199, which is one in a series of public use data files to be released from the 2017 Medical Expenditure Panel Survey Household Component (MEPS HC). Released in ASCII (with related SAS, SPSS, and Stata programming statements and data user information) and SAS formats, this public use file provides information on household-reported medical conditions collected on a nationally representative sample of the civilian noninstitutionalized population of the United States for calendar year 2017 MEPS HC. The file contains 25 variables and has a logical record length of 81 with an additional 2-byte carriage return/line feed at the end of each record.

This documentation offers a brief overview of the types and levels of data provided and the content and structure of the files. It contains the following sections:

  • Data File Information
  • Survey Sample Information
  • Merging/Linking MEPS Data Files
  • Appendices
    • Variable-Source Crosswalk
    • Detailed ICD-10-CM Condition Code Frequencies
    • List of Conditions Asked in Priority Conditions Enumeration Section

A codebook of all the variables included in the 2017 Medical Conditions File is provided in an accompanying file.

For more information on the MEPS sample design, see Chowdhury et al (2019). A copy of the survey instrument used to collect the information on this file is available on the MEPS website.

Return To Table Of Contents

2.0 Data File Information

This file contains 112,630 records. Each record represents one current medical condition reported for a household survey member who resides in an eligible responding household and who has a positive person or family weight. A condition is defined as current if it is linked to an event or a condition the person reported as experiencing during 2017 (i.e., a condition selected in the Condition Enumeration (CE) section). Starting with Panel 21 Round 5 and Panel 22 Round 3, the CE section is no longer asked. Conditions in the Priority Condition Enumeration (PE) section are asked in the context of “has person ever been told by a doctor or other health care professional that they have (condition)?” except joint pain and chronic bronchitis, which ask only about the last 12 months. Persons with a response of Yes (1) to a priority condition question for whom the condition is not current as defined above will not have a record for that condition in this file.

Records meeting one of the following criteria are included on the file:

In Panel 22:

  • Round 1 and Round 2 records that are linked to a 2017 event or a condition the person is currently experiencing (i.e., a condition selected in the CE section);
  • Round 3 conditions that were linked to a 2017 event.

In Panel 21:

  • Round 1 and Round 2 condition records that are linked to a 2017 event or a condition the person is currently experiencing in 2017 (i.e., a condition selected in the CE section);
  • Round 3 and Round 4 records that are linked to a 2017 event or a condition the person is currently experiencing (i.e., a condition selected in the CE section);
  • Round 5 condition records linked to a 2017 event.

For most variables on the file, the codebook provides both weighted and unweighted frequencies. The exceptions to this are weight variables and variance estimation variables. Only unweighted frequencies of these variables are included in the accompanying codebook file. See the Weights Variables list in Appendix 1, Variable-Source Crosswalk.

Person-level data (e.g., demographic or health insurance characteristics) from the 2017 MEPS full-year consolidated file (HC-201) can be merged to the records in this file using DUPERSID (see Section 4.0 for details). Since each record represents a single condition reported by a household respondent, some household members may have multiple medical conditions and thus will be represented by multiple records on this file. Other household members may have had no reported medical conditions and thus will have no records on this file. Still other household members may have had a reported medical condition that did not meet the criteria above and thus will have no records on this file. Data from this file also can be merged to 2017 MEPS Event Files (HC-197A, and HC-197D through HC-197H) by using the link files provided in HC-197I. (See HC-197I documentation for details.)

Return To Table Of Contents

2.1 Codebook Structure

The codebook and data file list variables in the following order:

  • Unique person identifiers
  • Unique condition identifiers
  • Medical condition variables
  • Utilization variables
  • Weight and variance estimation variables

Note that the person identifier is unique within this data year.

Return To Table Of Contents

2.2 Reserved Codes

The following reserved code values are used:

Value Definition
-1 INAPPLICABLE Question was not asked due to skip pattern
-7 REFUSED Question was asked and respondent refused to answer question
-8 DK Question was asked and respondent did not know answer
-9 NOT ASCERTAINED Interviewer did not record the data

Return To Table Of Contents

2.3 Codebook Format

This codebook describes an ASCII data set and provides the following programming identifiers for each variable:

Identifier Description
Name Variable name (maximum of 8 characters)
Description Variable descriptor (maximum 40 characters)
Format Number of bytes
Type Type of data: numeric (indicated by NUM) or character (indicated by CHAR)
Start Beginning column position of variable in record
End Ending column position of variable in record

Return To Table Of Contents

2.4 Variable Naming

In general, variable names reflect the content of the variable, with an 8-character limitation. Edited variables end in an “X” and are so noted in the variable label. (CONDIDX, which is an encrypted identifier variable, also ends in an “X”.)

Variables contained in this delivery were derived either from the questionnaire itself or from the CAPI. The source of each variable is identified in Appendix 1, Variable-Source Crosswalk. Sources for each variable are indicated in one of three ways: (1) variables derived from CAPI or assigned in sampling are so indicated; (2) variables collected at one or more specific questions have those numbers and questionnaire sections indicated in the “SOURCE” column; and (3) variables constructed from multiple questions using complex algorithms are labeled “Constructed” in the “SOURCE” column.

Return To Table Of Contents

2.5 File Contents

2.5.1 Identifier Variables (DUID-CONDRN)

The definitions of Dwelling Units (DUs) in the MEPS HC are generally consistent with the definitions employed for the National Health Interview Survey (NHIS). The dwelling unit ID (DUID) is a 5-digit random number assigned after the case was sampled for MEPS. The person number (PID) uniquely identifies each person within the dwelling unit.

The variable DUPERSID uniquely identifies each person represented on the file and is the combination of the variables DUID and PID.

CONDN is the condition number and uniquely identifies each condition reported for an individual. The range on this file for CONDN is 11-581 and the range of total records for any one person on the file is 1-54.

The variable CONDIDX uniquely identifies each condition (i.e., each record on the file) and is the combination of DUPERSID and CONDN. CONDIDX is always a length of 12 with DUPERSID (8) and CONDN (4) combined. For CONDIDX, the condition number is padded with leading zeroes to ensure consistent length.

PANEL is a constructed variable used to specify the panel number for the interview in which the condition was reported. PANEL will indicate either Panel 21 or Panel 22.

CONDRN indicates the round in which the condition was first reported. For a small number of cases, conditions that actually began in an earlier round were not reported by respondents until subsequent rounds of data collection. During file construction, editing was performed for these cases in order to reconcile the round in which a condition began and the round in which the condition was first reported.

Return To Table Of Contents

2.5.2 Medical Condition Variables (AGEDIAG-ICD10CDX)

This file contains variables describing medical conditions reported by respondents in several sections of the MEPS questionnaire, including the Condition Enumeration section, and all questionnaire sections collecting information about health provider visits and/or prescription medications (see Variable-Source Crosswalk in Appendix 1 for details).

Return To Table Of Contents

2.5.2.1 Priority Conditions and Injuries

Certain conditions were a priori designated as “priority conditions” due to their prevalence, expense, or relevance to policy. Some of these are long-term, life-threatening conditions, such as cancer, diabetes, emphysema, high cholesterol, hypertension, ischemic heart disease, and stroke. Others are chronic manageable conditions, including arthritis and asthma. The only mental health condition on the priority conditions list is attention deficit hyperactivity disorder/attention deficit disorder.
When a condition was first mentioned, respondents were asked whether it was due to an accident or injury (INJURY=1). Only non-priority conditions (i.e., conditions reported in a section other than PE) are eligible to be injuries. The interviewer is prevented from selecting priority conditions as injuries.

Return To Table Of Contents

2.5.2.2 Age Priority Condition Began

The age of diagnosis (AGEDIAG) was collected for all priority conditions, except joint pain. For confidentiality reasons, AGEDIAG is set to Inapplicable (-1) for cancer conditions.

To ensure confidentiality, age of diagnosis was top-coded to 85. This corresponds with the age top-coding in person-level PUFs.

Return To Table Of Contents

2.5.2.3 Follow-up Questions for Injuries and Priority Conditions

When a respondent reported that a condition resulted from an accident or injury (INJURY=1), respondents were asked during the round in which the injury was first reported whether the accident/injury occurred at work (ACCDNWRK). This question was not asked about persons aged 15 and younger; the condition had ACCDNWRK coded to inapplicable (-1) for those persons.

Return To Table Of Contents

2.5.2.4 Sources for Conditions on the MEPS Conditions File

The records on this file correspond with medical condition records collected by CAPI and stored on a person’s MEPS conditions roster. Conditions can be added to the MEPS conditions roster in several ways. A condition can be reported in the Priority Condition Enumeration (PE) section in which persons are asked if they have been diagnosed with specific conditions. The condition can be identified as the reason reported by the household respondent for a particular medical event (hospital stay, outpatient visit, emergency room visit, home health episode, prescribed medication purchase, or medical provider visit). Some condition information is collected in the Medical Provider Component of MEPS. However, since it is not available for everyone in the sample, it is not used to supplement, replace, or verify household-reported condition data.

Finally, prior to Panel 21 Round 5 and Panel 22 Round 3, the condition may be reported by the household-level respondent as a condition “bothering” the person during the reference period (see question CE03). Conditions reported in the PE section that are not current are not included on this file.

Return To Table Of Contents

2.5.2.5 Treatment of Data from Rounds Not Occurring in 2017

Prior to the 2008 file, priority conditions reported during Rounds 1 and 2 of the second year panel were included on the file even if the conditions were not related to an event or reported as a serious condition occurring in the second year of the panel. Beginning in 2008, priority conditions are included on the file only if they are current conditions. A current condition is defined as a condition linked to an event or a condition the person is currently experiencing (i.e., a condition selected in the Condition Enumeration (CE) section). However, starting in Panel 21 Round 5 and Panel 22 Round 3, a current condition is defined only as a condition linked to a 2017 event. Conditions from Rounds 1 and 2 that are not included in the 2017 file may be available in the 2016 Medical Conditions File if the person had a positive person or family weight in 2016. For 2017, 66 conditions from Panel 21 Rounds 1 and 2 are included on the 2017 Medical Conditions File for persons who did not appear on the previous year’s file.

Note: Priority conditions are generally chronic conditions. Even though a person may not have reported an event in 2017 due to the condition, or reported generally experiencing the condition in 2017, analysts should consider that the person is probably still experiencing the condition. If a Panel 21 person reported a priority condition in Round 1 or 2 and did not have an event for the condition in Round 3, 4, or 5, the condition will not be included on the 2017 Medical Conditions File.

Return To Table Of Contents

2.5.2.6 Rounds in Which Conditions Were Reported/Selected (CRND1 – CRND5)

A set of constructed variables indicates the round in which the condition was first reported (CONDRN), and the subsequent round(s) in which the condition was selected (CRND1 – CRND5). The condition may be reported or selected when the person reports an event that occurred due to the condition, or the condition was reported in the CE section but is not linked to any events. For example, consider a condition for which CRND1 = 0, CRND2 = 1, and CRND3 = 1. For non-priority conditions, this sequence of indicators on a condition record implies that the condition was not present during Round 1 (CRND1 = 0), was first mentioned during Round 2 (CRND2 = 1, CONDRN = 2), and was selected again during Round 3 (CRND3 = 1). For priority conditions, this sequence of indicators implies that the condition was reported in the PE section in Round 1 (CONDRN = 1) but was not connected with an event in that round (CRND1 = 0), and the condition was not selected in the CE section as a current condition until Rounds 2 and 3 (CRND2 = 1, CRND3 = 1). Because priority conditions are asked in the context of “has person ever been told by a doctor or other health care professional that they have (condition)?” except joint pain and chronic bronchitis, which ask only about the last 12 months, a priority condition might not be selected in the round in which it was first reported.

Return To Table Of Contents

2.5.2.7 Diagnosis and Condition Codes

The medical conditions and procedures reported by the Household Component respondent were recorded by the interviewer as verbatim text. Beginning FY16, ICD-9-CM codes (ICD9CODX) are no longer used and medical conditions now are coded to ICD-10-CM codes (ICD10CDX). Also beginning in FY16, condition names are no longer coded to procedure codes, and ICD9PROX has been dropped from the file.

Professional coders followed specific guidelines in coding missing values to the ICD-10-CM diagnosis condition variable. ICD10CDX was coded -9 (Not Ascertained) where the verbatim text fell into one of three categories: (1) the text indicated that the condition was unknown (e.g., DK); (2) the text indicated the condition could not be diagnosed by a doctor (e.g., doctor doesn’t know); or (3) the specified condition was not codeable. If the text indicated a procedure and the condition associated with the procedure could be discerned from the text, the condition itself is coded. For example, “cataract surgery” is coded as the condition “other cataract” (ICD10CDX is set to code “H26”). If the condition could not be discerned (e.g. “outpatient surgery”), ICD10CDX is set to -9.

Through FY15, the text strings were coded by professional coders to fully-specified ICD-9-CM codes, including medical condition and V codes (see Health Care Financing Administration, 1980). Condition names were coded to ICD-9-CM diagnosis codes (ICD9CODX), and to ICD-9-CM procedure codes (ICD9PROX) when applicable (the condition name indicated a procedure/surgery). Through FY15, ICD9CODX also was coded -9 if the specified condition was not codeable and a procedure could not be discerned from the text; if the verbatim text strictly denoted a procedure and not a condition, ICD9CODX was coded -1.

In order to preserve confidentiality, all of the conditions provided on this file have been coded to 3-digit diagnosis code categories rather than the fully-specified ICD-10-CM code. For example, the ICD10CDX value of J02 “Acute pharyngitis” includes the fully-specified subclassifications J020 and J029; the value F31 “Bipolar disorder” includes the fully-specified subclassifications F3110 through F319. Table 1 in Appendix 2 provides unweighted and weighted frequencies for all ICD-10-CM condition code values reported on the file. Approximately 2 percent of the ICD-10-CM codes on this file were edited further by collapsing two or more 3-digit codes into one 3-digit code. This includes clinically rare conditions that were recoded to broader codes by clinicians. A condition is determined to be clinically rare if it appears on the National Institutes of Health’s list of rare diseases.

For confidentiality purposes, approximately 5% of ICD-10-CM codes were recoded to -9 (Not Ascertained) for conditions where the frequency was less than 20 for the total unweighted population in the file or less than 200,000 for the weighted population. Additional factors used to determine recoding include age and gender.

In a small number of cases, diagnosis and condition codes were recoded to -9 (Not Ascertained) if they denoted a pregnancy for a person younger than 16 or older than 44. Less than one-tenth of 1 percent of records were recoded in this manner on the 2017 Medical Conditions File. The person’s age was determined by linking the 2017 Medical Conditions File to the 2016 and 2017 Person-Level Use PUFs. If the person’s age is under 16 or over 44 in the round in which the condition was reported, the appropriate condition code was recoded to -9 (Not Ascertained).

Users should note that because of the design of the survey, most deliveries (i.e., births) are coded as pregnancies. For more accurate estimates for deliveries, analysts should use RSNINHOS “Reason Entered Hospital” found on the Hospital Inpatient Stays Public Use File (HC-197D).

Each year, a few conditions on the final file may fall below the confidentiality threshold. This is due to the multistage file development process. The confidentiality recoding is performed on the preliminary version of the Conditions file each year. This preliminary version is used in the development of other event PUFs and, in turn, these event PUFs are used in the development of the final conditions file. During this process, some records from the preliminary file are dropped because only records that are relevant to the current data year are reflected in the final Conditions PUF.

Conditions file data can be merged with the 2017 MEPS Event Files. Because the conditions have been coded to 3-digit diagnosis code categories rather than the fully-specified ICD-10-CM code, it is possible for there to be duplicate ICD-10-CM condition codes linked to a single medical event when different fully-specified conditions are coded to the same 3-digit code. For information on merging data on this file with the 2017 MEPS Event Files (HC-197A, and HC-197D through HC-197H) refer to the link files provided in HC-197I, and see HC-197I documentation for details.

Conditions were reported in sections of the HC questionnaire (see Variable-Source Crosswalk in Appendix 1). Labels for all values of ICD10CDX, as shown in Table 1 of Appendix 2, are provided in the SAS programming statements included in this release (see the H199SU.TXT file).

Return To Table Of Contents

2.5.2.8 Clinical Classification Codes

The 2016 Medical Conditions public use file (PUF) was the first time ICD10 codes were provided on MEPS public use files. As a consequence of the adoption of the new condition classification system, the ICD-10 mapping to CCS codes is still under review and a final mapping is not available at the time of this file release. Users can visit the Healthcare Cost and Utilization Project (HCUP) website for more information.

Return To Table Of Contents

2.5.3 Utilization Variables (OBNUM – RXNUM)

The variables OBNUM, OPNUM, HHNUM, IPNUM, ERNUM, and RXNUM indicate the total number of 2017 events that can be linked to each condition record on the current file, i.e., office-based, outpatient, home health, inpatient hospital stays, emergency room visits, and prescribed medicines, respectively.

These counts of events were derived from Expenditure Event Public Use Files (HC-197G, HC-197F, HC-197H, HC-197D, HC-197E, and HC-197A). Events associated with conditions include all utilization that occurred between January 1, 2017 and December 31, 2017.

Because persons can be seen for more than one condition per visit, these frequencies will not match the person or event-level utilization counts. For example, if a person had one inpatient hospital stay and was treated for a fractured hip, a fractured shoulder, and a concussion, each of these conditions has a unique record in this file and IPNUM=1 for each record. By summing IPNUM for these records, the total inpatient hospital stays would be three when actually there was only one inpatient hospital stay for that person and three conditions were treated. These variables are useful for determining the number of inpatient hospital stays for head injuries, hip fractures, etc.

Return To Table Of Contents

3.0 Survey Sample Information

3.1 Overview

There is a single full year person-level weight (PERWT17F) assigned to each record for each key, in-scope person who responded to MEPS for the full period of time that he or she was in-scope during 2017. A key person was either a member of a responding NHIS household at the time of the interview or joined a family associated with such a household after being out-of-scope at the time of the NHIS (the latter circumstance includes newborns as well as those returning from military service, an institution, or residence in a foreign country). A person is in-scope whenever he or she is a member of the civilian noninstitutionalized portion of the U.S. population.

Return To Table Of Contents

3.2 Details on Person Weight Construction

The person-level weight PERWT17F was developed in several stages. First, person-level weights for Panel 21 and Panel 22 were created separately. The weighting process for each panel included adjustments for nonresponse over time and calibration to independent population totals. The calibration was initially accomplished separately for each panel by raking the corresponding sample weights to Current Population Survey (CPS) population estimates based on six variables. The six variables used in the establishment of the initial person-level control figures were: educational attainment of the reference person (no degree, high school/GED no college, some college, bachelor’s or a higher degree); census region (Northeast, Midwest, South, West); MSA status (MSA, non-MSA); race/ethnicity (Hispanic; Black, non-Hispanic; Asian, non-Hispanic; and other); sex; and age. A 2017 composite weight was then formed by multiplying each weight from Panel 21 by the factor .500 and each weight from Panel 22 by the factor .500. Using such factors to form composite weights serves to limit the variance of estimates obtained from pooling the two samples. The resulting composite weight was raked to the same set of CPS-based control totals. Then, when the poverty status information (derived from the MEPS income variables) became available, another raking was undertaken, using dimensions reflecting poverty status in addition to the previously mentioned six variables. Control totals were established using poverty status (five categories: below poverty, from 100 to 125 percent of poverty, from 125 to 200 percent of poverty, from 200 to 400 percent of poverty, at least 400 percent of poverty) as well as the other five variables previously used in the weight calibration. Thus, the raking for the final weight reflected poverty status as well as the other five variables previously used in the weight calibration.

Return To Table Of Contents

3.2.1 MEPS Panel 21 Weight Development Process

The person-level weight for an individual in MEPS Panel 21 was developed using the 2016 full year weight as a “base” weight for each survey participant present in 2016. For key, in-scope members who joined an RU some time in 2017 after being out-of-scope in 2016, the initially assigned person-level weight was the corresponding 2016 family weight. The weighting process included an adjustment for person-level nonresponse over Rounds 4 and 5 as well as raking to population control figures for December 2017 for key, responding persons in-scope on December 31, 2017. These control figures were derived by scaling back the population distribution obtained from the March 2018 CPS to reflect the December 31, 2017 estimated population total (estimated based on Census projections for January 1, 2018). Variables used for person-level raking included: educational attainment of the reference person (no degree, high school/GED no college, some college, bachelor’s or a higher degree); census region (Northeast, Midwest, South, West); MSA status (MSA, non-MSA); race/ethnicity (Hispanic; Black, non-Hispanic; Asian, non-Hispanic; and other); sex; and age. The final weight for key, responding persons who were not in-scope on December 31, 2017 but were in-scope earlier in the year was the person weight after the nonresponse adjustment.

Note that the 2016 full-year weight that was used as the base weight for Panel 21 was derived using the MEPS Round 1 weight and adjusting it further for nonresponse over the remaining data collection rounds in 2016 and raking to the December 2016 population control figures.

Return To Table Of Contents

3.2.2 MEPS Panel 22 Weight Development Process

The person-level weight for an individual in MEPS Panel 22 was developed using the 2017 MEPS Round 1 person-level weight as a “base” weight. For key, in-scope members who joined an RU after Round 1, the Round 1 family weight served as a “base” weight. The weighting process included an adjustment for nonresponse over the remaining data collection rounds in 2017 as well as raking to the same population control figures for December 2017 used for the MEPS Panel 21 weights for key, responding persons in-scope on December 31, 2017. The same six variables employed for Panel 21 raking (educational attainment of the reference person, census region, MSA status, race/ethnicity, sex, and age) were used for Panel 22 raking. Again, the final weight for key, responding persons who were not in-scope on December 31, 2017 but were in-scope earlier in the year was the person weight after the nonresponse adjustment.

Note that the MEPS Round 1 weights for Panel 22 incorporated the following components: the original household probability of selection for the NHIS and for the NHIS subsample reserved for MEPS and adjustment for NHIS nonresponse, the probability of selection for MEPS from NHIS, an adjustment for nonresponse at the dwelling unit level for Round 1, and poststratification to U.S. civilian noninstitutionalized population estimates at the family and person level obtained from the corresponding March CPS databases.

Return To Table Of Contents

3.2.3 The Final Weight for 2017

The final raking of those in-scope at the end of the year has been described above. In addition, the composite weights of two groups of persons who were out-of-scope on December 31, 2017 were poststratified. Specifically, the weights of those who were in-scope some time during the year, out-of-scope on December 31, and entered a nursing home during the year were adjusted to compensate for expected undercoverage for this subpopulation. The weights of persons who died while in-scope during 2017 were poststratified to corresponding estimates derived using data obtained from the Medicare Current Beneficiary Survey (MCBS) and Vital Statistics information provided by the National Center for Health Statistics (NCHS). Separate decedent control totals were developed for the “65 and older” and “under 65” civilian noninstitutionalized populations.

Overall, the weighted population estimate for the civilian noninstitutionalized population for December 31, 2017 is 321,529,965 (PERWT17F>0 and INSC1231=1). The sum of the person-level weights across all persons assigned a positive person-level weight is 324,779,909.

Return To Table Of Contents

3.2.4 Coverage

The target population for MEPS in this file is the 2017 U.S. civilian noninstitutionalized population. However, the MEPS sampled households are a subsample of the NHIS households interviewed in 2015 (Panel 21) and 2016 (Panel 22). New households created after the NHIS interviews for the respective panels and consisting exclusively of persons who entered the target population after 2015 (Panel 21) or after 2016 (Panel 22) are not covered by MEPS. Neither are previously out-of-scope persons who join an existing household but are unrelated to the current household residents. Persons not covered by a given MEPS panel thus include some members of the following groups: immigrants, persons leaving the military, U.S. citizens returning from residence in another country, and persons leaving institutions. The set of uncovered persons constitutes only a small segment of the MEPS target population.

Return To Table Of Contents

3.3 Using MEPS Data for Trend Analysis

MEPS began in 1996, and the utility of the survey for analyzing health care trends expands with each additional year of data; however, there are a variety of methodological and statistical considerations when examining trends over time using MEPS. Examining changes over longer periods of time can provide a more complete picture of underlying trends. In particular, large shifts in survey estimates over short periods of time (e.g. from one year to the next) that are statistically significant should be interpreted with caution unless they are attributable to known factors such as changes in public policy, economic conditions, or survey methodology.

In 2013 MEPS survey operations introduced an effort to obtain more complete information about health care utilization from MEPS respondents with full implementation in 2014. This effort resulted in improved data quality and a reduction in underreporting in the second half of 2013 and throughout 2014. Respondents tended to report more visits, especially non-physician visits, by sample members and the new approach appeared particularly effective among those subgroups with relatively large numbers of visits, such as the elderly, Medicare beneficiaries, and people with multiple chronic conditions, disabilities, or poor health. Reported spending on visits also tended to increase, especially for such subgroups.

Changes to the MEPS survey instrument should also be considered when analyzing trends. Thus, the note on the title page of this document is repeated here:

The MEPS instrument design changed beginning in Spring of 2018, affecting Panel 23 Round 1, Panel 22 Round 3, and Panel 21 Round 5. For the Full-Year 2017 PUFs, the Panel 22 Round 3 and Panel 21 Round 5 data were transformed to the degree possible to conform to the previous design. Data users should be aware of possible impacts on the data and especially trend analysis for these data years due to the design transition.

As always, it is recommended that data users review relevant sections of the documentation for descriptions of these types of changes before undertaking trend analyses.

Analysts may also wish to consider using statistical techniques to smooth or stabilize analyses of trends using MEPS data such as comparing pooled time periods (e.g. 1996-97 versus 2011-12), working with moving averages or using modeling techniques with several consecutive years of MEPS data to test the fit of specified patterns over time.

Finally, statistical significance tests should be conducted to assess the likelihood that observed trends are not attributable to sampling variation. In addition, researchers should be aware of the impact of multiple comparisons on Type I error. Without making appropriate allowance for multiple comparisons, undertaking numerous statistical significance tests of trends increases the likelihood of concluding that a change has taken place when one has not.

Return To Table Of Contents

4.0 Merging/Linking MEPS Data Files

Data from the current file can be used alone or in conjunction with other files. Merging characteristics of interest from person-level files expands the scope of potential estimates. See HC-197I for instructions on merging the Conditions File to the Medical Event Files. Person-level characteristics can be merged to this Conditions File using the following procedure:

  1. Sort the person-level file by person identifier, DUPERSID. Keep only DUPERSID and the variables to be merged onto the Conditions File.

  2. Sort the Conditions File by person identifier, DUPERSID.

  3. Merge both files by DUPERSID, and output all records in the Conditions File.

  4. If PERS contains the person-level variables, and COND is the Conditions File, the following code can be used to add person-level variables to the person’s conditions in the Condition-level file.

PROC SORT DATA=PERS(KEEP=DUPERSID AGE SEX EDUCYR HIDEG)
OUT=PERSX; BY DUPERSID;
RUN;

PROC SORT DATA=COND; BY DUPERSID;
RUN;

DATA COND;
MERGE COND (IN=A) PERSX(IN=B); BY DUPERSID;
IF A;
RUN;

Return To Table Of Contents

4.1 National Health Interview Survey (NHIS)

Data from this file can be used alone or in conjunction with other files for different analytic purposes. Each MEPS panel can also be linked back to the previous years’ National Health Interview Survey public use data files. For information on MEPS/NHIS link files please see the AHRQ website.

Return To Table Of Contents

4.2 Longitudinal Analysis

Panel-specific longitudinal files are available for downloading in the data section of the MEPS website. For each panel, the longitudinal file comprises MEPS survey data obtained in Rounds 1 through 5 of the panel and can be used to analyze changes over a two-year period. Variables in the file pertaining to survey administration, demographics, employment, health status, disability days, quality of care, patient satisfaction, health insurance, and medical care use and expenditures were obtained from the MEPS full-year Consolidated files from the two years covered by that panel.

For more details or to download the data files, please see Longitudinal Data Files at the AHRQ website.

Return To Table Of Contents

References

Chowdhury, S.R., Machlin, S.R., Gwet, K.L. Sample Designs of the Medical Expenditure Panel Survey Household Component, 1996–2006 and 2007–2016. Methodology Report #33. January 2019. Agency for Healthcare Research and Quality, Rockville, MD.

Cox, B. and Iachan, R. (1987). A Comparison of Household and Provider Reports of Medical Conditions. Journal of the American Statistical Association 82(400): 1013-18.

Edwards, W. S., Winn, D. M., Kurlantzick, V., et al. Evaluation of National Health Interview Survey Diagnostic Reporting. National Center for Health Statistics, Vital Health 2(120). 1994.

Health Care Financing Administration (1980). International Classification of Diseases, 9th Revision, Clinical Modification (ICD-CM). Vol. 1. (Department of Health and Human Services Pub. No (PHS) 80-1260). Department of Health and Human Services: U.S. Public Health Services.

Johnson, Ayah E., and Sanchez, Maria Elena. (1993), “Household and Medical Reports on Medical Conditions: National Medical Expenditure Survey.” Journal of Economic and Social Measurement, 19, 199-223.

Return To Table Of Contents

Appendix 1 – Variable-Source Crosswalk

FOR MEPS HC-199: 2017 MEDICAL CONDITIONS

UNIQUE IDENTIFIER VARIABLES

Variable Label Source1
DUID Dwelling Unit ID Assigned In Sampling
PID Person Number Assigned In Sampling
DUPERSID Person ID (DUID + PID) Assigned In Sampling
CONDN Condition Number CAPI Derived
CONDIDX Condition ID CAPI Derived
PANEL Panel Number Constructed
CONDRN Condition Round Number CAPI Derived

Return To Table Of Contents

MEDICAL CONDITION VARIABLES

Variable Label Source1
AGEDIAG Age When Diagnosed PE section
CRND1 Has Condition Information In Round 1 Constructed
CRND2 Has Condition Information In Round 2 Constructed
CRND3 Has Condition Information In Round 3 Constructed
CRND4 Has Condition Information In Round 4 Constructed
CRND5 Has Condition Information In Round 5 Constructed
INJURY Was Condition Due To Accident/Injury CN01A
ACCDNWRK Did Accident Occur At Work CN07
ICD10CDX ICD-10-CM Code For Condition - Edited CE05, HS04, ER04, OP09, MV09, HH05, PM09 (Edited)

Return To Table Of Contents

UTILIZATION VARIABLES

Variable Label Source1
HHNUM # Home Health Events Assoc. w/ Condition Constructed
IPNUM # Inpatient Events Assoc. w/ Condition Constructed
OPNUM # Outpatient Events Assoc. w/ Condition Constructed
OBNUM # Office-Based Events Assoc. w/ Condition Constructed
ERNUM # ER Events Assoc. w/ Condition Constructed
RXNUM # Prescribed Medicines Assoc. w/ Cond. Constructed

Return To Table Of Contents

WEIGHTS AND VARIANCE ESTIMATION VARIABLES

Variable Label Source1
PERWT17F Expenditure File Person Weight, 2017 Constructed
VARSTR Variance Estimation Stratum, 2017 Constructed
VARPSU Variance Estimation PSU, 2017 Constructed

1See the Household Component section under Survey Questionnaires on the MEPS home page for information on the MEPS HC questionnaire sections shown in the Source column (e.g., CN, PE).

Return To Table Of Contents

Appendix 2: Condition Code Frequencies

Return To Table Of Contents

Appendix 3

LIST OF CONDITIONS ASKED IN PRIORITY CONDITIONS ENUMERATION SECTION

  • Angina/Angina Pectoris
  • Arthritis
  • Asthma
  • Attention Deficit Hyperactivity Disorder (ADHD)/Attention Deficit Disorder (ADD)
  • Cancer/Malignancy
  • Chronic Bronchitis
  • Coronary Heart Disease
  • Diabetes/Sugar Diabetes
  • Emphysema
  • Heart Attack/Myocardial Infarction (MI)
  • High Cholesterol
  • Hypertension/High Blood Pressure
  • Joint Pain
  • Other Heart Disease (not coronary heart disease, angina, or heart attack)
  • Stroke/Transient Ischemic Attack (TIA)/Mini-stroke

Return To Table Of Contents

Back to topGo back to top
AHRQ
Get Social
Agency for Healthcare Research and Quality
5600 Fishers Lane
Rockville, MD 20857
Telephone: (301) 427-1364
Improving the Quality, Safety, Efficiency, and Effectiveness of Health Care For All Americans