MEPS HC-154: 2012 Medical Conditions

September 2014

Agency for Healthcare Research and Quality
Center for Financing, Access, and Cost Trends
540 Gaither Road
Rockville, MD 20850
(301) 427-1406


Table of Contents

A. Data Use Agreement
B. Background
1.0 Household Component
2.0 Medical Provider Component
3.0 Survey Management and Data Collection
C. Technical and Programming Information
1.0 General Information
2.0 Data File Information
2.1 Codebook Structure
2.2 Reserved Codes
2.3 Codebook Format
2.4 Variable Naming
2.5 File Contents
2.5.1 Identifier Variables (DUID-CONDRN)
2.5.2 Medical Condition Variables (AGEDIAG-CCCODEX)
2.5.2.1 Priority Conditions and Injuries
2.5.2.2 Age Priority Condition Began/Date Accident Occurred
2.5.2.3 Follow-up Questions for Injuries and Priority Conditions
2.5.2.4 Sources for Conditions on the MEPS Conditions File
2.5.2.5 Treatment of Data from Rounds Not Occurring in 2012
2.5.2.6 Rounds in Which Conditions Were Reported/Selected (CRND1 CRND5)
2.5.2.7 Disability Flag Variables
2.5.2.8 Diagnosis, Condition, and Procedure Codes
2.5.2.9 Clinical Classification Codes
2.5.3 Utilization Variables (OBNUM RXNUM)
3.0 Survey Sample Information
3.1 Overview
3.2 Details on Person Weight Construction
3.2.1 MEPS Panel 16 Weight Development Process
3.2.2 MEPS Panel 17 Weight Development Process
3.2.3 The Final Weight for 2012
3.2.4 Coverage
3.3 Using MEPS Data for Trend Analysis
4.0 Merging/Linking MEPS Data Files
4.1 National Health Interview Survey (NHIS)
4.2 Longitudinal Analysis
References
Appendix 1: Variable-Source Crosswalk
Appendix 2: Condition, Procedure, and Clinical Classification Code Frequencies
Appendix 3: Clinical Classification Code to ICD-9-CM Code Crosswalk
Appendix 4: List of Invalid ICD-9-CM Codes
Appendix 5: List of Conditions Asked in Priority Conditions Enumeration Section


A. Data Use Agreement

Individual identifiers have been removed from the micro-data contained in these files. Nevertheless, under sections 308 (d) and 903 (c) of the Public Health Service Act (42 U.S.C. 242m and 42 U.S.C. 299 a-1), data collected by the Agency for Healthcare Research and Quality (AHRQ) and/or the National Center for Health Statistics (NCHS) may not be used for any purpose other than for the purpose for which they were supplied; any effort to determine the identity of any reported cases is prohibited by law.

Therefore in accordance with the above referenced Federal Statute, it is understood that:

  1. No one is to use the data in this data set in any way except for statistical reporting and analysis; and

  2. If the identity of any person or establishment should be discovered inadvertently, then (a) no use will be made of this knowledge, (b) the Director Office of Management AHRQ will be advised of this incident, (c) the information that would identify any individual or establishment will be safeguarded or destroyed, as requested by AHRQ, and (d) no one else will be informed of the discovered identity; and

  3. No one will attempt to link this data set with individually identifiable records from any data sets other than the Medical Expenditure Panel Survey or the National Health Interview Survey.

By using these data you signify your agreement to comply with the above stated statutorily based requirements with the knowledge that deliberately making a false statement in any matter within the jurisdiction of any department or agency of the Federal Government violates Title 18 part 1 Chapter 47 Section 1001 and is punishable by a fine of up to $10,000 or up to 5 years in prison.

The Agency for Healthcare Research and Quality requests that users cite AHRQ and the Medical Expenditure Panel Survey as the data source in any publications or research based upon these data.

Return To Table Of Contents

B. Background

1.0 Household Component

The Medical Expenditure Panel Survey (MEPS) provides nationally representative estimates of health care use, expenditures, sources of payment, and health insurance coverage for the U.S. civilian noninstitutionalized population. The MEPS Household Component (HC) also provides estimates of respondents’ health status, demographic and socio-economic characteristics, employment, access to care, and satisfaction with health care. Estimates can be produced for individuals, families, and selected population subgroups. The panel design of the survey, which includes 5 Rounds of interviews covering 2 full calendar years, provides data for examining person level changes in selected variables such as expenditures, health insurance coverage, and health status. Using computer assisted personal interviewing (CAPI) technology, information about each household member is collected, and the survey builds on this information from interview to interview. All data for a sampled household are reported by a single household respondent.

The MEPS-HC was initiated in 1996. Each year a new panel of sample households is selected. Because the data collected are comparable to those from earlier medical expenditure surveys conducted in 1977 and 1987, it is possible to analyze long-term trends. Each annual MEPS-HC sample size is about 15,000 households. Data can be analyzed at either the person or event level. Data must be weighted to produce national estimates.

The set of households selected for each panel of the MEPS HC is a subsample of households participating in the previous year’s National Health Interview Survey (NHIS) conducted by the National Center for Health Statistics. The NHIS sampling frame provides a nationally representative sample of the U.S. civilian noninstitutionalized population and reflects an oversample of Blacks and Hispanics. In 2006, the NHIS implemented a new sample design, which included Asian persons in addition to households with Black and Hispanic persons in the oversampling of minority populations. MEPS further oversamples additional policy relevant sub-groups such as low income households. The linkage of the MEPS to the previous year’s NHIS provides additional data for longitudinal analytic purposes.

Return To Table Of Contents

2.0 Medical Provider Component

Upon completion of the household CAPI interview and obtaining permission from the household survey respondents, a sample of medical providers are contacted by telephone to obtain information that household respondents can not accurately provide. This part of the MEPS is called the Medical Provider Component (MPC) and information is collected on dates of visit, diagnosis and procedure codes, charges and payments. The Pharmacy Component (PC), a subcomponent of the MPC, does not collect charges or diagnosis and procedure codes but does collect drug detail information, including National Drug Code (NDC) and medicine name, as well as date filled and sources and amounts of payment. The MPC is not designed to yield national estimates. It is primarily used as an imputation source to supplement/replace household reported expenditure information.

Return To Table Of Contents

3.0 Survey Management and Data Collection

MEPS HC and MPC data are collected under the authority of the Public Health Service Act. Data are collected under contract with Westat, Inc. (MEPS HC) and Research Triangle Institute (MEPS MPC). Data sets and summary statistics are edited and published in accordance with the confidentiality provisions of the Public Health Service Act and the Privacy Act. The National Center for Health statistics (NCHS) provides consultation and technical assistance.

As soon as data collection and editing are completed, the MEPS survey data are released to the public in staged releases of summary reports, micro data files, and tables via the MEPS Web site: meps.ahrq.gov. Selected data can be analyzed through MEPSnet, an on-line interactive tool designed to give data users the capability to statistically analyze MEPS data in a menu-driven environment.

Additional information on MEPS is available from the MEPS project manager or the MEPS public use data manager at the Center for Financing, Access, and Cost Trends, Agency for Healthcare Research and Quality, 540 Gaither Road, Rockville, MD 20850 (301-427-1406).

Return To Table Of Contents

C. Technical and Programming Information

1.0 General Information

This documentation describes the data contained in MEPS Public Use Release HC-154, which is one in a series of public use data files to be released from the 2012 Medical Expenditure Panel Survey Household Component (MEPS HC). Released in ASCII (with related SAS, SPSS, and Stata programming statements and data user information) and SAS formats, this public use file provides information on household-reported medical conditions collected on a nationally representative sample of the civilian noninstitutionalized population of the United States for calendar year 2012 MEPS HC. The file contains 35 variables and has a logical record length of 102 with an additional 2-byte carriage return/line feed at the end of each record.

This documentation offers a brief overview of the types and levels of data provided and the content and structure of the files. It contains the following sections:

  • Data File Information
  • Survey Sample Information
  • Merging/Linking MEPS Data Files
  • Appendices:
    • Variable-Source Crosswalk
    • Detailed ICD-9-CM Condition, Procedure, and Clinical Classification Code Frequencies
    • Clinical Classification Code to ICD-9-CM Code Crosswalk
    • List of Invalid ICD-9-CM Codes
    • List of Conditions Asked in Priority Conditions Enumeration Section

A codebook of all the variables included in the 2012 Medical Conditions File is provided in an accompanying file.

For more information on MEPS survey design, see T. Ezzati-Rice, et al., 1998-2007 and S.Cohen, 1996. A copy of the survey instrument used to collect the information on this file is available on the MEPS Website: meps.ahrq.gov.

Return To Table Of Contents

2.0 Data File Information

This file contains 118,850 records. Each record represents one medical condition reported for a household survey member who resides in an eligible responding household and who has a positive person or family weight.

Conditions created in the Priority Condition Enumeration (PE) section were asked in the context of “has person ever been told by a doctor or other health care professional that they have (condition)?” except joint pain and chronic bronchitis, which ask only about the last 12 months. If the response is Yes (1), then a condition record is generated but only included in this file if the condition is current. A condition is defined as current if it is linked to an event or disability day or a condition the person is currently experiencing (i.e., a condition selected in the Condition Enumeration (CE) section).

Records meeting one of the following criteria are included on the file:

In Panel 17:

  • Round 1 and Round 2 records that are current conditions. A current condition is defined as a condition linked to a 2012 event or disability day, or a condition the person is currently experiencing (i.e., a condition selected in the CE section);
  • Round 3 conditions that were linked to a 2012 event;
  • Round 3 conditions that were due to an accident or injury and began before 2013;
  • Round 3 priority condition records that are current and either the age of diagnosis is less than or equal to the person’s age as of 12/31/2012 or the age of diagnosis is refused, don’t know, or not ascertained; or
  • Round 3 conditions where 50 percent or more of person’s reference period occurred in 2012.

In Panel 16:

  • Round 3, Round 4, and Round 5 records that are current conditions. A current condition is defined as a condition linked to a 2012 event or disability day or a condition the person is currently experiencing (i.e., a condition selected in the CE section); or
  • Round 1 and Round 2 condition records that are linked to a 2012 event or disability day, or a condition the person is currently experiencing in 2012 (i.e., a condition selected in the CE section).

For most variables on the file, the codebook provides both weighted and unweighted frequencies. The exceptions to this are weight variables and variance estimation variables. Only unweighted frequencies of these variables are included in the accompanying codebook file. See the Weights Variables list in Appendix 1, Variable-Source Crosswalk.

Data from this file can be merged with 2012 MEPS person-level data to append person-level characteristics such as demographic or health insurance characteristics to each record by using DUPERSID (see Section 4.0 for details). Since each record represents a single condition reported by a household respondent, some household members may have multiple medical conditions and thus will be represented by multiple records on this file. Other household members may have had no reported medical conditions and thus will have no records on this file. Still other household members may have had a reported medical condition that did not meet the criteria above and thus will have no records on this file. Data from this file also can be merged to 2012 MEPS Event Files (HC-152A through HC-152H) by using the link files provided in HC-152I. (See HC-152I documentation for details.)

Return To Table Of Contents

2.1 Codebook Structure

The codebook and data file list variables in the following order:

Unique person identifiers
Unique condition identifiers
Medical condition variables
Utilization variables
Weight and variance estimation variables

Note that the person identifier is unique within this data year.

Return To Table Of Contents

2.2 Reserved Codes

The following reserved code values are used:

Value Definition
-1 INAPPLICABLE Question was not asked due to skip pattern
-7 REFUSED Question was asked and respondent refused to answer question
-8 DK Question was asked and respondent did not know answer
-9 NOT ASCERTAINED Interviewer did not record the data

Return To Table Of Contents

2.3 Codebook Format

This codebook describes an ASCII data set and provides the following programming identifiers for each variable:

Identifier Description
Name Variable name (maximum of 8 characters)
Description Variable descriptor (maximum 40 characters)
Format Number of bytes
Type Type of data: numeric (indicated by NUM) or character (indicated by CHAR)
Start Beginning column position of variable in record
End Ending column position of variable in record

Return To Table Of Contents

2.4 Variable Naming

In general, variable names reflect the content of the variable, with an 8-character limitation. Edited variables end in an “X” and are so noted in the variable label. (CONDIDX, which is an encrypted identifier variable, also ends in an “X”.)

Variables contained in this delivery were derived either from the questionnaire itself or from the CAPI. The source of each variable is identified in Appendix 1 “Variable-Source Crosswalk.” Sources for each variable are indicated in one of three ways: (1) variables derived from CAPI or assigned in sampling are so indicated; (2) variables collected at one or more specific questions have those numbers and questionnaire sections indicated in the “SOURCE” column; and (3) variables constructed from multiple questions using complex algorithms are labeled “Constructed” in the “SOURCE” column.

Return To Table Of Contents

2.5 File Contents

2.5.1 Identifier Variables (DUID-CONDRN)

The definitions of Dwelling Units (DUs) in the MEPS HC are generally consistent with the definitions employed for the National Health Interview Survey (NHIS). The dwelling unit ID (DUID) is a 5-digit random number assigned after the case was sampled for MEPS. The person number (PID) uniquely identifies each person within the dwelling unit.

The variable DUPERSID uniquely identifies each person represented on the file and is the combination of the variables DUID and PID.

CONDN is the condition number and uniquely identifies each condition reported for an individual. The range on this file for CONDN is 11-441 and the range of total records for any one person on the file is 1-42.

The variable CONDIDX uniquely identifies each condition (i.e., each record on the file) and is the combination of DUPERSID and CONDN. CONDIDX is always a length of 12 with DUPERSID (8) and CONDN (4, with leading zeros added if needed) combined. For CONDIDX, the condition number is padded with leading zeroes to ensure consistent length.

PANEL is a constructed variable used to specify the panel number for the interview in which the condition was reported. PANEL will indicate either Panel 16 or Panel 17.

CONDRN indicates the round in which the condition was first reported. For a small number of cases, conditions that actually began in an earlier round were not reported by respondents until subsequent rounds of data collection. During file construction, editing was performed for these cases in order to reconcile the round in which a condition began and the round in which the condition was first reported.

Return To Table Of Contents

2.5.2 Medical Condition Variables (AGEDIAG-CCCODEX)

This file contains variables describing medical conditions reported by respondents in several sections of the MEPS questionnaire, including the Condition Enumeration section, all questionnaire sections collecting information about health provider visits, prescription medications, and disability days (see Variable-Source Crosswalk in Appendix 1 for details).

Return To Table Of Contents

2.5.2.1 Priority Conditions and Injuries

Certain conditions were a priori designated as “priority conditions” due to their prevalence, expense, or relevance to policy. Some of these are long-term, life-threatening conditions, such as cancer, diabetes, emphysema, high cholesterol, hypertension, ischemic heart disease, and stroke. Others are chronic manageable conditions, including arthritis and asthma. The only mental health condition on the priority conditions list is attention deficit hyperactivity disorder/attention deficit disorder.

When a condition was first mentioned, respondents were asked whether it was due to an accident or injury (INJURY=1). Only non-priority conditions (i.e., conditions reported in a section other than PE) are eligible to be injuries. The interviewer is prevented from selecting priority conditions as injuries.

Return To Table Of Contents

2.5.2.2 Age Priority Condition Began/Date Accident Occurred

The age of diagnosis (AGEDIAG) was collected for all priority conditions, except joint pain. The day, month, and year an accident or injury occurred (ACCDENTD, ACCDENTM, and ACCDENTY) were collected only for conditions that were reported as due to accident or injury. If the respondent did not know the accident year, or refused to provide it, or if the year was not ascertained (ACCDENTY in (-7, -8, -9)), a follow-up question gathered whether the accident occurred before or after January 1 of the reference year (ACCDNJAN). If the respondent replied that the accident occurred after January 1 of the reference year (ACCDNJAN = 2), then the reference year was used to set the accident year and ACCDNJAN was reset to Inapplicable (-1).

To ensure confidentiality, the accident year was bottom-coded to 1927 and age of diagnosis was top-coded to 85. This corresponds with the date of birth bottom-coding and age top-coding in person-level PUFs.

Return To Table Of Contents

2.5.2.3 Follow-up Questions for Injuries and Priority Conditions

When a respondent reported that a condition resulted from an accident or injury (INJURY=1), respondents were asked during the round in which the injury was first reported whether the accident/injury occurred at work (ACCDNWRK). This question was not asked about persons aged 15 and younger; the condition had ACCDNWRK coded to inapplicable (-1) for those persons.

For cancer conditions collected in the PE section, a follow-up question was asked when the cancer was first reported to determine whether the cancer was in remission/under control (REMISSN).

Return To Table Of Contents

2.5.2.4 Sources for Conditions on the MEPS Conditions File

The records on this file correspond with medical condition records collected by CAPI and stored on a person’s MEPS conditions roster. Conditions can be added to the MEPS conditions roster in several ways. A condition can be reported in the Priority Condition Enumeration (PE) section in which persons are asked if they have been diagnosed with specific conditions. The condition can be identified as the reason reported by the household respondent for a particular medical event (hospital stay, outpatient visit, emergency room visit, home health episode, prescribed medication purchase, or medical provider visit). The condition may be reported as the reason for one or more episodes of disability days. Finally, the condition may be reported by the household-level respondent as a condition “bothering” the person during the reference period (see question CE03). Conditions reported in the PE section that are not current are not included on this file.

Return To Table Of Contents

2.5.2.5 Treatment of Data from Rounds Not Occurring in 2012

Prior to the 2008 file, priority conditions reported during Rounds 1 and 2 of the second year panel were included on the file even if the conditions were not related to an event or disability day or reported as a serious condition occurring in the second year of the panel. Beginning in 2008, priority conditions are included on the file only if they are current conditions. A current condition is defined as a condition linked to an event or disability day or a condition the person is currently experiencing (i.e., a condition selected in the Condition Enumeration (CE) section). Conditions from Rounds 1 and 2 that are not included in the 2012 file are available in the 2011 Medical Conditions File. Note that, for some Rounds 1 and 2 records, data may not be available on the previous year’s file. This situation can occur when a person does not have a positive person or family weight in the first year but is assigned a positive weight in the subsequent year. The situation can also occur if the condition is a priority condition for which no events or disability days were reported in the first year but are reported in the second year. For 2012, 92 conditions from Panel 16 Rounds 1 and 2 are included on the 2012 Medical Conditions File for persons who did not appear on the previous year’s file.

Note: Priority conditions are generally chronic conditions. Even though a person may not have reported an event or disability day in 2012 due to the condition, or reported generally experiencing the condition in 2012, analysts should consider that the person is probably still experiencing the condition. If a Panel 16 person reported a priority condition in Round 1 or 2 and did not have an event or disability day for the condition in Round 3, 4, or 5, the condition will not be included on the 2012 Medical Conditions File.

Return To Table Of Contents

2.5.2.6 Rounds in Which Conditions Were Reported/Selected (CRND1 CRND5)

A set of constructed variables indicates the round in which the condition was first reported (CONDRN), and the subsequent round(s) in which the condition was selected (CRND1 CRND5). The condition may be reported or selected when the person reports an event or disability day that occurred due to the condition, or the condition may be selected as a serious condition that is not linked to any events or disability days. For example, consider a condition for which CRND1 = 0, CRND2 = 1, and CRND3 = 1. For non-priority conditions, this sequence of CRND indicators on a condition record implies that the condition was not present during Round 1 (CRND1 = 0), was first mentioned during Round 2, and was selected during Round 3. For priority conditions, it is necessary to look at CONDRN rather than CRND# to determine in which round the condition was first reported. In addition to the scenario above, this sequence of CRND indicators may imply for priority conditions that the condition was reported in the PE section in Round 1 but was not connected with an event or disability day, and not selected in the CE section as a current condition until Rounds 2 and 3.

Return To Table Of Contents

2.5.2.7 Disability Flag Variables

This file contains three flag variables indicating whether a condition is associated with a missed work day (MISSWORK), a missed school day (MISSSCHL), or a day spent in bed (INBEDFLG). Due to the MEPS instrument design, there is no link indicating the specific number of disability days associated with a particular medical condition.

Return To Table Of Contents

2.5.2.8 Diagnosis, Condition, and Procedure Codes

The medical conditions and procedures reported by the Household Component respondent were recorded by the interviewer as verbatim text, which was then coded by professional coders to fully-specified ICD-9-CM codes, including medical condition and V codes (see Health Care Financing Administration, 1980). Although codes were verified and error rates did not exceed 2.5 percent for any coder, analysts should not presume this level of precision in the data; the ability of household respondents to report condition data that can be coded accurately should not be assumed (see Cox and Iachan, 1987; Edwards, et al, 1994; and Johnson and Sanchez, 1993). Some condition information is collected in the Medical Provider Component of MEPS. However, since it is not available for everyone in the sample, it is not used to supplement, replace, or verify household-reported condition data.

Data analysts should also use caution when working with the procedure codes on this file. Procedure codes are gathered in the same manner as the conditions data, i.e., reports by household respondents. The survey does not prompt respondents for procedures, so procedures are under-reported. In addition, the ability of household respondents to accurately report procedures should not be assumed. Analysts should not use available data on procedures to make estimates of frequencies of specific procedures or to extrapolate to national estimates.

Professional coders followed specific guidelines in coding missing values to the ICD-9-CM diagnosis condition and procedure variables. The ICD-9-CM diagnosis condition variable (ICD9CODX) was coded -9 where the verbatim text fell into one of three categories: (1) the text indicated that the condition was unknown (e.g., DK); (2) the text indicated the condition could not be diagnosed by a doctor (e.g., doctor doesn’t know); or (3) the specified condition was not codeable and a procedure could not be discerned from the text. ICD9CODX was coded -1 where the verbatim text strictly denoted a procedure and not a condition. The ICD-9-CM procedure variable (ICD9PROX) was coded -9 where the verbatim text strictly denoted a procedure, but the text was not specific enough to assign a procedure code. ICD9PROX was set to -1 where the text strictly specified a condition and not a procedure.

In order to preserve confidentiality, nearly all of the diagnosis condition codes provided on this file have been collapsed from fully-specified codes to 3-digit code categories. Table 1 in Appendix 2 provides unweighted and weighted frequencies for all ICD-9-CM condition code values reported on the file. In this table, values that reflect this collapsing have an asterisk in the label indicating that the 3-digit category includes all the subclassifications within that category. For example, the ICD9CODX value of 034 “Strep Throat/Scarlet Fev *” includes the fully-specified subclassifications 034.0 and 034.1; the value 296 “Affective Psychoses*” includes the fully-specified subclassifications 296.0 through 296.99. Less than 1 percent of the records on this file were edited further by collapsing two or more 3-digit codes into one 3-digit code.

Similarly, most of the procedure codes were collapsed from fully-specified codes to 2-digit category codes. Table 2 in Appendix 2 provides unweighted and weighted frequencies for ICD9PROX, and this type of collapsing is identified by an asterisk in the variable label. For example, the ICD9PROX value of 81 “Joint Repair*” includes subclassifications 81.0 through 81.99. Less than 1 percent of records were further edited to combine two or more 2-digit categories.

Note that, for conditions related to certain medical events, the ICD-9-CM codes on this file are also released in the Prescribed Medicines, Emergency Room Visits, Office-based Medical Provider Visits, Outpatient Department Visits, and Inpatient Hospital Stays Event Files. Because the ICD-9-CM codes have been collapsed, it is possible for there to be duplicate ICD-9-CM condition or procedure codes linked to a single medical event when different fully-specified codes are collapsed into the same code. For information on merging data on this file with the 2012 MEPS Event Files (HC-152A through HC-152H) refer to the link files provided in HC-152I, and see HC-152I documentation for details.

Each year certain ICD-9-CM codes are ‘retired’ from use. Beginning in 2012, these codes are removed from the ‘history’ table (Appendix 3) prior to condition coding processing and listed separately for reference (see List of Invalid ICD-9-CM Codes in Appendix 4).

In a small number of cases, diagnosis, condition, and procedure codes were further recoded to -9 if they denoted a pregnancy for a person younger than 16 or older than 44. There were 12 records recoded in this manner on the 2012 Medical Conditions File. The person’s age was determined by linking the 2012 Medical Conditions File to the 2011 and 2012 Person-Level Use PUFs. If the person’s age is under 16 or over 44 in the round in which the condition or procedure was reported, the appropriate condition or procedure code was recoded to -9.

Users should note that because of the design of the survey, most deliveries (i.e., births) are coded as pregnancies. For more accurate estimates for deliveries, analysts should use RSNINHOS “Reason Entered Hospital” found on the Hospital Inpatient Stays Public Use File (HC-152D).

Conditions and procedures were reported in the same sections of the HC questionnaire (see Variable-Source Crosswalk in Appendix 1). Labels for all values of the variables ICD9CODX and ICD9PROX, as shown in Tables 1 and 2, are provided in the SAS programming statements included in this release (see the H154SU.TXT file).

Return To Table Of Contents

2.5.2.9 Clinical Classification Codes

ICD-9-CM condition codes have been aggregated into clinically meaningful categories that group similar conditions (CCCODEX). CCCODEX was generated using Clinical Classification Software (formerly known as Clinical Classifications for Health Care Policy Research (CCHPR)), which aggregates conditions and V-codes into mutually exclusive categories, most of which are clinically homogeneous (Elixhauser, et al, 2000). Appendix 3 lists the ICD-9-CM codes that have been aggregated for each clinical classification category.

The reported ICD-9-CM condition code values were mapped to the appropriate clinical classification category prior to being collapsed to 3-digit ICD-9-CM condition codes. The result is that every record which has an ICD-9-CM diagnosis code also has a clinical classification code.

Beginning with the FY12 Conditions file, for confidentiality purposes, ICD-9-CM codes are recoded to broader codes by clinicians for conditions that occur fewer than 20 times within a year’s conditions file and for clinically rare conditions. A condition is deemed clinically rare if it appears on the National Institutes of Health’s list of rare diseases. Each year, a few conditions on the final file fall below the confidentiality threshold. This is due to the multistage file development process. The confidentiality recoding is performed on the preliminary version of the Conditions file each year. This preliminary version is used in the development of other event PUFs and, in turn, these event PUFs are used in the development of the final conditions file. During this process, some records from the preliminary file are dropped because only records that are relevant to the current data year are reflected in the final Conditions PUF.

CCS codes are assigned to the original fully-specified ICD-9-CM codes. When the original ICD-9-CM codes undergo recoding, no changes are made to the assigned CCS codes.

As with ICD9CODX and ICD9PROX, professional coders followed specific guidelines in setting CCCODEX to a missing value. CCCODEX was coded -9 where the verbatim text fell into one of three categories: (1) the text indicated that the condition was unknown (e.g., DK); (2) the text indicated the condition could not be diagnosed by a doctor (e.g., doctor doesn’t know); or (3) the specified condition was not codeable and a procedure could not be discerned from the text. CCCODEX was coded -1 where the verbatim text strictly denotes a procedure and not a condition.

A small number (less than 1 percent) of clinical classification codes have been edited for confidentiality purposes. Table 3 in Appendix 2 provides weighted and unweighted frequencies for CCCODEX. Labels for all values of the variable CCCODEX, as shown in Table 3, are provided in the SAS programming statements included in this release (see the H154SU.TXT file).

In a small number of cases, clinical classification codes were further recoded to -9 if they denoted a pregnancy for a person younger than 16 or older than 44. There were 12 records recoded in this manner on the 2012 Medical Conditions File. The person’s age was determined by linking the 2012 Medical Conditions File to the 2011 and 2012 Person-Level Use PUFs. If the person’s age is under 16 or over 44 in the round in which the condition was reported, the appropriate clinical classification code was recoded to -9.

Note that, prior to 2004 the range for the variable CCCODEX was 001 through 260. In 2004, revisions to the coding of mental disorders were implemented. The codes 650 through 663 replaced 065 through 075 in 2004. Beginning in 2007, the mental disorders codes were reorganized again. Alcohol and substance abuse disorders were broken into separate categories, and miscellaneous mental disorders were renumbered.

Analysts should use the clinical classification codes listed in the Conditions PUF document (HC-154) and the Appendix to the Event Files document (HC-152I) when analyzing MEPS conditions data. Although there is a list of clinical classification codes and labels on the Healthcare Cost and Utilization Project (HCUP) Website, if updates to these codes and/or labels are made on the HCUP Website after the release of the 2012 MEPS PUFs, these updates will not be reflected in the 2012 MEPS data.

Return To Table Of Contents

2.5.3 Utilization Variables (OBNUM RXNUM)

The variables OBNUM, OPNUM, HHNUM, IPNUM, ERNUM, and RXNUM indicate the total number of 2012 events that can be linked to each condition record on the current file, i.e., office-based, outpatient, home health, inpatient hospital stays, emergency room visits, and prescribed medicines, respectively.

These counts of events were derived from Expenditure Event Public Use Files (HC-152G, HC-152F, HC-152H, HC-152D, HC-152E, and HC-152A). Events associated with conditions include all utilization that occurred between January 1, 2012 and December 31, 2012.

Because persons can be seen for more than one condition per visit, these frequencies will not match the person or event-level utilization counts. For example, if a person had one inpatient hospital stay and was treated for a fractured hip, a fractured shoulder, and a concussion, each of these conditions has a unique record in this file and IPNUM=1 for each record. By summing IPNUM for these records, the total inpatient hospital stays would be three when actually there was only one inpatient hospital stay for that person and three conditions were treated. These variables are useful for determining the number of inpatient hospital stays for head injuries, hip fractures, etc.

Return To Table Of Contents

3.0 Survey Sample Information

3.1 Overview

There is a single full year person-level weight (PERWT12F) assigned to each record for each key, in-scope person who responded to MEPS for the full period of time that he or she was in-scope during 2012. A key person was either a member of a responding NHIS household at the time of the interview or joined a family associated with such a household after being out-of-scope at the time of the NHIS (the latter circumstance includes newborns as well as those returning from military service, an institution, or residence in a foreign country). A person is in-scope whenever he or she is a member of the civilian noninstitutionalized portion of the U.S. population.

There has been an important change in the MEPS sample design that is worth noting. A new NHIS sample design was implemented in 2006 with a new sample of PSUs and segments, independent of the sample design used from 1995-2005. To the extent that the new NHIS design provides better coverage of the civilian noninstitutionalized U.S. population in general and specific subgroups in particular, differences between estimates based on the old and new designs could arise in both the NHIS and MEPS due to such improved coverage rather than actual changes in the characteristics of the target population.

Return To Table Of Contents

3.2 Details on Person Weight Construction

The person-level weight PERWT12F was developed in several stages. First, person-level weights for Panel 16 and Panel 17 were created separately. The weighting process for each panel included adjustments for nonresponse over time and calibration to independent population totals. The calibration was initially accomplished separately for each panel by raking the corresponding sample weights to Current Population Survey (CPS) population estimates based on five variables. The five variables used in the establishment of the initial person-level control figures were: census region (Northeast, Midwest, South, West); MSA status (MSA, non-MSA); race/ethnicity (Hispanic; Black, non-Hispanic; Asian, non-Hispanic; and other); sex; and age. A 2012 composite weight was then formed by multiplying each weight from Panel 16 by the factor .49 and each weight from Panel 17 by the factor .51. The choice of factors reflected the relative sample sizes of the two panels, helping to limit the variance of estimates obtained from pooling the two samples. The composite weight was raked to the same set of CPS-based control totals. When the poverty status information derived from income variables became available, a final raking was undertaken on the previously established weight variable. Control totals were established using poverty status (five categories: below poverty, from 100 to 125 percent of poverty, from 125 to 200 percent of poverty, from 200 to 400 percent of poverty, at least 400 percent of poverty), the other five variables previously used in the weight calibration, as well as age categories cross-classified with categories associated with numbers of office-based visits and age categories cross-classified with categories reflecting the number of prescribed medicines purchased.

Return To Table Of Contents

3.2.1 MEPS Panel 16 Weight Development Process

The person-level weight for MEPS Panel 16 was developed using the 2011 full year weight for an individual as a “base” weight for survey participants present in 2011. For key, in-scope members who joined an RU some time in 2012 after being out-of-scope in 2011, the initially assigned person-level weight was the corresponding 2011 family weight. The weighting process included an adjustment for person-level nonresponse over Rounds 4 and 5 as well as raking to population control figures for December 2012 for key, responding persons in-scope on December 31, 2012. These control figures were derived by scaling back the population distribution obtained from the March 2013 CPS to reflect the December 31, 2012 estimated population total (estimated based on Census projections for January 1, 2012). Variables used for person-level raking included: census region (Northeast, Midwest, South, West); MSA status (MSA, non-MSA); race/ethnicity (Hispanic; Black, non-Hispanic; Asian, non-Hispanic; and other); sex; and age. The final weight for key, responding persons who were not in-scope on December 31, 2012 but were in-scope earlier in the year was the person weight after the nonresponse adjustment.

Return To Table Of Contents

3.2.2 MEPS Panel 17 Weight Development Process

The person-level weight for MEPS Panel 17 was developed using the 2012 MEPS Round 1 person-level weight as a “base” weight. For key, in-scope members who joined an RU after Round 1, the Round 1 family weight served as a “base” weight. The weighting process included an adjustment for nonresponse over the remaining data collection rounds in 2012 as well as raking to the same population control figures for December 2012 used for the MEPS Panel 16 weights for key, responding persons in-scope on December 31, 2012. The same five variables employed for Panel 16 raking (census region, MSA status, race/ethnicity, sex, and age) were used for Panel 17 raking. Again, the final weight for key, responding persons who were not in-scope on December 31, 2012 but were in-scope earlier in the year was the person weight after the nonresponse adjustment.

Note that the MEPS Round 1 weights incorporated the following components: the original household probability of selection for the NHIS; ratio-adjustment to NHIS-based national population estimates at the household (occupied dwelling unit) level; adjustment for nonresponse at the dwelling unit level for Round 1; and poststratification to figures at the family and person level obtained from the March CPS data base of the corresponding year (i.e., 2011 for Panel 16 and 2012 for Panel 17).

Return To Table Of Contents

3.2.3 The Final Weight for 2012

The final raking of those in-scope at the end of the year has been described above. In addition, the composite weights of two groups of persons who were out-of-scope on December 31, 2012 were poststratified. Specifically, the weights of those who were in-scope some time during the year, out-of-scope on December 31, and entered a nursing home during the year were poststratified to a corresponding control total obtained from the 1996 MEPS Nursing Home Component. The weights of persons who died while in-scope during 2012 were poststratified to corresponding estimates derived using data obtained from the Medicare Current Beneficiary Survey (MCBS) and Vital Statistics information provided by the National Center for Health Statistics (NCHS). Separate decedent control totals were developed for the “65 and older” and “under 65” civilian noninstitutionalized populations.

In developing the final person-level weight for 2012 (PERWT12F), two raking dimensions were added. One reflected the MEPS 2009-2011 estimated average annual distribution of office-based visits by age (under 65, 65 and over) while the other reflected the MEPS 2009-2011 estimated average distribution of prescription medicine purchases, also by the same age groups. These additional adjustments were included to better reflect benchmark trends for these two measures of health care utilization.

For each category of the additional two raking dimensions, the tables below show the ratio of the weighted estimate of persons that resulted from including the additional raking dimension to the weighted estimate of persons without the additional dimension.

Ratio of Adjusted to Unadjusted Weights for Office-based Raking Dimension

Number of Office-based Visits Under 65 (AGE12X < 65) 65 or Older (AGE12X >= 65)
0 0.87188 0.95404
1-5 1.03549 0.94513
6-10 1.12561 0.99076
> 10 1.16699 1.09270

Ratio of Adjusted to Unadjusted Weights for Prescribed Medicine Raking Dimension

Number of Prescribed Medicine Purchases Under 65 (AGE12X < 65) 65 or Older (AGE12X >= 65)
0 0.91674 0.89169
>>0 1.07082 1.01080

Overall, the weighted population estimate for the civilian noninstitutionalized population for December 31, 2012 is 309,875,841 (PERWT12F>0 and INSC1231=1). The sum of the person-level weights across all persons assigned a positive person-level weight is 313,489,853.

Return To Table Of Contents

3.2.4 Coverage

The target population for MEPS in this file is the 2012 U.S. civilian noninstitutionalized population. However, the MEPS sampled households are a subsample of the NHIS households interviewed in 2010 (Panel 16) and 2011 (Panel 17). New households created after the NHIS interviews for the respective panels and consisting exclusively of persons who entered the target population after 2010 (Panel 16) or after 2011 (Panel 17) are not covered by MEPS. Neither are previously out-of-scope persons who join an existing household but are unrelated to the current household residents. Persons not covered by a given MEPS panel thus include some members of the following groups: immigrants, persons leaving the military, U.S. citizens returning from residence in another country, and persons leaving institutions. The set of uncovered persons constitutes only a small segment of the MEPS target population.

Return To Table Of Contents

3.3 Using MEPS Data for Trend Analysis

MEPS began in 1996, and the utility of the survey for analyzing health care trends expands with each additional year of data; however, it is important to consider a variety of factors when examining trends over time using MEPS. Statistical significance tests should be conducted to assess the likelihood that observed trends are not attributable to sampling variation. The length of time being analyzed should also be considered. In particular, large shifts in survey estimates over short periods of time (e.g. from one year to the next) that are statistically significant should be interpreted with caution unless they are attributable to known factors such as changes in public policy, economic conditions, or MEPS survey methodology. Looking at changes over longer periods of time can provide a more complete picture of underlying trends.

Analysts may wish to consider using techniques to evaluate, smooth, or stabilize analyses of trends using MEPS data such as comparing pooled time periods (e.g. 2008-2009 versus 2011-12), working with moving averages, or using modeling techniques with several consecutive years of MEPS data to test the fit of specified patterns over time. However, it should be noted that there are issues with pooling as well as comparing conditions data gathered prior to 2007 with the data collected in 2007 and beyond. Improved methods (a Priority Conditions Enumeration section and priority conditions automatically flagged by CAPI), were implemented for collecting priority conditions data for many of the conditions beginning in 2007.

Finally, researchers should be aware of the impact of multiple comparisons on Type I error. Without making appropriate allowance for multiple comparisons, undertaking numerous statistical significance tests of trends increases the likelihood of concluding that a change has taken place when one has not.

Return To Table Of Contents

4.0 Merging/Linking MEPS Data Files

Data from the current file can be used alone or in conjunction with other files. Merging characteristics of interest from person-level files expands the scope of potential estimates. See HC-152I for instructions on merging the Conditions File to the Medical Event Files. Person-level characteristics can be merged to this Conditions File using the following procedure:

  1. Sort the person-level file by person identifier, DUPERSID. Keep only DUPERSID and the variables to be merged onto the Conditions File.

  2. Sort the Conditions File by person identifier, DUPERSID.
  3. Merge both files by DUPERSID, and output all records in the Conditions File.

  4. If PERS contains the person-level variables, and COND is the Conditions File, the following code can be used to add person-level variables to the person’s conditions in the Condition-level file.

PROC SORT DATA=PERS(KEEP=DUPERSID AGE SEX EDUCLEVL EDULEV EDRECODE)
OUT=PERSX; BY DUPERSID;
RUN;

PROC SORT DATA=COND; BY DUPERSID;
RUN;

DATA COND;
MERGE COND (IN=A) PERSX(IN=B); BY DUPERSID;
IF A;
RUN;

Return To Table Of Contents

4.1 National Health Interview Survey (NHIS)

Data from this file can be used alone or in conjunction with other files for different analytic purposes. Each MEPS panel can also be linked back to the previous years’ National Health Interview Survey public use data files. For information on obtaining MEPS/NHIS link files please see meps.ahrq.gov/data_stats/more_info_download_data_files.jsp.

Return To Table Of Contents

4.2 Longitudinal Analysis

Panel-specific longitudinal files are available for downloading in the data section of the MEPS Web site. For each panel, the longitudinal file comprises MEPS survey data obtained in Rounds 1 through 5 of the panel and can be used to analyze changes over a two-year period. Variables in the file pertaining to survey administration, demographics, employment, health status, disability days, quality of care, patient satisfaction, health insurance, and medical care use and expenditures were obtained from the MEPS full-year Consolidated files from the two years covered by that panel.

Return To Table Of Contents

References

Cohen, S. B. (1996). The Redesign of the Medical Expenditure Panel Survey: A Component of the DHHS Survey Integration Plan. Proceedings of the COPAFS Seminar on Statistical Methodology in the Public Service.

Cox, B. and Iachan, R. (1987). A Comparison of Household and Provider Reports of Medical Conditions. Journal of the American Statistical Association 82(400): 1013-18.

Edwards, W. S., Winn, D. M., Kurlantzick, V., et al. Evaluation of National Health Interview Survey Diagnostic Reporting. National Center for Health Statistics, Vital Health 2(120). 1994.

Elixhauser, A., Steiner, C. A., Whittington, C. A., and McCarthy, E. Clinical Classifications for health policy research: Hospital inpatient statistics, 1995. Healthcare Cost and Utilization project, HCUP-3 research Note. Rockville, MD: Agency for Healthcare Research and Quality; 2000. AHCPR Pub. No. 98-0049.

Ezzati-Rice, T.M., Rohde, F., Greenblatt, J., Sample Design of the Medical Expenditure Panel Survey Household Component, 19982007. Methodology Report No. 22. March 2008. Agency for Healthcare Research and Quality, Rockville, MD.

Health Care Financing Administration (1980). International Classification of Diseases, 9th Revision, Clinical Modification (ICD-CM). Vol. 1. (Department of Health and Human Services Pub. No (PHS) 80-1260). Department of Health and Human Services: U.S. Public Health Services.

Johnson, Ayah E., and Sanchez, Maria Elena. (1993), “Household and Medical Reports on Medical Conditions: National Medical Expenditure Survey.” Journal of Economic and Social Measurement, 19, 199-223.

Return To Table Of Contents

Appendix 1. Variable-Source Crosswalk

Unique Identifier Variables
Variable Label Source1
DUID Dwelling Unit ID Assigned In Sampling
PID Person Number Assigned In Sampling
DUPERSID Person ID (DUID + PID) Assigned In Sampling
CONDN Condition Number CAPI Derived
CONDIDX Condition ID CAPI Derived
PANEL Panel Number Constructed
CONDRN Condition Round Number CAPI Derived

Return To Table Of Contents

Medical Condition Variables
Variable Label Source1
AGEDIAG Age When Diagnosed PE section
REMISSN Is Cancer in Remission/Under Control PE25
CRND1 Has Condition Information In Round 1 Constructed
CRND2 Has Condition Information In Round 2 Constructed
CRND3 Has Condition Information In Round 3 Constructed
CRND4 Has Condition Information In Round 4 Constructed
CRND5 Has Condition Information In Round 5 Constructed
INJURY Was Condition Due To Accident/Injury CN02
ACCDENTD Date Of Accident -- Day CN06
ACCDENTM Date Of Accident -- Month CN06
ACCDENTY Date Of Accident -- Year CN06
ACCDNJAN Accident/Injury Occur Before/After Jan 1 CN06A
ACCDNWRK Did Accident Occur At Work CN07
MISSWORK Flag Associated With Missed Work Days DD03
MISSSCHL Flag Associated With Missed School Days DD06
INBEDFLG Flag Associated With Bed Days DD09
ICD9CODX ICD-9-CM Code For Condition - Edited CE05, HS04, ER04, OP09, MV09, HH05, PM09 (Edited)
ICD9PROX ICD-9-CM Code For Procedure - Edited CE05, HS04, ER04, OP09, MV09, HH05, PM09 (Edited)
CCCODEX Clinical Classification Code - Edited Constructed/Edited

Return To Table Of Contents

Utilization Variables
Variable Label Source1
HHNUM # Home Health Events Assoc. w/ Condition Constructed
IPNUM # Inpatient Events Assoc. w/ Condition Constructed
OPNUM # Outpatient Events Assoc. w/ Condition Constructed
OBNUM # Office-Based Events Assoc. w/ Condition Constructed
ERNUM # ER Events Assoc. w/ Condition Constructed
RXNUM # Prescribed Medicines Assoc. w/ Cond. Constructed

Return To Table Of Contents

Weights and Variance Estimation Variables
Variable Label Source1
PERWT12F Expenditure File Person Weight, 2012 Constructed
VARSTR Variance Estimation Stratum, 2012 Constructed
VARPSU Variance Estimation PSU, 2012 Constructed

1See the Household Component section under Survey Questionnaires on the MEPS home page for information on the MEPS HC questionnaire sections shown in the Source column (e.g., CN, DD).

Return To Table Of Contents

Appendix 2. Condition, Procedure and Clinical Classification Code Frequencies

Appendix 3. Clinical Classification Code to ICD-9-CM Code Crosswalk

Appendix 4. List of Invalid ICD-9-CM Codes

 
Diagnosis Code Description
41.4 Escherichia coli [E. coli] infection in conditions classified elsewhere and of unspecified site
173 Other malignant neoplasm of skin of lip
173.1 Other malignant neoplasm of skin of eyelid, including canthus
173.2 Other malignant neoplasm of skin of ear and external auditory canal
173.3 Other malignant neoplasm of skin of other and unspecified parts of face
173.4 Other malignant neoplasm of scalp and skin of neck
173.5 Other malignant neoplasm of skin of trunk, except scrotum
173.6 Other malignant neoplasm of skin of upper limb, including shoulder
173.7 Other malignant neoplasm of skin of lower limb, including hip
173.8 Other malignant neoplasm of other specified sites of skin
173.9 Other malignant neoplasm of skin, site unspecified
284.1* Pancytopenia
286.5 Hemorrhagic disorder due to intrinsic circulating anticoagulants
310.8 Other specified nonpsychotic mental disorders following organic brain damage
425.1* Hypertrophic obstructive cardiomyopathy
444 Embolism and thrombosis of abdominal aorta
512.8* Other spontaneous pneumothorax
516.3 Idiopathic fibrosing alveolitis
518.5* Pulmonary insufficiency following trauma and surgery
596.8 Other specified disorders of bladder
631 Other abnormal product of conception
718.60* Unspecified intrapelvic protrusion of acetabulum, site unspecified
747.3 Anomalies of pulmonary artery
793.1* Nonspecific (abnormal) findings on radiological and other examination of lung field
795.5* Nonspecific reaction to tuberculin skin test without active tuberculosis
997.4** Digestive system complications
998.0* Postoperative shock
999.4 Anaphylactic shock due to serum
999.5** Other serum reaction
V12.2 Personal history of endocrine, metabolic, and immunity disorders
V13.8 Personal history of other specified diseases
V19.1 Family history of other eye disorders
V40.3* Other behavioral problems

 
Procedure Code Description
02.2* Ventriculostomy

Notes:
* These codes were discussed at the March 9-10, 2011 ICD-9-CM Coordination and Maintenance Committee meeting and were not finalized in time to include in the FY 2012 IPPS/LTCH PPS proposed rule. They were deleted on October 1, 2011.
** The code title has changed from the proposed rule.

Return To Table Of Contents

Appendix 5. List of Conditions Asked in Priority Conditions Enumeration Section

  • Angina/Angina Pectoris
  • Arthritis
  • Asthma
  • Attention Deficit Hyperactivity Disorder (ADHD)/Attention Deficit Disorder (ADD)
  • Cancer/Malignancy
  • Chronic Bronchitis
  • Coronary Heart Disease
  • Diabetes/Sugar Diabetes
  • Emphysema
  • Heart Attack/Myocardial Infarction (MI)
  • High Cholesterol
  • Hypertension/High Blood Pressure
  • Joint Pain
  • Other Heart Disease (not coronary heart disease, angina, or heart attack)
  • Stroke/Transient Ischemic Attack (TIA)/Mini-stroke

Return To Table Of Contents