A. Data Use Agreement
B. Background
B.1 Household Component
B.2 Medical Provider Component
B.3 Survey Management and Data Collection
C. Technical and Programming Information
C.1 General Information
C.2 Data File Information
C.2.1 Variables
C.2.1.1 Variables from Annual Full Year Consolidated Files
C.2.1.2 Constructed Variables for Selection of Group
C.2.1.3 Estimation Variables

Individual identifiers have been removed from the micro data contained in these files. Nevertheless, under sections 308 (d) and 903 (c) of the Public Health Service Act (42 U.S.C. 242m and 42 U.S.C. 299 a-1), data collected by the Agency for Healthcare Research and Quality (AHRQ) and/or the National Center for Health Statistics (NCHS) may not be used for any purpose other than for the purpose for which they were supplied; any effort to determine the identity of any reported cases is prohibited by law.

Therefore, in accordance with the above referenced Federal Statute, it is understood that:

  1. No one is to use the data in this data set in any way except for statistical reporting and analysis; and

  2. If the identity of any person or establishment should be discovered inadvertently, then (a) no use will be made of this knowledge, (b) the Director Office of Management AHRQ will be advised of this incident, (c) the information that would identify any individual or establishment will be safeguarded or destroyed, as requested by AHRQ, and (d) no one else will be informed of the discovered identity; and

  3. No one will attempt to link this data set with individually identifiable records from any data sets other than the Medical Expenditure Panel Survey or the National Health Interview Survey.

By using these data you signify your agreement to comply with the above stated statutorily based requirements with the knowledge that deliberately making a false statement in any matter within the jurisdiction of any department or agency of the Federal Government violates Title 18 part 1 Chapter 47 Section 1001 and is punishable by a fine of up to $10,000 or up to five years in prison.

The Agency for Healthcare Research and Quality requests that users cite AHRQ and the Medical Expenditure Panel Survey as the data source in any publications or research based upon these data.

The Medical Expenditure Panel Survey (MEPS) provides nationally representative estimates of health care use, expenditures, sources of payment, and health insurance coverage for the U.S. civilian non–institutionalized population. The MEPS Household Component (HC) also provides estimates of respondents' health status, demographic and socio economic characteristics, employment, access to care, and satisfaction with health care. Estimates can be produced for individuals, families, and selected population subgroups.  The panel design of the survey, which includes 5 Rounds of interviews covering 2 full calendar years, provides data for examining person level changes in selected variables such as expenditures, health insurance coverage, and health status. Using computer assisted personal interviewing (CAPI) technology, information about each household member is collected, and the survey builds on this information from interview to interview.  All data for a sampled household are reported by a single household respondent.

The MEPS HC was initiated in 1996.  Each year a new panel of sample households is selected.  Because the data collected are comparable to those from earlier medical expenditure surveys conducted in 1977 and 1987, it is possible to analyze long term trends. Each annual MEPS HC sample size is about 15,000 households.  Data can be analyzed at either the person or event level.  Data must be weighted to produce national estimates.

The set of households selected for each panel of the MEPS HC is a subsample of households participating in the previous year's National Health Interview Survey (NHIS) conducted by the National Center for Health Statistics. The NHIS sampling frame provides a nationally representative sample of the U.S. civilian non–institutionalized population and reflects an oversample of blacks and Hispanics.

B.2 Medical Provider Component

Upon completion of the household CAPI interview and obtaining permission from the household survey respondents, a sample of medical providers are contacted by telephone to obtain information that household respondents cannot accurately provide. This part of the MEPS is called the Medical Provider Component (MPC) and information is collected on dates of visit, diagnosis and procedure codes, charges and payments. The Pharmacy Component (PC), a subcomponent of the MPC, does not collect charges or diagnosis and procedure codes but does collect drug detail information, including National Drug Code (NDC) and medicine name, as well as date filled and sources and amounts of payment. The MPC is not designed to yield national estimates.  It is primarily used as an imputation source to supplement/replace household reported expenditure information.

B.3 Survey Management and Data Collection

MEPS HC and MPC data are collected under the authority of the Public Health Service Act.  Data are collected under contract with Westat, Inc.  Data sets and summary statistics are edited and published in accordance with the confidentiality provisions of the Public Health Service Act and the Privacy Act.  The National Center for Health statistics (NCHS) provides consultation and technical assistance.

As soon as data collection and editing are completed, the MEPS survey data are released to the public in staged releases of summary reports, micro data files, and tables via the MEPS web site:  Selected data can be analyzed through MEPSnet, an online interactive tool designed to give data users the capability to statistically analyze MEPS data in a menu driven environment.

Additional information on MEPS is available from the MEPS project manager or the MEPS public use data manager at the Center for Financing Access and Cost Trends, Agency for Healthcare Research and Quality, 540 Gaither Road, Rockville, MD 20850 (301-427-1406).

For MEPS Panels 1–8, the original longitudinal weight files that were released contained a limited number of variables that could be merged with data from two consecutive full year consolidated files to create a longitudinal file for analysis.  Beginning with Panel 9, AHRQ replaced the longitudinal weight files with more complete and analytically useful panel specific files that contain the variables from the consolidated full year files. AHRQ has revisited Panels 1–8 to replace the original longitudinal weight files with expanded versions.

This documentation describes the Panel 1 longitudinal data file from the Medical Expenditure Panel Survey Household Component (MEPS HC).  Released as an ASCII file (with related SAS, STATA, and SPSS programming statements and data use information) and a SAS transport dataset, this public use file provides information collected on a nationally representative sample of the civilian non–institutionalized population of the United States for the two years period 1996–97. The file contains 2,472 variables and has a logical record length of 6,233 with an additional 2 bytes carriage return/line feed at the end of each record. 

This file consists of MEPS survey data obtained in Rounds 1–5 of MEPS Panel 1 and can be used to analyze changes over a two years period. Variables in the file pertaining to survey administration, demographics, employment, health status, disability days, quality of care, patient satisfaction, health insurance and medical care use and expenditures were obtained from the MEPS 1996 and 1997 Full Year Consolidated Files (HC–12 and HC–20, respectively).

The following documentation offers a brief overview of the contents and structure of the files and programming information. A codebook of all the variables included in the Panel 1 data file is provided in a separate file (H23CB.PDF). A database of all MEPS products released to date and a variable locator indicating the major MEPS data items on public use files that have been released to date can be found on the MEPS Web site:

C.2 Data File Information

This public use file contains records for 19,859 persons in Panel 1 who were respondents for the period they were inscope for the survey (i.e., a member of the civilian non–institutionalized population) during the two years period. Data are available for all five rounds for 92.2% of the cases (18,300). The remaining 7.8% (1,559 persons) do not have data for one or more rounds but participated in the survey for their full period of eligibility. These persons include those who were born, died, were in the military or an institution, or left the country during the two years period. In contrast, persons in the panel who did not participate in the survey for the entire period they were inscope are not included in this file. The analytic weight variable (LONGWT) has been adjusted for non response / attrition and should be used to produce national estimates for the two years period. The codebook provides both weighted and unweighted frequencies for each variable in the data file.

Each MEPS panel can be linked back to the previous years National Health Interview Survey public use data files. For information on obtaining MEPS/NHIS link files please see

C.2.1 Variables

C.2.1.1 Variables from Annual Full Year Consolidated Files

Most variables on this file were obtained from the MEPS 1996 and 1997 Full Year Consolidated Files (HC–20 and HC–28, respectively).  However, names for time dependent variables from these files were modified in order to:  1) eliminate duplicate variable names for data reflecting different time periods during the panel, and 2) standardize variable names to facilitate pooling of multiple MEPS panels for analysis.1  Generally, annual variables with a suffix of “96” and “97” are renamed with a suffix of “Y1” and “Y2”, respectively. Variables with a suffix of “31”, “42”, and “53” are renamed with a suffix denoting the round the data was collected (i.e., “1” , “2” or “3” for variables originating from Rounds 1–3 on the 1996 full year file and “3”, “4”, or “5” for variables originating from Rounds 3–5 on the 1997 full year file).2  It is necessary to use this crosswalk in conjunction with documentation for the 1996 and 1997 full year consolidated files to obtain a full description of variables on this file. Table 1 below provides the crosswalk summarizing the scheme used for renaming variables from the annual files.

Table 1: Crosswalk of Variable Names between the Full Year Consolidated Files and the Longitudinal File
Type of Variable Full Year Consolidated File Variable Name Suffix Longitudinal File Variable Name Suffix Examples
Constant (i.e. not round or year specific) No suffixes No suffixes DOBMM=DOBMM
Annual,family related variables YR Y1 or YR1 FAMIDYR=FAMIDYR1 (1996 file)
Y2 or YR2 FAMIDYR=FAMIDYR2 1997 file)
Annual CPS family identifiers No suffix Y1 CPSFAMID=CPSFAMY1 (1996)
Annual, inscope variables No suffix YR1 All variables:
YR2 All variables:
12/31 status variables 1231 in 1996 file Y1 All variables:
FAMS1231=FAMSY1 (1996 file)
FCSZ1231=FCSZY1 (1996 file)
INSC1231=INSCY1 (1996 file)
1231 in 1997 file Y2 All variables:
FAMS1231=FAMSY2 (1997 file)
FCSZ1231=FCSZY2 (1997 file)
INSC1231=INSCY2 (1997 file)
Annual 96, 96X, 96F, or 96C Y1, Y1X, Y1F, or Y1C Example:
TOTEXP96=TOTEXPY1 (1996 file)
97, 97X, 97F, or 97C Y2, Y2X, Y2F, or Y2C Example:
TOTEXP97=TOTEXPY2 (1997 file)
Annual No suffixes3 Y1 All variables:
KEYNESS=KEYNESY1 (1996 file)
EVRWRK=EVRWRKY1 (1996 file)
Y2 All variables:
KEYNESS=KEYNESY2 (1997 file)
EVRWRK=EVRWRKY2 (1997 file)
Monthly 2 character month + 96 2 character month + Y1 Example:
PRIJA96=PRIJAY1 (1996 file)
2 character month + 97 2 character month + Y2 Example:
PRIJA97=PRIJAY2 (1997 file)
Round Specific 31 or 31X in 1996 file
42 or 42X in 1996 file
53 or 53X in 1996 file
1 or 1X for 1996
2 or 2X for 1996
3 or 3X for 1996
RTHLTH31=RTHLTH1 (1996 file)
RTHLTH42=RTHLTH2 (1996 file)
RTHLTH53=RTHLTH3 (1996 file if YEARIND=2)
31 or 31X in 1997 file
42 or 42X in 1997 file
53 or 53X in 1997 file
3 or 3X for 1997
4 or 4X for 1997
5 or 5X for 1997
RTHLTH31=RTHLTH3 (1997 file if YEARIND=1 or 3)
RTHLTH42=RTHLTH4 (1997 file)
RTHLTH53=RTHLTH5 (1997 file)
Job Change 3142
12 for 1996
23 for 1996
All cases:
CHGJ3142=CHGJ12(1996 file)
CHGJ4253=CHGJ23(1996 file)
YCHJ3142=YCHJ12(1996 file)
YCHJ4253=YCHJ23(1996 file)
34 for 1997
45 for 1997
All cases:
CHGJ3142=CHGJ34 (1997 file)
CHGJ4253=CHGJ45 (1997 file)
YCHJ3142=YCHJ34 (1997 file)
YCHJ4253=YCHJ45 (1997 file)

C.2.1.2. Constructed Variables for Selection of Group

The following eight variables were constructed and included on the file to facilitate the selection of appropriate cases for various analyses. Table 2 below contains descriptive statistics for these variables.

Variable Name Description
YEARIND 1 = both years, 2 = in 1996 only, and 3 = in 1997 only
ALL5RDS Inscope and data collected in all 5 rounds (0 = no, 1 = yes)
DIED Died during the two years survey period (0 = no, 1 = yes)
INST Institutionalized for some time during the two years survey period (0 = no, 1 = yes)
MILITARY Active duty military for some time during the two years survey period (0 = no, 1 = yes)
ENTRSRVY Entered survey after beginning of panel (mainly births; also includes persons
who had no initial chance of selection who moved into a MEPS sample household) (0 = no, 1 = yes)
LEFTUS Moved out of the country after beginning of panel (0 = no, 1 = yes)
OTHER Not identified in any of the above analytic groups (0 = no, 1 = yes)

Table 2: Frequencies and Percentage for Constructed Variables
Variable Number of Records Percentage of Records
(N = 19,589)
YEARIND = 1 (i.e., person in both years) 19,289 97.9
ALL5RDS = 1 (yes) 18,300 92.2
DIED = 1 (yes) 243 1.2
INST = 1 (yes) 114 0.6
MILITARY = 1 (yes) 40 0.2
ENTRSRVY = 1 (yes) 1019 5.1
LEFTUS = 1 (yes) 82 0.4
OTHER = 1 (yes) 100 0.5

Following are examples of situations where these variables would be useful in selecting records for analysis:

  • Analysts interested in working only with persons who were inscope and had data for all five rounds of the panel should subset to cases where ALL5RDS = 1.
  • If a researcher wanted to include persons who were inscope and had data for all five rounds of the panel as well as those in the survey at the beginning of the panel who subsequently died, then they would include cases where ALL5RDS = 1 or (ENTRSRVY = 0 and DIED = 1).
  • If a researcher wanted to include persons who were inscope and had data for all five rounds of the panel as well as those who died in the second year of the panel, then they would include cases where ALL5RDS = 1 or (DIED = 1 and YEARIND = 1).

C.2.1.3 Estimation Variables

Longitudinal Estimations for Panel 1
The file contains a weight variable (LONGWT) and variance estimation variables (VARSTR, VARPSU) that should be applied when producing national estimates for longitudinal analyses.  For example, LONGWT applied to the 18,300 cases where ALL5RDS = 1 produces a weighted population estimate of 253.0 million.  This represents an estimate of the number of persons in the civilian non–institutionalized population for the entire two years period from 1996–97.  To obtain estimates of variability (such as the standard error of sample estimates or corresponding confidence intervals) for estimates based on MEPS survey data, one needs to take into account the complex sample design of MEPS by specifying the estimation variables including stratum of sample selection (VARSTR), primary sampling unit (VARPSU) and longitudinal weight (LONGWT).

Pooled Estimations
When analyzing subpopulations and/or low prevalence events, it may be desirable to pool together more than one panel of MEPS HC data to yield sample sizes large enough to generate reliable estimates.  If only data from Panels 2 and beyond are being pooled, then simply use the strata and PSU variables (VARSTR, VARPSU) provided on the longitudinal files for pooled estimation.  However, because Panels 1–6 MEPS longitudinal weight files were released with panel specific variance structures, it is necessary to obtain the set of appropriate variance estimation variables from the HC–036 Pooled Estimation File when pooling involves these panels.

1 A variable named PANEL is also included to facilitate pooling across panels.  This variable is simply the panel number and is therefore constant across all records within a longitudinal file.

2 While round 3 values were obtained for most observations from the 1997 Full Year Consolidated File, they were obtained from the 1996 Full Year Consolidated File for sample persons where YEARIND = 2 (i.e., in 1996 only).

3 To maintain the 8 characters naming convention, some variable names had the last character or two dropped in the renaming process.

