MEPS HC-225
Panel 24 Longitudinal Data File

September 2022

Due to the COVID-19 pandemic, 2020 data collection moved primarily to phone rather than in-person. This posed a challenge in Panel 25 Round 1, which is difficult to start via phone, resulting in a low response rate. To balance this and increase the number of completes to be comparable to previous years, Panels 23 and 24 were extended to nine rounds of data collection. Phone data collection and the challenges of the pandemic present concerns about data quality. Please take this into consideration when comparing to or pooling with previous years.

Agency for Healthcare Research and Quality
Center for Financing, Access, and Cost Trends
5600 Fishers Lane
Rockville, MD 20857
(301) 427-1406


Table of Contents

A. Data Use Agreement
B. Background
1.0 Household Component
2.0 Medical Provider Component
3.0 Survey Management and Data Collection
C. Technical and Programming Information
1.0 General Information
2.0 Data File Information
2.1 Variables
2.1.1 Variables from Annual Full-year Consolidated Files
2.1.2 Constructed Variables for Selection of Group
2.1.3 Estimation Variables

A. Data Use Agreement

Individual identifiers have been removed from the micro-data contained in these files. Nevertheless, under sections 308 (d) and 903 (c) of the Public Health Service Act (42 U.S.C. 242m and 42 U.S.C. 299 a-1), data collected by the Agency for Healthcare Research and Quality (AHRQ) and/or the National Center for Health Statistics (NCHS) may not be used for any purpose other than for the purpose for which they were supplied; any effort to determine the identity of any reported cases is prohibited by law.

Therefore in accordance with the above referenced Federal Statute, it is understood that:

  1. No one is to use the data in this data set in any way except for statistical reporting and analysis; and

  2. If the identity of any person or establishment should be discovered inadvertently, then (a) no use will be made of this knowledge, (b) the Director Office of Management AHRQ will be advised of this incident, (c) the information that would identify any individual or establishment will be safeguarded or destroyed, as requested by AHRQ, and (d) no one else will be informed of the discovered identity; and

  3. No one will attempt to link this data set with individually identifiable records from any data sets other than the Medical Expenditure Panel Survey or the National Health Interview Survey. Furthermore, linkage of the Medical Expenditure Panel Survey and the National Health Interview Survey may not occur outside the AHRQ Data Center, NCHS Research Data Center (RDC) or the U.S. Census RDC network.

By using these data you signify your agreement to comply with the above stated statutorily based requirements with the knowledge that deliberately making a false statement in any matter within the jurisdiction of any department or agency of the Federal Government violates Title 18 part 1 Chapter 47 Section 1001 and is punishable by a fine of up to $10,000 or up to 5 years in prison.

The Agency for Healthcare Research and Quality requests that users cite AHRQ and the Medical Expenditure Panel Survey as the data source in any publications or research based upon these data.

Return To Table Of Contents


B. Background

1.0 Household Component

The Medical Expenditure Panel Survey (MEPS) provides nationally representative estimates of health care use, expenditures, sources of payment, and health insurance coverage for the U.S. civilian noninstitutionalized population. The MEPS Household Component (HC) also provides estimates of respondents health status, demographic and socio-economic characteristics, employment, access to care, and satisfaction with healthcare. Estimates can be produced for individuals, families, and selected population subgroups. The panel design of the survey typically includes five rounds of interviews covering two full calendar years. In 2020, in order to increase the number of completed interviews, the panel design has been extended to include seven rounds of interviews covering three full calendar years. The panel design of MEPS provides data for examining person level changes in selected variables such as expenditures, health insurance coverage, and health status. Using computer assisted personal interviewing (CAPI) technology, information about each household member is collected, and the survey builds on this information from interview to interview. All data for a sampled household are reported by a single household respondent.

The MEPS-HC was initiated in 1996. Each year a new panel of sample households is selected. Because the data collected are comparable to those from earlier medical expenditure surveys conducted in 1977 and 1987, it is possible to analyze long-term trends. Each annual MEPS HC sample size is about 15,000 households. Data can be analyzed at either the person or event level. Data must be weighted to produce national estimates.

The set of households selected for each panel of the MEPS HC is a subsample of households participating in the previous year‡s National Health Interview Survey (NHIS) conducted by the National Center for Health Statistics. The NHIS sampling frame provides a nationally representative sample of the U.S. civilian noninstitutionalized population. In 2006, the NHIS implemented a new sample design, which included Asian persons in addition to households with Black and Hispanic persons in the oversampling of minority populations. NHIS introduced a new sample design in 2016 that discontinued oversampling of these minority groups.

Return To Table Of Contents


2.0 Medical Provider Component

Upon completion of the household CAPI interview and obtaining permission from the household survey respondents, a sample of medical providers are contacted by telephone to obtain information that household respondents can not accurately provide. This part of the MEPS is called the Medical Provider Component (MPC) and information is collected on dates of visits, diagnosis and procedure codes, charges and payments. The Pharmacy Component (PC), a subcomponent of the MPC, does not collect charges or diagnosis and procedure codes but does collect drug detail information, including National Drug Code (NDC) and medicine name, as well as amounts of payment. The MPC is not designed to yield national estimates. It is primarily used as an imputation source to supplement/replace household reported expenditure information.

Return To Table Of Contents


3.0 Survey Management and Data Collection

MEPS HC and MPC data are collected under the authority of the Public Health Service Act. Data are collected under contract with Westat, Inc. (MEPS HC) and Research Triangle Institute (MEPS MPC). Data sets and summary statistics are edited and published in accordance with the confidentiality provisions of the Public Health Service Act and the Privacy Act. The National Center for Health Statistics (NCHS) provides consultation and technical assistance.

As soon as data collection and editing are completed, the MEPS survey data are released to the public in staged releases of summary reports, micro data files, and tables via the MEPS website.

Additional information on MEPS is available from the MEPS project manager or the MEPS public use data manager at the Center for Financing, Access, and Cost Trends, Agency for Healthcare Research and Quality, 5600 Fishers Lane, Rockville, MD 20857 (301-427-1406).

Return To Table Of Contents


C. Technical and Programming Information

1.0 General Information

This documentation describes the Panel 24 longitudinal data file from the Medical Expenditure Panel Survey Household Component (MEPS-HC). Released as an ASCII file (with related SAS, STATA, SPSS, and R programming statements and data user information), a SAS data set, a SAS transport dataset, a STATA dataset, and an Excel file, this public use file provides information collected on a nationally representative sample of the civilian noninstitutionalized population of the United States for the two-year period 2019-2020. The file contains 2,761 variables and has a logical record length of 7,842 with an additional 2-byte carriage return/line feed at the end of each record.

This file consists of MEPS survey data obtained in Rounds 1-5 of MEPS Panel 24 and can be used to analyze changes over a two-year period. Variables in the file pertaining to survey administration, demographics, employment, health status, disability days, quality of care, patient satisfaction, health insurance and medical care use and expenditures were obtained from the MEPS 2019 and 2020 Full-Year Consolidated Files (HC-216 and HC-224, respectively).

The following documentation offers a brief overview of the contents and structure of the files and programming information. A codebook of all the variables included in the Panel 24 data file is provided in a separate file (h225cb.pdf). A database of all MEPS products released to date and a variable locator indicating the major MEPS data items on public use files that have been released to date can be found on the MEPS website.

Return To Table Of Contents


2.0 Data File Information

This public use file contains records for 9,797 persons in Panel 24 who were respondents for the period they were in-scope for the survey (i.e., a member of the civilian non-institutionalized population) during the two-year period. Only persons with positive person-level weights (PERWT19F or PERWT20F are included in the longitudinal PUF data. Data are available for all five rounds for 93.1% of the cases (9,120). The remaining 6.9% (677 persons) do not have data for one or more rounds but were in-scope for all rounds they participated in the survey. These persons are those who were born, died, were in the military or an institution, or left the country during the two-year period. In contrast, persons in the panel who participated in the survey for only part of the period they were in-scope are not included in this file. To compensate for this attrition, adjustments were made in the construction of the panel weight variable included in this file (LONGWT). The codebook provides both weighted and unweighted frequencies for each variable on the data file. The LONGWT variable should be used to produce national estimates for the two-year period.

Return To Table Of Contents


2.1 Variables

2.1.1 Variables from Annual Full-Year Consolidated Files

Most variables on this file were obtained from the MEPS 2019 and 2020 Full-Year Consolidated Files (HC-216 and HC-224, respectively). However, names for time dependent variables from these files were modified in order to: 1) eliminate duplicate variable names for data reflecting different time periods during the panel, and 2) standardize variable names to facilitate pooling of multiple MEPS panels for analysis1. Generally, annual variables with a suffix of "19" and "20" are renamed with a suffix of "Y1" and "Y2", respectively. Variables with a suffix of "31", "42", and "53" are renamed with a suffix denoting the round the data was collected (i.e., "1" , "2" or "3" for variables originating from Rounds 1-3 on the 2019 full-year file and "3", "4", or "5" for variables originating from Rounds 3-5 on the 2020 full-year file)2. It is necessary to use this crosswalk in conjunction with documentation for the 2019 and 2020 full-year consolidated files to obtain a full description of variables on this file. Table 1 below provides the crosswalk summarizing the scheme used for renaming variables from the annual files.


1 A variable named PANEL is also included to facilitate pooling across panels. This variable is simply the panel number and is therefore constant across all records within a longitudinal file. The ten-character variable DUPERSID uniquely identifies each person represented on the file and is the combination of the variables DUID (PANEL + Dwelling Unit ID) and PID (Person Number).
2 While Round 3 values were obtained for most observations from the 2020 Full Year Consolidated File, they were obtained from the 2019 Full Year Consolidated File for sample persons where YEARIND=2 (i.e., in 2019 only).
Table 1. Crosswalk of Variable Names between the Full-Year Consolidated Files and the Longitudinal File

Type of Variable Full-Year Consolidated File Variable Name Suffix Longitudinal File Variable Name Suffix  Specific cases or examples
Constant (i.e., not round or year specific)
No suffixes
No suffixes
All variables:
BORNUSA=BORNUSA
DOBMM=DOBMM
DOBYY=DOBYY 
DUID=DUID
PID=PID
DUPERSID=DUPERSID
EDUCYR=EDUCYR
HIDEG=HIDEG
HISPANX=HISPANX
HISPNCAT=HISPNCAT 
HWELLSPK=HWELLSPK 
INTVLANG=INTVLANG 
OTHLGSPK=OTHLGSPK
PANEL=PANEL
PID=PID
RACEAX=RACEAX
RACEBX=RACEBX
RACEWX=RACEWX
RACEV1X=RACEV1X
RACEV2X=RACEV2X
RACETHX=RACETHX
SEX=SEX
VARPSU=VARPSU
VARSTR=VARSTR
WHTLGSPK=WHTLGSPK
YRSINUS=YRSINUS 
Annual, family related variables
YR
Y1 or YR1




Y2 or YR2
All variables:
FAMIDYR=FAMIDYR1 (2019 file)
FAMRFPYR=FAMRFPY1 (2019 file)
FAMSZEYR=FAMSZYR1 (2019 file)

FAMIDYR=FAMIDYR2 (2020 file)
FAMRFPYR=FAMRFPY2 (2020 file)
FAMSZEYR=FAMSZYR2 (2020 file)
Annual, CPS family identifiers
No suffix
Y1  


Y2  
All variables:
CPSFAMID= CPSFAMY1 (2019 file)

CPSFAMID= CPSFAMY2 (2020 file) 
Annual, health insurance eligibility units
No suffix
Y1  


Y2  
All variables:
HIEUIDX=HIEUIDY1 (2019 file)

HIEUIDX=HIEUIDY2 (2020 file) 
Annual, inscope variables
No suffixes
YR1


YR2
All variables:
INSCOPE=INSCPYR1 (2019 file)

INSCOPE=INSCPYR2 (2020 file)
12/31 status variables
1231 in 2019 file






1231 in 2020 file
Y1  






Y2  
All variables:
FAMS1231=FAMSY1 (2019 file)
FCRP1231=FCRPY1 (2019 file)
FCSZ1231=FCSZY1 (2019 file)
FMRS1231=FMRSY1 (2019 file)
INSC1231=INSCY1 (2019 file)

FAMS1231=FAMSY2 (2020 file)
FCRP1231=FCRPY2 (2020 file)
FCSZ1231=FCSZY2 (2020 file)
FMRS1231=FMRSY2 (2020 file)
INSC1231=INSCY2 (2020 file)
Annual
19, 19X, 19F, or 19C



20, 20X, 20F, or 20C
Y1, Y1X, Y1F, or Y1C



Y2, Y2X, Y2F, or Y2C
Examples:
TOTEXP19=TOTEXPY1
AGE19X=AGEY1X

TOTEXP20=TOTEXPY2
AGE20X=AGEY2X
Variables for health insurance prior to January 1, 2019
(data collected in Round 1 only) 
No suffixes
No suffixes
All variables:
PREVCOVR=PREVCOVR
MORECOVR=MORECOVR
Annual
No suffixes3
Y1  








Y2  


Examples:
KEYNESS=KEYNESY1 (2019 file)
SAQELIG=SAQELIY1 (2019 file)
EVRWRK=EVRWRKY1 (2019 file)
EVRETIRE=EVRETIY1 (2019 file)
AGELAST=AGELSTY1 (2019 file)
DIABDX_M18=DIABDXY1_M18 (2019 file)

KEYNESS=KEYNESY2 (2020 file)
SAQELIG=SAQELIY2 (2020 file)
EVRWRK=EVRWRKY2 (2020 file)
EVRETIRE=EVRETIY2 (2020 file)
AGELAST=AGELSTY2 (2020 file)
DIABDX_M18=DIABDXY2_M18 (2020 file)
Monthly
2-character month + 19
2-character month + 20
2-character month + Y1
2-character month + Y2
Example:
PRIJA19=PRIJAY1 (2019 file)
PRIJA20=PRIJAY2 (2020 file)
Round Specific
31, 31X, or 31H in 2019 file 
42, 42X, or 42H in 2019 file 
53, 53X, or 53H in 2019 file 
31_M18 in 2019 file
42_M18 in 2019 file

31, 31X, or 31H in 2020 file

42, 42X, or 42H in 2020 file 
53, 53X, or 53H in 2020 file 
31_M18 in 2020 file
42_M18 in 2020 file
1, 1X, or 1H for 2019

2, 2X, or 2H for 2019

3, 3X, or 3H for 2019

1_M18 for 2019
2_M18 for 2019

3, 3X, 3H for 2020

4, 4X, 4H for 2020

5, 5X, 5H for 2020

3_M18 for 2020
4_M18 for 2020
Example:
RTHLTH31=RTHLTH1 (2019 file)

RTHLTH42=RTHLTH2 (2019 file)

RTHLTH53=RTHLTH3 (2019 file if YEARIND=2)
JTPAIN31_M18=JTPAIN1_M18
PROVTY42_M18=PROVTY2_M18

RTHLTH31= RTHLTH3 (2020 file if YEARIND=1 or 3)
RTHLTH42=RTHLTH4 (2020 file)

RTHLTH53=RTHLTH5 (2020 file)

JTPAIN31_M18=JTPAIN3_M18
PROVTY42_M18=PROVTY4_M18
Diabetes preventive care
1853, 1953, and 2053 in 2019 file




1953, 2053, and 2153 in 2020 file
Y0R3 for 2018
Y1R3 for 2019
Y2R3 for 2020



Y1R5 for 2019
Y2R5 for 2020
Y3R5 for 2021
Example:
DSEB1753=DSEBY0R3 (2019 file)
DSEY1753=DSEYY0R3 (2019 file)
DSEY1853=DSEYY1R3 (2019 file)
DSEY1953=DSEYY2R3 (2019 file)

DSEB1853=DSEBY1R5 (2020 file)
DSEY1853=DSEYY1R5 (2020 file)
DSEY1953=DSEYY2R5 (2020 file)
DSEY2053=DSEYY3R5 (2020 file)
Job Change
3142 or 4253
12 for 2019
23 for 2019




34 for 2020
45 for 2020
All cases:
CHGJ3142=CHGJ12(2019 file)
CHGJ4253=CHGJ23(2019 file)
YCHJ3142=YCHJ12(2019 file)
YCHJ4253=YCHJ23(2019 file)

CHGJ3142=CHGJ34 (2020 file)
CHGJ4253=CHGJ45 (2020 file)
YCHJ3142=YCHJ34 (2020 file)
YCHJ4253=YCHJ45 (2020 file)
Cancer/
Cancer in remission4
No suffixes5
Y1 for 2019

Y2 for 2020
Example:
CALUNG=CALUNGY1 (2019 file)

CALUNG=CALUNGY2 (2020 file)
Age of Diagnosis
No suffixes5
Y1 for 2019



Y2 for 2020
Example:
CHDAGED=CHDAGY1 (2019 file) 
CHOLAGED=CHOLAGY1(2019 file)

CHDAGED=CHDAGY2 (2020 file)
CHOLAGED=CHOLAGY2(2020 file)


[3] To maintain a previously-implemented 8-character naming convention, some variable names had the last character or two dropped in the renaming process. A few variables have names longer than 8 characters because they were modified in 2018 and tagged with an '_M18' suffix. These variables were altered in the same fashion they would have been without the _M18 suffix, and the _M18 suffix was retained.

[4] Starting in 2010, variables were added to indicate whether each reported cancer was in remission.

[5]To maintain a previously implemented 8-character naming convention, some variable names had the last character or two dropped in the renaming process.

Return To Table Of Contents


2.1.2 Constructed Variables for Selection of Group

The following eight variables were constructed and included on the file to facilitate the selection of appropriate cases for various analyses. Table 2 below contains descriptive statistics for these variables.

YEARIND 1=both years, 2=in 2019 only, and 3=in 2020 only
ALL5RDS In scope and data collected in all 5 rounds (0=no, 1=yes)
DIED Died during the two-year survey period (0=no, 1=yes)
INST Institutionalized for some time during the two-year survey period (0=no, 1=yes)
MILITARY Active duty military for some time during the two-year survey period (0=no, 1=yes)
ENTRSRVY Entered survey after beginning of panel (mainly births; also includes persons who had no initial chance of selection who moved into a MEPS sample household) (0=no, 1=yes)
LEFTUS Moved out of the country after beginning of panel (0=no, 1=yes)
OTHER Not identified in any of the above analytic groups (0=no, 1=yes)

Table 2. Frequencies and Percentage for Constructed Variables


Variable

Number of Records 

Percentage of Records (N=9,797)

YEARIND=1 (i.e., person in both years)

9,569

97.7

ALL5RDS=1 (yes)

9,120

93.1

DIED=1 (yes)

211

2.2

INST=1 (yes)

29

0.3

MILITARY=1 (yes)

27

0.3

ENTRSRVY=1 (yes)

362

3.7

LEFTUS=1 (yes)

25

0.3

OTHER=1 (yes)

34

0.4


Following are examples of situations where these variables would be useful in selecting records for analysis:

  • Analysts interested in working only with persons who were in-scope and had data for all five rounds of the panel should subset to cases where ALL5RDS=1.
  • If a researcher wanted to include persons who were in-scope and had data for all five rounds of the panel as well as those in the survey at the beginning of the panel who subsequently died, then they would include cases where ALL5RDS=1 or (ENTRSRVY=0 and DIED=1).
  • If a researcher wanted to include persons who were in-scope and had data for all five rounds of the panel as well as those who died in the second year of the panel, then they would include cases where ALL5RDS=1 or (DIED=1 and YEARIND=1).

Return To Table Of Contents


2.1.3 Estimation Variables

Longitudinal Estimations for Panel 24

The file contains a weight variable (LONGWT) and variance estimation variables (VARSTR, VARPSU) that should be applied when producing national estimates for longitudinal analyses. For example, LONGWT applied to the 9,120 cases where ALL5RDS=1 produces a weighted population estimate of 307.0 million. This represents an estimate of the number of persons in the civilian noninstitutionalized population for the entire two-year period from 2019-2020. To obtain estimates of variability (such as the standard error of sample estimates or corresponding confidence intervals) for estimates based on MEPS survey data, one needs to take into account the complex sample design of MEPS by specifying the estimation variables including stratum of sample selection (VARSTR), primary sampling unit (VARPSU) and longitudinal weight (LONGWT).


This longitudinal file also contains a longitudinal SAQ weight variable (LSAQWT). This weight variable should be used to perform longitudinal analyses involving any variables from the self-administered questionnaire (SAQ) which was administered to persons age 18 and older in both rounds 2 and 4 of the survey. The variable SAQRDS24 can be used to identify which persons have SAQ data for both versus only one of the two rounds. Table 3 below provides the estimated population size (i.e., the sum of LSAQWT values) for cases with only one round of SAQ data (i.e., SAQRDS24=0) and for cases with both rounds of SAQ data (i.e., SAQRDS24=1). The estimated population size for analyses based on the 4,737 cases with SAQ data for both rounds (i.e., SAQRDS24=1) is 229.34 million.


Table 3. Number of Respondents and Estimated Population Size for SAQ Analyses


Value of
SAQRDS24

Description

Number of
Respondents
(Unweighted)

Estimated Population
Size (Weighted by
LSAQWT)

0

Persons with one round of SAQ data

5,060

23,602,504

1

Persons with both rounds of SAQ data

4,737

229,285,551

Total

All SAQ respondents

9,797

252,888,055


Pooled Estimations

When analyzing subpopulations and/or low prevalence events, it may be desirable to pool together more than one panel of MEPS-HC data to yield sample sizes large enough to generate reliable estimates. If only data from Panels 7 and beyond are being pooled, then simply use the strata and PSU variables (VARSTR, VARPSU)[6] provided on the longitudinal files for pooled estimation. However, because Panels 1-6 MEPS longitudinal weight files were released with panel-specific variance structures, it is necessary to obtain the set of appropriate variance estimation variables from the HC-036 Pooled Estimation File when pooling involves these panels.

Notable Changes from Previous Longitudinal Files

The reference period for Panel 24 Round 5 is longer than the Round 5 reference period for previous panels. The extension of Panel 24 to nine rounds led to the fielding of Round 5 as a cross-year round where respondents were asked to provide information about the reference period between the Round 4 interview in 2020 and the Round 5 interview date in 2021. Thus, the Round 5 reference period generally ended in 2021.[7] In previous panels, the Round 5 reference period ended on or before December 31 during respondents' second year of participation (for example, the Panel 23 Round 5 reference period generally ended on December 31, 2019).

The extension of Round 5 for Panel 24 may affect certain survey administration variables (e.g., PSTATS5), demographic variables (e.g., AGE5X), person-level condition variables, disability days indicator variables, access to care variables, employment variables, and health insurance variables with names ending in 5 or 5X. Variables in these categories may include transitions, events, or information that occurred in 2021. Users should be aware of this distinction when pooling and/or comparing data across panels. Income and tax filing variables, other health insurance variables, and utilization, expenditures, and sources of payment variables are not affected by the Round 5 extension and include data from 2020 only, consistent with previous longitudinal files. Please refer to the documentation for HC-224, the 2020 Full Year Consolidated File, for additional details about variable construction.

Return To Table Of Contents




[6] Note that variable names for strata and PSU are VARSTR and VARPSU, respectively, in longitudinal files for panel 9 and beyond. These variables were named differently in the longitudinal files for panel 7 (VARSTRP7, VARPSUP7) and panel 8 (VARSTRP8, VARPSUP8) and need to be standardized when pooling with subsequent panels.

[7] The end date of the reference period for a person is prior to the date of the interview if the person was deceased during the round, left the reporting unit (RU), was institutionalized prior to that round's interview, or left the RU to join the military.



Back to topGo back to top
Back to Top Go back to top

Connect With Us

Facebook Twitter You Tube LinkedIn

Sign up for Email Updates

Agency for Healthcare Research and Quality

5600 Fishers Lane
Rockville, MD 20857
Telephone: (301) 427-1364