MEPS HC-222 2020 Medical Conditions
October 2022
Due to the COVID-19 pandemic, changes were made to the
2020 MEPS data collection that analysts should keep in mind when doing trend
analysis and pooling years of data. 1) The MEPS moved primarily to a phone
rather than in-person survey. 2) Panels 23 and 24 were extended to nine rounds
(four years) of data collection as opposed to the historical five rounds (two
years). Because of the unforeseeable nature of the pandemic, data collection for
2020 included Round 5 interviews for Panel 23 that were fielded under the
assumption that that interview would be the panel’s last interview. Researchers
using variables related to the first interview of the calendar year should read
the documentation for their specific variables to understand the sources of the
values for Panel 23.
Agency for Healthcare Research and Quality
Center for Financing, Access, and Cost Trends
5600 Fishers Lane
Rockville, MD 20857
(301) 427-1406
Table of Contents
A. Data Use Agreement
B. Background
1.0 Household Component
2.0 Medical Provider Component
3.0 Survey Management and Data Collection
C. Technical and Programming Information
1.0 General Information
2.0 Data File Information
2.1 Codebook Structure
2.2 Reserved Codes
2.3 Codebook Format
2.4 Variable Naming
2.5 File Contents
2.5.1 Identifier Variables (DUID-CONDRN)
2.5.2 Medical Condition Variables (AGEDIAG-ICD10CDX)
2.5.3 Utilization Variables (OBNUM - RXNUM)
3.0 Survey Sample Information
3.1 Discussion of Pandemic Effects on Quality of 2020 MEPS Data
3.1.1 Summary
3.1.2 Overview
3.1.3 The Impact of the Pandemic on some Major Federal Surveys
3.1.4 Modifications to the MEPS-HC 2020 Sample Design and Implementation Effort in Response to the Pandemic
3.1.5 Data Quality Issues for MEPS FY 2020
3.1.6 Discussion and Guidance
3.2 Sample Weight (PERWT20F)
3.3 Details on Person Weight Construction
3.3.1 MEPS Panel 23 Weight Development Process
3.3.2 MEPS Panel 24 Weight Development Process
3.3.3 MEPS Panel 25 Weight Development Process
3.3.4 The Final Weight for 2020
3.4 Coverage
3.5 Using MEPS Data for Trend Analysis
4.0 Merging/Linking MEPS Data Files
4.1 National Health Interview Survey (NHIS)
4.2 Longitudinal Analysis
References
D. Variable-Source Crosswalk
Appendix 1: ICD10CDX and CCSR Condition Code Frequencies
Appendix 2: List of Conditions Asked in Priority Conditions Enumeration Section
Individual identifiers have been removed from the micro-data contained in these files. Nevertheless, under sections 308 (d) and 903 (c) of the Public Health Service Act (42 U.S.C. 242m and 42 U.S.C. 299 a-1), data collected by the Agency for Healthcare Research and Quality (AHRQ) and/or the National Center for Health Statistics (NCHS) may not be used for any purpose other than for the purpose for which they were supplied; any effort to determine the identity of any reported cases is prohibited by law.
Therefore in accordance with the above referenced Federal Statute, it is understood that:
- No one is to use the data in this data set in any way except for statistical reporting and analysis; and
- If the identity of any person or establishment should be discovered inadvertently, then (a) no use will be made of this knowledge, (b) the Director Office of Management AHRQ will be advised of this incident, (c) the information that would identify any individual or establishment will be safeguarded or destroyed, as requested by AHRQ, and (d) no one else will be informed of the discovered identity; and
- No one will attempt to link this data set with individually identifiable records from any data sets other than the Medical Expenditure Panel Survey or the National Health Interview Survey. Furthermore, linkage of the Medical Expenditure Panel Survey and the National Health Interview Survey may not occur outside the AHRQ Data Center, NCHS Research Data Center (RDC) or the U.S. Census RDC network.
By using these data you signify your agreement to comply with the above stated statutorily based requirements with the knowledge that deliberately making a false statement in any matter within the jurisdiction of any department or agency of the Federal Government violates Title 18 part 1 Chapter 47 Section 1001 and is punishable by a fine of up to $10,000 or up to 5 years in prison.
The Agency for Healthcare Research and Quality requests that users cite AHRQ and the Medical Expenditure Panel Survey as the data source in any publications or research based upon these data.
Return To Table Of Contents
The Medical Expenditure Panel Survey (MEPS) provides nationally representative estimates of health care use, expenditures, sources of payment, and health insurance coverage for the U.S. civilian noninstitutionalized population. The MEPS Household Component (HC) also provides estimates of respondents’ health status, demographic and socio-economic characteristics, employment, access to care, and satisfaction with health care. Estimates can be produced for individuals, families, and selected population subgroups. The panel design of the survey, which includes 5 Rounds of interviews covering 2 full calendar years and two additional rounds in 2020 covering a third year to compensate for the smaller number of completed interviews in Panel 25, provides data for examining person level changes in selected variables such as expenditures, health insurance coverage, and health status. Using computer assisted personal interviewing (CAPI) technology, information about each household member is collected, and the survey builds on this information from interview to interview. All data for a sampled household are reported by a single household respondent.
The MEPS HC was initiated in 1996. Each year a new panel of sample households is selected. Because the data collected are comparable to those from earlier medical expenditure surveys conducted in 1977 and 1987, it is possible to analyze long-term trends. Each annual MEPS HC sample size is about 15,000 households. Data can be analyzed at either the person or event level. Data must be weighted to produce national estimates.
The set of households selected for each panel of the MEPS HC is a subsample of households participating in the previous year’s National Health Interview Survey (NHIS) conducted by the National Center for Health Statistics. The NHIS sampling frame provides a nationally representative sample of the U.S. civilian noninstitutionalized population. In 2006, the NHIS implemented a new sample design, which included Asian persons in addition to households with Black and Hispanic persons in the oversampling of minority populations. NHIS introduced a new sample design in 2016 that discontinued oversampling of these minority groups.
Return To Table Of Contents
Upon completion of the household CAPI interview and obtaining permission from the household survey respondents, a sample of medical providers are contacted by telephone to obtain information that household respondents cannot accurately provide. This part of the MEPS is called the Medical Provider Component (MPC) and information is collected on dates of visits, diagnosis and procedure codes, charges and payments. The Pharmacy Component (PC), a subcomponent of the MPC, does not collect charges or diagnosis and procedure codes but does collect drug detail information, including National Drug Code (NDC) and medicine name, as well as amounts of payment. The MPC is not designed to yield national estimates. It is primarily used as an imputation source to supplement/replace household reported expenditure information.
Return To Table Of Contents
MEPS HC and MPC data are collected under the authority of the Public Health Service Act. Data are collected under contract with Westat, Inc. (MEPS HC) and Research Triangle Institute (MEPS MPC). Data sets and summary statistics are edited and published in accordance with the confidentiality provisions of the Public Health Service Act and the Privacy Act. The National Center for Health Statistics (NCHS) provides consultation and technical assistance.
As soon as data collection and editing are completed, the MEPS survey data are released to the public in staged releases of micro data files and tables via the
MEPS website.
Additional information on MEPS is available from the MEPS project manager or the MEPS public use data manager at the Center for Financing, Access, and Cost Trends, Agency for Healthcare Research and Quality, 5600 Fishers Lane, Rockville, MD 20857 (301-427-1406).
Return To Table Of Contents
This documentation describes the data contained in MEPS Public Use Release HC-222, which is one in a series of public use data files to be released from the 2020 Medical Expenditure Panel Survey Household Component (MEPS HC). Released as an ASCII file (with related SAS, SPSS, R, and Stata programming statements and data user information), and a SAS data set, SAS transport file, Stata data set, and Excel file, this public use file provides information on household-reported medical conditions collected on a nationally representative sample of the civilian noninstitutionalized population of the United States for calendar year 2020 MEPS HC. The file contains 30 variables and has a logical record length of 111 with an additional 2-byte carriage return/line feed at the end of each record.
This documentation offers a brief overview of the types and levels of data provided and the content and structure of the files. It contains the following sections:
- Data File Information
- Survey Sample Information
- Merging/Linking MEPS Data Files
- Variable-Source Crosswalk
- Appendices
- ICD10CDX and CCSR Condition Code Frequencies
- List of Conditions Asked in Priority Conditions Enumeration Section
A codebook of all the variables included in the 2020 Medical Conditions File is provided in an accompanying file.
For more information on the MEPS sample design, see Chowdhury et al (2019). A copy of the survey instrument used to collect the information on this file is available on the MEPS website.
Return To Table Of Contents
This file contains 80,802 records. Each record represents one current medical condition reported for a household survey member who resides in an eligible responding household and who has a positive person or family weight. A condition is defined as current if it is linked to an event during 2020. Conditions in the Priority Condition Enumeration (PE) section are asked in the context of “has person ever been told by a doctor or other health care professional that they have (condition)?” except joint pain and chronic bronchitis, which ask only about the last 12 months. Persons with a response of Yes (1) to a priority condition question for whom the condition is not current as defined above will not have a record for that condition in this file.
Full year (FY) 2020 is the first data year to include three panels of data; Panel 23 was extended to include Rounds 6 and 7.
For most variables on the file, the codebook provides both weighted and unweighted frequencies. The exceptions to this are weight variables and variance estimation variables. Only unweighted frequencies of these variables are included in the accompanying codebook file. See the Weights Variables list in Section D, Variable-Source Crosswalk.
Person-level data (e.g., demographic or health insurance characteristics) from the 2020 MEPS full-year consolidated file (HC-224) can be merged to the records in this file using DUPERSID (see Section 4.0 for details). Since each record represents a single condition reported by a household respondent, some household members may have multiple medical conditions and thus will be represented by multiple records on this file. Other household members may have had no reported medical conditions and thus will have no records on this file. Still other household members may have had a reported medical condition that did not meet the criteria above and thus will have no records on this file. Data from this file also can be merged to 2020 MEPS Event Files (HC-220A, and HC-220D through HC-220H) by using the link files provided in HC-220I. (See HC-220I documentation for details.)
Return To Table Of Contents
The codebook and data file list variables in the following order:
- Unique person identifiers
- Unique condition identifiers
- Medical condition variables
- Utilization variables
- Weight and variance estimation variables
Note that the person identifier is unique within this data year.
Return To Table Of Contents
The following reserved code values are used:
Value |
Definition |
-1 INAPPLICABLE |
Question was not asked due to skip pattern |
-7 REFUSED |
Question was asked and respondent refused to answer question |
-8 DK |
Question was asked and respondent did not know answer or the information could not be ascertained |
-15 CANNOT BE COMPUTED |
Value cannot be derived from data |
The value -15 (CANNOT BE COMPUTED) is assigned to MEPS constructed variables in cases where there is not enough information from the MEPS instrument to calculate the constructed variables. “Not enough information” is often the result of skip patterns in the data or from missing information resulting from MEPS responses of -7 (REFUSED) or -8 (DK). Note that reserved code -8 includes cases where the information from the question was “not ascertained” or where the respondent chose “don’t know”.
Return To Table Of Contents
This codebook describes an ASCII data set (although the data are also being provided in an Excel file, a Stata data set, a SAS data set, and a SAS transport file), and provides the following programming identifiers for each variable:
Variable Programming Identifiers
Identifier |
Description |
Name |
Variable name |
Description |
Variable descriptor |
Format |
Number of bytes |
Type |
Type of data: numeric (indicated by NUM) or character (indicated by
CHAR) |
Start |
Beginning column position of variable in record |
End |
Ending column position of variable in record |
Return To Table Of Contents
In general, variable names reflect the content of the variable, with an 8-character limitation. Edited variables end in an “X” and are so noted in the variable label. (CONDIDX, which is an encrypted identifier variable, also ends in an “X”.)
As variable collection, universe, or categories are altered, the variable name will be appended with “_Myy” to indicate in which year the alterations took place. Details about these alterations can be found throughout this document.
Variables contained in this delivery were derived either from the questionnaire itself or from the CAPI. The source of each variable is identified in Section D, Variable-Source Crosswalk. Sources for each variable are indicated in one of three ways: (1) variables derived from CAPI or assigned in sampling are so indicated; (2) variables collected at one or more specific questions have those numbers and questionnaire sections indicated in the “SOURCE” column; and (3) variables constructed from multiple questions using complex algorithms are labeled “Constructed” in the “SOURCE” column.
Return To Table Of Contents
The definitions of Dwelling Units (DUs) in the MEPS HC are generally consistent with the definitions employed for the National Health Interview Survey (NHIS). The Dwelling Unit ID (DUID) is a seven-digit ID number consisting of a 2-digit panel number followed by a five-digit random number assigned after the case was sampled for MEPS. A three-digit person number (PID) uniquely identifies each person within the DU. The variable DUPERSID is the combination of the variables DUID and PID. IDs begin with a 2-digit panel number.
CONDN is the condition number and uniquely identifies each condition reported for an individual. The range on this file for CONDN is 1-65. A CONDN beginning with “9” reflects a condition that was added during the editing process.
The variable CONDIDX uniquely identifies each condition (i.e., each record on the file) and is the combination of DUPERSID and CONDN. CONDIDX has a length of 13 with DUPERSID (10) and CONDN (3) combined.
PANEL is a constructed variable used to specify the panel number for the interview in which the condition was reported. PANEL will indicate Panel 23, Panel 24, or Panel 25. The panel number is included as the first two digits of the DUID and DUPERSID.
CONDRN indicates the round in which the condition was first reported. For a small number of cases, conditions that actually began in an earlier round were not reported by respondents until subsequent rounds of data collection. During file construction, editing was performed for these cases in order to reconcile the round in which a condition began and the round in which the condition was first reported.
Return To Table Of Contents
This file contains variables describing medical conditions reported by respondents in several sections of the MEPS questionnaire, and all questionnaire sections collecting information about health provider visits and/or prescription medications (see Variable-Source Crosswalk in Section D for details).
Priority Conditions and Injuries
Certain conditions were a priori designated as “priority conditions” due to their prevalence, expense, or relevance to policy. Some of these are long-term, life-threatening conditions, such as cancer, diabetes, emphysema, high cholesterol, hypertension, ischemic heart disease, and stroke. Others are chronic, manageable conditions, including arthritis and asthma. The only mental health condition on the priority conditions list is attention deficit hyperactivity disorder/attention deficit disorder. See Appendix 2 for a full list of the priority conditions.
When a condition was first mentioned, respondents were asked whether it was due to an accident or injury (INJURY=1). Only non-priority conditions (i.e., conditions reported in a section other than PE) are eligible to be injuries. The interviewer is prevented from selecting priority conditions as injuries.
Age Priority Condition Began
The age of diagnosis (AGEDIAG) was collected for all priority conditions, except joint pain. For confidentiality reasons, AGEDIAG is set to Inapplicable (-1) for cancer conditions.
To ensure confidentiality,
age of diagnosis was top-coded to 85. This corresponds with the
age top-coding in person-level PUFs.
Follow-up Questions for Injuries
When a respondent reported that a condition resulted from an accident or injury (INJURY=1), respondents were asked during the round in which the injury was first reported whether the accident/injury occurred at work (ACCDNWRK). This question was not asked about persons aged 15 and younger; the condition had ACCDNWRK coded to inapplicable (-1) for those persons.
Sources for Conditions on the MEPS Conditions File
The records on this file correspond with medical condition records collected by CAPI and stored on a person’s MEPS conditions roster. Conditions can be added to the MEPS conditions roster in several ways. A condition can be reported in the Priority Condition Enumeration (PE) section in which persons are asked if they have been diagnosed with specific conditions. The condition can be identified as the reason reported by the household respondent for a particular medical event (hospital stay, outpatient visit, emergency room visit, home health episode, prescribed medication purchase, or medical provider visit). Some condition information is collected in the Medical Provider Component of MEPS. However, since it is not available for everyone in the sample, it is not used to supplement, replace, or verify household-reported condition data. Conditions reported in the PE section that are not current are not included on this file.
Treatment of Data from Rounds Not Occurring in 2020
Prior to the 2008 file, priority conditions reported during Rounds 1 and 2 of the second year panel were included on the file even if the conditions were not related to an event or reported as a serious condition occurring in the second year of the panel. Beginning in 2008, priority conditions are included on the file only if they are also current conditions. From 2008-2017, a current condition was defined as a condition linked to an event or a condition the person was currently experiencing (i.e., a condition selected in the Condition Enumeration (CE) section). However, starting in Panel 21 Round 5 and Panel 22 Round 3, a current condition is defined only as a condition linked to a current year event. Conditions from Panel 23 Rounds 3, 4, and 5 as well as Panel 24 Rounds 1, 2, and 3 that are not included in the 2020 file may be available in the 2019 Medical Conditions File if the person had a positive person or family weight in 2019.
Note: Priority conditions are generally chronic conditions. Even though a person may not have reported an event in 2020 due to the condition, analysts should consider that the person may still be experiencing the condition. If a Panel 24 person reported a priority condition in Round 1 or 2 and did not have an event for the condition in Round 3, 4, or 5, the condition will not be included on the 2020 Medical Conditions File. Similarly, if a Panel 23 person reported a priority condition in Round 1, 2, 3, 4, or 5 and did not have an event for the condition in Round 6 or 7, the condition will not be included on the 2020 Medical Conditions File.
Rounds in Which Conditions Were Reported/Selected (CRND1 - CRND7)
A set of constructed variables indicates the round in which the condition was first reported (CONDRN), and the subsequent round(s) in which the condition was selected (CRND1 - CRND7). The condition may be reported or selected when the person reports an event that occurred due to the condition. For example, consider a condition for which CRND1 = 0, CRND2 = 1, and CRND3 = 1. For non-priority conditions (conditions not asked in the PE section), this sequence of indicators on a condition record implies that the condition was not present during Round 1 (CRND1 = 0), was first mentioned during Round 2 (CRND2 = 1, CONDRN = 2), and was selected again during Round 3 (CRND3 = 1). For priority conditions, this sequence of indicators implies that the condition was reported in the PE section in Round 1 (CONDRN = 1, CRND1 = 0) but was not connected with an event until Rounds 2 and 3 (CRND2 = 1, CRND3 = 1). Because priority conditions are asked in the context of “has person ever been told by a doctor or other health care professional that they have (condition)?” except joint pain and chronic bronchitis, which ask only about the last 12 months, a priority condition might not be selected in the round in which it was first reported. For Panel 23 records, a condition is current if there is an event linked to a condition in Round 6 or the 2020 portion of Round 7. For Panel 24 records, a condition is current if there is an event linked to a condition in the 2020 portion of Rounds 3 or 5, or in Round 4. For Panel 25 records, a condition is current if there is an event linked to a condition in Rounds 1 or 2, or in the 2020 portion of Round 3.
Diagnosis Codes
Medical conditions reported by the Household Component respondent were recorded by the interviewer using a condition pick-list with ICD-10-CM codes already assigned to conditions in the list. Reported conditions not in the pick-list were recorded as verbatim text and then were coded to ICD-10-CM codes (ICD10CDX) by professional coders.
Coders followed specific guidelines in coding missing values to the ICD-10-CM diagnosis condition variable when a verbatim text string could not be matched to an ICD-10-CM code through the pick-list. ICD10CDX was coded -15 (Cannot be Computed) where the verbatim text fell into one of three categories: (1) the text indicated that the condition was unknown (e.g., DK); (2) the text indicated the condition could not be diagnosed by a doctor (e.g., doctor doesn’t know); or (3) the specified condition was not codable. If the text indicated a procedure and the condition associated with the procedure could be discerned from the text, the condition itself is coded. For example, “cataract surgery” is coded as the condition “other cataract” (ICD10CDX is set to code “H26”). If the condition could not be discerned (e.g. “outpatient surgery”), ICD10CDX is set to -15.
In order to preserve confidentiality, all of the conditions provided on this file have been collapsed to 3-digit diagnosis code categories rather than the fully-specified ICD-10-CM code. For example, the ICD10CDX value of J02 “Acute pharyngitis” includes the fully-specified subclassifications J020 and J029; the value F31 “Bipolar disorder” includes the fully-specified subclassifications F3110 through F319. Table 1 in Appendix 1 provides unweighted and weighted frequencies for all ICD-10-CM condition code values reported on the file. Less than 1 percent of the ICD-10-CM codes on this file were edited further by collapsing two or more 3-digit codes into one 3-digit code. This includes clinically rare conditions that were recoded to broader codes by clinicians. A condition is determined to be clinically rare if it appears on the
National Institutes of Health’s list of rare diseases.
For confidentiality purposes, approximately 4% of ICD-10-CM codes were recoded to -15 (Cannot be Computed) for conditions where the frequency was less than 20 for the total unweighted population in the file or less than 200,000 for the weighted population. Additional factors used to determine recoding include age and gender.
In a small number of cases, diagnosis and condition codes were recoded to -15 (Cannot be Computed) if they denoted a pregnancy for a person younger than 16 or older than 44. Less than one-tenth of 1 percent of records were recoded in this manner on the 2020 Medical Conditions file. The person’s age was determined by linking the 2020 Medical Conditions file to the 2019 and 2020 Population Characteristics File. If the person’s age is under 16 or over 44 in the round in which the condition was reported, the appropriate condition code was recoded to -15 (Cannot be Computed).
Users should note that because of the design of the survey, most deliveries (i.e., births) are coded as pregnancies. For more accurate estimates for deliveries, analysts should use RSNINHOS “Reason Entered Hospital” found on the Hospital Inpatient Stays Public Use File (HC-220D).
Each year, a few conditions on the final file may fall below the confidentiality threshold. This is due to the multistage file development process. The confidentiality recoding is performed on the preliminary version of the Conditions file each year. This preliminary version is used in the development of other event PUFs and, in turn, these event PUFs are used in the development of the final Conditions file. During this process, some records from the preliminary file are dropped because only records that are relevant to the current data year are reflected in the final Conditions PUF.
Conditions file data can be merged with the 2020 MEPS Event Files using the 2020 MEPS Condition-Event Linking file (HC-220I). Because the conditions have been collapsed to 3-digit diagnosis code categories rather than the fully-specified ICD-10-CM code, it is possible for there to be duplicate ICD-10-CM condition codes linked to a single medical event when different fully-specified conditions are coded to the same 3-digit code.
Conditions were reported in several sections of the HC questionnaire (see Variable-Source Crosswalk in Section D). Labels for all values of ICD10CDX, as shown in Table 1 of Appendix 1, are provided in the SAS programming statements included in this release (see the H222SU.TXT file).
Clinical Classification Software Refined
Clinical Classification Software Refined (CCSR) are used alongside ICD-10-CM diagnosis codes to group medical conditions into clinically meaningful categories. Although ICD-10-CM diagnosis codes can map to multiple CCSR codes, for the purposes of this PUF, one ICD-10-CM diagnosis code may map to up to three CCSR categories (CCSR1X, CCSR2X, CCSR3X) using the v2021.3 release of the CCSR for ICD-10-CM diagnoses. The CCSR categories on this PUF are listed in alphabetical order and do not indicate a primary and secondary diagnosis. For more information on CCSR, visit the
user guide for CCSR.
For confidentiality purposes, less than 2% of the CCSR categories were collapsed into a broader code for the appropriate body system where the frequency was less than 20 for the total unweighted population in the file or less than 200,000 for the weighted population. For example, BLD001 (Nutritional Anemia), may be recoded to BLD000 (Disease of Blood and Disorders Involving Immune Mechanism), thus revealing only the body system. Less than 1% of CCSR codes were recoded to -15 (Cannot be Computed) based on frequencies of ICD10CDX and CCSR pairs.
Table 2 in Appendix 1 provides unweighted and weighted frequencies for CCSR combinations reported on the file.
Return To Table Of Contents
The variables OBNUM, OPNUM, HHNUM, IPNUM and ERNUM indicate the total number of 2020 events that can be linked to each condition record on the current file, i.e., office-based, outpatient, home health, inpatient hospital stays and emergency room visits. Note that the HHNUM variable includes all home health types, including informal care. The variable RXNUM is a count of distinct prescribed medicines, i.e., not refills, in a round linked to a given condition.
These counts of events were derived from Expenditure Event Public Use Files (HC-220G, HC-220F, HC-220H, HC-220D, HC-220E, and HC-220A). Events associated with conditions include all utilization that occurred between January 1, 2020 and December 31, 2020.
Because persons can be seen for more than one condition per visit, these frequencies will not match the person- or event-level utilization counts. For example, if a person had one inpatient hospital stay and was treated for a fractured hip, a fractured shoulder, and a concussion, each of these conditions has a unique record in this file and IPNUM=1 for each record. By summing IPNUM for these records, the total inpatient hospital stays would be three when actually there was only one inpatient hospital stay for that person and three conditions were treated. These variables are useful for determining the number of inpatient hospital stays associated with a particular condition.
Return To Table Of Contents
Data collection for in-person sample surveys in 2020 presented real challenges after the onset of the COVID-19 pandemic at a national level in mid-March of that year. After major modifications to the standard MEPS study design, it was possible to collect data safely, but there were naturally concerns about the quality of the data after such modifications. Some issues related to data quality were identified and are discussed below. As with most in-person surveys conducted in 2020, researchers are counseled to take care in the interpretation of 2020 estimates including the comparison of such estimates with those of other years.
Return To Table Of Contents
The onset of the COVID-19 pandemic in 2020 had a major impact on the MEPS Household Component (MEPS-HC) as it did for most major federal surveys and, of course, American life generally. The following discussion describes 1) the general impact of the pandemic on three major federal surveys (the effects on two of which also affect MEPS); 2) modifications to the MEPS sample design and field operations in 2020 due to the pandemic; and 3) potential data quality issues in the FY 2020 MEPS data related to the COVID-19 pandemic.
Return To Table Of Contents
Many important federal surveys were collecting data when much of the nation shut down in the face of the pandemic in March 2020. Among them were the Current Population Survey (CPS), the American Community Survey (ACS), and the National Health Interview Survey (NHIS). The ACS and the NHIS field new samples each year. The CPS includes rotating panels, meaning some of the sampled households fielded had participated in prior years while others were fresh. Two of these surveys have important roles in MEPS. Estimates of CPS subgroups serve as benchmarks for the MEPS weighting process (referred to below as “raking control totals”) while households fielded for Round 1 of MEPS in each year are selected as a subsample of the NHIS responding households from the prior year.
Because data collection in 2020 occurred under such unusual circumstances, all three of these surveys have reported bias concerns. (In fact, the ACS decided not to release a standard database for 2020 due to the uncertain quality of the data, while the CPS and the NHIS released data but included reports discussing concerns about bias.) All three surveys have reported evidence of nonresponse bias, specifically, that households in higher socio-economic levels were relatively more likely to respond and the sample weighting was unable to fully compensate for this. As a result, analysts have been cautioned about the accuracy of survey estimates and the ability to compare resulting estimates with estimates obtained in the years prior to the pandemic.
The quality of CPS data is of particular importance to Full Year 2020 MEPS PUFs as CPS estimates serve as the control totals for the raking component of the MEPS weighting process. These control totals are based on the following demographic variables: age, sex, race/ethnicity, region, MSA status, educational attainment, and poverty status. The CPS estimates used in the development of the FY 2020 MEPS PUF weights that were based on the variables age, sex, race/ethnicity, region, and MSA status were evaluated by the Census Bureau and determined to be of high quality. However, similar evaluations of the corresponding CPS estimates associated with educational attainment and poverty status found that these estimates suffered from bias.
A set of references discussing the fielding of these three surveys during the pandemic and resulting bias concerns can be found in the References section of this document.
Return To Table Of Contents
For the MEPS-HC, face-to-face interviewing ceased due to the COVID-19 pandemic on March 17, 2020. At that time, there were two MEPS panels in the field for which 2020 data were being collected: Round 1 of Panel 25 and Round 3 of Panel 24. The sampled households for Panel 25 were being contacted and asked to participate in MEPS for the first time while those from Panel 24 had already participated in MEPS for two rounds. A third MEPS panel was also in the field in early 2020, Round 5 of Panel 23, collecting data for the last portion of 2019.
In developing a plan for how best to resume MEPS data collection, the primary issues were how to do so safely for both sampled household members and interviewers and the potential impact on data quality. Telephone data collection, although not the preferred method of data collection in general for MEPS-HC, was the natural option because it did not require in-person contact with respondents and could be implemented relatively quickly. The impact of changing to telephone on both response rates and data quality was expected to be larger for Panel 25 Round 1 (e.g., no experience with reporting health care events in the recent past). At the time in-person interviewing stopped in mid-March 2020 completion rates for Panels 23 and 24 were substantially higher than those for Panel 25.
AHRQ decided to field Panel 23 for at least one more year, asking Panel 23 respondents if they would be open to further participation in MEPS in newly added Rounds 6 and 7. Extending Panel 23 was meant to both offset the decrease in the number of cases in the FY 2020 data related to lower expected sample yields for Panel 25 and to improve data quality by retaining a set of participants who were familiar with MEPS. These decisions required major changes in survey operations, including adding a fall Panel 23 Round 6 interview covering all 2020 events from January 1, 2020 to the date of the interview.
Return To Table Of Contents
Numerous analyses were conducted to examine potential impacts on data quality and to gain a more complete understanding of these issues. Zuvekas and Kashihara (2021) discuss some of these analyses and provide additional background information on how the MEPS study design was modified in 2020 in response to the pandemic. Three sources of potential bias that were identified are noted here: the long recall period for Round 6 of Panel 23; switching from in-person to telephone interviewing which likely had a larger impact on Panel 25; and the impact of CPS bias on the MEPS weights. Each is considered in turn.
Comparisons of health care utilization data for Panel 24 and Panel 23 indicated that the extended reference period for Panel 23 Round 6 may have resulted in recall issues for respondents. Round 6 was initially fielded in the late summer and early fall of 2020, and because the Round 5 reference period ended on December 31, 2019, the recall period for health care events and related information extended back to January 1, 2020, much longer than for typical MEPS rounds. For Panel 23 Round 6 respondents, events of a less salient nature, such as dental visits and office-based physician visits, occurring in early 2020 were under-reported. Underreporting was confirmed through both an examination of differential utilization across 2020 for Panel 23 respondents as well as statistical comparisons of Panel 23 and Panel 24 event estimates. Adjustments were made to the sample weights for Panel 23 to help address this concern. Details on these adjustments can be found in Section 3.3.1.
Comparisons of Panel 25 with Panel 24 health care utilization data found that the difference in estimates reached statistical significance for several event types with those from Panel 25 generally being the higher. The same comparisons between first and second year panels in MEPS in recent years showed relatively few such differences with no differences at all in 2019.
Finally, AHRQ decided to calibrate, via raking, the FY 2020 Consolidated PUF weights to control totals reflecting CPS 2021 poverty status data. As discussed earlier, bias was identified by the Census Bureau in the 2020 and 2021 CPS income data and correlates. Nevertheless, the Census Bureau decided to use its standard sample weighting approach for both the 2020 and 2021 CPS ASEC data sets while recognizing some deficiencies in the nonresponse adjustment approach for the two years as a result of data collection during the pandemic. Similarly, MEPS has used poverty status based on the CPS estimates for calibration for many years and continued to do so for the 2020 Full Year Consolidated PUF as it was decided that the advantages of doing so outweighed the disadvantages.
Return To Table Of Contents
The additional procedures for developing person-level and family-level final weights for the 2020 Consolidated MEPS data were designed to correct for potential biases in the data due to changes in data collection and response bias. However, evaluations of MEPS data quality in 2020 - corroborated in analyses of other Federal surveys fielded in 2020 - suggest that users of the MEPS FY 2020 Consolidated PUF should exercise caution when interpreting estimates and assessing analyses based on these data as well as in comparing 2020 estimates to those of prior years.
Return To Table Of Contents
There is a single full-year person-level weight (PERWT20F) assigned to each record for each key, in-scope person who responded to MEPS for the full period of time that he or she was in-scope during 2020. A key person was either a member of a responding NHIS household at the time of the interview or joined a family associated with such a household after being out-of-scope at the time of the NHIS (the latter circumstance includes newborns as well as those returning from military service, an institution, or residence in a foreign country). A person is in-scope whenever he or she is a member of the civilian noninstitutionalized portion of the U.S. population.
Return To Table Of Contents
The person-level weight PERWT20F was developed in several stages. Person-level weights for Panel 23, Panel 24, and Panel 25 were created separately. The weighting process for each panel included an adjustment for nonresponse over time and calibration to independent population figures. The calibration was initially accomplished separately for each panel by raking the corresponding sample weights for those in-scope at the end of the calendar year to Current Population Survey (CPS) population estimates based on six variables. The six variables used in the establishment of the initial person-level control figures were:
educational attainment of the reference person (no degree, high school/GED no college, some college, bachelor’s degree or higher); census region (Northeast, Midwest, South, West); MSA status (MSA, non-MSA); race/ethnicity (Hispanic; Black, non-Hispanic; Asian, non-Hispanic; and other); sex; and age. A 2020 composite weight was then formed by multiplying each weight from Panel 23 by the factor .29, each weight from Panel 24 by the factor .36, and each weight from Panel 25 by the factor .35. The choice of factors reflected the relative sample sizes of the three panels, helping to limit the variance of estimates obtained from pooling the three samples. The composite weight was raked to the same set of CPS-based control totals.
The standard approach for MEPS weighting is as follows. When the poverty status information derived from income variables becomes available, a final raking is undertaken. The full sample weight appearing on the Population Characteristics PUF for a given year is re-raked, establishing control figures reflecting poverty status rather than educational attainment. Thus, control totals are established using poverty status (five categories: below poverty, from 100 to 125 percent of poverty, from 125 to 200 percent of poverty, from 200 to 400 percent of poverty, at least 400 percent of poverty) as well as the other five variables previously used in the weight calibration.
This approach was modified for the full sample weights appearing on the FY 2020 Consolidated PUF. The raking of the Panel 23 weights was re-done as described in Section 3.3.1 below, and then the resulting Panel 23 weights were composited with those previously established for Panels 24 and 25 with the same factors as described previously, producing a new full sample weight. This new weight was then raked to control figures reflecting the standard five variables plus poverty status.
Return To Table Of Contents
The person-level weight for MEPS Panel 23 was developed using the 2019 full-year weight for an individual as the initially assigned weight for 2019 survey participants present in 2020. For key, in-scope members who joined an RU some time in 2020 after being out-of-scope in 2019, the initially assigned person-level weight was the corresponding 2019 family weight. The weighting process included an adjustment for person-level nonresponse over Rounds 6 and 7 as well as raking to population control figures for December 2020 for key, responding persons in-scope on December 31, 2020. These control totals were derived by scaling back the population distribution obtained from the March 2021 CPS to reflect the December 31, 2020 estimated population total (estimated based on Census projections for January 1, 2021). Variables used for person-level raking included: education of the reference person (three categories: no degree; high school/GED only or some college; Bachelor’s or higher degree); Census region (Northeast, Midwest, South, West); MSA status (MSA, non-MSA); race/ethnicity (Hispanic; Black, non-Hispanic; Asian, non-Hispanic; and other); sex; and age. (It may be noted that for confidentiality reasons, the MSA status variables are no longer released for public use. This started with the Full-Year 2013 Person-Level Use PUF.) The final weight for key, responding persons who were not in-scope on December 31, 2020 but were in-scope earlier in the year was the nonresponse-adjusted person weight without raking.
In developing the person-level weight for Panel 23, an additional raking dimension was included beyond those based on the usual six variables. This dimension was added to adjust the distribution of event-based (i.e., office-based [MV] and/or outpatient [OP]) estimates to align with corresponding Panel 24 weighted estimates. The table below shows ratios of weighted totals (population estimates) associated with this additional raking dimension, reflecting the extent to which the Panel 23 estimates were modified in order to correspond to Panel 24 estimates. Generally, the weights of the records with any event in Q1 are inflated to account for the under reporting of events in Q1.
Ratio of Adjusted to Unadjusted Weights
# of Events |
Ratio |
1: No MV/OP Events |
0.8375 |
2: At least 1 event in Q1 and no events in other quarters |
2.7509 |
3: At least 1 event in Q2 and no events in other quarters |
0.9456 |
4: At least 1 event in Q3 and no events in other quarters |
0.7811 |
5: At least 1 event in Q4 and no events in other quarters |
0.7149 |
6: At least 1 event in Q1 and at least 1 event in at least 1 other quarter |
1.3188 |
7: At least 1 event in Q2 and at least 1 event in at least 1 Q3 or Q4 |
0.7199 |
8: Other |
0.6908 |
The Panel 23 2019 full-year weight used as the base weight for Panel 23 was derived from the 2018 MEPS Round 1 weight and reflected adjustment for nonresponse over the remaining data collection rounds in 2018 and 2019 as well as raking to the December 2018 and December 2019 population control figures.
For the raking variable “education of the reference person” there were four raking categories in prior years: no degree; high school/GED no college; some college; and Bachelor’s or a higher degree. However, as mentioned in the discussion of data quality issues in 2020 in Section 3.1, there was evidence that the onset of the COVID-19 pandemic in the years of 2020 and 2021 affected estimates associated with income and education (further details can be found in the references associated with the CPS data quality issues in 2020 and 2021 in the References section). For the full-year 2019 weights, March 2019 CPS was utilized instead of March 2020 CPS in the construction of control totals to avoid data quality issues connected to the COVID-19 pandemic. For the full-year 2020 weights, since there are no reliable education estimates from 2020 or 2021 CPS, a regression approach was implemented to derive education control figures. The regression approach involved two steps. The first step fit a linear regression model for each of the four education categories using the 2013-2018 CPS education of reference person distributions as the predictors in order to estimate the distribution for 2020, and the second step derived the education of reference person control figures by applying the estimated 2020 education distribution to the December 31, 2020 population total. The models for “no degree” and “Bachelor’s or a higher degree” performed extremely well with R2 values of 0.97 and 0.98, respectively. The models for “high school/GED no college” and “some college” showed a lower goodness of fit, especially for some college, with a R2 value of 0.74. A linear regression for the two categories combined improved the R2 value to 0.89, so the two levels were combined for the 2020 weight development.
Return To Table Of Contents
The person-level weight for MEPS Panel 24 was developed using the 2019 full-year weight for an individual as a “base” weight for survey participants present in 2019. For key, in-scope members who joined an RU some time in 2020 after being out-of-scope in 2019, the initially assigned person-level weight was the corresponding 2019 family weight. The weighting process included an adjustment for person-level nonresponse over Rounds 4 and 5 as well as raking to population control totals for December 2020 used for the MEPS Panel 23 weights for key, responding persons in-scope on December 31, 2020. The six standard variables employed for Panel 23 raking (education level, census region, MSA status, race/ethnicity, sex, and age) were also used for Panel 24 raking. Similar to Panel 23, the Panel 24 final weight for key, responding persons not in-scope on December 31, 2020 but in-scope earlier in the year was the nonresponse-adjusted person weight without raking.
Note that the 2019 full-year weight that was used as the base weight for Panel 24 was derived as follows; adjustment of the 2019 MEPS Round 1 weight for nonresponse over the remaining data collection rounds in 2019; and raking the resulting nonresponse adjusted weight to December 2019 population control figures.
Return To Table Of Contents
The person-level weight for MEPS Panel 25 was developed using the 2020 MEPS Round 1 person-level weight as a “base” weight. The MEPS Round 1 weights incorporated the following components: the original household probability of selection for the NHIS, use of a subsample of the NHIS panels and quarters reserved for MEPS, an adjustment for NHIS nonresponse, the probability of selection for MEPS from NHIS responding households, adjustment for nonresponse at the dwelling unit level for Round 1, and poststratification to control figures at the person level obtained from the March CPS of the corresponding year. For key, in-scope members who joined an RU after Round 1, the Round 1 family weight served as a “base” weight.
The weighting process also included an adjustment for nonresponse over the remaining data collection rounds in 2020 as well as raking to the same population control figures for December 2020 used for the MEPS Panel 23 and Panel 24 weights for key, responding persons in-scope on December 31, 2020. The six standard variables employed for Panel 23 and Panel 24 raking (educational attainment of the reference person, census region, MSA status, race/ethnicity, sex, and age) were also used for Panel 25. The event-based raking dimension used for Panel 23 was not employed for Panel 25. Similar to Panel 23 and Panel 24, the Panel 25 final weight for key, responding persons who were not in-scope on December 31, 2020 but were in-scope earlier in the year was the person weight after the nonresponse adjustment.
Return To Table Of Contents
The final raking of those in-scope at the end of the year has been described above. In addition, the composite weights of three groups of persons who were out-of-scope on December 31, 2020 were adjusted for expected undercoverage. Specifically, the weights of those who were in-scope some time during the year, out-of-scope on December 31, and entered a nursing home during the year and still residing in a nursing home at the end of the year were poststratified to an estimate of the number of persons who were residents of Medicare- and Medicaid-certified nursing homes for part of the year (approximately 3-9 months) during 2014. This estimate was developed from data on the Minimum Data Set (MDS) of the Center for Medicare and Medicaid Services (CMS). The weights of persons who died while in-scope were poststratified to corresponding estimates derived using data obtained from the Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), Underlying Cause of Death, 1999-2020 on CDC WONDER Online Database, released in 2022, the latest available data at the time. Separate decedent control totals were developed for the “65 and older” and “under 65” civilian noninstitutionalized populations.
Overall, the weighted population estimate for the civilian noninstitutionalized population for December 31, 2020 is 324,539,180 (PERWT20F >0 and INSC1231=1). The sum of person-level weights across all persons assigned a positive person-level weight is 328,545,297.
Return To Table Of Contents
The target population for MEPS in this file is the 2020 U.S. civilian noninstitutionalized population. However, the MEPS sampled households are a subsample of the NHIS households interviewed in 2017 (Panel 23), 2018 (Panel 24), and 2019 (Panel 25). New households created after the NHIS interviews for the respective panels and consisting exclusively of persons who entered the target population after 2017 (Panel 23), after 2018 (Panel 24), or after 2019 (Panel 25) are not covered by MEPS. Neither are previously out-of-scope persons who join an existing household but are unrelated to the current household residents. Persons not covered by a given MEPS panel thus include some members of the following groups: immigrants; persons leaving the military; U.S. citizens returning from residence in another country; and persons leaving institutions. The set of uncovered persons constitutes a relatively small segment of the MEPS target population.
Return To Table Of Contents
First, of course, we note that there are uncertainties associated with 2020 data quality as discussed in Section 3.1. Evaluations described in that section suggest that care should be taken in the interpretation of estimates based on data collected in 2020 as well as in comparisons over time. Trend analyses are challenging since the advent of the COVID-19 pandemic resulted in uncertain data quality for MEPS as well as standard benchmark sources such as the CPS, ACS, and NHIS while the pandemic also had an impact on the health and access to health care of the U.S. population. For such reasons, the extent to which 2020 health care parameters may differ from those of prior years is difficult to assess.
In terms of other factors to be aware of, MEPS began in 1996, and the utility of the survey for analyzing health care trends expands with each additional year of data; however, it is important to consider a variety of factors when examining trends over time using MEPS. Tests of statistical significance should be conducted to assess the likelihood that observed trends are not attributable to sampling variation. The length of time being analyzed should also be considered. In particular, large shifts in survey estimates over short periods of time (e.g. from one year to the next) that are statistically significant should be interpreted with caution unless they are attributable to known factors such as changes in public policy, economic conditions, or MEPS survey methodology.
With respect to methodological considerations, in 2013 MEPS introduced an effort focused on field procedure changes such as interviewer training to obtain more complete information about health care utilization from MEPS respondents with full implementation in 2014. This effort likely resulted in improved data quality and a reduction in underreporting starting in the second half of 2013 and throughout 2014 full year files and have had some impact on analyses involving trends in utilization across years. The aforementioned changes in the NHIS sample design in 2016 could also potentially affect trend analyses. The new NHIS sample design is based on more up-to-date information related to the distribution of housing units across the U.S. As a result, it can be expected to better cover the full U.S. civilian, noninstitutionalized population, the target population for MEPS as well as many of its subpopulations. Better coverage of the target population helps to reduce the potential for bias in both NHIS and MEPS estimates.
A significant change to the Conditions file occurred in 2016 when ICD-10-CM condition codes replaced ICD-9-CM codes. In addition, beginning in 2018, MEPS transitioned to CCSR codes, and up to three CCSR codes were assigned to a single condition (see Section 2.5.2 for details). Previously, a single CCS code was assigned to each condition to group conditions into clinically meaningful categories. The 2016 and 2017 Medical Conditions files are scheduled to be updated to include up to three CCSR codes for each condition. Also in 2018, the inclusion criteria for conditions changed; therefore, fewer conditions are on the 2018 and later files compared to previous years. See section 2.0 for a discussion of conditions included on the file.
Another change with the potential to affect trend analyses involved modifications to the MEPS instrument design and data collection process, particularly in the events sections of the instrument. These were introduced in the Spring of 2018 and thus affected data beginning with Round 1 of Panel 23, Round 3 of Panel 22, and Round 5 of Panel 21. Since the Full Year 2017 PUFs were established from data collected in Rounds 1-3 of Panel 22 and Rounds 3-5 of Panel 21, they reflected two different instrument designs. In order to mitigate the effect of such differences within the same full year file, the Panel 22 Round 3 data and the Panel 21 Round 5 data were transformed to make them as consistent as possible with data collected under the previous design. The changes in the instrument were designed to make the data collection effort more efficient and easy to administer. In addition, expectations were that data on some items, such as those related to health care events, would be more complete with the potential of identifying more events. Increases in service use reported since the implementation of these changes are consistent with these expectations. Data users should be aware of possible impacts on the data and especially trend analyses for these data years due to the design transition.
Process changes, such as data editing and imputation, may also affect trend analyses. For example, users should refer to the 2020 Consolidated file (HC-224) and, for more detail, the documentation for the prescription drug file (HC-220A) when analyzing prescription drug spending over time.
As always, it is recommended that data users review relevant sections of the documentation for descriptions of these types of changes that might affect the interpretation of changes over time before undertaking trend analyses.
Analysts may wish to consider using techniques to smooth or stabilize analyses of trends using MEPS data such as comparing pooled time periods (e.g. 1996-1997 versus 2011-2012), working with moving averages, or using modeling techniques with several consecutive years of MEPS data to test the fit of specified patterns over time.
Finally, statistical significance tests should be conducted to assess the likelihood that observed trends are not attributable to sampling variation. In addition, researchers should be aware of the impact of multiple comparisons on Type I error. Without making appropriate allowance for multiple comparisons, undertaking numerous statistical significance tests of trends increases the likelihood of concluding that a change has taken place when one has not.
Return To Table Of Contents
Data from the current file can be used alone or in conjunction with other files. Merging characteristics of interest from person-level files expands the scope of potential estimates. Person-level characteristics can be merged to this Conditions File using the following procedure (example given for the SAS programming language):
- Sort the person-level file by person identifier, DUPERSID. Keep only DUPERSID and the variables to be merged onto the Conditions File.
- Sort the Conditions File by person identifier, DUPERSID.
- Merge both files by DUPERSID, and output all records in the Conditions File.
- If PERS contains the person-level variables, and COND is the Conditions File, the following code can be used to add person-level variables to the person’s conditions in the Condition-level file.
PROC SORT DATA=PERS(KEEP=DUPERSID AGE SEX EDUCYR HIDEG)
OUT=PERSX; BY DUPERSID;
RUN;
PROC SORT DATA=COND; BY DUPERSID;
RUN;
DATA COND;
MERGE COND (IN=A) PERSX(IN=B); BY DUPERSID;
IF A;
RUN;
Return To Table Of Contents
Data from this file can be used alone or in conjunction with other files for different analytic purposes. Each MEPS panel can also be linked back to the previous years’ National Health Interview Survey public use data files. For information on MEPS/NHIS link files please see the
AHRQ website.
Return To Table Of Contents
Panel-specific longitudinal files are available for downloading in the data section of the MEPS website. As has been done routinely in past years, the longitudinal file for Panel 24 comprises MEPS survey data obtained in Rounds 1 through 5 of the panel and can be used to analyze changes over a two-year period. Unlike past years for MEPS, in 2020 Panel 23 had data collected for a third year. As such, two-year and three-year longitudinal files will be developed for Panel 23. These can be used to analyze changes over the corresponding two-year or three-year period. Variables in the file pertaining to survey administration, demographics, employment, health status, disability days, quality of care, patient satisfaction, health insurance, and medical care use and expenditures were obtained from the MEPS full-year Consolidated files from the years covered by each panel.
For more details or to download the data files, please see Longitudinal Data Files at the
AHRQ website.
Return To Table Of Contents
Bramlett, M.D., Dahlhamer, J.M., & Bose, J. (2021, September).
Weighting Procedures and Bias Assessment for the 2020 National Health Interview Survey. Centers for Disease Control and Prevention.
Chowdhury, S.R., Machlin, S.R. & Gwet, K.L. Sample Designs of the Medical Expenditure Panel Survey Household Component, 1996-2006 and 2007-2016.
Methodology Report #33. January 2019. Agency for Healthcare Research and Quality, Rockville, MD.
Cox, B. and Iachan, R. (1987). A Comparison of Household and Provider Reports of Medical Conditions. Journal of the American Statistical Association 82(400): 1013-18.
Current Population Survey: 2021 Annual Social and Economic (ASEC) Supplement. (2021). U.S. Census Bureau.
Dahlhamer, J.M., Bramlett, M.D., Maitland, A., & Blumberg, S.J. (2021).
Preliminary evaluation of nonresponse bias due to the COVID-19 pandemic on National Health Interview Survey estimates, April-June 2020. National Center for Health Statistics.
Daily, D., Cantwell, P.J., Battle, K., & Waddington, D.G. (2021, October 27),
An Assessment of the COVID-19 Pandemic’s Impact on the 2020 ACS 1-Year Data. U.S. Census Bureau.
Edwards, W. S., Winn, D. M., Kurlantzick, V., et al. Evaluation of National Health Interview Survey Diagnostic Reporting. National Center for Health Statistics, Vital Health 2(120). 1994.
Health Care Financing Administration (1980). International Classification of Diseases, 9th Revision, Clinical Modification (ICD-CM). Vol. 1. (Department of Health and Human Services Pub. No (PHS) 80-1260). Department of Health and Human Services: U.S. Public Health Services.
Johnson, Ayah E., and Sanchez, Maria Elena. (1993), “Household and Medical Reports on Medical Conditions: National Medical Expenditure Survey." Journal of Economic and Social Measurement, 19, 199-223.
Lau, D.T., Sosa, P., Dasgupta, N., & He, H. (2021).
Impact of the COVID-19 Pandemic on Public Health Surveillance and Survey Data Collections in the United States.
American Journal of Public Health, 111 (12), pp. 2118-2121.
Rothbaum, J. & Bee, A. (2020). Coronavirus Infects Surveys, Too: Nonresponse Bias During the Pandemic in the CPS ASEC (SEHSD Working Paper Number 2020-10). U.S. Census Bureau.
Rothbaum, J. & Bee, A. (2021, May 3).
Coronavirus Infects Surveys, Too: Survey Nonresponse Bias and the Coronavirus Pandemic. U.S. Census Bureau.
Rothbaum, J., Eggleston, J., Bee, A., Klee, M., & Mendez-Smith, B. (2021).
Addressing Nonresponse Bias in the American Community Survey During the Pandemic Using Administrative Data.
U.S. Census Bureau.
Villa Ross, C.A., Shin, H.B., & Marlay, M.C. (2021, October 27).
Pandemic Impact on 2020 American Community Survey 1-Year Data.
U.S. Census Bureau.
Zuvekas, S.H. & Kashihara, D. (2021). The Impacts of the COVID-19 Pandemic on the Medical Expenditure Panel Survey.
American Journal of Public Health, 111 (12), pp. 2157-2166.
Return To Table Of Contents
MEPS HC-222: 2020 MEDICAL CONDITIONS
UNIQUE IDENTIFIER VARIABLES
VARIABLE |
LABEL |
SOURCE1 |
DUID |
Panel # + Encrypted DU Identifier |
Assigned In Sampling |
PID |
Person Number |
Assigned In Sampling |
DUPERSID |
Person ID (DUID + PID) |
Assigned In Sampling |
CONDN |
Condition Number |
CAPI Derived |
CONDIDX |
Condition ID |
CAPI Derived |
PANEL |
Panel Number |
Constructed |
CONDRN |
Condition Round Number |
CAPI Derived |
Return To Table Of Contents
MEDICAL CONDITION VARIABLES
VARIABLE |
LABEL |
SOURCE1 |
AGEDIAG |
Age When Diagnosed |
PE section |
CRND1 |
Has Condition Information In Round 1 |
Constructed |
CRND2 |
Has Condition Information In Round 2 |
Constructed |
CRND3 |
Has Condition Information In Round 3 |
Constructed |
CRND4 |
Has Condition Information In Round 4 |
Constructed |
CRND5 |
Has Condition Information In Round 5 |
Constructed |
CRND6 |
Has Condition Information In Round 6 |
Constructed |
CRND7 |
Has Condition Information In Round 7 |
Constructed |
INJURY |
Was Condition Due To Accident/Injury |
AH80 |
ACCDNWRK |
Did Accident Occur At Work |
AH90 |
ICD10CDX |
ICD-10-CM Code For Condition Edited |
HS40, ER30, OP60, MV70, HH80, PM120, PE Section (Edited) |
CCSR1X |
Clinical Classification Refined Code 1 Edited |
HS40, ER30, OP60, MV70, HH80, PM120, PE section (Edited) |
CCSR2X |
Clinical Classification Refined Code 2 Edited |
HS40, ER30, OP60, MV70, HH80, PM120, PE section (Edited) |
CCSR3X |
Clinical Classification Refined Code 3 Edited |
HS40, ER30, OP60, MV70, HH80, PM120, PE section (Edited) |
Return To Table Of Contents
UTILIZATION VARIABLES
VARIABLE |
LABEL |
SOURCE1 |
HHNUM |
# Home Health Events
Assoc. w/ Condition |
Constructed |
IPNUM |
# Inpatient Events
Assoc. w/ Condition |
Constructed |
OPNUM |
# Outpatient Events
Assoc. w/ Condition |
Constructed |
OBNUM |
# Office-Based Events
Assoc. w/ Condition |
Constructed |
ERNUM |
# ER Events Assoc. w/
Condition |
Constructed |
RXNUM |
# Distinct Prescribed Medicines Per Round Assoc. w/ Cond |
Constructed |
Return To Table Of Contents
WEIGHTS AND VARIANCE ESTIMATION VARIABLES
VARIABLE |
LABEL |
SOURCE1 |
PERWT20F |
Expenditure File
Person Weight, 2020 |
Constructed |
VARSTR |
Variance Estimation
Stratum, 2020 |
Constructed |
VARPSU |
Variance Estimation
PSU, 2020 |
Constructed |
1See the Household Component section under Survey Questionnaires on the MEPS home page for information on the MEPS HC questionnaire sections shown in the Source column (e.g., PE).
Return To Table Of Contents
Angina/Angina Pectoris
Arthritis
Asthma
Attention Deficit Hyperactivity Disorder (ADHD)/Attention Deficit Disorder (ADD)
Cancer/Malignancy
Chronic Bronchitis
Coronary Heart Disease
Diabetes/Sugar Diabetes
Emphysema
Heart Attack/Myocardial Infarction (MI)
High Cholesterol
Hypertension/High Blood Pressure
Joint Pain
Other Heart Disease (not coronary heart disease, angina, or heart attack)
Stroke/Transient Ischemic Attack (TIA)/Mini-stroke
Return To Table Of Contents
|