Font Size:
|
||||||
MEPS HC-221 2020 Food SecuritySeptember 2020 Due to the COVID-19 pandemic, changes were made to the 2020 MEPS data collection that analysts should keep in mind when doing trend analysis and pooling years of data. 1) The MEPS moved primarily to a phone rather than in-person survey. 2) Panels 23 and 24 were extended to nine rounds(four years) of data collection as opposed to the historical five rounds (two years). Because of the unforeseeable nature of the pandemic, data collection for 2020 included Round 5 interviews for Panel 23 that were fielded under the assumption that that interview would be the panel’s last interview. Researchers using variables related to the first interview of the calendar year should read the documentation for their specific variables to understand the sources of the values for Panel 23. Agency for Healthcare Research and Quality A. Data Use Agreement A. Data Use AgreementIndividual identifiers have been removed from the micro-data contained in these files. Nevertheless, under sections 308 (d) and 903 (c) of the Public Health Service Act (42 U.S.C. 242m and 42 U.S.C. 299 a-1), data collected by the Agency for Healthcare Research and Quality (AHRQ) and/or the National Center for Health Statistics (NCHS) may not be used for any purpose other than for the purpose for which they were supplied; any effort to determine the identity of any reported cases is prohibited by law. Therefore in accordance with the above referenced Federal Statute, it is understood that:
By using these data you signify your agreement to comply with the above stated statutorily based requirements with the knowledge that deliberately making a false statement in any matter within the jurisdiction of any department or agency of the Federal Government violates Title 18 part 1 Chapter 47 Section 1001 and is punishable by a fine of up to $10,000 or up to 5 years in prison. The Agency for Healthcare Research and Quality requests that users cite AHRQ and the Medical Expenditure Panel Survey as the data source in any publications or research based upon these data. B. Background1.0 Household ComponentThe Medical Expenditure Panel Survey (MEPS) provides nationally representative estimates of health care use, expenditures, sources of payment, and health insurance coverage for the U.S. civilian noninstitutionalized population. The MEPS Household Component (HC) also provides estimates of respondents’ health status, demographic and socio-economic characteristics, employment, access to care, and satisfaction with health care. Estimates can be produced for individuals, families, and selected population subgroups. The panel design of the survey, which includes 5 Rounds of interviews covering 2 full calendar years and two additional rounds in 2020 covering a third year to compensate for the smaller number of completed interviews in Panel 25, provides data for examining person level changes in selected variables such as expenditures, health insurance coverage, and health status. Using computer assisted personal interviewing (CAPI) technology, information about each household member is collected, and the survey builds on this information from interview to interview. All data for a sampled household are reported by a single household respondent. The MEPS HC was initiated in 1996. Each year a new panel of sample households is selected. Because the data collected are comparable to those from earlier medical expenditure surveys conducted in 1977 and 1987, it is possible to analyze long-term trends. Each annual MEPS HC sample size is about 15,000 households. Data can be analyzed at either the person or event level. Data must be weighted to produce national estimates. The set of households selected for each panel of the MEPS HC is a subsample of households participating in the previous year’s National Health Interview Survey (NHIS) conducted by the National Center for Health Statistics. The NHIS sampling frame provides a nationally representative sample of the U.S. civilian noninstitutionalized population. In 2006, the NHIS implemented a new sample design, which included Asian persons in addition to households with Black and Hispanic persons in the oversampling of minority populations. NHIS introduced a new sample design in 2016 that discontinued oversampling of these minority groups. 2.0 Medical Provider ComponentUpon completion of the household CAPI interview and obtaining permission from the household survey respondents, a sample of medical providers are contacted by telephone to obtain information that household respondents cannot accurately provide. This part of the MEPS is called the Medical Provider Component (MPC) and information is collected on dates of visits, diagnosis and procedure codes, charges and payments. The Pharmacy Component (PC), a subcomponent of the MPC, does not collect charges or diagnosis and procedure codes but does collect drug detail information, including National Drug Code (NDC) and medicine name, as well as amounts of payment. The MPC is not designed to yield national estimates. It is primarily used as an imputation source to supplement/replace household reported expenditure information. 3.0 Survey Management and Data CollectionMEPS HC and MPC data are collected under the authority of the Public Health Service Act. Data are collected under contract with Westat, Inc. (MEPS HC) and Research Triangle Institute (MEPS MPC). Data sets and summary statistics are edited and published in accordance with the confidentiality provisions of the Public Health Service Act and the Privacy Act. The National Center for Health Statistics (NCHS) provides consultation and technical assistance. As soon as data collection and editing are completed, the MEPS survey data are released to the public in staged releases of micro data files and tables via the MEPS website. Additional information on MEPS is available from the MEPS project manager or the MEPS public use data manager at the Center for Financing, Access, and Cost Trends, Agency for Healthcare Research and Quality, 5600 Fishers Lane, Rockville, MD 20857 (301-427-1406). C. Technical and Programming Information1.0 General InformationThis documentation describes the 2020 food security data file from the Medical Expenditure Panel Survey Household Component (MEPS HC). Released as an ASCII file (with related SAS, SPSS, Stata, and R programming statements and data user information), a SAS dataset, a SAS transport dataset, a Stata dataset, and an Excel file, this public use file provides information collected on a nationally representative sample of the civilian noninstitutionalized population of the United States for calendar year 2020. The file contains 17 variables and has a logical record length of 57 with an additional 2-byte carriage return/line feed at the end of each record. This file consists of MEPS survey data obtained in Round 6 of Panel 23, Round 4 of Panel 24, and Round 2 of Panel 25, and contains variables pertaining to food security. The following documentation offers a brief overview of the types and levels of data provided, content and structure of the files, and programming information. It contains the following sections:
Both weighted and unweighted frequencies of most variables included in the 2020 food security data file are provided in the accompanying codebook file. The exceptions to this are weight variables and variance estimation variables. Only unweighted frequencies of these variables are included in the accompanying codebook file. See the Weights Variables list in Section D, Variable-Source Crosswalk. A database of all MEPS products released to date can be found on the MEPS website. 2.0 Data File InformationThis public use dataset contains variables and frequency distributions associated with 11,992 households who participated in the MEPS Household Component of the Medical Expenditure Panel Survey in 2020. These households received a positive family-level weight and were part of one of the three MEPS panels for whom food security data were collected in Round 6 of Panel 23, Round 4 of Panel 24, or Round 2 of Panel 25. 2.1 Codebook StructureThe codebook and data file sequence lists variables in the following order:
2.2 Reserved Codes
As part of the MEPS instrument design change in Spring of 2018, -9 (NOT ASCERTAINED) was removed from the MEPS instrument. This affected responses starting in Panel 23 Round 1, Panel 22 Round 3, and Panel 21 Round 5 and will continue in subsequent panels and rounds. Cases that used to contain -9 (NOT ASCERTAINED) in MEPS variables are now distributed between -8 (DK) and -15 (CANNOT BE COMPUTED). Most of the cases that were previously -9 (NOT ASCERTAINED) will now be assigned -8 (DK). However, -15 (CANNOT BE COMPUTED) will be assigned for MEPS variables that are constructed from MEPS instrument variables in cases where there is not enough information from the MEPS instrument to calculate the constructed MEPS variables. “Lack of information” is often the result of skip patterns in the data or from missing information resulting from -7 (REFUSED) or -8 (DK). Also note that reserved code -8 previously identified cases where respondent chose “don’t know” to a question. It now represents a broader category that includes cases where either the information from the question was “not ascertained” or the respondent chose “don’t know”. 2.3 Codebook FormatThis codebook describes an ASCII data set (although the data are also being provided in a SAS data set, SAS transport file, Stata data set, and Excel file) and provides the following programming identifiers for each variable:
2.4 Variable NamingVariable names reflect the content of the variable, with an eight-character limitation. Historically round dates have been indicated by two numbers following the variable name; the first number representing the round for second panel persons (Panel 24), the second number representing the round for first panel persons (Panel 25). The variable names in this 2020 file have not been renamed from prior years, despite the addition of Round 6 of Panel 23, and the round number (6) is not included in the variable names. Variables contained in this delivery were derived either from the questionnaire itself or from the CAPI. The source of each variable is identified in Appendix 1 “Variable-Source Crosswalk.” Sources for each variable are indicated in one of three ways: (1) variables derived from CAPI or assigned in sampling are so indicated; (2) variables collected at one or more specific questions have those numbers and questionnaire sections indicated in the “SOURCE” column; and (3) variables constructed from multiple questions using complex algorithms are labeled “Constructed” in the “SOURCE” column. 2.5 File Contents2.5.1 Survey Administration Variables (HOMEIDX - RULETR42)HOMEIDX uniquely identifies each household on the file and consists of the Dwelling Unit ID (DUID) followed by the Reporting Unit (RU) letter and round number. The definitions of Dwelling Units (DUs) in the MEPS Household Survey are generally consistent with the definitions employed for the National Health Interview Survey (NHIS). The DUID is a seven-digit number consisting of a 2-digit panel number followed by a five-digit random number assigned after the case was sampled for MEPS. PANEL is a constructed variable used to specify the panel number for the person. PANEL will indicate Panel 23, Panel 24 or Panel 25 for each person on the file. Panel 23 is the Panel that started in 2018, Panel 24 is the panel that started in 2019, and Panel 25 is the panel that started in 2020. An RU is a person or group of persons in the sampled DU who are related by blood, marriage, adoption, or other family association. Each RU was interviewed as a single entity for MEPS. Thus, the RU serves chiefly as a family-based “survey” operations unit rather than an analytic unit. Members of each RU within the DU in Round 6, Round 4, or Round 2 are identified in the variable RULETR42. Households are eligible for the Food Security PUF if the MEPS interview was completed by an RU member and if the household is not a student RU. 2.5.2 Food Security Variables (FSOUT42 - FSNEDY42)Respondents were asked: FSOUT42 - how often in the last 30 days anyone in the household worried whether food would run out before getting money to buy more FSLAST42 - how often in the last 30 days the food purchased didn’t last and the person/household didn’t have money to get more FSAFRD42 - how often in the last 30 days the person/household could not afford to eat balanced meals FSSKIP42 - in the last 30 days did the person/household reduce or skip meals because there wasn’t enough money for food (coded as “-1 Inapplicable” when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15) FSSKDY42 - how many meals were skipped in the last 30 days (coded as “-1 Inapplicable” when FSSKIP42 = 2, -7, -8, or -15 OR when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15) FSLESS42 - in the last 30 days did the person/household ever eat less because there wasn’t enough money for food (coded as “-1 Inapplicable” when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15) FSHGRY42 - in the last 30 days was the person/household ever hungry but didn’t eat because there wasn’t enough money for food (coded as “-1 Inapplicable” when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15) FSWTLS42 - in the last 30 days did anyone in the household lose weight because there wasn’t enough money for food (coded as “-1 Inapplicable” when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15) FSNEAT42 - in the last 30 days did anyone in the household not eat for a whole day because there wasn’t enough money for food (coded as “-1 Inapplicable” when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15; or when FSLESS42, FSHGRY42, and FSWTLS42 = 2, -7, -8, or -15) FSNEDY42 - how many days in the last 30 days anyone in the household had not eaten for a whole day because there wasn’t enough money for food (coded as “-1 Inapplicable” when FSOUT42, FSLAST42, and FSAFRD42 = 3, -7, -8, or -15; or when FSLESS42, FSHGRY42, and FSWTLS42 = 2, -7, -8, or -15; or when FSNEAT42 = 2, -7, -8, or -15) 2.6 Linking to Other Files2.6.1 MEPS Public Use FilesThis Food Security file can be linked to the 2020 full year Consolidated file by DUID and RULETR42 to obtain additional data for the families included in this file. The reference person of the RU can be identified in the Consolidated data file by the variable REFPRS42. 2.6.2 National Health Interview SurveyThe set of households selected for MEPS is a subsample of those participating in the National Health Interview Survey (NHIS), thus, each MEPS panel can also be linked back to the previous year’s NHIS public use data files. For information on obtaining MEPS/NHIS link files please see the AHRQ website. 2.6.3 Longitudinal AnalysisPanel-specific longitudinal files are available for downloading in the data section of the MEPS website. As has been done routinely in past years, the longitudinal file for Panel 24 comprises MEPS survey data obtained in Rounds 1 through 5 of the panel and can be used to analyze changes over a two-year period. Unlike past years for MEPS, in 2020 Panel 23 had data collected for a third year. As such, two-year and three-year longitudinal files will be developed for Panel 23. These can be used to analyze changes over the corresponding two-year or three-year period. Variables in the file pertaining to survey administration, demographics, employment, health status, disability days, quality of care, patient satisfaction, health insurance, and medical care use and expenditures were obtained from the MEPS full-year Consolidated files from the years covered by each panel. For more details or to download the data files, please see Longitudinal Weight Files at the AHRQ website. 3.0 Survey Sample Information3.1 Discussion of Pandemic Effects on Quality of 2020 MEPS Data3.1.1 SummaryData collection for in-person sample surveys in 2020 presented real challenges after the onset of the COVID-19 pandemic at a national level in mid-March of that year. After major modifications to the standard MEPS study design, it was possible to collect data safely, but there were naturally concerns about the quality of the data after such modifications. Some issues related to data quality were identified and are discussed below. As with most in-person surveys conducted in 2020, researchers are counseled to take care in the interpretation of 2020 estimates including the comparison of such estimates with those of other years. 3.1.2 OverviewThe onset of the COVID-19 pandemic in 2020 had a major impact on the MEPS Household Component (MEPS HC) as it did for most major federal surveys and, of course, American life generally. The following discussion describes 1) the general impact of the pandemic on three major federal surveys (the effects on two of which also affect MEPS); 2) modifications to the MEPS sample design and field operations in 2020 due to the pandemic; and 3) potential data quality issues in the Full Year (FY) 2020 MEPS data related to the COVID-19 pandemic. 3.1.3 The Impact of the Pandemic on some Major Federal SurveysMany important federal surveys were collecting data when much of the nation shut down in the face of the pandemic in March 2020. Among them were the Current Population Survey (CPS), the American Community Survey (ACS), and the National Health Interview Survey (NHIS). The ACS and the NHIS field new samples each year. The CPS includes rotating panels, meaning some of the sampled households fielded had participated in prior years while others were fresh. Two of these surveys have important roles in MEPS. Estimates of CPS subgroups serve as benchmarks for the MEPS weighting process (referred to below as “raking control totals”) while households fielded for Round 1 of MEPS in each year are selected as a subsample of the NHIS responding households from the prior year. Because data collection in 2020 occurred under such unusual circumstances, all three of these surveys have reported bias concerns. (In fact, the ACS decided not to release a standard database for 2020 due to the uncertain quality of the data, while the CPS and the NHIS released data but included reports discussing concerns about bias.) All three surveys have reported evidence of nonresponse bias, specifically, that households in higher socio-economic levels were relatively more likely to respond and the sample weighting was unable to fully compensate for this. As a result, analysts have been cautioned about the accuracy of survey estimates and the ability to compare resulting estimates with estimates obtained in the years prior to the pandemic. The quality of CPS data is of particular importance to Full Year 2020 MEPS PUFs as CPS estimates serve as the control totals for the raking component of the MEPS weighting process. These control totals are based on the following demographic variables: age, sex, race/ethnicity, region, MSA status, educational attainment, and poverty status. The CPS estimates used in the development of the FY 2020 MEPS PUF weights that were based on the variables age, sex, race/ethnicity, region, and MSA status were evaluated by the Census Bureau and determined to be of high quality. However, similar evaluations of the corresponding CPS estimates associated with educational attainment and poverty status found that these estimates suffered from bias. A set of references discussing the fielding of these three surveys during the pandemic and resulting bias concerns can be found in the References section of this document. 3.1.4 Modifications to the MEPS HC 2020 Sample Design and Implementation Effort in Response to the PandemicFor the MEPS HC, face-to-face interviewing ceased due to the COVID-19 pandemic on March 17, 2020. At that time, there were two MEPS panels in the field for which 2020 data were being collected: Round 1 of Panel 25 and Round 3 of Panel 24. The sampled households for Panel 25 were being contacted and asked to participate in MEPS for the first time while those from Panel 24 had already participated in MEPS for two rounds. A third MEPS panel was also in the field in early 2020, Round 5 of Panel 23, collecting data for the last portion of 2019. In developing a plan for how best to resume MEPS data collection, the primary issues were how to do so safely for both sampled household members and interviewers and the potential impact on data quality. Telephone data collection, although not the preferred method of data collection in general for MEPS HC, was the natural option because it did not require in-person contact with respondents and could be implemented relatively quickly. The impact of changing to telephone on both response rates and data quality was expected to be larger for Panel 25 Round 1 (e.g., no experience with reporting health care events in the recent past). At the time in-person interviewing stopped in mid-March 2020 completion rates for Panels 23 and 24 were substantially higher than those for Panel 25. AHRQ decided to field Panel 23 for at least one more year, asking Panel 23 respondents if they would be open to further participation in MEPS in newly added Rounds 6 and 7. Extending Panel 23 was meant to both offset the decrease in the number of cases in the FY 2020 data related to lower expected sample yields for Panel 25 and to improve data quality by retaining a set of participants who were familiar with MEPS. These decisions required major changes in survey operations, including adding a fall Panel 23 Round 6 interview covering all 2020 events from January 1, 2020 to the date of the interview. 3.1.5 Data Quality Issues for MEPS for FY 2020Numerous analyses were conducted to examine potential impacts on data quality and to gain a more complete understanding of these issues. Zuvekas and Kashihara (2021) discuss some of these analyses and provide additional background information on how the MEPS study design was modified in 2020 in response to the pandemic. Three sources of potential bias that were identified are noted here: the long recall period for Round 6 of Panel 23; switching from in-person to telephone interviewing which likely had a larger impact on Panel 25; and the impact of CPS bias on the MEPS weights. Each is considered in turn. Comparisons of health care utilization data for Panel 24 and Panel 23 indicated that the extended reference period for Panel 23 Round 6 may have resulted in recall issues for respondents. Round 6 was initially fielded in the late summer and early fall of 2020, and because the Round 5 reference period ended on December 31, 2019, the recall period for health care events and related information extended back to January 1, 2020, much longer than for typical MEPS rounds. For Panel 23 Round 6 respondents, events of a less salient nature, such as dental visits and office-based physician visits, occurring in early 2020 were under-reported. Underreporting was confirmed through both an examination of differential utilization across 2020 for Panel 23 respondents as well as statistical comparisons of Panel 23 and Panel 24 event estimates. Adjustments were made to the sample weights for Panel 23 to help address this concern. Details on these adjustments can be found in Section 3.3.1. Comparisons of Panel 25 with Panel 24 health care utilization data found that the difference in estimates reached statistical significance for several event types with those from Panel 25 generally being the higher. The same comparisons between first and second year panels in MEPS in recent years showed relatively few such differences with no differences at all in 2019. Finally, AHRQ decided to calibrate, via raking, the FY 2020 Consolidated PUF weights to control totals reflecting CPS 2021 poverty status data. As discussed earlier, bias was identified by the Census Bureau in the 2020 and 2021 CPS income data and correlates. Nevertheless, the Census Bureau decided to use its standard sample weighting approach for both the 2020 and 2021 CPS ASEC data sets while recognizing some deficiencies in the nonresponse adjustment approach for the two years as a result of data collection during the pandemic. Similarly, MEPS has used poverty status based on the CPS estimates for calibration for many years and continued to do so for the 2020 Full Year Consolidated PUF as it was decided that the advantages of doing so outweighed the disadvantages. 3.1.6 Discussion and GuidanceThe additional procedures for developing person-level and family-level final weights for the 2020 Consolidated MEPS data were designed to correct for potential biases in the data due to changes in data collection and response bias. However, evaluations of MEPS data quality in 2020 - corroborated in analyses of other Federal surveys fielded in 2020 - suggest that users of the MEPS FY 2020 Consolidated PUF should exercise caution when interpreting estimates and assessing analyses based on these data as well as in comparing 2020 estimates to those of prior years. 3.2 Background on Sample Design and Response RatesThe MEPS is designed to produce estimates at the national and regional level over time for the civilian, noninstitutionalized population of the United States and some subpopulations of interest. The data in this public use file pertain to calendar year 2020. The data were collected in Rounds 1, 2, and 3 for MEPS Panel 25, Rounds 3, 4, and 5 for MEPS Panel 24, and Rounds 6 and 7 for MEPS Panel 23. As usual, Round 3 for a MEPS panel (this time for Panel 25) has been designed to overlap two calendar years, as illustrated below. However it may be noted that, with the fielding of a third panel in 2020 (as indicated in the data quality discussion in Section 3.1), the structure of other rounds has changed. For 2020, Panel 23 Round 6 represents the reference period from the date of the Round 6 interview back to January 1, 2020 (as discussed in the data quality subsection). The 2020 food security data were collected only in Round 6 of Panel 23, Round 4 of Panel 24, and Round 2 of Panel 25. A sample design feature shared by Panel 23, Panel 24, and Panel 25 involved the partitioning of the sample domain “Other” (serving as the catchall stratum and consisting mainly of households with “White” members) into two sample domains. This was done for the first time in Panel 16. The two domains distinguished between those households characterized as “complete” respondents to the NHIS; and those characterized as “partial completes.” NHIS “partial completes” typically have a lower response rate to MEPS and for all three MEPS panels the “partial” domain was sampled at a lower rate than the “complete” domain. This approach has served to reduce survey costs, since the “partials” tend to have higher costs in gaining survey participation, but has also increased sample variance due to the resulting increased variance in sampling rates. Starting with Panel 25, the “Other, Partial” domain includes the NHIS households that have provided only a roster of household members. For detailed information on the MEPS sample design, see Chowdhury et al (2019). 3.2.1 MEPS-Linked to the National Health Interview Survey (NHIS)Each responding household found in this 2020 MEPS dataset is associated with one of the three separate and overlapping MEPS panels, MEPS Panel 23, MEPS Panel 24 and MEPS Panel 25. These panels consist of subsamples of households participating in the 2017, 2018, and 2019 NHIS, respectively. The Full Year 2018 PUF was the first one where both MEPS panels reflect the new NHIS sample design first implemented in 2016. Whenever there is a change in sample or study design, it is good survey practice to assess whether such a change could affect the sample estimates. For example, increased coverage of the target populations with an updated sample design based on data from the latest Census can improve the accuracy of the sample estimates. MEPS estimates have been and will continue to be evaluated to determine if an important change in the survey estimates might be associated with a change in design. Discussion on the potential effects of MEPS design changes in 2020 appears in Section 3.1. Background on the two NHIS sample designs of interest here is provided next. Background on the NHIS Sample Redesign Implemented in 2016 Beginning in 2016, NCHS implemented another new sample design for the NHIS, which differed substantially from the prior design. Each of the 50 states as well as the District of Columbia served as explicit strata for sample selection purposes with the intent of providing the capability of state-level NHIS estimates obtained through pooling across years if the sample size for a single year would result in unreliable estimates. In contrast to the previous design, households in areas with relatively high concentrations of minorities are not oversampled. PSUs are still formed at the county level. However, within sampled PSUs, the clusters of addresses that have been sampled for each year of the NHIS are not in the form of segments (consisting of one or more census blocks) as was done for the previous NHIS designs. For the 2016 NHIS, each such cluster consisted of roughly 25 subclusters selected using random systematic sampling across the full geography of the PSU. Each subcluster is made up of, generally, four nearby addresses or roughly 100 addresses in all. The number of subclusters per cluster can vary from year to year. Another major change is that the list of DUs (addresses) was obtained from the Computerized Delivery Sequence File (CDSF) of the U.S. Postal Service, which is a different approach than the standard listing process for area probability samples used in the pre-2016 designs. While addresses in the CDSF provide very high coverage of most areas of the country, coverage in rural areas can be somewhat lower. For rural areas where this was a concern, address lists were created through the conventional listing process. A description of the NHIS sample design is found on the NHIS website. Panel 23 Household Sample Size A subsample of 9,700 households (occupied DUs) selected for MEPS Panel 23 from NHIS responding households in 2017, of which 9,694 were fielded for MEPS after the elimination of any units characterized as ineligible for fielding. Panel 24 Household Sample Size A subsample of 9,700 households was randomly selected for MEPS Panel 24 from the households responding to the 2018 NHIS, of which 9,684 were fielded for MEPS after the elimination of any units characterized as ineligible for fielding. Panel 25 Household Sample Size A subsample of 9,900 households was randomly selected for MEPS Panel 25 from the households responding to the 2019 NHIS, of which 9,888 were fielded for MEPS after the elimination of any units characterized as ineligible for fielding. Implications of the New Design on MEPS Estimates Under the new design, MEPS sampled households reflect the clustering of the NHIS, as described above but to a somewhat lesser degree due to the sampling from NHIS respondents. Due to the spreading of the NHIS sample in small subclusters across the PSU and the sampling limited to only NHIS respondents, the impact of clustering on the variance of MEPS estimates may be more limited. Also, in contrast to the previous design, the NHIS sampling rates at the address level currently do not vary due to oversampling of minorities (although this could change in subsequent years). On balance, the overall variation in sampling rates/weights at the national level for the NHIS is expected to be lower with a corresponding positive impact on the precision of MEPS estimates. However, with a reduction in the sample sizes of minority households, precision levels of MEPS estimates for Blacks, Hispanics, and Asians may be reduced to some extent. 3.2.2 Sample Weights and Variance EstimationIn the dataset “MEPS HC-221: 2020 Food Security Data File,” a weight variable is provided for generating MEPS estimates of totals, means, percentages, and rates for families in the civilian noninstitutionalized population. Procedures and considerations associated with the construction and interpretation of family estimates using these and other variables are discussed below. 3.3 The MEPS Sampling Process and Response Rates: An OverviewFor most MEPS panels, a sample representing about three-eighths of the NHIS responding households is made available for use in MEPS. This was the case for MEPS Panel 23, Panel 24, and Panel 25. Because the MEPS subsampling has to be done soon after NHIS responding households are identified, a small percentage of the NHIS households initially characterized as NHIS respondents are later classified as nonrespondents for the purposes of NHIS data analysis. This actually serves to increase the overall MEPS response rate slightly since the percentage of NHIS households designated for use in MEPS (all those characterized initially as respondents from the NHIS panels and quarters used by MEPS for a given year) is slightly larger than the final NHIS household-level response rate and some NHIS nonresponding households do participate in MEPS. However, as a result, these NHIS nonrespondents who are MEPS participants have no NHIS data available to link with MEPS data. Once the MEPS sample is selected from among the NHIS households characterized as NHIS respondents, RUs consisting entirely of military personnel are deleted from the sample. Military personnel not living in the same RU as civilians are ineligible for MEPS. After these exclusions, all RUs associated with households selected from among those identified as NHIS responding households are then fielded in the first round of MEPS. Table 3.1 shows in Rows A, B, and C the three informational components just discussed. Row A indicates the percentage of NHIS households eligible for MEPS. Row B indicates the number of NHIS households sampled for MEPS. Row C indicates the number of sampled households actually fielded for MEPS (after dropping the military members discussed above and a small number of NHIS households sampled in error). Note that all response rates discussed here are unweighted.
*Among the panels and quarters of the NHIS allocated to MEPS, the percentage of households that were considered to be NHIS respondents at the time the MEPS sample was selected. 3.3.1 Response RatesIn order to produce annual health care estimates for calendar year 2020 based on the full MEPS sample data from the MEPS Panel 23, Panel 24 and Panel 25, the three panels are combined. More specifically, full calendar year 2020 data collected in Rounds 6 and 7 for the MEPS Panel 23 and Rounds 3 through 5 for the MEPS Panel 24 sample are pooled with data from the first three rounds of data collection for the MEPS Panel 25 sample (the general approach is described below). As mentioned above, all response rates discussed here are unweighted. To understand the calculation of MEPS response rates, some features related to MEPS data collection should be noted. When an RU is visited for a round of data collection, changes in RU membership are identified. Such changes include the formation of student RUs as well as other new RUs created when RU members from a previous round have moved to another location in the U.S. Thus, the number of RUs eligible for MEPS interviewing in a given round is determined after data collection is fully completed. The ratio of the number of RUs completing the MEPS interview in a given round to the number of RUs characterized as eligible to complete the interview for that round represents the “conditional” response rate for that round expressed as a proportion. It is “conditional” in that it pertains to the set of RUs characterized as eligible for MEPS for that round and thus is “conditioned” on prior participation rather than representing the overall response rate through that round. For example, in Table 3.1, for Panel 25 Round 2 the ratio of 4,677 (Row G) to 5,958 (Row F) multiplied by 100 represents the response rate for the round (78.5 percent when computed), conditioned on the set of RUs characterized as eligible for MEPS for that round. Taking the product of the percentage of the NHIS sample eligible for MEPS (Row A) with the product of the ratios for a consecutive set of MEPS rounds beginning with Round 1 produces the overall response rate through the last MEPS round specified. The overall unweighted response rate for 2020 for the combined sample after pooling the respondents across the three panels was obtained by computing the product of the compositing factor associated with each panel and the corresponding overall panel response rates and then summing the three products. Panel 25 represents about 34.6 percent of the combined sample size, Panel 24 represents about 35.9 percent, and Panel 23 represents the remaining 29.4 percent. Thus, the combined response rate of 27.6 percent was computed as 0.29 times 28.0 (28.0 is the overall Panel 23 response rate through Round 7) plus 0.36 times 28.8 (28.8 is the overall Panel 24 response rate through Round 5) plus 0.35 times 25.9 (25.9 is the overall Panel 25 response rate through Round 3.) The overall response rate of 27.6 percent for 2020 is substantially lower than that for 2019 (39.5 percent), reflecting the impact of the pandemic on data collection efforts. 3.3.2 Panel 25 Response Rates9,888 households for MEPS Panel 25 Round 1 were fielded in 2020 (Row C of Table 3.1), a randomly selected subsample of the households responding to the 2019 National Health Interview Survey (NHIS). Table 3.1 shows the number of RUs eligible for interviewing in each Round of Panel 25 as well as the number of RUs completing the MEPS interview. Computing the individual round “conditional” response rates as described in Section 3.3.1 and then taking the product of these three response rates and the factor 65.7 (the percentage of the NHIS sampled households designated for use in selecting a sample of households for MEPS) yields an overall response rate of 25.9 percent for Panel 25 through Round 3. 3.3.3 Panel 24 Response Rates9,684 households for MPES Panel 24 were fielded in 2019 (as indicated in Row C of Table 3.1), a randomly selected subsample of the households responding to the 2018 National Health Interview Survey (NHIS). Table 3.1 shows the number of RUs eligible for interviewing and the number completing the interview for all five rounds of Panel 24. The overall response rate for Panel 24 was computed in a similar fashion to that of Panel 25 but covering all five rounds of MEPS interviewing as well the factor representing the percentage of NHIS sampled households eligible for MEPS. The overall response rate for Panel 24 through Round 5 is 28.8 percent. 3.3.4 Panel 23 Response Rates9,694 households for MEPS Panel 23 were fielded in 2018 (as indicated in Row C of Table 3.1), a randomly selected subsample of the households responding to the 2017 National Health Interview Survey (NHIS). Table 3.1 shows the number of RUs eligible for interviewing and the number completing the interview for all seven rounds of Panel 23. The overall response rate for Panel 23 was computed in a similar fashion to that of Panel 24 but covering all seven rounds of MEPS interviewing as well the factor representing the percentage of NHIS sampled households eligible for MEPS. The overall response rate for Panel 23 through Round 5 is 28.0 percent. 3.3.5 Annual (Combined Panel) Response RateA combined panel response rate for the survey respondents in this data set is obtained by taking a weighted average of the panel specific response rates. The Panel 23 response rate was weighted by a factor of 0.29, the Panel 24 response rate was weighted by a factor of 0.36, and Panel 25 was weighted by a factor of 0.35, reflecting approximately the distribution of the overall sample between the three panels. The resulting combined response rate for the combined panels was computed as (0.29 x 28.0) plus (0.36 x 28.8) plus (0.35 x 25.9) or 27.6 percent (as shown in Table 3.1). 3.3.6 OversamplingOversampling is a feature of the MEPS sample design, helping to increase the precision of estimates for some subgroups of interest. Before going into details related to MEPS, the concept of oversampling will be discussed. In a sample where all persons in a population are selected with the same probability and survey coverage of the population is high, the sample distribution is expected to be proportionate to the population distribution. For example, if Hispanics represent 15 percent of the general population, one would expect roughly 15 percent of the persons sampled to be Hispanic. However, in order to improve the precision of estimates for specific subgroups of a population, one might decide to select samples from those subgroups at higher rates than the remainder of the population. Thus, one might select Hispanics at twice the rate (i.e., at double the probability) of persons not oversampled. As a result, an oversampled subgroup comprises a higher proportion of the sample than it represents in the general population. Sample weights ensure that population estimates are not distorted by a disproportionate contribution from oversampled subgroups. Base sample weights for oversampled groups will be smaller than for the portion of the population not oversampled. For example, if a subgroup is sampled at roughly twice the rate of sample selection for the remainder of the population not oversampled, members of the oversampled subgroup will receive base or initial sample weights (prior to nonresponse or poststratification adjustments) that are roughly half the size of the group not oversampled. As mentioned above, oversampling is implemented to increase the sample sizes and thus improve the precision of survey estimates for particular subgroups of the population. The “cost” of oversampling is that the precision of estimates for the general population and subgroups not oversampled will be reduced to some extent compared to the precision one could have achieved if the same overall sample size were selected without any oversampling. The NHIS no longer oversamples households with members who are Asian, Black, or Hispanic. Nevertheless, these minority groups are still of analytic interest for MEPS. As a result for all three panels, all households in the Asian, Hispanic, and Black domains were sampled with certainty (i.e., all households assigned to those domains were included in the MEPS). In addition, all households in Panel 23 who had a member who was a veteran were selected with certainty. Among all remaining households for Panel 23, the “Other, complete” domain was sampled at a rate of about 69 percent while the “Other, partial complete” domain was sampled at a rate of about 43 percent. For Panel 24, the corresponding sampling rates for the “Other, complete” domain and the “Other, partial complete” domain were about 79 percent and 50 percent, respectively. For Panel 25, the corresponding sampling rates for the “Other, complete” domain and the “Other, partial complete” domain were about 77 percent and 50 percent, respectively. The somewhat lower sampling rates for Panel 23 in the two “Other” domains arose due to the oversampling of veterans in that panel. With a specified overall sample size of 9,700 fewer were needed from those assigned to the “Other” domains in that panel. Within the “noncertainty” strata (the “Other” domains) for both panels, responding NHIS households were selected for MEPS using a systematic sample selection procedure from among those eligible. The selection of the households was with probability proportionate to size (pps) where the size measure was the inverse of the NHIS initial probability of selection. The pps sampling was undertaken to help reduce the variability in the MEPS weights incurred due to the variability of the NHIS sampling rates. A note with respect to the interpretation of MEPS response rates, which are unweighted. Typically, sample allocations across sample domains change from one MEPS panel to another. The sample domains used may also vary by panel as is the case for Panel 23 versus Panel 24 and Panel 25. When one compares unweighted measures (e.g., response rates) between panels and years, one should take into account such differences. Suppose, for example, members of one domain have a lower propensity to respond than those of another domain. Then, if that domain has been allocated a higher proportion of the sample, the corresponding panel may have a lower unweighted response rate simply because of the differences in sample allocation. 3.4 Food Security Weight (FSWT42)3.4.1 Background and Target PopulationThe Food Security questions are designed to learn more about existing food concerns among families in the U.S. These questions are incorporated into the MEPS survey instrument in the second round of data collection for each panel in a calendar year. Thus, for calendar year 2020, this took place in Round 6 of Panel 23, Round 4 of Panel 24, and Round 2 of Panel 25. To ensure that data reflected family circumstances as accurately as possible, data were collected from a member of the RU. For virtually all MEPS interviews, this was the RU respondent. However, a relative handful of MEPS interviews are conducted with proxy respondents. As a result, food security data were not collected for such RUs, and they are not part of the target population for the Food Security Weights. It may be noted that such families can be expected to be somewhat different from families generally. For example, a proxy respondent may be called for if his or her elderly parent was too sick to respond or had entered a nursing home. Some RUs for which Food Security data were obtained may have gone out of scope prior to the end of 2020 while others may have become MEPS nonrespondents. As a result, the MEPS family weights established to reflect MEPS families in 2020 and appearing on the FY 2020 Consolidated PUF do not pertain to the target population associated with the Food Security weights. The Food Security weights were thus established with this in mind. The target population for the Food Security questions in 2020 can be described as MEPS families in the fall of 2020 that did not require a proxy respondent. As a reminder, single person RUs are considered a MEPS family as are partners who, though unrelated by marriage, consider themselves as a single family unit. 3.4.2 Development of the Food Security WeightsThe weights for the 2020 Food Security data were established utilizing the MEPS family weights established for the FY 2020 Consolidated PUF which already compensated for MEPS nonresponse at the family level across MEPS rounds. To reflect such nonresponse for the Food Security weights, the initial Food Security weight assigned to an RU (MEPS family) was that of the weight of the corresponding MEPS family at the end of 2020, as established through the weighting of families for the FY 2020 Consolidated PUF. Specifically, these weights were assigned to each responding RU at Rounds 6/4/2 of Panels 23/24/25 where an RU member completed the MEPS interview. Those MEPS families that responded in Rounds 6/4/2 but were a nonresponding RU at Round 7/5/3 would thus not receive a Food Security weight as they would not have received a MEPS family weight. Proxy respondents at Rounds 6/4/2 were then removed from further consideration in the weighting process. Slightly under 1 percent of the MEPS family population at Rounds 6/4/2 had their MEPS data reported by proxy respondents. Fewer than 20 RU respondents of the families otherwise eligible for the Food Security weights at Rounds 6/4/2 did not answer at least three of the 10 Food Security questions. These were treated as Food Security nonrespondents and a global adjustment factor was assigned to the weights of the respondents of the Food Security questions to determine the final value of weight variable FSWT42, the 2020 Food Security weight appearing on the 2020 Food Security PUF. For information on the derivation of FAMWT20F, the weight variable representing the MEPS family population appearing on the 2020 Full Year Consolidated PUF, see MEPS HC-224, the corresponding PUF Documentation. Table 3.2 shows the number of families in the Food Security data file by panel and the weighted total number of the families.
3.5 Variance EstimationThe MEPS is based on a complex sample design. To obtain estimates of variability (such as the standard error of sample estimates or corresponding confidence intervals) for MEPS estimates, analysts need to take into account the complex sample design of MEPS for both person-level and family-level analyses. Several methodologies have been developed for estimating standard errors for surveys with a complex sample design, including the Taylor-series linearization method, balanced repeated replication, and jackknife replication. Various software packages provide analysts with the capability of implementing these methodologies. MEPS analysts most commonly use the Taylor Series approach. Although this data file does not contain replicate weights, the capability of employing replicate weights constructed using the Balanced Repeated Replication (BRR) methodology is also provided if needed to develop variances for more complex estimators (see Section 3.5.2). 3.5.1 Taylor-series Linearization MethodThe variables needed to calculate appropriate standard errors based on the Taylor-series linearization method are included on this and all other MEPS public use files. Software packages that permit the use of the Taylor-series linearization method include SUDAAN, R, Stata, SAS (version 8.2 and higher), and SPSS (version 12.0 and higher). For complete information on the capabilities of a package, analysts should refer to the corresponding software user documentation. Using the Taylor-series linearization method, variance estimation strata and the variance estimation PSUs within these strata must be specified. The variables VARSTR and VARPSU on this MEPS data file serve to identify the sampling strata and primary sampling units required by the variance estimation programs. Specifying a “with replacement” design in one of the previously mentioned computer software packages will provide estimated standard errors appropriate for assessing the variability of MEPS survey estimates. It should be noted that the number of degrees of freedom associated with estimates of variability indicated by such a package may not appropriately reflect the number available. For variables of interest distributed throughout the country (and thus the MEPS sample PSUs), one can generally expect to have at least 100 degrees of freedom associated with the estimated standard errors for national estimates based on this MEPS database. Prior to 2002, MEPS variance strata and PSUs were developed independently from year to year, and the last two characters of the strata and PSU variable names denoted the year. Beginning with the 2002 Point-in-Time PUF, the approach changed with the intention that variance strata and PSUs would be developed to be compatible with all future PUFs until the NHIS design changed. Thus, when pooling data across years 2002 through the Panel 11 component of the 2007 files, the variance strata and PSU variables provided can be used without modification for variance estimation purposes for estimates covering multiple years of data. There were 203 variance estimation strata, each stratum with either two or three variance estimation PSUs. From Panel 12 of the 2007 files, a new set of variance strata and PSUs were developed because of the introduction of a new NHIS design. There are 165 variance strata with either two or three variance estimation PSUs per stratum, starting from Panel 12. Therefore, there are a total of 368 (203+165) variance strata in the 2007 Full Year file as it consists of two panels that were selected under two independent NHIS sample designs. Since both MEPS panels in the Full Year files from 2008 through 2016 were based on the next NHIS design, there are only 165 variance strata. These variance strata (VARSTR values) have been numbered from 1001 to 1165 so that they can be readily distinguished from those developed under the former NHIS sample design in the event that data are pooled for several years. As discussed, a complete change was made to the NHIS sample design in 2016, effectively changing the MEPS design beginning with calendar year 2017. There were 117 variance strata originally formed under this new design intended for use until the next fully new NHIS design was implemented. In order to make the pooling of data across multiple years of MEPS more straightforward, the numbering system for the variance strata has changed. Those strata associated with the new design (implemented in 2016) were numbered from 2001 to 2117. However, the new NHIS sample design implemented in 2016 was further modified in 2018. With the modification in the 2018 NHIS sample design, the MEPS variance structure for the 2019 Full Year file has also had to be modified, reducing the number of variance strata to 105. Consistency was maintained with the prior structure in that the 2019 Full Year file variance strata were also numbered within the range of values from 2001-2117, although there are now gaps in the values assigned within this range. Due to the modification, each stratum could contain up to five variance estimation PSUs. Some analysts may be interested in pooling data across multiple years of MEPS data. As noted on the cover page of this document, due to data quality issues arising from collecting data during the COVID-19 pandemic in 2020, caution should be taken when interpreting the results of such pooling. If pooling across years is to be undertaken, it should be noted that, to obtain appropriate standard errors when doing so, it is necessary to specify a common variance structure. Prior to 2002, each annual MEPS public use file was released with a variance structure unique to the particular MEPS sample in that year. Starting in 2002, the annual MEPS public use files were released with a common variance structure that allowed users to pool data from 2002 through 2018. However, with the need to modify the variance structure beginning with 2019, this can no longer be routinely done. To ensure that variance strata are identified appropriately for variance estimation purposes when pooling MEPS data across several years, one can proceed as follows:
3.5.2 Balanced Repeated Replication (BRR) MethodBRR replicate weights are not provided on this MEPS PUF for the purposes of variance estimation. However, a file containing a BRR replication structure is made available so that the users can form replicate weights, if desired, from the final MEPS weight to compute variances of MEPS estimates using either BRR or Fay’s modified BRR (Fay 1989) methods. The replicate weights are useful to compute variances of complex non-linear estimators for which a Taylor linear form is not easy to derive and not available in commonly used software. For instance, it is not possible to calculate the variances of a median or the ratio of two medians using the Taylor linearization method. For these types of estimators, users may calculate a variance using BRR or Fay’s modified BRR methods. However, it should be noted that the replicate weights have been derived from the final weight through a shortcut approach. Specifically, the replicate weights are not computed starting with the base weight and all adjustments made in different stages of weighting are not applied independently in each replicate. Thus, the variances computed using this one-step BRR do not capture the effects of all weighting adjustments that would be captured in a set of fully developed BRR replicate weights. The Taylor Series approach does not fully capture the effects of the different weighting adjustments either. The dataset HC-036BRR, MEPS 1996-2018 Replicates for Variance Estimation File, contains the information necessary to construct the BRR replicates. It contains a set of 128 flags (BRR1-BRR128) in the form of half sample indicators, each of which is coded 0 or 1 to indicate whether the person should or should not be included in that particular replicate. These flags can be used in conjunction with the full-year weight to construct the BRR replicate weights. For analysis of MEPS data pooled across years, the BRR replicates can be formed in the same way using the HC-036, MEPS 1996-2018 Pooled Linkage Variance Estimation File. For more information about creating BRR replicates, users can refer to the documentation for the HC-036BRR pooled linkage file on the AHRQ website. ReferencesBramlett, M.D., Dahlhamer, J.M., & Bose, J. (2021, September). Weighting Procedures and Bias Assessment for the 2020 National Health Interview Survey. Centers for Disease Control and Prevention. Chowdhury, S.R., Machlin, S.R., Gwet, K.L. Sample Designs of the Medical Expenditure Panel Survey Household Component, 1996-2006 and 2007-2016. Methodology Report #33. January 2019. Agency for Healthcare Research and Quality, Rockville, MD. Current Population Survey: 2021 Annual Social and Economic (ASEC) Supplement. (2021). U.S. Census Bureau. Dahlhamer, J.M., Bramlett, M.D., Maitland, A., & Blumberg, S.J. (2021). Preliminary evaluation of nonresponse bias due to the COVID-19 pandemic on National Health Interview Survey estimates, April-June 2020. National Center for Health Statistics. Daily, D., Cantwell, P.J., Battle, K., & Waddington, D.G. (2021, October 27), An Assessment of the COVID-19 Pandemic’s Impact on the 2020 ACS 1-Year Data. U.S. Census Bureau. Fay, R.E. (1989). Theory and Application of Replicate Weighting for Variance Calculations. Proceedings of the Survey Research Methods Sections, ASA, 212-217. Lau, D.T., Sosa, P., Dasgupta, N., & He, H. (2021). Impact of the COVID-19 Pandemic on Public Health Surveillance and Survey Data Collections in the United States. American Journal of Public Health, 111 (12), pp. 2118-2121. Rothbaum, J. & Bee, A. (2020). Coronavirus Infects Surveys, Too: Nonresponse Bias During the Pandemic in the CPS ASEC (SEHSD Working Paper Number 2020-10). U.S. Census Bureau. Rothbaum, J. & Bee, A. (2021, May 3). Coronavirus Infects Surveys, Too: Survey Nonresponse Bias and the Coronavirus Pandemic. U.S. Census Bureau. Rothbaum, J., Eggleston, J., Bee, A., Klee, M., & Mendez-Smith, B. (2021). Addressing Nonresponse Bias in the American Community Survey During the Pandemic Using Administrative Data. U.S. Census Bureau. Villa Ross, C.A., Shin, H.B., & Marlay, M.C. (2021, October 27). Pandemic Impact on 2020 American Community Survey 1-Year Data. U.S. Census Bureau. Zuvekas, S.H. & Kashihara, D. (2021). The Impacts of the COVID-19 Pandemic on the Medical Expenditure Panel Survey. American Journal of Public Health, 111 (12), pp. 2157-2166. D. Variable-Source CrosswalkFOR MEPS HC-221: 2020 FOOD SECURITY DATA FILE
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||