Researchers with approved research projects may be authorized to use elements from restricted data files that are
not publicly released for reasons of confidentiality through the AHRQ Data Center. Researchers may apply to use
restricted MEPS data through one of three access options, presented below.
- In-person data access at the AHRQ Data Center in Rockville, Maryland.
- In-person data access at a Federal Statistical Research Data Center (FSRDC). AHRQ and the Census Bureau partnered
to make restricted MEPS data files available to researchers through the FSRDC network. (Please visit the
Federal Statistical Research Data Centers page on
census.gov for additional information.) For information on the FSRDC research proposal process and the data sets
available, please read the AHRQ-Census Bureau agreement on access to restricted MEPS data.
- Remote data access through the MEPS-Amazon Web Services (AWS) SecureCloud. AHRQ's MEPS-AWS SecureCloud environment
is hosted by AWS and allows researchers to access certain restricted data elements without being in-person at the
AHRQ DC. AWS access is generally limited to 90 days. Through the MEPS-AWS SecureCloud, researchers can access data, perform analyses, and generate reports.
Eligibility for this access option is limited and researcher data needs must meet restrictive requirements. (See
Data Available Using the MEPS-AWS SecureCloud, below.)
Data Available In-Person at the AHRQ Data Center or at an FSRDC
Confidential data files
- Household Component-Insurance Component Linked File (available for 1996-1999 and 2001). For many
households, health insurance available through a household member's employer is a key source of coverage. However,
many details about the health insurance plans offered by and/or obtained through employers are not readily available
from household respondents. To fill this data gap, the employers of Household Component jobholders are contacted
through the Insurance Component survey. Employers are asked about health insurance offerings, premiums, employee
contributions to premiums and other plan details for their establishment as a whole. This information is then
linked, via a data file, to data collected during the household survey for the jobholder to provide a more complete
picture of the jobholder's insurance options. Significant survey non-response prevents these linked data from being
used to make nationally representative estimates and results cannot be generalized beyond the sample of persons
included in the file. Thus, no weight has been constructed for these files. Detailed documentation of these research
files for 1996-1999, including a crosswalk of variables to the MEPS-IC questionnaires and the codebooks, are
available for download. There are no plans for collecting these data in future years.
- Nursing Home Component (available for 1996 only). The Nursing Home Component was a survey of nursing
homes and persons resident in or admitted to nursing homes at any time during calendar year 1996. The survey
gathered information on the demographic characteristics, residence history, health and functional status, use of
services, use of prescription medications, and health care expenditures of nursing home residents. Nursing home
administrators and designated staff also provided information on facility size, ownership, certification status,
services provided, revenues and expenses, and other facility characteristics. A community questionnaire obtained
data from next of kin or other knowledgeable persons in the community on income, assets, family relationships, and
caregiving information for the sampled nursing home resident. Detailed documentation of these research files,
including links to the Nursing Home survey questionnaires and codebooks, are available for download.
There are no plans for collecting these data in future years.
- Medical Provider Component. The primary purpose of the Medical Provider Component is to collect detailed
charge and payment data from hospitals, physicians, home health care providers, and pharmacies to supplement
information received from Household Component respondents on their health care expenses. The billing data collected
includes whether a visit was capitated, charges for each procedure (CPT4) for office-based and outpatient department
visits, and amounts for each source of payment. Diagnosis codes for medical visits and stays and NDC codes for
prescription drugs are also collected. While the Medical Provider Component was not designed to make stand-alone
national estimates, the detailed information on payments, procedure codes, and diagnostic codes support a variety of
analyses not possible with the Household Component data alone. However, linking the Medical Provider Component and
Household Component for analytical purposes is not possible.
- Area Health Resources Files. The Area Health Resources Files (AHRF) dataset provides current as well as
historic data for more than 6,000 variables for each of the nation's counties, as well as state and national data.
It contains information on health facilities, health professions, measures of resource scarcity, health status,
economic activity, health training programs, and socioeconomic and environmental characteristics. In addition, the
basic file contains geographic codes and descriptors which enable it to be linked to many other files and to
aggregate counties into various geographic groupings. The Area Health Resources Files (AHRF) data are designed to be
used by planners, policymakers, researchers, and others interested in the nation's health care delivery system and
factors that may impact health status and health care in the United States. Historically and when requested, data
from the AHRF has been merged onto MEPS files that are used in the AHRQ Data Center. This can still be done;
however, researchers need to show AHRQ that they have obtained a copy of the AHRF from Health Resources and Services
Administration (HRSA). Go to the website https://data.hrsa.gov/data/download
for more information on obtaining copies of the AHRF. Please note that AHRQ does not permit access to the entirety
of the AHRF data. Researchers must supply a specific list of variable names with their application to be merged
with the MEPS data by state/county.
- MEPS Link Files to NHIS. Each MEPS/NHIS link file contains a crosswalk that allows data users to merge
MEPS full-year public use data files to NHIS person-level public use data files that contain data collected for MEPS
respondents in the year prior to their initial year of MEPS participation. Documentation of these link files are
available for download.
Additionally, all MEPS Public Use Files (PUFs) currently available on the MEPS website are also available to researchers
with approved projects at the AHRQ Data Center, FSRDCs, or on the MEPS-AWS Secure Cloud. Approved researchers will
work with Data Center staff to access the PUFs in these environments.
Confidential and non-public use variables
- Fully Specified ICD-9 and ICD-10 Codes: These codes allow medical conditions to be identified with
greater specificity. Access to all fully specified ICD codes is not permitted. Researchers must submit a list of the
specific codes needed for their analysis. Please contact the AHRQ Data Center Coordinator for additional details.
- Fully Specified Industry and Occupation Codes: These codes allow a worker's industry and occupation to be
identified with greater specificity.
- State and County FIPS Codes: These codes can be used to merge data from the Area Resource File, or any
data at the State and/or County level, onto the MEPS data. Please note that these geographic codes are usually
either dropped or encrypted after the data merge. Please contact the AHRQ Data Center Coordinator for additional
- Census Tract and Block-Group Codes: These codes can be used to merge data from the U.S. Census, or any
data at the tract or block-group level, onto the MEPS data. Please note that these geographic codes will be either
dropped or encrypted after the data merge.
- Metropolitan statistical area (MSA) indicator
- Non-Public Use Data Elements: These are data elements from our questionnaires that are not directly
identifiable data, but have yet to be edited or released, e.g., asset information and imputed NDC codes.
- Federal and State Marginal Tax Rates: Tax simulations using the National Bureau of Economic Research's
TAXSIM package are run for the Household Component full-year populations beginning in 1996. Tax amounts and marginal
tax rates are computed for Federal, State and FICA taxes. Property taxes, sales taxes, and city/county income taxes
are not simulated.
Data Available Using the MEPS-AWS SecureCloud
Only a limited number of low-risk variables have been approved for use by researchers accessing restricted MEPS data
through the MEPS-AWS SecureCloud. Additionally, all MEPS Public Use Files (PUFs) currently available on the MEPS website are
also available in the MEPS-AWS SecureCloud.
Confidential and non-public use variables
- State FIPS Codes: These codes can be used to merge data from the Area Resource File, or any
data at the State level, onto the MEPS data. Please note that these geographic codes may be either
dropped or encrypted after the data merge. Please contact the AHRQ Data Center Coordinator for additional details.
- Metropolitan statistical area (MSA) indicator
Application Procedure and Form
Prospective researchers must submit an application, including a research proposal, for AHRQ review. The application
form is available in HTML and PDF formats.
AHRQ Data Center applications are accepted, reviewed, and approved on a rolling basis.
It is recommended that researchers have an initial discussion of the feasibility of the proposed project with AHRQ
Data Center staff. The complete application package should be emailed to
CFACTDC@AHRQ.HHS.GOV or mailed to the AHRQ Data Center address below.
The AHRQ Data Center Coordinator will review material for completeness, receive clarification from the researcher (if
needed), and send the material for approval to the AHRQ Data Center Manager. The proposal may iterate with the
researcher and AHRQ Data Center Manager until an appropriate package is developed. If necessary, a cost estimate for
required services will be reviewed with the researcher/applicant once the project is approved.
The following forms are developed and specified for each project by the AHRQ Data Center Coordinator: 1) The AHRQ
Data Center agreement, which makes clear the scope of the project, data, and services to be provided by the researcher
and the AHRQ Data Center; the obligations of both parties; and the cost; 2) Task order/billing agreement for programming
support; and 3) Confidentiality affidavits and security training, as required.
AHRQ charges a user fee of $300.00 for approved projects to cover technical assistance and up to four hours of
programming support for simple file construction. The user fee is due after application approval. The user fee is
waived for full-time graduate students working on dissertations or other degree requirements, Federal Government
agencies, and FSRDC users. FSRDC users are responsible for any additional fees required by the Census Bureau while
working at the FSRDC.
The AHRQ Data Center Manager will review each proposal. The following will be considered in reviewing proposals:
- The feasibility of existing data to the project, that is, whether it is possible for the research to be
conducted with the available information. On occasion, it is apparent from the outset that the sample will not
support the intended analysis. For instance, MEPS does not support estimates for smaller states or for some
- The risk of disclosure of restricted information, that is whether the analysis can be conducted without
compromising the confidentiality promised to all respondents (persons, employers, insurers, hospitals and
- The availability of resources to support the project at the AHRQ Data Center. At this time, only limited
technical support can be provided on site, so researchers should make every effort to familiarize themselves with
the MEPS data before coming to the AHRQ Data Center.
- Whether the proposed project is in accordance with the mission of AHRQ as specified in its authorizing
Researchers should note that approval of the application does not constitute endorsement by AHRQ of the substantive,
methodological, theoretical, policy relevance or scientific merit of the proposed research. Approval only constitutes
a judgment that the research is doable, broadly consistent with AHRQ's mission, and an appropriate use of the data in
terms of what confidentiality and privacy protections were promised to respondents or otherwise required by law.
Accessing the Data Files
Researchers must execute all computer runs either at facilities located within the AHRQ Data Center in Rockville,
Maryland or at one of the Federal Statistical Research Data Centers (FSRDC) located throughout the U.S. Additionally,
as described above, the MEPS-AWS SecureCloud provides a limited remote access option.
For researchers who have already visited the AHRQ Data Center to perform initial programming, we offer a limited
service that allows them to submit revised SAS, STATA, or R programs that will be run on their behalf by the AHRQ Data Center
Coordinator. This service is not for the purpose of program development and does not include programming support—the
researcher is responsible for developing a debugged program. The service is to support minor modifications to existing
programs, such as changes in variables in a statistical model. If there are numerous such requests for a single
project, a separate fee may be negotiated for providing this additional service.
The AHRQ Data Center allows researchers to supply their own data to be merged with AHRQ data. The researcher-supplied
data may consist of proprietary data collected and owned by the researcher. Researchers must provide the AHRQ Data
Center staff with complete documentation of any data proposed to be merged. The documentation should include
descriptive variable and value labels. Researchers are responsible for interacting with AHRQ Data Center staff to
ensure that the data can be merged and that the formats are consistent. Merging of characteristics of geographic areas
must be done in such a way that details of the sampling frame (i.e., what counties are included or excluded) of the
survey are not revealed to the researcher. Data merges will be conducted by the AHRQ Data Center or its contractors
prior to the arrival of the researcher. Files and information used in linking the data sets will not be made available
to the researcher. As a policy, AHRQ will catalog all data (including researcher-supplied data) used at the AHRQ Data
Center and will make that data available to other researchers.
What output can be taken from the AHRQ Data Center, FSRDC, or MEPS-AWS SecureCloud?
- All materials to be removed from the AHRQ Data Center, FSRDC, or MEPS-AWS SecureCloud are subject to disclosure
- In addition to approved tables, researchers may request that the AHRQ Data Center Coordinator review and, if
approved, download the following to an external media from the AHRQ Data Center: programs, word processing
documents, and electronic versions of approved printouts or statistical tables. These approved output files may also
be emailed to the researcher.
What output cannot be taken from the AHRQ Data Center, FSRDC, or MEPS-AWS SecureCloud?
- Any output that could potentially identify respondents or small geographic areas, either directly or
inferentially., cannot be removed from the AHRQ Data Center, FSRDC, or MEPS-AWS SecureCloud.
- Models using geographic area as the dependent variable cannot be removed from the AHRQ Data Center, FSRDC, or
MEPS-AWS SecureCloud, unless the values are encrypted.
- The identity of sampling units or information that might reveal sampling units, which could be used in efforts to
identify the data subject, cannot be removed.
- In general, any direct or inferential identifiers not revealed on the public use data files cannot be removed from
the AHRQ Data Center, FSRDC, or MEPS-AWS SecureCloud.
- No sample case printouts are ever allowed to be removed from the AHRQ Data Center, FSRDC, or MEPS-AWS SecureCloud.
- When using Proc Means, no cells with the same minimum and maximum or with cell sizes of less than 100 may be
- When using Proc Univariate or producing cross-frequencies, no tables with the same minimum and maximum, with only
one observation for the minimum and maximum, or tables with row and column sizes of less than 100 can be removed.
- Researchers will not have access to data files unless they are requested in their approved project.
- Researchers will not have access to files containing direct identifiers, such as names or addresses.
- Researchers will not have access to geographical data (county, census tract) unless it is encrypted. Depending
upon the approved research project, researchers will be given access to characteristics of those geographic areas.
Researchers may also be given access to files with dummy codes for places, but they will not be given the decodes
that allow association of place name with the code. Upon request, an entire file can be pre-coded into categories
(e.g., residing in a state with high/medium/low Medicaid generosity).