Methodology Report #37:
Outpatient Prescription Drugs: Data Collection and Editing in the 2021 Medical Expenditure Panel Survey
Salam Abdus, PhD, Steven C. Hill, PhD, and Rebecca Ahrnsbrak, MPS
Table of Contents
Introduction
Household Component
Data Collection
Editing Household Component Data
Pharmacy Component
Data Collection
Supplemental Data
Editing the Pharmacy-Reported Data
Imputing NDC
Imputing quantity dispensed
Editing third-party payers
Identifying acquisitions needing price imputation
Imputing price and payments
Results of Preparing the Pharmacy-Reported Data
Matching Pharmacy Data to Household Data
Overview of Matching
Details of Matching
The first approach: Within-person matching
The second approach: Imputation from a different person
Editing Matched Data
Imputed fills with too many days supplied
Free antibiotics, anti-diabetics, and prenatal vitamins
Prices paid by the government program CHAMPVA
Federal pharmacy prices
Editing year in crossover rounds
Medicare Part D and private insurance
Resolving payer inconsistencies arising from imputation
Additional variables
Editing for confidentiality
References
Notes
Abstract
The Medical Expenditure Panel Survey (MEPS), sponsored by the Agency for Healthcare Research and Quality (AHRQ), is a
nationally representative survey of the U.S. civilian noninstitutional population's medical care use and
expenditures. Household respondents report prescription drugs obtained by members of the household and the number of
times each drug was obtained, while a follow-back survey of pharmacies is the primary source of prices, payers, and
drug attributes. This report describes the household and pharmacy data collection and editing processes, the editing
and imputation techniques used to supply values for missing data in the pharmacy database, the procedure linking the
data reported by the pharmacy to each prescription drug reported by the household, and recent improvements in these
procedures. Statistics on these methods are presented for the 2021 MEPS.
*
*
*
The estimates in this report are based on the most recent data available at the time the report was written. However,
selected elements of Medical Expenditure Panel Survey (MEPS) data may be revised on the basis of additional analyses,
which could result in slightly different estimates from those shown here. Please check the MEPS website for the most
current file releases.
Center for Financing, Access and Cost Trends
Agency for Healthcare Research and Quality
5600 Fishers Lane, Mailstop 07W41A
Rockville, MD 20857
http://www.meps.ahrq.gov/
Disclaimer: Any opinions and conclusions expressed herein are those of the authors and do not necessarily reflect
those of the Agency for Healthcare Research and Quality, the Department of Health and Human Services, or the U.S.
Census Bureau. The Census Bureau has reviewed this data product for unauthorized disclosure of confidential
information and has approved the disclosure avoidance practices applied to this release. Disclosure Review Board
Approval Numbers CBDRB-FY22-047 and CBDRB-FY22-292; DMS project number 7514872.
Introduction
The Medical Expenditure Panel Survey (MEPS) is a nationally representative survey of medical care use and
expenditures sponsored by the Agency for Healthcare Research and Quality (AHRQ) covering the U.S. civilian
noninstitutional population. MEPS includes a Household Component (HC) and a Medical Provider Component (MPC).a MEPS data can produce estimates for individuals, families, and selected
population subgroups. The HC provides data on health status, demographic and socioeconomic characteristics,
employment, access to care, and experiences with healthcare. The MPC collects expenditures data from hospitals,
physicians, home healthcare providers, and pharmacies identified by MEPS-HC respondents. Its purpose is to supplement
the HC information.
Each year, a new panel of households is selected and participates in five rounds of HC interviews, covering 2 full
calendar years.b This set of households is a subsample of households
that participated in the previous year's National Health Interview Survey (NHIS) conducted by the National Center for
Health Statistics (NCHS) of the Centers for Disease Control and Prevention. The NHIS sampling frame provides a
nationally representative sample of the U.S. civilian noninstitutionalized population.
MEPS supports longitudinal analysis; each panel typically consists of five rounds of interviews to collect 2 full
calendar years of data. It is possible to analyze even more long-term trends by linking MEPS to the previous year's
NHIS. Cross-sectional analysis over a long period, going back to 1996 (when MEPS began), is possible as well.
MEPS defines expenditures as payments from all sources (including individuals, private insurance, Medicare, Medicaid,
and other sources) for healthcare services during the year. To construct estimates of expenditures, MEPS collects
detailed information about medical events including ambulatory visits (outpatient and office based), inpatient stays,
emergency care, home healthcare, dental care, other medical care, and prescription medicines. At each HC interview,
the household respondent supplies the names of any prescribed medicine that family members obtained and identifies the
pharmacies used. In addition, family members are asked for permission for MEPS to contact the pharmacies. With this
permission, the Pharmacy Component (PC), which is a subcomponent of the MPC, collects detailed information from the
pharmacies about the drugs obtained, including payments (the sum of which equals the drug price), payers, date each
prescription was filled, quantity dispensed, the National Drug Code, and precise drug attributes. Matching drugs
reported by the pharmacies in the PC to those reported by the household in the HC is accomplished through the use of
supplemental data and specialized software.
The MEPS conducts the PC to collect information that pharmacies can more easily and accurately provide than household
respondents. Because pharmacies receive payments for the drugs they dispense, they are generally likely to have more
accurate payment information than households, which may be more familiar with their out-of-pocket payments than
third-party payments. Some HC respondents may lack documentation, such as explanations of benefits, with details about
third-party payments. The PC collects information about payments for drugs, but data users should note that rebates
between manufacturers and pharmacies, pharmacy benefit managers, and government programs are not collected. A second
motivation for conducting the PC is that some households do not have easy access to details about their medications,
such as the number or strength of pills. In addition, some households have many fills for multiple medications during
the reference period, and asking them for information about each of these fills would be repetitive and overly
burdensome.
As with all surveys, inconsistent and missing data occur in MEPS. Errors and failures to report information happen.
Some pharmacies do not respond to the PC, and some sample members deny permission to contact pharmacies, so PC data
are not available for every sampled person. Even when pharmacies respond, the data provided are not always complete.
This report describes the methods used to (1) supply values for missing HC and PC data and (2) link the data reported
by the pharmacies to each prescription drug name reported by the household. Table 1 summarizes the key variables
edited or imputed. Details of edits and imputation are in the body of the report. The objective of these procedures is
to maximize the amount and quality of data available for analysis and to reduce the risk of bias associated with
reporting error and missing data. Statistics on these methods are presented for the 2021 MEPS.
This report updates an earlier Methodology Report on the 2011 MEPS prescription drugs.1
Since the 2011 data were released, several enhancements have been made to improve the quality and analytical capability
of the MEPS prescription drug data. Some improvements implemented include the following:
- Beginning with the 2013 data, a new drug name variable supplied by the Multum Lexicon Plus database of Oracle Health was added to the public use file. This drug name is the generic name of the
drug most commonly used by prescribing physicians. For the earlier years, this variable has been provided with
Addendum Files to MEPS Prescribed Medicines Files for 1996-2013.
- Beginning with the 2017 data, higher imputed prices were allowed to account for the rising prices of specialty
drugs. In 2017, this change in editing procedures accounted for more than 95 percent of the increase in total
expenditures for prescribed medicines relative to 2016.
- MEPS instrument redesign in 2018 aimed to improve data collection for all sections of the MEPS HC, including (1)
prescribed medicines that sample members obtained earlier in the reference period but were no longer taking and (2)
medications to treat diabetes and asthma.
- Starting with the 2018 data, the pharmacy types are those reportedly used by the person. For prior data years, the
pharmacy types are those reportedly used by anyone in the household.
- Beginning with the 2020 data, the rules used to identify outlier prices for prescription medications were improved
based on newer price benchmarks and analyses.2 New outlier thresholds were established based on the distribution of
the ratio of retail unit prices relative to the National Average Drug Acquisition Cost per unit, which better
reflects the prices paid for drugs. As a result, the prices paid for generics are lower in the 2020 data, compared
with the 2019 data, and fewer generic fills have third-party payments.
The importance of these changes depends on the research focus. For cross-sectional studies focused on a single year
of data, these changes may not be of interest to general users. However, studies that pull data across years or
examine trends over time should pay attention to methodological changes that may affect the trends. In particular,
attention should be paid when comparing 2016 data to 2017 data (allowing higher imputed prices may account for part of
the increase in expenditure), 2017 data to 2018 data (instrument redesign might affect trends), and from 2019 data to
2020 data (effects of the COVID-19 pandemic on data collection, as well as changes in price editing for generics).
This report is organized as follows. First, it describes data collection for the HC and the procedures used to edit
the HC data. Second, the discussion turns to data collection for the PC and the procedures employed for editing and
imputing data in the PC. Finally, an overview and details of matching pharmacy data to household data and the methods
of editing the matched data are presented.
Return to Table of Contents
Household Component
Data Collection
The MEPS-HC uses a panel design with two overlapping cohorts of the U.S. noninstitutionalized civilian population
combined to produce annual estimates. 3, 4, 5 A new cohort of households begins each year
and is interviewed five times to collect 2 calendar years of data. Additionally, due to the COVID 19 pandemic, the
panels that began in 2018 and 2019 (panels 23 and 24, respectively) each were extended to nine rounds of data
collection covering 4 calendar years. As a result, the 2021 MEPS includes data from four panels.
A single respondent in each household typically provides data about each household member during each interview. The
MEPS asks that this person be the family member most knowledgeable about health and healthcare use. In an interview,
the average recall period (the "round") is 5 months. The respondent is given a calendar and other materials to aid
memory and then asked to retain paperwork such as receipts and insurance benefit statements.
During each interview, the HC gathers information on healthcare services used, including prescribed medicines. The
first opportunity to report prescription drugs occurs when the HC asks about nonprescription drug health service
events that took place during the round. The respondent supplies the names of any medications prescribed as part of a
visit to an inpatient stay, emergency room, hospital outpatient clinic, or doctor's office and subsequently filled.
For hospital stays, the respondent is asked to report only drugs that were prescribed on discharge. The HC does not
collect information on drugs administered in healthcare settings.
Another opportunity to mention prescriptions occurs in the dedicated prescribed medicines section of the survey where
the respondent can identify prescription medicines not already mentioned. This section of the survey was revised in
2018 to better capture drugs associated with two priority conditions, diabetes and asthma, as well as drugs that
sample members are no longer taking. If a respondent reports that a sample member has ever been diagnosed with
diabetes, they are asked to report "insulin or any other prescribed medications related to [their] diabetes." They are
then asked whether they obtained "any other diabetic equipment or supplies, typically prescribed by a physician."
(Note that any diabetic supplies purchased without a prescription represent a nonprescription addition to the
prescription drug expenditure and utilization data.) If a respondent reports that a sample member has ever been
diagnosed with asthma, they are specifically asked whether they "obtained any prescribed medicines related to their
asthma." For each sample member, household respondents are then asked to report (1) "any new prescriptions or refills"
obtained at any pharmacy, including mail order or online, and (2) "any other prescriptions, even if [they are] no
longer taking the medicine or only take it as-needed."
In all these sections of the HC interview, the respondent is encouraged to consult records such as medicine bottles
or receipts, so the reported medication name is often quite specific. However, the information can also be minimal,
for example, "pain pills." In spring 2022, the HC implemented a searchable lookup tool with commonly reported drugs
pre-programmed into the survey instrument. Drugs that are not included in the lookup tool are manually entered as text
strings. When drugs are reported, the drug names and any available information on strength and form are entered in a
dynamic roster.
In addition to the drug name, the HC collects the following information about each medicine:
- The number of times it was obtained in the round.
- The health condition it was prescribed for (if any) (asked only in the first interview in which it is mentioned).
- The year and month in which the person first used it (asked only in the first interview in which it is mentioned).
The names, addresses, and types of pharmacies that filled each household member's prescriptions are also requested,
along with permission for MEPS to acquire data from these pharmacies. Signed authorization forms allow pharmacies to
respond to the PC of MEPS when it is conducted. For the interviews collecting information about medication obtained in
2021, 58.5 percent of pharmacy permission forms were signed.
In 2021, household respondents reported 144,783 unique drug names obtained by family members in a round (hereafter
referred to as "person-round-drugs"), including insulin and diabetic supplies.c The end result of HC data collection is
a roster of drug names that each sample person obtained in the round, the person's health condition or conditions
associated with each drug, the number of acquisitions, and a roster of the person's pharmacies.
Because the HC collects no information on expenditures and limited information on drug characteristics, the
expenditure information and other details about each prescription medicine—quantities, form, strength, dates
obtained, price, payments by source, etc.—come from the PC. A supplementary data source (Multum Lexicon Plus
database of Oracle Health) provides information on therapeutic classes and pregnancy category.
Return to Table of Contents
Editing Household Component Data
Editing the HC drug data consists mainly of imputing a value for the number of times a drug was obtained in the round
when this information is missing or invalid as collected in the HC.
GPI coding. A first step in preparing the HC data for editing is to code each drug name to a Generic
Product Identifier (GPI). A proprietary dataset, Wolters Kluwer's Master Drug Data Base (MDDB®)
6
Version 2.5 provides the GPI, a 14-digit code that identifies groups of pharmaceutically equivalent drugs that have
the same active ingredient(s), dosage form, and strength. The GPI is the common variable linking the PC data to the
HC. It allows information to be drawn from other sample members (imputed) in cases of missing information. The first
pair of digits in the GPI represents the drug group, the second pair represents the therapeutic class, and the third
pair represents the therapeutic subclass. The next four digits of the GPI (digits 7-10) represent the active
ingredient(s). The first 8 digits of the GPI are sufficient to identify most active ingredients, but 10 digits are
needed for compounds and salts. The remaining digits of the GPI represent dosage form and strength. For the HC,
professional coders use the MDDB to identify as many digits of the GPI as possible based on the medication name and
any other information provided by the respondent. Typically, 8 to 10 digits can be coded for household-reported drug
names, but in 2021, 5,702 (3.9 percent) of person-round-drugs could not be coded.
Imputing the number of acquisitions. The number of times a drug identified by name in the HC was
obtained by a person in a round (person-round-drug) needs to be imputed when it is missing or extreme. In the 2021
data, for 4.8 percent of household-reported drug names, the respondent did not know or remember the number of times
the drug was obtained during the round. Outlier values for the number of times a household reported obtaining a drug
in a round occur as well and are determined by comparing the number of days in the round and the number of
acquisitions of the drug reported in the round. Person-round-drugs with more than five fills per month were
automatically deemed invalid, with limited exceptions. For each drug, the maximum fills per month across all
person-rounds with pharmacy data are also used to label outliers in the household data. Person-round-drugs for which
the drug names reported by the household cannot be coded to a GPI, and for which more than one and no more than five
fills per month were reported, are reviewed by a pharmacist to assess their plausibility.
For missing and implausible values, a hot-deck procedure imputes a new number of acquisitions, drawing from the donor
pool of drugs with valid values. Specifically, days per acquisition are calculated in the donor pool, imputed to
recipients with replacement, and then the recipient's days in the round are divided by the days per acquisition to
calculate the number of acquisitions in the round. Four attempts are made to match a person-round-drug to a recipient
person-round-drug. The first attempt requires an exact match on drug, dosage form, and strength (14 digit GPI). The
second attempt requires an exact match on drug and dosage form (12 digit GPI). The third attempt requires an exact
match on drug (8 digit GPI). The fourth attempt requires an exact match on drug group (2 digit GPI), and therapeutic
class and subclass are weighting variables in the match.
Weighted match variables used in all four attempts include whether the person used any mail-order pharmacies;
insurance status and potential payment sources for the person; private and public health maintenance organization
(HMO) enrollment; length of the round (in months); health conditions treated by any drug; whether the person is
enrolled in a high-deductible health plan; and the person's age, state of residence, census division, region, and
urbanicity. The weight for each variable is based on its relative contribution to the variation in days per
acquisition. Because patients usually have only one fill of an antibiotic, the imputation approach is a little
different. Rather than imputing days per fill, the total number of fills is imputed. The donor pool consists of
antibiotics with valid values, and imputation does not require an exact match on any variable. Similarly, for
person-round-drugs with names that could not be GPI coded, imputation does not require exact match on any variable.
Table 2 summarizes the results of the process of imputing acquisitions in the 2021 HC. Most of the imputations (93.4
percent) were based on exact matches to pharmaceutically equivalent drugs (with the same active ingredients, dosage
form, and strength). After this imputation, the 2021 data file contained 333,859 acquisitions.
After these edits and imputation, the HC data are composed of each drug named during every HC interview. These data
include a GPI code and the number of acquisitions. The data are now ready to be linked to the payment and other drug
details from the PC.
Return to Table of Contents
Pharmacy Component
Data Collection
The PC collects data from the pharmacies identified by HC respondents as places where household members obtained
prescription medicines during the calendar year. During the second and subsequent HC interviews, sample members are
asked to sign permission forms authorizing MEPS to contact their pharmacies and authorizing the pharmacies to release
records to MEPS. Only those pharmacies for which an HC sample member signed this permission form and that voluntarily
participated are included in the PC.
The data collection protocol for pharmacies is as follows. For small retail pharmacies—that is, pharmacies not
associated with a large chain—the data collection staff contacts the pharmacy to explain the study's purpose, verifies
receipt of authorization forms, requests specific data elements, and determines whether patient profiles are
available. The data collection staff may collect information by telephone or request that the data be sent by fax or
mail or submitted electronically so that the necessary data elements can be abstracted. Pharmacy data are received in
any format, including hardcopy patient profiles, electronic files with patient profile data, or reports via telephone.
For large retail pharmacy chains, individual pharmacies are grouped by chain. Negotiators follow up with these
pharmacies in one of two basic ways:
- If the corporate office of the retail chain prefers that the local stores respond, data collection follows the
small retail model.
- If the pharmacy prefers that the data request be handled with a regional or central contact, the negotiator
facilitates the most efficient method for data collection.
The Veterans Health Administration (VA) and AHRQ have an interagency agreement for the VA to extract information from
centralized administrative data for MEPS HC sample members who sign authorization forms for the VA.
Each pharmacy is asked to provide patient profiles or information about each prescription filled or refilled for each
patient during the calendar year. For each acquisition of a drug, the following information is requested:
- The dates the prescription was filled or refilled.
- The National Drug Code (NDC).
- The quantity dispensed.
- The number of days supplied.
- The amount of out-of-pocket payment.
- Whether there were any third-party payments, and, if so, the third parties and payment amounts.
If the pharmacy does not provide the NDC, the PC asks instead for the medication name, dosage form, strength, and
strength unit.
In addition, pharmacy patient profiles are requested directly from HC family members who reported using one or more
of several pharmacy chains that had repeatedly refused to participate in the PC. Families are mailed a request for
profiles. To minimize burden on households, the mailing is limited to households that had completed their
participation in all rounds of the MEPS, and no follow-up attempts are made for those who do not reply to a mail
request. For the 2021 data collection, patient profile requests were mailed to 511 families with 720 patient-pharmacy
pairs. Completed profiles were received for 11.1 percent of the pairs. Despite the overall low rate of return, the
effort does provide a number of profiles for patients associated with pharmacy chains that otherwise would not be
represented in the pharmacy data.d
Table 3 presents the sample size and participation rates in the 2021 PC. Conditional on having obtained 58.5 percent
of signed permission forms sought in the household interview, more than four out of every five eligible pharmacies
responded. The responding pharmacies provided data on 257,596 unique acquisitions of drugs during 2021.
To the extent allowed by survey response, the end result of PC data collection is a roster of each HC household
member's drug acquisitions during a calendar year. The information that the PC seeks to collect about each of these
acquisitions includes the transaction date, the NDC (or analogous identifying drug information), the quantity, the
number of days of the drug that the acquisition supplied, and payments to the pharmacy by source.
Return to Table of Contents
Supplemental Data
As mentioned above, matching the pharmacy data to the household data requires assigning a common set of codes to the
household- and pharmacy-reported drugs. Wolters Kluwer's proprietary Master Drug Data Base (MDDB®) Version 2.5 is the
source of such a code, the Generic Product Identifier (GPI), also described above. For the PC, automated and manual
coding generally succeed in recording all 14 digits of the GPI based on the NDC (which is more specific than the GPI)
or the medication name, dosage form, and strength. A negligible number of acquisitions in the PC cannot be coded to a
GPI based on the information the pharmacy provides.
The MDDB has several other uses besides GPI coding. Information about the patent status and the average wholesale
unit price (AWUP) and wholesale acquisition cost unit price (WACUP) used in price editing and imputation come from the
MDDB. The MDDB is also used to impute the NDC where necessary. (Price editing, price imputation, and NDC imputation
are discussed in more detail below.) If the pharmacy reports an NDC found in the MDDB, the MDDB provides information
on the drug name, dosage form, and strength.
Another important supplemental source of information used for pharmacy data editing is the National Average Drug
Acquisition Cost (NADAC) developed by the Centers for Medicare and Medicaid Services. Starting with the 2020 MEPS
data, NADAC is used as the primary price benchmark to identify implausibly high or low drug prices (outliers).
Two other datasets are used in editing the pharmacy data. A price list obtained from the VA that includes prices for
TRICARE, the VA, and the Indian Health Service is used to determine prices of drugs dispensed by federal pharmacies
when total payments are missing. The Multum Lexicon Plus databasee of
Oracle Health7 provides variables for therapeutic class.
These variables are merged into the PC data file by NDC.
Editing the Pharmacy-Reported Data
To maximize the donor pool for imputation, the entire PC data file (257,596 acquisitions in 2021) is edited before it
is matched to the HC. Key aspects of this process include imputing missing NDCs and missing quantity dispensed,
editing ambiguous third-party payer information, identifying acquisitions with deficient payment information, and
imputing missing payment information.
Most editing and imputation are conducted at the acquisition level, but there are exceptions. For imputing NDCs and
matching them to the HC, acquisitions are aggregated to the person-round-drug level. That is, for each person and
round, all pharmaceutically equivalent acquisitions (having the same 14-digit GPI, which represents the active
ingredient, dosage form, and strength) are grouped together. The GPI can group multiple NDCs, such as brand name drugs
and their generic equivalents.
Imputing NDC
Among the 257,596 acquisitions reported by pharmacies in the 2021 data, 7,573 (2.9 percent) had a missing or an
invalid NDC. A missing NDC is imputed rather than left missing because it distinguishes between brand name drugs and
generics, which is important for identifying price outliers and imputing missing payment data. Furthermore, many
analyses of medication use depend on the NDC to identify medications of interest. The NDC can identify the
manufacturer, the patent status, and the average wholesale price of a drug. The NDC, therefore, increases the accuracy
of any price imputation and the analytic utility of the data.
MEPS uses three approaches to identify the NDCs of acquisitions missing this item in the PC. First, wherever
feasible, the NDC is taken from one of the person's other acquisitions in the same round. For each person and round,
all acquisitions of drugs with the same 14 digit GPI (representing the active ingredient, dosage form, and strength)
are grouped together. Within these groups, if at least one acquisition is missing an NDC and at least one acquisition
has an NDC, then the missing NDC is assigned the reported NDC. If more than one NDC in the 14-digit GPI group exists,
one is selected at random.
Second, for the residual acquisitions still missing NDC, the MDDB supplies it. The basic approach is to use matching
software to find the best match to a drug product on the MDDB based on the medication name, quantity units (for
example, milliliters), and the GPI. Medication name is used as a match variable because various brand name and generic
products, each with a unique NDC, can be grouped together under a single GPI. When a pharmacy reports NDCs, it almost
always reports the same NDC for all the person's acquisitions of a drug; so, to preserve homogeneity among the
recipient acquisitions, one NDC is imputed to all the acquisitions for the same person-round-drug.
Three attempts are made to impute an NDC by matching to the MDDB. In the first attempt, one NDC is taken from the
MDDB for all acquisitions with the same person, round, active ingredients, dosage form, and strength. In other words,
an exact match on 14 digit GPI is required. For the second attempt, one NDC is imputed from the MDDB to all
acquisitions with the same person, round, and active ingredients (10 digit GPI). Here, the dosage form and strength
may differ. An exact match on active ingredients is required, and, again, the medication name is a match variable. The
third attempt is similar to the second attempt, except that an exact match on drug name (2 digit GPI) is used.
For the residual acquisitions missing NDCs and without valid GPIs, NDCs are imputed from other pharmacy records. Most
of these acquisitions lack the drug name as well as any information about the quantity dispensed and the strength. In
these cases, an NDC is imputed from the acquisitions of a person with similar characteristics reported in the HC and
for whom the pharmacy reported both drug names and valid NDCs. Match variables include the person's age, sex,
geographic division, urbanicity, health conditions, and potential payment sources. In addition to the NDC, the
recipient acquisition adopts any missing medication name, GPI, quantity dispensed, or strength.
Table 4 summarizes the results of the NDC imputation process for the 2021 data. Some imputations were from other
acquisitions of the same person in the same round. Most of the imputations relied on detailed information provided by
the pharmacies.
Imputing quantity dispensed
Pharmacies report quantity dispensed for nearly all acquisitions, but occasionally it must be imputed. If the NDC is
imputed and the quantity is missing, then the quantity is taken from the same acquisition that donated the NDC.
Otherwise, matching software imputes a quantity from another acquisition. Match variables include the NDC; active
ingredients, dosage form, and strength (GPI); and characteristics of the person reported in the HC (age, sex, health
conditions, and health status). Exact matching on the drug is required, and heavier weight is placed on the NDC,
followed by the dosage form and strength. In the 2021 data, the quantity dispensed was imputed for 0.5 percent of
the 257,596 acquisitions (Table 1). Subsequently, quantity was occasionally edited to create greater consistency
between price and quantity and greater consistency between number of fills and quantity dispensed.
Editing third-party payers
Sometimes the pharmacy does not clearly identify a third-party payer. For example, a pharmacy might report as a payer
a pharmacy benefit management company, which could be under contract with a private insurer, Medicare prescription
drug plan, Medicaid, or other public program. For some acquisitions, pharmacies do not report the third-party payer at
all. In these cases, information from the HC about insurance coverage and usual third-party payer can indicate the
type of payer. For example, if the respondent in the HC reports Medicare Part D as the usual third-party payer, then
the unknown payer in the PC is set to "Medicare".
In the 2021 data, among the 257,596 acquisitions reported by pharmacies, the third-party payer was edited for 16.5
percent of acquisitions (Table 1). Specifically, the category of third-party payer was assigned in three situations
using the insurance status reported in the HC:
- The third-party payer was reportedly a pharmacy benefit management company or HMO (39,500 acquisitions). The payer
in these cases was coded mainly as private insurance, Medicare, or Medicaid.
- The payer reportedly was a public clinic or some other ambiguous public payer (2,498 acquisitions). The payer in
these cases was coded mainly as Medicaid, Medicare, or Other State and Local.
- The third-party payer was unknown (484 acquisitions). The payer in these cases was also coded mainly as private
insurance, Medicare, Medicaid, or Other Insurance.
Additional edits to the PC acquisitions (patterned after edits applied to the nonprescription provider data) include
the following:
- Correcting the information from pharmacies that mistakenly report a payment from private insurance instead of a
Medicare prescription drug plan (910 acquisitions), Medicaid (971 acquisitions), or TRICARE (647 acquisitions).
- Eliminating the inconsistency of pharmacy providers incorrectly reporting a Medicare source of payment instead of
a Medicaid payment for those nonelderly who are not Medicare beneficiaries (3,003 acquisitions).
- Eliminating the inconsistency of pharmacy providers incorrectly reporting a Medicaid source of payment instead of
a private source of payment for people with private insurance (1,021 acquisitions).
Identifying acquisitions needing price imputation
The primary aim of MEPS is to collect information about healthcare expenditures, including the costs of prescription
drugs. Payment information that is altogether or partly missing, incorrectly reported as zero, or otherwise inaccurate
will cause distortion in estimated expenditures. Responding pharmacies provided all the information requested for 56.3
percent of acquisitions in 2021 (Table 5). In many cases, the source of incompleteness of payment data is that the
third-party payment amounts are missing. Federal pharmacies for a few federal programs are especially likely to omit
third-party payment amounts, because these pharmacies report little payment information generally.
Even when payment data appear to be complete, MEPS attempts to detect inaccurate payment data by comparing an
acquisition's price (the sum of payments) to prices provided in the NADAC data and MDDB. If the sum of payments falls
outside certain ranges relative to the benchmarks, then MEPS imputes a price and distributes it among sources of
payment.
The threshold used for defining what constitutes deficient payment information depends on the pattern of reported
payments. Generally, deficiency comes under one of four categories:
- The acquisition lacks any payment information at all.
- The acquisition has missing third-party payments.
- The acquisition has zero third-party payments, and the out-of-pocket amount is unrealistically low as a total
price.
- The acquisition's payments sum to an unrealistically high or low price.
The primary method for identifying unrealistically high and low prices ("price outliers") is to assess prices per
unit (tablet, capsule, ounces, etc.). The retail unit price (RUP) is the sum of payments for the acquisition divided
by the quantity dispensed. Price outliers are identified by comparing the acquisition's RUP reported in the PC to the
NDC's national average drug acquisition cost (NADAC) per unit. When NADAC is not available, then average wholesale
cost unit price (WACUP) is used. When neither NADAC per unit nor WAC per unit is available, average wholesale unit
price (AWUP) is used.
Rules for identifying lower outliers are summarized in Table 6a. When using NADAC per unit or WACUP, the rules depend
on patent status, the completeness of the payment data, whether the pharmacy reported discounts or coupons for the
fill, and whether the fill was for a person with Medicare Part D who appeared to be in the donut hole. The thresholds
are lower for brand name drugs obtained by Medicare Part D beneficiaries who have spent enough to enter the donut
hole. The thresholds are lower because, starting in 2019, these acquisitions are discounted 60 percent. Acquisitions
in the donut hole are identified using the cumulative, total out-of-pocket payments reported by the pharmacies for
acquisitions covered by Part D. The rules for identifying lower outliers also depend on whether the drug was an
over-the-counter (OTC) drug and whether there was any third-party payment. The three patent statuses are (1)
single source brand name drugs, which are available only from one manufacturer; (2) originator drugs, which are brand
name drugs with therapeutically equivalent competitors; and (3) generic drugs, which are available from multiple
manufacturers and pharmaceutically equivalent to a brand name drug. The differences between the thresholds for NADAC
per unit and WACUP reflect differences in the distributions of RUP relative to NADAC per unit compared with RUP
relative to WACUP.2 When using AWUP, which is used only as a last resort for identifying outliers, no information
regarding OTC or third-party payment is used.
Rules for identifying upper outliers are summarized in Table 6b. The rules depend on patent status and dosage form.
Acquisitions are flagged as upper outliers when the RUP exceeds 50 times the NADAC per unit for generics, 8 times the
NADAC per unit for single source liquids, and 4 times the NADAC per unit for all other drugs among fills priced at $16
or more per fill. When using the WAC, cases are flagged as upper outliers (prices too high) when the RUP exceeds 20
times the WACUP for generics, 4 times the WACUP for single source liquids, and 2 times the WACUP for all other drugs.
When using AWUP, upper outliers are identified as RUP ≥ 10 times AWUP, regardless of patent status or dosage
form.f
2021 Results. Table 7 presents the number of acquisitions identified as potentially needing and
definitely not needing price imputation in the 2021 data, based on the rules for identifying upper and lower outliers
in unit prices. Few of the acquisitions with complete payment data, 1,046 of 144,929 or 0.7 percent, were flagged as
lower outliers, and a few more, 2,758 out of 144,929 or 1.9 percent, were flagged as upper outliers. Those flagged as
lower outliers were more likely to be generic (772 out of 1,046) than brand name, whereas about two-thirds of those
flagged as upper outliers were single source drugs (1,799 out of 2,758). For most of these lower outliers, imputed
prices replaced reported prices (865 out of 1,046), mainly because the imputed price was more than the reported
price.g
Most of those were cases with zero third-party payment amounts (615 out of 865). About two-thirds of the upper
outliers in unit prices were deemed to not need any imputation (1,854 out of 2,758), for example because the total
price of the fill was less than $16.h Most of those upper outliers
flagged for imputation had positive third-party payment amounts (834 out of 904).
More than half of the acquisitions with partial payment data, 34,579 of 63,297, were flagged as lower outliers. A
very small number of the acquisitions with partial payment data were flagged as upper outliers (69 out of 63,297).
For acquisitions with partial payment data not identified as lower or upper outliers, missing payments were set to
zero. For a small proportion of partial payment data flagged as lower outliers that appeared to be partial fills of a
small number of pills, missing values were set to zero.i
Imputing price and payments
Donors for price imputation are the acquisitions with complete price data not identified as outliers and the
acquisitions with partial price data whose missing values are set to zero.j
Eight attempts are made to match a donor
acquisition to a recipient acquisition. The first two attempts are the highest-quality imputations, where there is a
donor with the same NDC, so that the RUP can be directly imputed. The first attempt requires an exact match on NDC and
the set of payers, and the second requires an exact match on NDC. The remaining match attempts impute ratios of the
RUP to the NADAC per unit, the WACUP, or the AWUP, depending on whether the NADAC per unit (third and fourth attempts)
or the WACUP (fifth and sixth attempts) are available, and in cases when they are not available, the AWUP (seventh and
eighth attempts). The third, fifth, and seventh attempts require an exact match on GPI and patent status. The fourth,
sixth, and eighth attempts requires an exact match on patent status, package unit, and discount flag.
All records flagged as in or after the donut hole are excluded from the donor pool for the second through eighth
matches, because if they matched to fills outside the donut hole, then the imputed prices would be too low. Also
excluded from the donor pool for the third through eighth matches are all records for generics with high price ratios,
because the high ratios are more common among very inexpensive drugs, but these ratios could get matched to expensive
drugs. The match variables in all eight attempts are third-party payer, Medicare Part D low-income subsidy
participation, enrollment in a high-deductible health plan, the person's number of fills for all drugs in the year up
to the date of the recipient fill (to reflect insurance benefit design over the course of the year), the person's age
and private and public HMO enrollment, pharmacy name, pharmacy chain name, quantity, and state. The weight for each
variable is based on its relative contribution to the variation in both price and the proportion of the price paid out
of pocket.
The imputed price is the product of the recipient acquisition's quantity dispensed and an imputed RUP. The
calculation of the imputed RUP depends on the attempt at which a match is found for imputation. If the first or second
match is successful, then the imputed RUP is the donor acquisition's RUP. If the third or fourth match is successful,
then the imputed RUP is based on the donor's ratio of RUP to NADAC per unit and the recipient's NADAC per unit:
Imputed RUP = (donor's RUP ÷ NADAC per unit) x (recipient's NADAC per unit).
Similarly, if the fifth or sixth match is successful, then the imputed RUP is the product of the ratio of the donor's
RUP with the donor's WACUP and the recipient's WACUP. Lastly, if the seventh or eighth match is successful, then the
imputed RUP is the product of the ratio of the donor's RUP with the donor's AWUP and the recipient's AWUP. There are
no limits on the number of times a donor can match a recipient.
Some corrections for dispensing fees are made for some very low imputed undiscounted prices that are not partial
fills. To better reflect Medicaid's dispensing fees, if the acquisition is brand name, Medicaid is a source of
payment but Medicare is not, and the imputed price is less than $10, then $10 is added to the imputed price. For brand
name acquisitions with payers other than Medicaid, if the imputed price is less than $1, then $1 is added to the
imputed price. Lastly, for a generic acquisition, if the imputed price is less than $0.4, then $1 is added to the
imputed price.
Whether any exception is made to basing the imputed price on the imputed RUP, and how the price is distributed by
payer depends on whether the recipient acquisition had any payment information and whether it was identified as a
lower outlier or upper outlier.
Lower outliers with complete payment data: There are two types of cases where the imputed prices are
not allocated and the original recipient prices are retained (no change) are: (1) if the imputed price is less than
the original recipient price and (2) if it was a discounted single source drug for an uninsured person. Otherwise, the
increase in price due to imputation is allocated based on the recipient's household-reported type of insurance or
usual third-party payer for drugs. If there is one third-party payer, then the increase is assigned to that payer, and
if there is more than one third-party payer, then the increase is allocated according to a hierarchy of the
third-party payers. If the household did not report any insurance or usual third-party payer, then the increase is
assigned to self-payment.
Upper outliers with complete payment data: There are two types of cases where there are no changes:
(1) If only one pill is reported, then quantity is increased, and (2) if the price is less than or equal to $16 or the
imputed price is less than $2, then there is no change; otherwise, the reduction is taken from the third-party
payment. If there is no positive third-party payment, the reduction is taken from the out-of-pocket payment.
Lower outliers with partial payment data: For the infrequent cases where the imputed price is less
than the original recipient price, the original price is retained, and missing values are set to zero. More commonly,
the imputed price less the reported payment amount is allocated among the reported payers, almost always to fill in a
missing third-party payment. That is, if out-of-pocket is the only reported payment, then the increase in price is
entirely allocated to the missing third-party payment.
Upper outliers with partial payment data: If only one pill is reported, then quantity is increased.
If the price is less than or equal to $16, then there is no change. Otherwise, the reduction in price is taken entirely
out of the positive source(s) of payment reported by the pharmacy provider.
Recipients with no payment data or zero reported payments: The donor acquisition supplies both the
unit price and the shares paid out of pocket and by third-party payers. The donor's distribution is sometimes modified
using the following hierarchy of rules:
- If the recipient's out-of-pocket payment amount was zero, the recipient's imputed out-of-pocket payment amount is
almost always set to zero.
- For the perfect matches on NDC and payers, the donor's proportions by payer are applied to the imputed price.
- For acquisitions from federal pharmacies, the out-of-pocket amounts are set using program rules, and the remaining
imputed price is allocated to federal programs.
- If the donor and recipient have different payers, then the share paid out of pocket is imputed and the donor's
share covered by the third-party payer is allocated to the recipient's third-party payer.
- If the recipient's pharmacy did not report a third-party payer and the household reported no insurance and no
usual third-party payer, then generally the out-of-pocket amount is set to the entire imputed price. If the
recipient's acquisition appears to be in the Medicare Part D donut hole, then the price is allocated between
out-of-pocket and Medicare based on program rules for that year. In particular, the manufacturer's share does not
appear in the MEPS data because the manufacturer pays itself.
- If Medicare Part D catastrophic coverage (beyond the donut hole) appears to apply to the recipient's acquisition,
then 95 percent of the price is allocated to Medicare and the rest to out-of-pocket.
Lastly, unusually high and low prices, and cases with gross inconsistencies across fills purchased by a person, are
reviewed by a pharmacist.
Results of Preparing the Pharmacy-Reported Data
The information collected in the PC about an acquisition—the date of the transaction, the NDC (or analogous
identifying drug information), the quantity, the number of days of the drug that the acquisition supplied, and
payments to the pharmacy by source—now also includes the GPI. Imputation has supplied values for any missing NDC or
quantity dispensed. Logical edits have determined third-party payers if necessary, and missing or unrealistic prices
have been replaced with imputed values. With these enhancements, the PC data are now ready to be linked to the drug
names reported in the HC.
Return to Table of Contents
Matching Pharmacy Data to Household Data
Overview of Matching
Two general approaches accomplish matching the data reported by pharmacies to the data reported by households. First,
for each of a person's acquisitions in the HC, an attempt is made to find the same or a similar acquisition obtained
by
that person in the PC. If this approach fails, the second approach imputes pharmacy data from some other person.
Both matching procedures are conducted at the level of the person-round-drug rather than at the acquisition level. In
the HC data, a "drug" is a unique drug name reported for the person in the round. In the PC data, a "drug" is a set of
acquisitions of pharmaceutically equivalent drug products identical in the active ingredients, dosage form, and
strength (14-digit GPI), whether brand name or generic, by the person in the round. That is, the acquisitions in the
PC data are aggregated to the drug level to mirror the structure of the HC data. In 2021, 144,783 person-round-drugs
in the HC represented 335,291 acquisitions; in the PC, 257,596 acquisitions aggregated to 140,878 person-round-drugs.
After matching a PC person-round-drug to an HC person-round-drug to create a pair, the PC acquisitions within the
aggregated set are unrolled, the HC drug is fanned out into a set of acquisitions, and each HC acquisition is paired
with a PC acquisition in the drug set. If the number of acquisitions differs between the HC and PC, then the number of
acquisitions is determined by the HC and some randomization is used to allocate the PC acquisitions to the HC
acquisitions.
Return to Table of Contents
Details of Matching
The first approach: Within-person matching
The first approach to matching—within the person—entails four attempts. First, the procedure seeks PC
data exactly matching on person, round, and pharmaceutically equivalent drug (14-digit GPI) for each
household-reported drug name. Once a match is made, the PC donor person-round-drug is removed from the donor pool and
not matched with any other household-reported drugs for the individual (i.e., the matches are made without
replacement). In the second attempt, the residual household- and pharmacy-reported drugs must match exactly on person
and round. Weighted match variables, which are not required to match exactly, are the drug group, therapeutic class,
active ingredients, and medication name. An HC-PC pair is accepted only if exactly matching on drug group or the
reported medication name. In the second attempt, matches are also made without replacement. For the HC residuals of
the second attempt, the requirement of matching within the round is removed, and all the person's pharmacy-reported
drugs are candidate donors. This third attempt requires exact matches on person, and active ingredients, dosage form,
strength, and medication name are weighted match variables. For the HC residuals of the third attempt, the requirement
of exact match on active ingredients is removed (so only exact match on person is required), and all the person's
pharmacy-reported drugs are candidate donors. In this fourth attempt, active ingredients, dosage form, drug group,
drug class, therapeutic group, round, and medication name are weighted match variables. A match is accepted only if it
meets quality standards established for the match attempt. In the third and fourth attempts, matches are made with
replacement.
One variable in the weighted match is medication name, and matching medication names requires specialized software.
We use a method similar to Soundex to match medication name.k Matching
PC and HC drugs within person leaves some HC
drugs unlinked to PC data. Some sample persons lack pharmacy data altogether, and some HC drug names remain unmatched
for other reasons (for example, the household respondent failed to mention one of multiple pharmacies used). A second
approach is needed for this reason.
The second approach: Imputation from a different person
The second approach to matching—imputation from a different person—entails a total of 11 match attempts.
It
draws on a donor pool of all pharmacy-reported drugs regardless of person (excluding specific free drugs, which are
reconciled later). PC drugs with imputed NDCs are part of the donor pool as well. All these attempts match with
replacement.
The first attempt requires an exact match on active ingredient, dosage form, and strength (14-digit GPI). Weighted
match variables used are the medication name; number of months per acquisition in the round; insurance status and
potential payment sources for the person; whether the person is enrolled in any high-deductible health plan; access to
Medicare Part D low-income subsidy; name of the pharmacy; whether the HC respondent reported that the person used any
mail-order pharmacies; the cumulative number of HC-reported acquisitions of all drugs in the prior and current rounds
of the calendar year (a proxy for healthcare utilization); and the person's age, sex, medical conditions, geographic
region and division, urbanicity, employment status, and self-reported health status.
The second attempt requires an exact match on active ingredient and dosage form (12-digit GPI), and the third
requires an exact match on active ingredient (8-digit GPI). The match variables used in these two attempts are the
same as the ones used in the first match attempt, except that in the third attempt, the 10-digit GPI is used as an
additional match variable.
The fourth attempt requires an exact match on drug class (4-digit GPI) and whether the drug is single source only
(the 10-digit GPI only has single source drugs in the MDDB). The fifth attempt requires an exact match on drug class
(4-digit GPI) only. For the fourth and the fifth attempts, all the weighted match variables for the first attempt are
used, in addition to the strength of the drug. The sixth attempt requires an exact match on drug group (2-digit GPI)
only, and all match variables from the fourth and fifth attempts are used, in addition to the drug class.
The seventh and the eighth attempts both require exact matches on therapeutic group, with the key difference between
the two attempts being that in the eighth attempt, medication name is not used as a match variable (zero weight)
because the drug name reported by the household is not specific. The ninth attempt matches nonspecific
household-reported steroids (e.g., drug name is "steroid") to steroids in the pharmacy data, without matching on drug
name.
The last two attempts do not require any exact matches. The key difference between the tenth and eleventh match
attempts is in the weight on medication name.
After matching person-round-drugs, the HC and PC drug records are expanded into acquisitions, as mentioned above.
Each drug name reported in the HC interview is fanned out to the number of acquisitions that the household reported;
each PC drug, a set of aggregated PC acquisitions, is unrolled back into distinct acquisitions as originally reported
in the PC. Then each HC acquisition is paired with a PC acquisition within the drug-to-drug matched set. Because the
household reports only the number of acquisitions, they have no natural order. Therefore, the acquisitions take the
date order of the PC acquisitions; the order is preserved only in the unique record identifiers. When the household
and the pharmacy both report the same number of acquisitions, then the pairing is one to one by the date order in the
PC. Otherwise, if the number of acquisitions differs between the HC and PC, then the PC acquisitions are paired with
the HC acquisitions in a randomization process that follows the date order of the PC and the unique identifiers of the
acquisitions in the HC. Note that when the drug match is imputed, the order of acquisitions represented by the record
identifier may not be analytically useful because the date order is from a different person or round.
Matching diabetic supplies and equipment is similar. Pharmacy-reported acquisitions of diabetic supplies and
equipment are mostly aggregated into one "drug" record for each person-round, so that the PC data structure parallels
the HC. Aggregating in this way before matching generates a more accurate representation of the variety of diabetic
supplies purchased. Unfolding PC diabetic supplies and equipment acquired in the same day added 1,432 records to the
file, so the total number of records increased from 333,859 to 335,291.
Table 8 summarizes the results of matching pharmacy data to household data in 2021. Almost three-fourths (72.9
percent) of drugs were acquired by those sample members who had any pharmacy data. Matching was highly successful
for these drugs: 83.6 percent (line 5) of their acquisitions were from their own pharmacies. The imputed
pharmacy matches—acquisitions not matched within the person—made good use of the information given by
household respondents: 89.0 percent (line 11) were matches to the active ingredient of the drug reported by the
household.
In the 2021 PC donor database, 39.1 percent of acquisitions remained unmatched to the HC. A validation study found
that household respondents reported as many acquisitions as were found in claims data, but households reported fewer
drugs and more acquisitions per drug. The drugs acquired but not reported by the household tended to be for short-term
use. They entailed fewer acquisitions and included many anti-infectives, topical agents, and pain medications.8
Return to Table of Contents
Editing Matched Data
This section describes edits that apply to the matched HC-PC data, including supplemental data merged into the file
to enhance analytic utility. These four edits apply to fills with drug details and payment information imputed from
another person:
- Imputed fills with too many days supplied. This edit corrects for imputing 90-day fills to some
people with more
frequent refills.
- Free antibiotics, anti-diabetics, anti-hypertensives, anti-asthmatics, and prenatal vitamins.
This edit allows a
price of zero for certain free acquisitions, such as antibiotics, from chain pharmacies offering free drug programs,
but in a manner that preserves the anonymity of the chain.
- Prices paid by the Civilian Health and Medical Program of the Department of Veterans Affairs
(CHAMPVA). This edit corrects prices and payment amounts by source for some
individuals with only CHAMPVA.
- Resolving payer inconsistencies arising from imputation. This edit reconciles differences in
payers related to
having matched across persons.
Additional editing applies to some fills:
- Federal pharmacy prices. This edit improves the accuracy of prices of acquisitions from federal
pharmacies.
- Editing year in crossover rounds. This edit allocates to each year purchases reported in the
interviews that cover parts of 2 calendar years.
- Medicare Part D and private insurance. These edits change the payer from private insurance to
Medicare Part D and vice versa for certain people's acquisitions.
Imputed fills with too many days supplied
When imputing PC data from a different person or different round, mismatch in days supplied per fill can occur. For
example, for some recipients, the donors may have 60 or more days supplied, but the recipients may have more frequent
fills, suggesting fewer days supplied. This edit corrects fill details for some of these cases. In particular, if the
days supplied are twice or thrice the days in the round and there are more than 90 extra days supplied, then the
quantities, days supplied, prices, and third-party payer amounts are reduced for consistency with the higher
frequencies of fills.
Free antibiotics, anti-diabetics, and prenatal vitamins
Under certain conditions, payment data are edited for the free antibiotics, anti-diabetics, and prenatal vitamins
that pharmacies report. Among the household-reported acquisitions matched to a different person's pharmacy data, all
the price and payment variables are set to zero for selected free antibiotics, anti-diabetics, anti-hypertensives,
anti-asthmatics, and vitamins. This editing rule applies only if the household respondent reported that the person
used only one pharmacy; that pharmacy chain had a free antibiotic, anti diabetic, anti-hypertensive, anti-asthmatic,
or prenatal vitamin program; and the antibiotic, anti-diabetic, anti-hypertensive, anti-asthmatic or prenatal vitamin
was included in the chain's program. In addition to maintain the pharmacy chain's confidentiality for prices of other
purchases, the person must have resided in a state with two or more pharmacy chains with free programs. In the 2021
data, 26 acquisitions were edited to a price of zero.
Prices paid by CHAMPVA
If a person-round had CHAMPVA insurance; did not have Medicare, Medicaid, or private insurance; did not obtain
medications only at a Department of Defense or an Indian Health Service pharmacy; and did not have a positive payment
amount from CHAMPVA, then the price and third-party payments are set according to the VA pricing rules for the year.
Federal pharmacy prices
When the household reported that the sample member used federal pharmacies, but the payment data were imputed from
nonfederal pharmacies, a federal price list is used to determine prices for acquisitions. For each NDC, the lowest
price on the list is used. These prices reflect the government's cost of acquiring medicines, so an estimate of
dispensing costs is added.
Federal prices are assigned to acquisitions in three situations. These situations are defined by information from
both the HC and the PC:
- The PC data are from a federal pharmacy that did not report payment information and the HC and PC acquisitions
match on person and round.
- The household respondent reported the person used only federal pharmacies in that round, and the PC data are from
a different person.
- The household respondent reported the person used at least one federal pharmacy in that round and the PC data,
which are from a different person, are from a federal pharmacy that did not report payment information.
In these three situations, the federal price is merged into the matched HC-PC acquisition by the pharmacy-reported
NDC. Sometimes, however, the NDC from the PC does not exactly match a drug on the federal price list—because the NDC
is imputed, or because there is no negotiated price with the manufacturer of the drug. In these cases, instead of the
NDC, the match is by active ingredient, dosage form, and strength (14-digit GPI) with drug name as a weighted match
variable. The pharmacy-reported NDC, whether reported or imputed, is replaced with the one selected from the federal
price list. The remaining cases retain the price from the PC (whether reported or imputed) because the federal price
schedule does not include every drug.
For the acquisitions with prices taken from the federal price list, the out-of-pocket and federal payments are set
using federal program rules.
In the 2021 data, 4,109 acquisitions (1.2 percent) were assigned federal prices.
Editing year in crossover rounds
In the crossover round interviews (the third interview for all panels; the fifth and seventh rounds for panels that
collected 4 years of data), in which the reference period spans the later part of one year and the early part of the
next, for each drug name, the household respondent is asked to report both the number of times the drug was obtained
since the last interview and the number of those times the drug was obtained in the previous year. When this
information is missing for a drug, the number of acquisitions is allocated to each year. The allocation is based on
the date the person started taking the medication and, for drugs with PC data matched exactly to the HC on person and
round, the number of acquisitions reported by pharmacies in each year. Otherwise, acquisitions are distributed in
proportion to the duration of the crossover round in each year. After this allocation, there were 303,394 acquisitions
in 2021; this is the number of acquisitions (records) in the public use file.
Medicare Part D and private insurance
In two types of rare cases, payments are switched between Medicare and private insurance. For Medicare beneficiaries,
some private insurance payments are assumed to be Medicare Part D. This edit applies only to acquisitions for which
the pharmacy data are matched from the beneficiary's pharmacies, not imputed from another person. For a beneficiary
who has an acquisition with a private payment and, according to the HC, has Part D coverage, the private payment is
assumed to be from a Medicare Part D plan. In addition, for an elderly person who has an acquisition with a private
payment and, according to the HC, lacks private drug coverage, the private payment is assumed to be from a Medicare
Part D plan. This edit applied to 773 acquisitions in 2021.
Conversely, some Medicare payments are assumed to be private payments. These edits apply only to acquisitions for
which the pharmacy data were imputed from another person. For acquisitions that have an imputed Medicare payment but
no private insurance payment, and the person is not elderly, not disabled, and has private coverage but no Medicare
Part D, then the Medicare payment is reassigned to private payment. This edit applied to 603 acquisitions in 2021.
Resolving payer inconsistencies arising from imputation
As addressed above, matching pharmacy data to drug names reported in the HC is based primarily on the drug group,
therapeutic class, active ingredients, dosage form, and strength. Additional criteria include insurance status and
potential third-party payers (although, due to small cell sizes, these variables are not required to match exactly in
the second approach to matching, imputation from another person). Despite high-quality drug-to-drug matching,
inconsistency in payment patterns can result. The donor acquisition and recipient acquisition may have different
payers. For example, the pharmacy (donor)-reported third-party payments can be from an insurer that, according to the
HC, did not provide coverage to the sample person (recipient). This section describes the methods used to identify and
resolve these inconsistencies.
Acquisitions with inconsistencies are defined as those for which the HC-reported insurance coverage differs between
the donor drug record and the recipient drug record. Both the donor and recipient records are classified using a
hierarchy of combinations of insurance (see Table 9). The classification reflects common situations that relate to the
types of third-party payers and amount paid out of pocket. The most common situation is having only one type of
insurance for the entire round, but the classification accounts for other situations as well. When the donor and
recipient have different insurance classifications, they are considered inconsistent.
Most situations of inconsistency between a donor's sources of payment and the recipient's insurance require editing
or additional imputation for resolution. However, resolving inconsistency between insurance and payers is unnecessary
in several situations. First, acquisitions where pharmacy data have been matched within person are always seen as
consistent. Second, acquisitions with prices taken from the federal price list are defined as consistent, because
third-party payments have already been edited. Third, free acquisitions are consistent, because there are no payments.
Fourth, acquisitions are consistent for the small number of drugs covered by Medicare Part B if both the recipient and
the donor are Medicare beneficiaries and one of the following is true: Both have Medicaid as a source of payment; both
have private coverage or any prescription drug coverage; or both do not have Medicaid, private, or any prescription
drug coverage.
Among the 25,279 fills with inconsistencies requiring editing or imputation, two rare situations rely on editing.
First, payments for Medicare Part B drugs for Medicare beneficiaries are allocated to Medicare (78 acquisitions).
Second, for acquisitions that appear to occur after the person has exited the Medicare Part D donut hole and are
therefore in the catastrophic benefit portion, 95 percent of the price is allocated to Medicare and 5 percent to out
of pocket (7 acquisitions).
The method for correcting the remaining inconsistencies (25,194 acquisitions in 2021) is to impute the distribution
of payments from an acquisition with consistent payers using a hot-deck procedure. The donor pool is composed of
acquisitions with consistent payers: those where the PC data were matched within the person to the HC as well as those
from the federal price list. Free acquisitions are excluded, because they lack payment data, and fills for persons who
obtained all their fills from military or TRICARE pharmacies are excluded, because no individuals with these
characteristics will be in the recipient group, as fills obtained by these individuals already have been edited. Fills
with payments that have been edited for consistency are in neither the donor group nor the recipient group. A donor is
drawn from the donor pool with replacement. Class variables are, in order of importance: insurance classification,
patent status, price categories, whether any mail-order pharmacies were used by the person in the round, more detailed
price categories, drug group, drug class, active ingredients, dosage form, and strength. Except for insurance status,
if there are fewer donors than recipients in a cell, then cells are collapsed until the ratio of donors to recipients
is at least 1:1. The imputed distributions of payments are further adjusted for especially rare combinations of
insurance for better alignment between the donor's and the recipient's sources of payment.
Payment information from the new donor acquisition is then used to impute third-party amounts and either copayments
or out-of-pocket coinsurance payments. Copayments, including values of zero, are always imputed for Medicaid, the VA,
and TRICARE. For example, if the donor has a Medicaid payment, then the recipient acquisition's out-of-pocket amount
is set equal to that of the donor acquisition's, and the recipient Medicaid payment is set equal to the recipient
price less the donor out-of-pocket amount, preserving the total. For the remaining third-party payers, either a
copayment or coinsurance is imputed, depending on whether the donor's out-of-pocket amount is a whole number. If the
donor acquisition is generic and the out-of-pocket amount is a whole number, or the out-of-pocket amount is a round
multiple of $5, then the copayment is imputed, and the third-party amount is set as the difference between the price
and copayment, preserving the total. Otherwise, the donor acquisition is treated as a coinsurance case. The proportion
of the donor acquisition's price paid out of pocket and by each third-party payer is calculated. The recipient
acquisition's price is distributed in the same proportions to calculate the recipient out-of-pocket amount and
payments by each third-party payer.
Additional variables
Two sets of variables are merged into the dataset for release to the public: Multum drug name and therapeutic
classes. The source is the Multum MediSource Lexicon database from Multum Lexicon Plus of Oracle Health.l These variables are merged into the file by NDC.
Editing for confidentiality
Before the data are released to the public, automated masking procedures are reviewed by a pharmacist consultant to
ensure the confidentiality of the sample members. Drugs are censored if they are associated with very rare conditions,
particularly orphan drugs, or estimated to be used by fewer than 400,000 individuals, unless use of the drug does not
reveal specific information about the condition treated (for example, cold remedies). In these cases, the drug name is
replaced with a more general therapeutic class name and the NDC is set to "missing." Additional masking ensure
pharmacies are not identifiable. Confidentiality protection affected 10 percent of acquisitions in 2021.
Return to Table of Contents
References
1 Hill S, Stagnitti M, Roemer M. Outpatient Prescription Drugs: Data Collection and Editing in
the 2011 Medical Expenditure Panel Survey. MEPS Methodology Report #29. Rockville, MD: Agency for Healthcare Research
and Quality; March 2014. https://meps.ahrq.gov/data_files/publications/mr29/mr29.shtml.
Accessed January 19, 2024.
2 Ding Y, Hill SC. Evaluating Alternative Benchmarks to Improve Identification of Outlier Drug
Prices for Medical Expenditure Panel Survey Prescribed Medicines Data Editing. Working Paper #22001. Rockville, MD:
Agency for Healthcare Research and Quality; September 2022. https://meps.ahrq.gov/data_files/publications/workingpapers/wp_22001.pdf.
Accessed November 10, 2023.
3 Cohen J. Design and Methods of the Medical Expenditure Panel Survey Household Component. MEPS
Methodology Report No. 1. AHCPR Pub. No. 97-0026. Rockville, MD: Agency for Health Care Policy and Research; 1997. https://www.meps.ahrq.gov/mepsweb/data_files/publications/mr1/mr1.shtml.
Accessed November 10, 2023.
4 Cohen S. Sample Design of the 1996 Medical Expenditure Panel Survey Household Component. MEPS
Methodology Report No. 2. AHCPR Pub. No. 97-0027. Rockville, MD: Agency for Health Care Policy and Research; 1997. https://www.meps.ahrq.gov/mepsweb/data_files/publications/mr2/mr2.shtml.
Accessed November 10, 2023.
5 Cohen S. Design strategies and innovations in the Medical Expenditure Panel Survey. Medical
Care, 2003 July;41(7):Supplement III-5-III-12.
6 Master Drug Data Base (MDDB®), Version 2.5. Documentation Manual. Indianapolis, IN: Wolters
Kluwer Health, Inc., 2023.
7 Web Lexicon Plus™. [Documentation.] Denver, CO: Cerner Multum, Inc., 2016.
8 Hill, S.C., Zuvekas, S.H., Zodet, M.W. "Implications of the Accuracy of MEPS prescription drug
data for health services research". Inquiry. 2011; 48(3):242-259.
Return to Table of Contents
Suggested Citation
Abdus S, Hill SC, Ahrnsbrak R. Outpatient Prescription Drugs: Data Collection and Editing in the 2021 Medical
Expenditure Panel Survey. Methodology Report #37. Rockville, MD: Agency for Healthcare Research and Quality,
Rockville, MD; January 2024.
http://www.meps.ahrq.gov/mepsweb/data_files/publications/mr37/mr37.shtml
Tables
Table 1. Summary of editing and imputation rates for key variables, 2021
Dataset and unit (number) |
Variable to be imputed |
Percentage with editing or imputation |
Household Component (HC) data |
Unique drug names reported for a person in a round (144,783) |
Number of acquisitions |
5.1% |
Pharmacy Component (PC) data |
Unique acquisitions of a drug in a year (257,596) |
National Drug Code
Quantity
Third-party payer
Price
|
2.9%
0.5%
16.5%
31.4%
|
Matched HC-PC data |
Unique drug names reported for a person in a round (144,783)
|
Drug details |
40.3% |
Table 2. Method used to impute the number of acquisitions in the Medical Expenditure Panel Survey Household
Component, 2021
Match type |
Number |
Percentage |
Exact match to Household Component donor pool on: |
Active ingredients, dosage form, and strength
Active ingredients and dosage form
Active ingredients
Drug group
|
7,256
19
24
15
|
93.4%
0.2%
0.3%
0.2%
|
Nonexact match |
Antibiotics
Missing Generic Product Identifier
|
139
312
|
1.8%
4.0%
|
Total person-round-drugs requiring imputation |
7,765 |
100.0% |
Notes: A person-round-drug is a unique drug name within a person and round. Matching used Wolters Kluwer's Generic Product Identifier to identify drug names with the same active ingredients.
Table 3. Sample size and participation in the Medical Expenditure Panel Survey Pharmacy Component,
2021
Participation type |
Pharmacies |
Person-pharmacy pairs |
Number |
Percentage |
Number |
Percentage |
Eligible sample |
9,079 |
100.0% |
17,698 |
100.0% |
Response
Refusal
Other nonresponse
|
7,501
183
1,395
|
82.6%
2.0%
15.4%
|
14,362
1,986
1,350
|
81.2%
11.2%
7.6%
|
Notes: The sample for the Pharmacy Component is derived from all persons in the Household Component who signed permission forms to contact the pharmacies from which they reported obtaining drugs during the rounds that include 2021. Person-pharmacy pairs uniquely count the number of HC sample members reported in the HC as using each pharmacy. Other nonresponse includes unlocatable pharmacies and patients who had no services on record. Veterans Health Administration (VA) pairs: In 2021, there were 492 Pharmacy VA pairs. Data were collected for 470 pharmacy pairs, resulting in a completion rate of 95.5%.
Table 4. Method used to impute NDC in the Medical Expenditure Panel Survey Pharmacy Component, 2021
Imputation method |
Number |
Percentage |
Acquisitions |
257,596 |
100.0% |
Acquisitions with valid National Drug Codes (NDCs)
Acquisitions missing or invalid NDCs
|
250,023
7,573
|
97.1%
2.9%
|
Among acquisitions missing NDCs |
7,573 |
100.0% |
Imputed from same person and round in Pharmacy Component
Imputed from the Master Drug Data Base
Exact match on active ingredients, dosage form, and strength
Exact match on active ingredients
Exact match on drug group
Imputed from an acquisition of a person with similar characteristics
|
403
6,753
6,390
308
55
417
|
5.3%
89.2%
84.4%
4.1%
0.7%
5.5%
|
Note: Matching used Wolters Kluwer's Generic Product Identifier to identify drugs with the same active ingredients.
Table 5. Completeness of payment data in the Medical Expenditure Panel Survey Pharmacy Component,
2021
Payment data completeness |
Number |
Percentage |
Total Acquisitions
|
257,596
|
100.0%
|
Complete payment data
Partial payment data
All payment data missing or zero
|
144,929
63,297
49,370
|
56.3%
24.6%
19.2%
|
Table 6a. Lower thresholds for identifying prices needing imputation using ratio of retail unit price to price
benchmarks, by payment data completeness, potential for being in Medicare Part D donut hole, flag for being
discounted or being over-the-counter (OTC) medication, any third-party payment, and by patent status, 2021
Pharmacy payment data |
Any third-party payment, discounted, in donut hole, or OTC medication |
Donut hole |
Patent status |
Single source |
Originator |
Generic |
NADAC per unit |
Complete | Yes | | 0.01 | 0.01 | 0.01 |
Complete | No | | 0.85 | 0.01 | 0.01 |
Partial | | No | 0.95 | 0.95 | 0.42 |
Partial | | Yes | 0.45 | 0.45 | 0.42 |
WACUP (when NADAC is not available) |
Complete | Yes | | 0.01 | 0.01 | 0.01 |
Complete | No | | 0.85 | 0.01 | 0.01 |
Partial | | No | 0.85 | 0.85 | 0.12 |
Partial | | Yes | 0.4 | 0.40 | 0.12 |
AWUP (when neither NADAC nor WACUP is available) |
Complete | No | No | 0.65 | 0.20 | 0.03 |
Complete | Yes | No | 0.4 | 0.0 | 0.0 |
Partial | No or Yes | No | 0.75 | 0.7 | 0.15 |
Complete | No or Yes | Yes | 0.2 | 0.2 | 0.03 |
Partial | No or Yes | Yes | 0.3 | 0.3 | 0.15 |
Note: NADAC = national average drug acquisition cost, WACUP = wholesale acquisition cost unit price.
AWUP = average wholesale unit price. When using AWUP as a benchmark, information about third-party payers or OTC
medication are not used.
Table 6b. Upper thresholds for identifying prices needing imputation using ratio of retail unit price to price
benchmarks, by patent status and dosage form, 2021
Measure |
Patent status and dosage form |
Single source liquids |
All other brand name drugs |
Generics |
NADAC per unit | 8 | 4 | 50 |
WACUP | 4 | 2 | 20 |
AWUP | 10 | 10 | 10 |
Note: NADAC = national average drug acquisition cost, WACUP = wholesale acquisition cost unit price.
AWUP = average wholesale unit price.
Table 7. Editing categories by patent status in the Medical Expenditure Panel Survey Pharmacy Component,
2011
Type of payment data |
Total |
Patent status |
Brand name |
Generic |
Single source |
Originator |
Total acquisitions |
257,596 |
35,728 |
4,219 |
217,649 |
Complete payment data |
144,929 |
19,446 |
2,710 |
122,773 |
No editing |
141,125 |
17,403 |
2,630 |
121,092 |
Lower outlier
No change
No third-party payment (impute RUP)
Positive third-party payment (impute RUP)
|
1,046
181
615
250
|
244
39
154
51
|
30
2
25
3
|
772
140
436
196
|
Upper outlier
No change
No third-party payment (impute RUP)
Positive third-party payment (impute RUP)
|
2,758
1,854
70
834
|
1,799
1,588
33
178
|
50
30
3
17
|
909
236
34
639
|
Partial payment data |
63,297 |
8,973 |
731 |
53,593 |
Missing values set to zero |
28,649 |
302 |
122 |
28,225 |
Lower outlier
Missing values set to zero
Impute RUP
|
34,579
3,891
30,688
|
8,654
1,562
7,092
|
607
44
563
|
25,318
2,285
23,033
|
Upper outlier
Missing values set to zero
Impute RUP
|
69
26
43
|
17
11
6
|
2
1
1
|
50
14
36
|
Free antibiotics, prenatal vitamins, antidiabetics, and glucometers from some pharmacies (no change) |
961 |
312 |
10 |
639 |
Missing all payment data (impute RUP) |
48,409 |
6,997 |
768 |
40,644 |
Note: RUP = retail unit price.
Table 8. Matching and imputation of pharmacy-reported drugs and acquisitions to household-reported drug names
in the Medical Expenditure Panel Survey, 2021
# |
Description |
Drugs |
Acquisitions |
Number |
Percentage |
Number |
Percentage |
1 |
Household-reported totals |
144,783 |
100.0% |
335,291 |
100.0% |
2 |
Person had any pharmacy data |
105,618 |
72.9% |
244,429 |
72.9% |
3 |
Person had no pharmacy data |
39,165 |
27.1% |
90,862 |
27.1% |
Pharmacy data |
4 |
Person had any pharmacy data (line 2) |
105,618 |
100.0% |
244,429 |
100.0% |
5 |
Total matched within person (sum of lines 6, 7) |
86,403 |
81.8% |
204,334 |
83.6% |
6 |
Matched within person-round |
74,987 |
71.0% |
183,078 |
74.9% |
7 |
Matched within person |
11,416 |
10.8% |
21,256 |
8.7% |
8 |
Not matched within person |
19,215 |
18.2% |
40,095 |
16.4% |
Unmatched within person |
9 |
Total not matched within person (sum of lines 3, 8) |
58,380 |
100.0% |
130,957 |
100.0% |
10 |
Imputation with exact match on (rows 11-21) |
57,119 |
97.8% |
128,347 |
98.0% |
11 |
Active ingredient (sum of lines 12, 13, 14) |
51,359 |
88.0% |
116,587 |
89.0% |
12 |
Active ingredient, dosage form, strength |
34,661 |
59.4% |
77,559 |
59.2% |
13 |
Active ingredient, dosage form |
6,170 |
10.6% |
14,746 |
11.3% |
14 |
Active ingredient |
10,528 |
18.0% |
24,282 |
18.5% |
15 |
Drug class and only single source |
338 |
0.6% |
841 |
0.6% |
16 |
Drug class |
1,461 |
2.5% |
3,382 |
2.6% |
17 |
Drug group |
2,275 |
3.9% |
4,569 |
3.5% |
18 |
Therapeutic group (sum of rows 19, 20) |
1,605 |
2.7% |
2,857 |
2.2% |
19 |
Therapeutic group/medicine name |
818 |
1.4% |
1,656 |
1.3% |
20 |
Therapeutic group |
787 |
1.3% |
1,201 |
0.9% |
21 |
Steroid |
81 |
0.1% |
111 |
0.1% |
22 |
Imputation without exact GPI/TG (sum of lines 23, 24) |
1,261 |
2.2% |
2,610 |
2.0% |
23 |
Medicine name |
632 |
1.1% |
1,283 |
1.0% |
24 |
Weighted match variables |
629 |
1.1% |
1,327 |
1.0% |
Note: In the Household Component, a "drug" is a unique drug name reported for the person in the
round. In the Pharmacy Component, a "drug" is a set of acquisitions of pharmaceutically equivalent drug products
identical in the active ingredients, dosage form and strength, whether brand name or generic, by the person in the
round. Matching used Wolters Kluwer's Generic Product Identifier to identify drugs with the same active ingredients.
Matches are weighted and take into account drug name, types of insurance, health status and chronic conditions, census
division and urbanicity, and demographics.
Table 9. Major categories of hierarchical insurance classification for payer consistency editing
Medicare Part D |
And Medicaid |
Other low-income subsidy (no Part D premium) |
And Veterans Health Administration (VA) |
Likely in the donut hole based on total expenditures on drugs covered by Part D in prior rounds in the calendar year |
And private drug coverage (no Medicaid) |
Other |
Medicaid (no Medicare Part D) |
The whole round, no private insurance |
Part of the round, no private insurance |
And private insurance |
Private drug coverage (no Medicare Part D or Medicaid) |
Part of the round, no TRICARE and no VA |
The whole round, no TRICARE and no VA |
Private insurance (no private drug coverage, Medicare Part D, or Medicaid) |
And VA (No TRICARE) |
And TRICARE |
TRICARE (no private insurance, Medicare Part D, or Medicaid) |
No VA |
And VA |
VA (no private insurance, Medicare Part D, or Medicaid) |
No Medicare |
And Medicare |
Other public |
Private insurance only and not elderly |
Private insurance only and elderly |
Medicare only |
Uninsured |
Return to Table of Contents
|