There are a lot of lessons that I’ve learned from using the Medical Expenditure Panel Survey (MEPS) data from the Agency for Healthcare Research and Quality (AHRQ). Some of these I learned after I made some mistakes and some I learned from other people. Overall, it’s a short but evolving note of the things that I’ve learned about MEPS and its nuances. I plan on updating this in the future as I expect to learn more new things. But for those who are interested in learning what I’ve learned, you can read my notes on my RPubs page, which is here.
MEPS tutorial on interrupted time series analysis in R
I wrote a short tutorial on how to perform an interrupted time series analysis in R. I had a challenging time working on this because I wasn’t familiar with all the nuances of the ITSA. More importantly, I wasn’t able to leverage my Stata skills to do this in R. I’m used to the Stata margins command, which is great for creating constrasts. R has its own version of the margins command, but it lacks some of Stata’s features such as the pwcompare, which I use a lot in Stata. However, I found a workaround with linear splines, and I have uploaded this to my RPubs site (link). I hope you find this useful. I also saved my R Markdown code on my GitHub site (link).
MEPS tutorials on linkage files and trend analysis
I create two MEPS tutorials recently. One is on the use of condition-event linkage files to capture the disease-specific costs. I used migraine as a motivating example. In this tutorial, I go through the steps to identify migraine-related costs assocaited with office-based visits and inpatient night stays. In the second tutorial, I review how to perform simple trend analysis with linear regressio models. I pooled MEPS data from 2016 to 2021 and apply the approriate primary sampling units and strata from the pooled file.
The first tutorial is located on my RPubs page (MEPS Tutorial 4 - Using condition-event link (CLNK
) file: A case study with migraine). The R Markdown code to create the tutorial is located in my GitHub repository (link).
The second tutorial is also located on my Rpubs page (MEPS Tutorial 5 - Simple Trend Analysis with Linear Models). The R Markdown code to create the tutorial is located in my GitHub repository (link).
Tweedie GLM model in R for Cost Data
I wrote a tutorial on using a Tweedie distribution for a GLM gamma model for cost data in R. Unlike Stata, R is very particular with zeroes when constructing GLM models. Hence, I opted to use the Tweedie distribution to mix and match the link function with the Gamma distribution. I settled on the identity link because it doesn’t involve retransformation and is each to interpret.
My tutorial is available on my RPubs site and GitHub site.
MEPS Tutorial - Part 3: Applying survey weights using R
In this tutorial, we will review how survey weights from the Medical Expenditure Panel Survey (MEPS) are applied using R.
The tutorial is available on my GitHub site and RPubs.
The R Markdown code I used to generate this tutorial is available on my GitHub site.
MEPS Tutorial - Part 2: Merging Data with R
MEPS Tutorial - Part 1: Loading Data into R
For the last couple of years, I have used Stata whenever I worked with MEPS data. Stata is a great statistical program that allows me to script and analyze data from complex survey designs similar to MEPS. However, R is another powerful statistical program that researchers have been using to evaluate and analyze MEPS data. R is free/open source and has a large community that constantly builds packages to improve its utility. Because of its advantages, I wanted to start writing tutorials on how to use R to analyze MEPS data.
This first tutorial provides instructions on how to load MEPS data into R, which is a critical step for data analysis.
You can find the tutorial on my RPubs page (link); I also posted this on my GitHub page (link).
For those of you who are interested in how I developed this tutorial, the R Markdown code is located on my GitHub page (link).
In the coming months, I’ll continue to write more tutorials using R with MEPS data, so stay tuned.
Stata tutorial: Adding the 95% Confidence Interval to a Two-way Line Plot
I created a tutorial on how to add the 95% CI to a two-way line plot in Stata. I use the “connected” command to generate a line plot in Stata, and then I added the 95% CI to each value. Surprisingly, Stata does not have a native feature to allow users to generate these 95% CI on a two-way line plot.
I used the AHRQ Medical Expenditure Panel Survey (MEPS) database for the motivating example. In this tutorial, we plotted the average total healthcare expenditure from 2008 to 2019.
I build this tutorial on Stata, but I used R Markdown to write the tutorial. The R Markdown code is located in my GitHub site (Stata - Line plot with 95% CI tutorial).
You can find the tutorial on my Github site and RPubs page.
I used Stata SE 17 to build this.
Medical Expenditure Panel Survey (MEPS) Guide - Part 1
INTRODUCTION
Medical Expenditure Panel Survey (MEPS) is a publicly available dataset on healthcare expenditures that is representative of the US population.
The MEPS homepage contains vital information about the methods used to validate household responses and guides on how to properly use these data for research or exploration. You can learn about MEPS in its background section.
MEPS data files are available for download here. The most important file is the Full Year Consolidated Data Files, which contains the data for unique household responses on their characteristics and expenditures. These data are great practice for those interested in learning more about MEPS. Each of the Full Year Consolidated Data Files contain information about the data in the form of Documentation and Code Books. For example, the 2017 Full Year Consolidated Data Files Documentation and Code Book are located here.
If you area a Stata user, there are Stata programming statements available to copy and paste into a Stata *.do file. These programming statements are used to transform the MEPS data into a *.dta file that is usable by Stata. Follow the instructions in the programming statement to properly transform the MEPS data. This is similar to an extract-transform-load (ETL) process.
MEPS has a library of reports that uses its data. You can search for topics using their search engine. For example, Report #43 describes the annual opioid usage among adults treated for conditions associated with pain versus other conditions from 2013 to 2015.
Other examples of MEPS data being used in research include the following:
Hamad R, Niedzwiecki MJ. The short-term effects of the earned income tax credit on health care expenditures among US adults. Health Serv Res. 2019 Dec;54(6):1295-1304. doi: 10.1111/1475-6773.13204. Epub 2019 Sep 30.
Watanabe JH. Examining the Pharmacist Labor Supply in the United States: Increasing Medication Use, Aging Society, and Evolution of Pharmacy Practice. Pharmacy (Basel). 2019 Sep 19;7(3). pii: E137. doi: 10.3390/pharmacy7030137.
Bounthavong M, Li M, Watanabe JH. An evaluation of health care expenditures in Crohn's disease using the United States Medical Expenditure Panel Survey from 2003 to 2013. Res Social Adm Pharm. 2017 May - Jun;13(3):530-538. doi: 10.1016/j.sapharm.2016.05.042. Epub 2016 May 20.