March 31, 2026

Econometrics, MEPS, R Programming, Statistics & Probability

Two-Part Model with Bootstrap using R

March 31, 2026

Econometrics, MEPS, R Programming, Statistics & Probability

In this article, I wanted to expand on a previous post that describes using a two-part model to model cost (or total expenditure) as an outcome with data from the Agency for Healthcare Research and Quality (AHRQ) Medical Expenditure Panel Survey (MEPS). In the previous article, I used the twopartm package, which is great at leveraging the two-part model approach. However, it does not appear to handle data from complex survey designs like MEPS.

The best way to handle complex survey design data with weights using the two-part model approach is to perform the estimations for each part separately and then combine them.

With a little help from some AI chatbots, I was able to construct a viable code that not only estimates and combines both parts of the two-part model, but also allows me to bootstrap the results to generate 95% confidence intervals (CI).

The complete article on how to construct a two-part model with bootstrap using R is available on my RPubs site (link)

Mark Bounthavong

February 27, 2026

Methods, Statistics & Probability

One-sample z-test of proportions in R

Mark Bounthavong

February 27, 2026

Methods, Statistics & Probability

There are situations where you will be asked to compare the performance of your institution with another institution. This is commonly done with projects that I’m on where data collection occurs at a single site, and stakeholders want to compare the single site’s findings with a reference site. More commonly, stakeholders want to compare their performance to a published paper’s findings. In other words, we want to compare an observed finding to a theoretical one.

In the case of proportions, we can compare the proportion of individuals who experienced an event in single site to the proportion from a published study. To do that, we can use the one-sample z-test of proportions.

I wrote a guide on how to perform one-sample z-test of proportions to determine if the proportion of events observed is significantly different from an expected proportion, which is available on my RPubs site (link).

Mark Bounthavong

December 31, 2025

Epidemiology, Literary Cafe, Methods, R Programming, Statistics & Probability

Literary Cafe series: Policy analysis (Part 2) - Interrupted Times Series Analysis with publicly available data

Mark Bounthavong

December 31, 2025

Epidemiology, Literary Cafe, Methods, R Programming, Statistics & Probability

I’m back with some Literary Cafe series updates.

I have regularly informal discussions with my students about interesting papers in the biomedical sciences. Recently, we discussed a great paper by Jurecka and colleagues on the impact of a state-wide law to change the definition of fentanyl possession on opioid-related overdose death rates.

Jurecka and colleagues used publicly available data to perform their research, and I wanted to show my students how this was done using CDC WONDER data. Hence, I started this Literary Care series to document these exercises for others to learn from.

Last month, I wrote an article on how to get data from the CDC WONDER site, which you can read here. I considered this Part 1 (Getting the data).

This is the second part of a two-part series that illustrates how to use publicly available data to replicate the findings from a published study. In Part 2, I use the data from Part 1 to analyze the impact of the statwide fentanyl possession law on opioid-related overdose death rates using an interrupted time series analysis. I posted this on my RPubs site (link) along with part 1 (link).

Mark Bounthavong

November 30, 2025

Literary Cafe, Statistics & Probability, Epidemiology

Literary Cafe series - Policy Analysis (Part 1): Getting Data From CDC WONDER

Mark Bounthavong

November 30, 2025

Literary Cafe, Statistics & Probability, Epidemiology

This is Part 1 on a series of articles that I plan to write on how to perform analyses using publicly available data inspired by published studies.

Hence, I wrote an article on how to get death data from CDC WONDER, which I posted on my RPubs site here.

I’m not sure how these articles will evolve, so I’ll start with something simple like this first part, which is to gather the data to perform the analysis (Part 2 is available here).

Meanwhile, I think I’ll call these series of articles, “Literary Cafe series.” (Note: I know that this title needs work.)

Mark Bounthavong

August 30, 2025

R Programming, Statistics & Probability

R - Tips and Tricks (Guide) - Part 2

Mark Bounthavong

August 30, 2025

R Programming, Statistics & Probability

I wrote a second R guide to help students navigate and use R and RStudio in their biostatistics course. I focused on creating vectors, matrices, and dataframes.

The guide can be found on my RPubs site.

Mark Bounthavong

July 29, 2025

Methods, R Programming, Statistics & Probability

Ratio of risk ratios in R

Mark Bounthavong

July 29, 2025

Methods, R Programming, Statistics & Probability

I ran into a problem where I had two risk ratios, but I wanted to evaluate the statistical difference between them. I couldn’t find an R package, but I found a paper by Altman and Bland that go over the step-by-step process. I wrote a tutorial on how to perform this method using R, which is available on my RPubs page (link).

Reference:

Altman DG, Bland JM. Interaction revisited: the difference between two estimates. BMJ. 2003 Jan 25;326(7382):219. doi: 10.1136/bmj.326.7382.219. PMID: 12543843; PMCID: PMC1125071.

Mark Bounthavong

May 26, 2025

R Programming, Statistics & Probability

Transform data from wide to long format using R

Mark Bounthavong

May 26, 2025

R Programming, Statistics & Probability

Often, when we input data into a spreadsheet, we use the wide format where the sequence of variables are ordered according to the columns. But when we perform longitudinal analyses, we need to transform this to the long format.

Sometimes, I forget how to do this in R, so I decided to write a tutorial to remind myself how to do this.

Therefore, I wrote a tutorial on using the pivot_longer() function to transform data from the wide to long format in preparation for longitudinal data analysis. The tutorial is located on my RPubs page.

Mark Bounthavong

April 30, 2025

R Programming, Statistics & Probability

Generate data using the simstudy package in R

Mark Bounthavong

April 30, 2025

R Programming, Statistics & Probability

There are times when you are looking for a dataset to test a code or formula, but they are hard to find or are not publicly available. To get around this problem, we can generate our own data. R provides several tools for us to accomplish this.

I wrote a short guide on how to generate data using the simstudy package in R. You can read how to do this on my Rpub site (link).

Mark Bounthavong

February 26, 2025

Econometrics, Epidemiology, MEPS, Methods, R Programming, Statistics & Probability

Propensity score matching in R

Mark Bounthavong

February 26, 2025

Econometrics, Epidemiology, MEPS, Methods, R Programming, Statistics & Probability

I wrote an introductory tutorial on how to perform propensity score matching using R, which has been posted on my RPubs site (link).

Propensity score matching is a statistical approach to balancing the observed covariates between groups. In observational studies, this method has the potential to mitigate potential confounding and allow us to make causal interpretations. However, there are a lot of approaches and nuances. This intorductory tutorial presents the basics of propensity score methods and how we can use these in our conventional analyses.

Mark Bounthavong

January 30, 2025

Data visualization, Econometrics, MEPS, Stata programming, Statistics & Probability

Stata - marginsplot & mplotoffset commands for plotting average marginal effects

Mark Bounthavong

January 30, 2025

Data visualization, Econometrics, MEPS, Stata programming, Statistics & Probability

In Stata, users have a lot of flexibility with creating plots, particularly after the margins command has been executed. Once a regression command has been run, users can estimate the average marginal effect of a factor with respect to another variable using the margins command in Stata. Once the average marginal effect has been estimated, users can plot this using the marginsplot or mplotoffset commands. These are power tools that allow us to visualize the average marginal effects, particularly when we have interaction terms.

I posted a tutorail on my RPubs site that revieweed some basic features of the marginsplot and mplotoffset commands and provide some practical examples of customization.

Two-Part Model with Bootstrap using R

One-sample z-test of proportions in R

R - Tips and Tricks (Guide) - Part 2

Ratio of risk ratios in R

Transform data from wide to long format using R

Generate data using the simstudy package in R

Propensity score matching in R

Stata - marginsplot & mplotoffset commands for plotting average marginal effects

Categories

Use the search tool to find a specific blog

Previous blogs