February 27, 2026

One-sample z-test of proportions in R

February 27, 2026

There are situations where you will be asked to compare the performance of your institution with another institution. This is commonly done with projects that I’m on where data collection occurs at a single site, and stakeholders want to compare the single site’s findings with a reference site. More commonly, stakeholders want to compare their performance to a published paper’s findings. In other words, we want to compare an observed finding to a theoretical one.

In the case of proportions, we can compare the proportion of individuals who experienced an event in single site to the proportion from a published study. To do that, we can use the one-sample z-test of proportions.

I wrote a guide on how to perform one-sample z-test of proportions to determine if the proportion of events observed is significantly different from an expected proportion, which is available on my RPubs site (link).

Mark Bounthavong

December 31, 2025

Epidemiology, Literary Cafe, Methods, R Programming, Statistics & Probability

Literary Cafe series: Policy analysis (Part 2) - Interrupted Times Series Analysis with publicly available data

Mark Bounthavong

December 31, 2025

Epidemiology, Literary Cafe, Methods, R Programming, Statistics & Probability

I’m back with some Literary Cafe series updates.

I have regularly informal discussions with my students about interesting papers in the biomedical sciences. Recently, we discussed a great paper by Jurecka and colleagues on the impact of a state-wide law to change the definition of fentanyl possession on opioid-related overdose death rates.

Jurecka and colleagues used publicly available data to perform their research, and I wanted to show my students how this was done using CDC WONDER data. Hence, I started this Literary Care series to document these exercises for others to learn from.

Last month, I wrote an article on how to get data from the CDC WONDER site, which you can read here. I considered this Part 1 (Getting the data).

This is the second part of a two-part series that illustrates how to use publicly available data to replicate the findings from a published study. In Part 2, I use the data from Part 1 to analyze the impact of the statwide fentanyl possession law on opioid-related overdose death rates using an interrupted time series analysis. I posted this on my RPubs site (link) along with part 1 (link).

Mark Bounthavong

July 29, 2025

Methods, R Programming, Statistics & Probability

Ratio of risk ratios in R

Mark Bounthavong

July 29, 2025

Methods, R Programming, Statistics & Probability

I ran into a problem where I had two risk ratios, but I wanted to evaluate the statistical difference between them. I couldn’t find an R package, but I found a paper by Altman and Bland that go over the step-by-step process. I wrote a tutorial on how to perform this method using R, which is available on my RPubs page (link).

Reference:

Altman DG, Bland JM. Interaction revisited: the difference between two estimates. BMJ. 2003 Jan 25;326(7382):219. doi: 10.1136/bmj.326.7382.219. PMID: 12543843; PMCID: PMC1125071.

Mark Bounthavong

March 30, 2025

Epidemiology, Methods, R Programming

Medication adherence estimations using R - Part 1

Mark Bounthavong

March 30, 2025

Epidemiology, Methods, R Programming

I created a tutorial on how to use the AdhereR package in R to estimate the medication adherence rate for a sample of individuals with prescription claims data. I posted the tutorial on my RPubs page (link).

The two most common medication adherence meaures are the Medication Possession Ratio (MPR) and the Proportion of Days Covered (PDC). This tutorial reviews how to estimate these medication adherence rates using AdhereR in R.

Mark Bounthavong

February 26, 2025

Econometrics, Epidemiology, MEPS, Methods, R Programming, Statistics & Probability

Propensity score matching in R

Mark Bounthavong

February 26, 2025

Econometrics, Epidemiology, MEPS, Methods, R Programming, Statistics & Probability

I wrote an introductory tutorial on how to perform propensity score matching using R, which has been posted on my RPubs site (link).

Propensity score matching is a statistical approach to balancing the observed covariates between groups. In observational studies, this method has the potential to mitigate potential confounding and allow us to make causal interpretations. However, there are a lot of approaches and nuances. This intorductory tutorial presents the basics of propensity score methods and how we can use these in our conventional analyses.

Mark Bounthavong

December 25, 2024

Econometrics, Epidemiology, Methods, R Programming, Statistics & Probability

Prepost analysis with continuous data using R - Part 1

Mark Bounthavong

December 25, 2024

Econometrics, Epidemiology, Methods, R Programming, Statistics & Probability

I wrote a tutorial on how to perform simple prepost analysis using R, which is available on my RPubs page. It covers how to compare two differences (change in value before and after an interention) using independent t test and linear regression approaches. However, it doesn’t cover how to address correlation between two dependent values. Part 2 of prepost analysis will cover those issues.

Mark Bounthavong

November 30, 2024

Econometrics, Methods, Statistics & Probability

Some cool website on study design and biostatistics

Mark Bounthavong

November 30, 2024

Econometrics, Methods, Statistics & Probability

This month (November 2024), I wanted to take a break from writing tutorial and articles. Instead, I wanted update myself on (and share) some very helpful/useful online resources.

A colleague of mine introduced me to website called Datamethods. It’s mainly a discussion forum, but it has some useful resources. This particular post contains references that are very useful for anyone who is interested in study design and biostatistics (link). It is a collection of papers and articles that addresses common myths and practices regarding the application of biostatistics in study designs.

Another great website is Scott Cunningham’s Mixed Taped Sessions. He has a book called Mixed Taped Session about causal inference, and he has regular workshops. I attended his Causal Inference Part 2 workshop, and it was amazing. We learned about the basic difference-in-differences methods (coding in R and Stata), and the innovations surrounding these methods (e.g., Callaway & Sant’Anna’s staggered difference-in-differences approach). Scott also provides the historical perspectives on these methods, which are insightful as they are entertaining. Moreover, he conducts interviews with prominant econometricians, which he posts on his YouTube channel.

Hopefully, these sites are useful for you as they have been for me.

Mark Bounthavong

October 28, 2024

Methods, Stata programming, Statistics & Probability

Linear spline (piecewise) models in Stata

Mark Bounthavong

October 28, 2024

Methods, Stata programming, Statistics & Probability

I wrote a tutorial on how to construct linear spline (also known as piecewise) models using Stata, which has been uploaded to my RPubs site.

Previously, I have developed tutorial on using the linear spline method for interrupted time series analsyis with Stata. However, I did not properly go over the mkspline commands.

In this tutorial, I review the mkspline command and the marginal option to generate coefficients that could be interpreted as the slope within each segment or the change in slope between segments, respectively.

Mark Bounthavong

June 23, 2024

Epidemiology, Econometrics, Methods, R Programming, Statistics & Probability

Mediation analysis using R

Mark Bounthavong

June 23, 2024

Epidemiology, Econometrics, Methods, R Programming, Statistics & Probability

It’s not uncommon to see covariates in a regression model that should not be there. For example, measurements that occur after the treatment assignment are included into a regression model as baseline covariates. Rather, one should consider a mediation analysis.

I wrote a tutorial on how to perform mediation analysis using R on my RPubs site (link).

I know that I make this mistake at times. This tutorial helped me to carefully consider which covariates to include in a regression model and which ones to consider for mediation analysis.

Mark Bounthavong

February 26, 2024

Epidemiology, Methods, Stata programming, Statistics & Probability

Survival Analysis - Immortal Time Bias with Stata

Mark Bounthavong

February 26, 2024

Epidemiology, Methods, Stata programming, Statistics & Probability

I wrote a tutorial on how to handle immortal time bias with survival analysis using Stata. In the tutorial, I used a time-varying predictor for the grouping variable and assigned the period before exposure to the control group. This was inspired by the paper Redelmeier and Singh wrote on “Surival in Academy Award-Winner Actors and Actresses.” There was a lot of debate about the rigor of their analyses, and Sylvestre and colleagues re-analyzed the data with immortal time bias in mind. This tutorial uses data from Sylvestre and colleagues to re-create their results.

The tutorial is on my RPubs page. Data used for the tutorial is located on my GitHub page.

To load the data, you can use the Stata import command

  
    import delimited "https://raw.githubusercontent.com/mbounthavong/Survival-analysis-and-immortal-time-bias/main/Data/data1.csv"
  

One-sample z-test of proportions in R

Ratio of risk ratios in R

Medication adherence estimations using R - Part 1

Propensity score matching in R

Prepost analysis with continuous data using R - Part 1

Some cool website on study design and biostatistics

Linear spline (piecewise) models in Stata

Mediation analysis using R

Survival Analysis - Immortal Time Bias with Stata

Categories

Use the search tool to find a specific blog

Previous blogs