I wrote a tutorial on how to perform simple prepost analysis using R, which is available on my RPubs page. It covers how to compare two differences (change in value before and after an interention) using independent t test and linear regression approaches. However, it doesn’t cover how to address correlation between two dependent values. Part 2 of prepost analysis will cover those issues.
Linear spline (piecewise) models in Stata
I wrote a tutorial on how to construct linear spline (also known as piecewise) models using Stata, which has been uploaded to my RPubs site.
Previously, I have developed tutorial on using the linear spline method for interrupted time series analsyis with Stata. However, I did not properly go over the mkspline commands.
In this tutorial, I review the mkspline command and the marginal option to generate coefficients that could be interpreted as the slope within each segment or the change in slope between segments, respectively.
Staggered difference-in-differences using R
I was interested in learning how to apply the Callaway & Sant'Anna staggered difference-in-differences framework to my work. After reading several papers and watching the video by Sant'Anna, I wrote a short tutorial on how to apply this framework to a simulated data. The tutorial is located on my RPubs site.
This is a unique method that used the R “did” package, which is based on the paper by Callaway & Sant’Anna.
Mediation analysis using R
It’s not uncommon to see covariates in a regression model that should not be there. For example, measurements that occur after the treatment assignment are included into a regression model as baseline covariates. Rather, one should consider a mediation analysis.
I wrote a tutorial on how to perform mediation analysis using R on my RPubs site (link).
I know that I make this mistake at times. This tutorial helped me to carefully consider which covariates to include in a regression model and which ones to consider for mediation analysis.
Interrupted time series analysis (ITSA) with Stata
Interrupted time series analysis (ITSA) is a study design used to study the effects of an intervention across time. An important feature of the ITSA is the time when the intevention occurs. The time before and after the intervention are of interest because we want to visualize if the trends are similar or different. Additionally, we want to visualize the change immediately after the intervention is implemenated. I call this period the index date.
In this article, I’ll review the single-group ITSA and multiple groups ITSA. Then I’ll review how to perform an ITSA in Stata
.
You can view the complete tutorial on my RPubs site.
Tweedie GLM model in R for Cost Data
I wrote a tutorial on using a Tweedie distribution for a GLM gamma model for cost data in R. Unlike Stata, R is very particular with zeroes when constructing GLM models. Hence, I opted to use the Tweedie distribution to mix and match the link function with the Gamma distribution. I settled on the identity link because it doesn’t involve retransformation and is each to interpret.
My tutorial is available on my RPubs site and GitHub site.
Interpreting regression models
I wrote a short explanation on how to interpret regression models.
I have posted this on RPubs, and the code is saved on my GitHub site.
Sample size estimation using the odds ratio in a case-control study
Sample size estimation and Power analysis in R
Logistic regression in R - Part 2 (Goodness of fit tests)
In a previous tutorial, I discussed how to perform logistic regression using R. I wrote a follow-up tutorial on how to conduct goodness of fit tests for logistic regression models in R and posted it on RPubs. The R Markdown code is available on my Github site.
I’ve learned how to assess model fit using Pearson correlations, deviance, and modified Hosmer-Lemeshow Goodness Of Fit (GOF) tests. I think these are important tools when assessing the fit of a logistic regression model. However, I wanted to focus on the HL GOF tests for this tutorial because there are a lot of nuances that I learned and wanted to share.
Additionally, I added the usefulness of visualizing whether the model over- or under-predicts the actual observed data using the calibration plot in R.