History

I'm a novice when it comes to programming. Most of my work has been in Stata and R. However, I've been learning how to code using SQL as part of my job. I took a class on Python and found it helpful when it comes to learning first-order logic. I find these skills incredibly useful and rewarding, especially when it comes to generating publishable-worthy graphics and being able to think logically. A major project that I will undertake is to redo many of my decision analytic models using R and to develop queries that will result in analyzable datasets.

Programming Links

I've compiled a list of helpful links related to programming and my own GitHub page which is undergoing construction. I hope to have more codes and projects in my GitHub repository soon.

My GitHub page: https://github.com/mbounthavong

Stata programming and codes: https://github.com/mbounthavong/STATA-programming-and-codes

R programming and codes: https://github.com/mbounthavong/R-tutorials

Generate simulated data in `R`

I plan on writing a series on how to generated simulated data using R.

Part 1 - Generate data using the simstudy package in R (link)

Propensity score matching in `R`

I posted an introductory tutorial on using propensity score matching techniques in R on my RPubs site. I review the nearest neighbor matching and inverse probability weight for the average treatment effect (ATE) and average treatment effect of the treated (ATT) approaches. You can view the tutorial here (link).

Stata - `marginsplot` & `mplotoffset` commands for plotting average marginal effects

I posted a tutorail on my PRubs site that revieweed some basic features of the marginsplot and mplotoffset commands and provide some practical examples of customization. You can view the tutorial here (link).

PREPOST ANALYSIS METHODS

I wrote a tutorial on how to perform simple prepost analysis with continuous data using R. It has been posted to my RPubs site (link). R code for this tutorial is located on my Github site (link).

Linear spline models in Stata

I wrote a tutorial on how to construct linear spline models using Stata. It has been posted to my RPubs site (link).

R and RStudio Guide

I started to collect some of the tips and tricks that I’ve learned about R and RStudio and posted these on my RPubs site (link). This is a work in progress, and I plan to update this in the fiture. Maenwhile, I hope it will be helpful for new users of R and RStudio.

Staggered difference-in-differences using R

I wrote a tutorial on how to apply the Callaway & Sant’Anna staggered difference-in-differences framework in R, which is located on my RPubs site.

Staggered difference-in-differences is very useful for health services research where implementation of the evidence-based program varies across sites or units. Callaway & Sant’Anna wrote a great paper explaining their framework, which I high recommend reading.

Mediation analysis using R

I wrote a tutorial on how to perform mediation anlaysis in R, which is located in my RPubs site (link).

Mediation analysis should be used for those variables that are measured after the treatment assignment or index date and are associated with the endpoint or outcome on interest.

Google Drive and R

I wrote a tutorial on how to load data from Google drive into the R environment, which is on my RPubs site (link).

Google drive offers an opportunity to store and share data for my R projects.

I like the idea of using the the Google drive platform with the R environment, and I may build upon these tutorials in the future.

Survival analysis

survival analysis in r

I wrote a tutorial on survival analysis using R. It is a simple introduction to survivor and hazard functions, Kaplan-Meier curves, and Cox proportional hazards model.

The tutorial is located on my RPubs page.

The R Markdown code is located on my GitHub site.

Immortal Time Bias with Stata

I wrote a survival analysis tutorial on how to handle immortal time bias using time-varying covariates with Stata.

The tutorial is posted on my RPubs site.

The data is located in my GitHub page.

Interrupted Times Series Analysis using Stata

I wrote a tutorial on how to perform ITSA using Stata, which includes two approaches. One approach uses the mkspline command, and the other uses a triple interaction term. This article is available on my RPubs site (link).

Exact matching using R - MatchIt package

Exact matching is a useful method to match subjects on one or more variables withing using a propensity score.

I wanted to learn how to perform this, so I created a tutorial to perform exact matching using the MatchIt package in R, which is posted on my RPubs site (link).

Building a book using bookdown in R

I started a tutorial on how to build an online book using the R bookdown package.

Part 1 is located on my RPubs site (link).

Part 2 provides a high-level overview on adding chapters and a reference section to your e-book (link)

Future parts are forthcoming.

Modeling cost as a dependent variable

I started a series to review various models to model cost as a dependent variable.

The first model I reviewed was the two-part model. Specifically, I wanted to learn how to construct this in R. I have some background building this in Stata, but R proved to be a rewarding challenge.

Part 1 - Two-part model for cost data in R (RPubs link) (GitHub link)

Part 2 - Tweedie gamma model for cost data in R (RPubs link) (GitHub link)

Part 3 - Forthcoming

Building an HTML Presentation using R Markdown

This will be a series of tutorials to construct an HTML Presentation using R Markdown.

Part 1 - Introduction (RPubs link)

Part 2 - Adding a slide, image, and fade-in (RPubs link)

Part 3 - Font colors (RPubs link)

MEPS Tutorials using R

I’ve begun to create tutorials on how to use MEPS with R.

Part 1 - You can find the first tutorial on loading MEPS data into R is posted on my RPubs page (link); I also posted this on my GitHub page (link). Additionally, the R Markdown code for building the tutorial is located on my GitHub page (link).

Part 2 - The second tutorial on merging MEPS data files is on my RPubs (link) and my GitHub (link) site. My R Markdown code can be found on my GitHub (link) site.

Part 3 - The third tutorial on applying survey weights in MEPS using R is available on my GitHub site and RPubs. The R Markdown code I used to generate this tutorial is available on my GitHub site.

Part 4 - Using condition-even link file to identify disease-specific healthcare expenditure with MEPS data. This tutorial is available on my RPubs site: (MEPS Tutorial 4 - Using condition-event link (CLNK) file: A case study with migraine).

Part 5 - Performing trend analysis using linear regression model with MEPS data. This tutorial is available on my RPubs site: (MEPS Tutorial 5 - Simple Trend Analysis with Linear Models).

Part 6 - Conducting an interrupted time serise analysis using R and MEPS data. Tutorial is available on my RPubs site, and the R Markdown code is available on my GitHub page.

Part 7 - “Some Helpful Notes” is an end of the year summary of some notes I collected about my experience using MEPS data. This is available on my RPubs site.

I will continue to write more tutorials and post them here, so stay tuned.

Adding Icons in R Markdown using the Fontawesome package

I discovered an interesting package that allowed me to insert icons into my R Markdown documents. I learned how to use some of the basic commands and wrote a short tutorial on how to do this. I posted the tutorial on my GitHub page. I also posted the R Markdown code on my GitHub site.

I was excited to learn this, so I generated the tutorial without understanding all the nuance with the coding. Hence, expect to see updates with this in the future.

STATA - ADD 95% CI TO TWO-WAY LINE PLOT (r MARKDOWN TUTORIAL)

I created a tutorial on how to add 95% CI to a two-way line plot in Stata. I use the “connected” command to generate a line plot in Stata, and then I added the 95% CI to each value. Surprisingly, Stata does not have a native feature to allow users to generate these 95% CI on a two-way line plot.

I used the AHRQ Medical Expenditure Panel Survey (MEPS) database for the motivating example. In this tutorial, we plotted the average total healthcare expenditure from 2008 to 2019.

I build this tutorial on Stata, but I used R Markdown to write the tutorial. The R Markdown code is located in my GitHub site (Stata - Line plot with 95% CI tutorial).

You can find the tutorial on my Github site and RPubs page.

I used Stata SE 17 to build this.

R Plotly - Bar charts

I wrote a tutorial on how to use plotly, an R package that allows users to include interactive charts in R Markdown projects.

The tutorial is available on RPubs, and the R Markdown code is available on my GitHub page.

I really like using plotly for my R Markdown projects because it has some nice interactive features. Hopefully, this tutorial will open the doors to more creativity with R Markdown projects.

sample size estimation and power analysis in R

I wrote a tutorial on how to perform sample size estimations and power analysis using R “pwr” package. These are simple examples that will hopefully lead to more complicated estimations.

The tutorial is available on RPubs (link)

The R Markdown code is available on my Github site (link).

Update (29 June 2022): I wrote a short tutorial on sample size estimation using the odds ratio for a case-control trial. I used the “epiR” package to perform this task. The tutorial is available on RPubs (link), and the R Markdown code is available on my GitHub page (link).

Logistic Regression Model in R

I wrote a couple of tutorials on how to construct logistic regression models and perform goodness of fit tests in R. These tutorials were published on RPubs:

RPubs tutorial - Part 1 (link)
RPubs tutorial - Part 2 (link)

In part 1, I go through the use of the glm() command to perform a crude logistic regression model and a multivariable logistic regression model.

In part 2, I wrote a follow-up tutorial on how to conduct goodness of fit tests for logistic regression models in R

The RMarkdown code I used to create the tutorial is located on my GitHub site:

R Markdown code - Part 1 (link)
R Markdown code - Part 2 (link)

Data sources

The data (diabetes.csv) that I used for Parts 1 and 2 are located here.
The Evans County data (evans.csv) that I used for Part 2 are located here.

Visualizing linear regression models in R

I created a tutorial on how to visualize linear regression models using R and the predict3d package.

The tutorial focuses on visualizing linear regression models using R. The tutorial also focuses on using visualization to assess whether the model’s residuals were associated with the predicted values and whether they are normally distributed.

I uploaded this tutorial on my RPubs page here.

The R Markdown code can be found on my GitHub site here.)

R tutorial with epitools

R has a couple of packages that allows users to quickly estimate the risk ratio and odds ratio for a contingency table (e.g., epitools and epiR). I created a tutorial to use this in the context for assessing confounding and interaction in a typical exposure-disease pathway. The tutorial is located in my RPubs page here, and the R Markdown code is available at my GitHub page here.

Forest plots in R

I wrote a tutorial on how to create forest plots in R. It is posted on the RPubs site here. I also posted the R Markdown code on my GitHub page here. The investment to learn how to construct forest plots in R was worth it. I normally do this in Stata or Excel, but I wanted to have some R code to generate forest plots for future studies and publications. In the end, this was a great experience.

Pharmacoeconomics

Decision tree models in R

I created a tutorial on developing decision tree models using R. I posted the tutorial on the RPubs site. The decision tree is based on a hypothetical scenario where a treatment decision for acute respiratory infection (ARI) is between two antibiotics (Treatment A and Treatment B). The outcomes for being cured and not cured are 20 Life Years and 10 Life Years, respectively. When the expected costs and benefits are calculated, the tutorial provides instructions on how to generate the incremental cost-effectiveness analysis (ICER) using R. Additionally, readers will learn how to use R to generate a one-way sensitivity analysis and construct a tornado diagram.

Markov model in Excel

A tutorial on how to construct Markov models using Excel is available here. The example is a static three-state Markov model, which means that the transition probabilities do not change over time.

Distributions in cost-effectiveness analysis

I created a short tutorial on generating distributions for parameters in a cost-effectiveness analysis. The tutorial is available on my RPubs site.

DATA WRANGLING

transform data from wide to long format using r

I wrote a tutorial on using the pivot_longer() function to transform data from the wide to long format in preparation for longitudinal data analysis. The tutorial is located on my RPubs page.

Counting events and preparing data set for interrupted time series analysis

I have always had a difficult time preparing my data set for an interrupted time series (ITS) analysis. This requires further manipulation of the data (in the long format) to count backwards in time before the index date (time of the first event) and counting forward in time after the index date.

I finally took time to write the Stata code for version 15 and posted it on my Github page. Please feel free to let me know if there are any errors or suggestions for improvements.

I created an accompanying tutorial here.

Performing Bayesian analysis with the bayesAB package in R

I learned to use the bayesAB package in R through a tutorial by BC Mullins, which can be found here. I was inspired to replicate the work and create a R Markdown file for it, which I saved as an html page on my Rpubs.com website. This package allows you to explore the different posterior distributions even after increasing the trial (or sample) size of the experiments. In doing so, you update the Bayesian priors and improve the precision and accuracy of the posterior distribution. The advantage of using Bayesian analysis over the conventional Frequentist approach is the ability to provide a stakeholder with the probability that drug A will be better than drug B. Conventional Frequentist inference only can tell you whether drug A is statistically significantly different from drug B. Bayesian interpretation provides us with the ability to predict how often drug A will be better than drug B.

The GitHub code can be found here.

R Shiny - Tutorial: Histogram

I developed my first R Shiny app and posted it on www.shinyapps.io. R Shiny is a data viz interface for R Studio used to develop interactive graphics. I started to use www.shinyapps.io to host my R Shiny apps. You need to to install the rsconnect package into R. A tutorial on how to install rsconnect and how to publish R Shiny apps on www.shinyapp.io can be found here.

The app is a simple histogram with a widget that goes from 1 to 100.

You can view the Shiny app here.

The code for this app can be viewed on my GitHub page here.

JAGS Tutorial in R

I recently wrote a JAGS tutorial in R using data from my professor, Dr. Beth Devine, that was used for her lecture. The Tutorial goes over how to perform simple pair-wise meta-analysis using a Bayesian framework. JAGS stands for "Just Another Gibbs Sampler" and provides a reliable and fast means for performing multiple parallel chains of operations for a single likelihood function. Martyn Plummer is credited with creating JAGS, and you can find his SourceForge account here.

You can view my tutorial here.

The Markdown codes for my tutorial are located here.

You will need to have R Studio installed along with the rjags, coda, and boa, packages.

Medication adherence estimations

Stata - Medication Possession Ratio and Proportion OF Days Covered calculator

I created a Stata template to calculate MPR and PDC for one drug. The calculator is a *.do file that takes the prescription history of a patient for a single drug and estimates the adherence. You can view the Stata code here.

R - Estimating medication adherence using `AdhereR`

I created a tutorial on how to use the AdhereR package in R to esimate the Medication Possession Ratio (MPR) and the Proportion of Days Covered (PDC). I posted this tutorial on my RPubs site (link).

Hoyle and Henley's methods for improving curve fits to summary survival data

While I was working on a project, I had to simulate survival data based on a Weibull distribution. I found a useful article by Martin W. Hoyle and William Henley that details their methods for fitting curves to survival data, "Improved curve fits to summary survival data: application to economic evaluation of health technologies." The authors provided an Excel file with the necessary formulas to generate the Weibull distribution parameters (lambda and gamma) and offer suggestions for using these methods in a probabilistic sensitivity analysis. Although not necessarily a program, these methods can be used in R in order to generate the codes for survival curve simulations in decision analytic models.