August 30, 2021

Reproduction number—COVID-19

August 30, 2021

BACKGROUND

As the COVID-19 pandemic, which began in December 2019, continues into its second year, public health measures have been put into place to mitigate its spread. At the time of writing this article, there have been over 4.5 million deaths and over 216 million cases due to COVID-19.[1] Surveillance of COVID-19 remains an important public health measure of understanding the spread and impact. Daily reports such as the John Hopkins COVID-19 dashboard provide end users with visual and statistical information about the surges in cases and deaths associated with COVID-19. However, one measure that is of great interest is the reproduction number or R0.

Reproduction number (R0) and effective reproduction number (Rt)

The reproduction number is the number of new cases that is directly caused by exposure to a single case.[2,3] Figure 1 provides a visual explanation of the basic reproduction number. However, the underlying assumption with R0 is that everyone in the population is susceptible to infection. With the introduction of vaccines, the R0 isn’t a good measure of the reproductive capabilities of COVID-19. Instead, the effective reproduction number (Rt) is used to provide a more realistic reproduction number based on the population being infected, recovered, or vaccinated. The Rt changes over time as the population susceptible to infection changes.

Figure 1. Basic reproduction number.

I wanted to create a figure that would highlight the changes associated with the Rt for each state in the United States. To do this, I downloaded the Rt data from the by Xihong Lin's Group in the Department of Biostatistics at the Harvard T.H. Chan School of Public Health. They have an amazing COVID-19 tracker dashboard that captures the changing patterns of Rt for each state. Then I created a Cleveland plot to show where the Rt was near the beginning of the pandemic and where it is currently (August 2021). (Note: I wrote a tutorial on creating Cleveland plots that you can review here.) Here is the final figure (because of the length of the figure, I cropped it to show the first 30 states or territories):

Figure 2. Effective reproduction number (Rt) for U.S. states and territories, April 17, 2020 (past) to August 14, 2021 (recent).

The blue dots denote the most recent effective reproduction number (14 August 2021) and the past dots denote the earliest effective reproduction number (17 April 2020).

It seems that some states have gotten worse in terms of increase effective reproduction number since the beginning of the pandemic. This could be due to lack of good data in the early phases of the pandemic. However, what is of concern is the high effective reproduction numbers in some states (Rt > 2), which indicates that the pandemic is still spreading at an alarming rate.

There were some missing data which are identified by a single dot (blue or red) or an empty field in the recent or past effective reproduction number. Rather than fill these in, I left them empty. There may be data in between the two time periods that I could have used, but I left those out.

One thing to mention is that this Cleveland plot only tells us one dimension of the effective reproduction number story (the difference between the most recent Rt and the earliest Rt). It doesn’t tell us much about how the effective reproduction number changes across time. For that, I direct your attention to the Lin’s Laboratory Group at Harvard, they have a great figure that shows the fluctuation of the effective reproduction number for the U.S. and its states/territories (see example):

Source: Lin’s Laboratory Group at Harvard (link). [last accessed on 30 August 2021].

CONCLUSIONS

The effective reproduction number provides us with some interesting patterns in spread of COVID-19 by states/territories. It seems to have worsened over time, but this could be due to poor data early in the pandemic. There are some issues with the us of effective reproduction number for policy decisions. Reporting delays can impact the estimates for the effective reproduction number. A technique called “nowcasting” is used to estimate the reproduction number.[3] But when I explored some of the work in this area, there appears to be a variety of methods for performing this technique. Despite this limitation, the effective reproduction number may be useful to evaluate public health policy decisions to reduce the spread of the COVID-19 pandemic.[4,5]

DATA SOURCE

I provided the link to the COVID-19 Spread Tracker from the Lin Lab at Harvard. You can also download a curated version of the data for this article from my Dropbox folder. The data are current as of 17 August 2021. If you’re interested in recreating this Cleveland plot, I recommend downloading the most recent data to see how much the effective reproduction number has changed.

REFERENCES

Worldometeres.info. COVID Live Update: 217,770,381 Cases and 4,521,936 Deaths from the Coronavirus - Worldometer. Accessed August 30, 2021. https://www.worldometers.info/coronavirus/
Lim J-S, Cho S-I, Ryu S, Pak S-I. Interpretation of the Basic and Effective Reproduction Number. J Prev Med Pub Health. 2020;53(6):405-408. doi:10.3961/jpmph.20.288
Adam D. A guide to R — the pandemic’s misunderstood metric. Nature. 2020;583(7816):346-348. doi:10.1038/d41586-020-02009-w
Inglesby TV. Public Health Measures and the Reproduction Number of SARS-CoV-2. JAMA. 2020;323(21):2186-2187. doi:10.1001/jama.2020.7878
Pan A, Liu L, Wang C, et al. Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan, China. JAMA. 2020;323(19):1915-1923. doi:10.1001/jama.2020.6130

Mark Bounthavong

April 27, 2021

Data visualization

Communicating data effectively with data visualizations: Part 36 (Scatter plots and Moving Averages for COVID-19 vaccination)

Mark Bounthavong

April 27, 2021

Data visualization

INTRODUCTION

As of 27 April 2021, the Centers for Disease Control and Prevention (CDC) reported that approximately 42.7% of the U.S. population have received at least one dose of the coronavirus (COVID-19) vaccine and 29.1% are fully vaccinated.1 Moreover, a total of over 200 million doses have been administered. Despite the effectiveness of the availability of COVID-19 vaccines, there are some people who have decided to not receive the vaccine. The Economist published a data visualization that summarized the results of weekly surveys that asked respondents, “When a covid-19 vaccine becomes available to you, will you get vaccinated?” This figure shows that approximately 60% of the population will get or already have received the COVID-19 vaccine. However, a little over 20% stated that they will not get the vaccine and nearly 20% are unsure. Using data from The Economist, we can use Microsoft Excel to recreate the figure below:

* Source: The Economist – “When a covid-19 vaccine becomes available to you, will you get vaccinated?” (link). This is for educational purposes only.

DATA

The data to recreate the figure can be found at The Economist YouGov data site (link). I also abstracted the data and saved them onto an Excel file, which is located here. Using the data from The Economist, we will create a scatter plot and use the 2-day moving average to re-create the figure above.

Creating the Scatter Plot with Moving Average

Step 1. Inspect the data

The data include the percentage of respondents who answered “Yes” (includes both Yes and those who received the vaccine), “No”, and “Not Sure”. The timeline is composed of the calendar year, month, and the week the data was collected. Notice that for some of the months, there were 5 weeks. I altered the scale so that the data would fit into a 4-week bin. Hence, you’ll notice that the weekly scale is 1, 2, 3, 3.5, and 4.

Step 2. Highlight the data and insert a line chart

Step 3. Modify the data source

Once the line chart is inserted onto the workspace, you’ll notice that the scatter plot hasn’t been created. A few more steps are required. Right-click on the chart and click on “Select Data” to open the window that will allow you to make modifications to the data source.

Click on the “Edit” option and select the rows containing the Year and Month on the data sheet. This will create the X-axis labels for our data visualization. You should notice that the X-axis labels have changed to reflect the Year and Month.

To further modify the data source, right-click on the chart again and click on “Select Data….” At the data source window, remove “Year”, “Month”, and “Week” from the legend series to only display the percentage of respondents to the survey question. Once these have been removed, click “Ok.” This will generate a link chart with the respondents’ answers to the survey question.

Step 4. Change line chart to scatter plot

Currently, the X-axis displays the data from 2021 before 2020. We can change this by right-clicking on the X-axis and then clicking on “Format Axis.” This brings up a window where we can click on the box by “Categories in reverse order” for the X-axis to display the respondents answers in chronological order.

To change the line chart to a scatter plot, right-click on the line and select “Format Data Series…” This will open the options to modify the data on the chart. Instead of a line chart, we want a scatter plot. To make this change, click on the paint bucket and under the line group click on “No line” to remove the line chart. Then under the Marker group, select the circle under “Type.” This will replace the line chart with a scatter plot.

Repeat this for the other lines and you should have a chart that looks like the following:

Step 5. Adjust the aesthetics and add a moving average line

The final step involves making changes to the aesthetics of the chart. You can remove the labels on the Y-axis, delete the gridlines, delete the chart title, eliminate the border of the X-axis, and then delete the legend. You can add moving average lines for each of the scatter plots by right-clicking on the data points and selecting “Add Trendline…” Then select “Moving average” with “2” periods. Change the moving average line from a dashed line to a solid line and a higher thickness.

The final figure can be achieved by changing the colors of the scatter plots and moving average lines. You add additional labels to emulate the figure from The Economist.