Communicating data effectively with data visualizations: Part 31 (Visualizing COVID-19 deaths using sparklines)

INTRODUCTION

As the COVID-19 pandemic continues to place a public health burden on our society, the death toll in the United States has reached a terrible milestone; over 240,000 deaths were reported as of November 17, 2020.[1] COVID-19 is now the third leading cause of death behind heart disease and cancer followed by accidents (unintentional injuries) and chronic respiratory diseases (reported by Scientific American).[2] Of note is the epidemiology spread of new deaths over time. New York has borne the brunt of the incidental deaths; early in the pandemic New York City had the highest 7-day moving average of COVID-related death in the country (1,193 deaths in April 21, 2020), but, recently, the 7-day moving average has dropped to 8 deaths in November 16, 2020. One way that we can visualize this is to use sparklines. Excel has a feature that allows users to plot sparklines onto the spreadsheet. We will use this feature to visualize the change in new COVID-related deaths over time by state.

 

MOTIVATING EXAMPLE

We will recreate the following figure using the Excel sparklines feature to visualize the changing number of new deaths due to COVID-19 from January to October 2020. Data can be downloaded from the CDC Tracker site or from the Dropbox folder. (Data was pulled on November 18, 2020.)

CONSTRUCTING THE SPARKLINES

Step 1. Open and examine the data

You can download and open the Excel file located here. The data should look like the following:

The data are in wide format and are ordered by Total Deaths in descending order.  The abbreviation NYC represents New York City instead of the state. The data also include districts (some cities) and US territories.

 

Step 2. Identify the location of the sparklines options

Excel’s sparkline feature is located in the ribbon area. You can locate this in the Insert tab under the sparklines area.

For this tutorial, we will use the “Column” option. This will generate bar charts within the cell that we identify.

 

Step 3. Insert the sparkline

In the cell next to the “Total Deaths” column, we will add the column label “New Deaths Over Time.” Select the empty cell for the first state and insert the sparkline. A window will open prompting for information. Data Range should include the data for the state from Jan to Oct. Once you’ve selected your data, click on “OK.”

Step 4. Change row height to fit the sparkline

The sparkline is inserted into a narrow cell with a low row height. We can change this by right-clicking on the row label, selecting “Row Height…” changing the “Row Height” to 45, and then confirming that this is the height you want. Feel free to change the height to fit your needs.

Step 5. Select other states

At this point, the sparklines are nearly complete. You can expand this to the other states by selecting the corner of the cell and dragging the sparkline to the other cells.

Step 6. Hide cells

We can hide the cells with the number of deaths from Jan to Oct. However, when you do this, the sparklines will disappear. Excel uses the unhidden cells to create the sparklines. When you hide the data, the sparklines will not generate the visual. To fix this, open the sparklines ribbon and select the “Edit Data” box. This opens several options. Select the “Hidden & Empty Cell…” and click on the “Show data in hidden rows and columns.” This will allow you to hide cells that the sparklines are using to generate the visuals.

Step 7. Final touches

You can change the color of the sparklines and add some table lines to separate the states. In this tutorial, the Arial Nova Light font was used. Here is the final sparklines for the first five states with the most total number of COVID-related deaths in the United States.

CONCLUSIONS

Using sparklines can enhance a data visualization by adding a time series of historical events (e.g., COVID-19 deaths). This is useful when you want to incorporate this alongside some data in a table, spreadsheet, or dashboard. In our tutorial, we used the historical of deaths for each states in the United States to give the audience some perspective of what has occurred despite the high number of deaths. For instance, New York City had the highest number of total deaths, but historically, these deaths occurred in the months of April to July and dropped significantly in recent months. Other states started out with low number of deaths but have many their total deaths in recent months. This provides us with another dimension to the visualization of how different the epidemiologic spread of deaths occurred across the United States.

REFERENCES

  1. CDC. COVID-19 Cases, Deaths, and Trends in the US | CDC COVID Data Tracker. Centers for Disease Control and Prevention. Published March 28, 2020. Accessed November 18, 2020. https://covid.cdc.gov/covid-data-tracker

  2. Stix YZ Gary. COVID-19 Is Now the Third Leading Cause of Death in the U.S. Scientific American. Accessed November 18, 2020. https://www.scientificamerican.com/article/covid-19-is-now-the-third-leading-cause-of-death-in-the-u-s1/