Communicating data effectively with data visualizations: Part 35 (Heat maps – COVID-19 Deaths by Week and State)

INTRODUCTION

With the COVID-19 pandemic entering its second year, the number of deaths in the United States (US), has total over 546 thousand deaths (data from March 26, 2021). Of note is the variation in death rates across US. CDC provides a daily update on the number of deaths reported by each state (and major cities) in the US, which is accessible through their portal. However, it can be difficult to visualize all these data; proper visualizations will assist the audience with interpretability and add to the narrative that the COVID-19 pandemic has been more severe is some states compared to others. The Economist provided some helpful data visualizations for Great Britain by using heat maps to report the deviations from the expected weekly death counts due to COVID-19. Here is an example.

* Source: The Economist – Tracking COVID-19 excess deaths across countries (link). This is for educational purposes only.

From this visualization, it is clear that all sectors of Britain were hit by COVID-19 at similar times. Some areas had higher deviations from the expected mortality rates, while other had lower deviations. Additionally, there appears to be another peak in the deviations from the expected number of deaths in late 2020. Using this example as a model, we can recreate this heat map for the US. Fortunately, The Economist has a GitHub portal with all the data for different countries including the US.

DATA

Data for this exercise is located the The Economist GitHub portal. We will use the “covid-19-excess-deaths-tracker,” which includes data on the deviation from the expected deaths (%). Data on the deviation from the expected deaths (%) for the US will be used to emulate the example from The Economist.

 

Heat map of deviations from the number expected death (%) for the US.

We will use Excel to create a heat map of the weekly deviations from the number of expected death because of COVID-19 in the US from January 2020 to February 2021.

Step 1. Inspect the data:

The data for this exercise is located here. Download and open the file. Review the data under the “exercise data” sheet. It should look like the following:

We will manipulate this data so copy and paste this into a new sheet.

Step 2. Add spaces:

One of the elements that The Economist added to their heat map were spaces between each state or territory.

This can be accomplished manually by inserted a row between each state or territory. However, when there are a lot of rows, this can be a burden. A systematic method includes a couple of columns (number and alternate) to create the spaces we need. These will represent the spaces between the states/territories.

Once you have created these, select the entire data set. Then sort them by “number” and then by “alternate”; see picture below:

You should have spaces between each state/territory now.

Step 3. Change column width and row height:

Select the columns and right click to change the width. Set the value to 2. Once you done this, the weekly deviations should look like the following:

Next, use the Filter feature in Excel to only select the “alternate” values  of 1. This should only show values for the “alternate” column that equal to 1. Select the rows where the column values are 1 and right-click. This will allow you to select “Row Height” so that you can change the value to 25. Please see the illustration on how to do this below:

Now that you’ve change the row height for the “alternate” column values for 1, we will change the row height for the “alternate” column values for 0 and 0.5 using the same method. The only difference is that the row height will be 5 instead of 25. See the illustration below on how to perform this step:

Step 4. Add condition formatting rules:

Now that the data column width and row height have been set properly, you can add the conditional formatting rules to change the color of the boxes to reflect the deviations from the number of expected death (%) due to COVID-19.

First, select all the data. Then go to the Home tab and select “Conditional Formatting.” Make sure to click on “New Rule...” Select the “3-Color Scale” because we will use the gradient levels to create our heat map. The darker color will denote the higher deviations and the lighter color will denote the low or no deviations from the number of expected deaths (%). Make sure to change the “Type” to “Number.” You can enter any values that work for you, but for this example, use “0”, “0.5”, and “1” for the “Minimum”, “Midpoint”, and “Maximum” types, respectively.

Next, with the data still selected, right-click on the data to open a window. Select “Format Cells” because we want to remove the values from the cells. In the Number tab, select “Custom” and enter the following: "";"";"";"". This will change the number values so that it doesn’t show in the cells.

The Excel file should start looking like a heat map. However, there are a few more features we can do to emulate the heat map on The Economist website.

Step 5. Adding borders to the heat map:

After formatting the cells so that the numbers are not visible, you can add borders to the heat map. Select the heat map area, right-click anywhere on the data and select “Format Cells”; you will select the “Border” tab and change the border color to white. Then apply the borders to the heat map.

Once you’ve applied the border color change, the heat map should look like the following:

Next, we will add a border to separate the states/territories from each other.

Use the “Filter” feature and select rows where the “alternate” values are 0.5. The rows will be very narrow since we set them to a value of 5 earlier. Now, select the data area including the state/territories column and right-click on the data. Select “Format Cells” and then change the border color to a light shade of gray and apply this to the top border. This will apply a light gray border to separate the states/territories from each other. See illustration below for how to perform this step.

Once you’ve added the border, the heat map should look like the following:

Step 6. Final steps:

In the last steps, you can change the font, hide some of the values, and add a label to identify the weekly deviations. For the final figure, I hid the “alternate” and “number” columns, and I hid the weekly labels. I replaced the weekly labels with a more informative timeline to based on quarters. I also rearranged the states/territories so that I had the United States at the top and some of the more active states/territories at the top (e.g., New York and New York City). This should be remarkably similar to the heat map developed by The Economist. The final figure can be downloaded from the following link.

CONCLUSIONS

Using the data from The Economist, we recreated their heat map for the US. We can visualize when the COVID-19-related deaths exceeded the expected all-cause mortality deaths on a weekly basis from January 2020 to February 2021. For example, New York City was hit with a high percentage of COVID-19 deaths around April 2020, which was also reported for The District of Columbia, New York State, and New Jersey. However, North Dakota, South Dakota, California, and Arizona were hit with high percentages of COVID-19 deaths that exceeded the expected number of deaths later in the year.

Heat maps provide us with a useful visual to compare and contrast different states/territories and the severity of COVID-19 deaths using simple tools provided by Excel.

 

REFERENCES

I used the lessons from a previous blog on developing heat maps (link).

Data on deviations from the number of expected deaths (%) in the US was provided by The Economist (link). However, I cleaned the data so that you can use it for your own exercise (link).