Power BI

Communicating data effectively with data visualizations: Part 27 (Building a COVID-19 surveillance dashboard using Power BI)

INTRODUCTION

With the advent of the COVID-19 pandemic, several high-profile dashboards have been created using data from around the world. Since Power BI is a new tool, I thought it would be a good opportunity to develop a simple COVID-19 dashboard using it using existing data.

We will use this opportunity to create the following dashboard:

DATA

Data for this tutorial came from Our World in Data (OWID) GitHub site. You can download a version of the data that focuses on the United States from my Dropbox folder. There are a lot of variables in this dataset; however, the codebook for all of the variables is located here.

 

POWER BI TUTORIAL

Step 1. Download data

The data for the United States is located here. Download this to a convenient location.

 

Step 2. Loading the data into Power BI

When you start Power BI, select the “Get data.” Make sure you click on the Excel format and then click on “Connect.” Locate the data where you saved it and then in the Navigator window, select “Sheet 1.” Make sure to Load the data to complete the process.

Step 3. Plotting total deaths

In the Visualizations panel, select the link chart. In the Fields panel, you can select the “data” variable. However, we will not use the built-in hierarchy. Instead, we will use the date order (e.g., January 1, 2020; January 2, 2020; January 3, 2020; etc).

After you have selected the date for the X-axis, the next variable is the Total Deaths.

Step 4. Change the labels

The current visualization uses the default labels. However, we can change these in the Visualization panel.

To change the name of the visual, you will need to use the Format option

Once you’ve done these steps, the first visualization should be nearly complete. Additional edits to the visualization can be done using the Format (paint roller icon) such as the background color and line color.

Step 6. Include the Card text

You can add additional visualization such as the number of deaths using the card feature in the Visualization panel.

Step 7. Repeat steps for the other visualizations

You can repeat these steps for the other COVID-19 metrics (e.g., new deaths, total cases, and number of new cases).

Once you have added all the visualizations to your dashboard, you will hopefully get something like the example below.

CONCLUSIONS

Power BI makes it very convenient to build dashboards using available data. Other software are also available, so I would recommend exploring them. However, I would also encourage you to explore other dashboards that are created using Power BI; Microsoft has a library of great examples. By using the methods presented in this tutorial, you can build other dashboards using different types of data.

Files used in this tutorial are available here.

REFERENCES

Data for this tutorial came from the Our World in Data GitHub site.

 

Communicating data effectively with data visualizations: Part 26 (COVID-19 choropleth)

INTRODUCTION

Power BI is a useful tool that allows you to build dashboards and data visuals. One ability that makes Power BI unique is its mapping feature. Although there are pre-built map functions in Power BI, better and easier applications are available through their AppSource store. This tutorial will go over how to create a choropleth of the most recent COVID-19 pandemic using the incidence rate (dated June 21, 2020) for the United States (US) using Power BI and the Drilldown Choropleth application.

 

DATA

You will need to download two types of data: (1) COVID-19 incidence rate for each county in the United States (You can download the Excel file from this Dropbox link) and (2) Shape files for the United States. Fortunately, there are many sources of shape files for the United States, and I have made one available using the Albers USA projection, which has the state of Alaska and Hawaii in the lower left corner of the map. You can use the link to the map shape file here. Data for the shape files comes from the US Census. [I downloaded the ZIP file and converted the SHP files to TopoJSON using the following conversion tool (Mapshaper.org).] The COVID-19 data contains US county codes called the Federal Information Processing Standard Publication (FIPS), a 5-digit code that identifies counties. Incidence rate is defined as the number of confirmed cases per 100,000 persons.

Albers US projection

POWER BI TUTORIAL

For this tutorial, you will need to have Power BI installed on your desktop to re-create this choropleth.

 

Step 1. Get the data

When you start Power BI, you will see a start screen that allows you to select the data you need. Since the COVID-19 incidence rates for the US data are saved as an Excel file, you will need to select Excel as the data type and then connect to the data.

After clicking “Connect,” the Navigator will open. Select Sheet 1 containing the data. Notice that there are two variables (FIPS1 and FIPS). These are the data that will be used to pair the incidence rate to the US county code. The FIPS code has a “0” in front of some of the county code (e.g., 05001) for a total of 5 digits. However, notice that these are in numeric format so the preceding “0” is missing. To remedy this, we need transform the data. Make sure to select “Transform Data” from the Navigator screen.

After selection “Transform,” you will need to change the FIPS variable from numeric to text.

You will be asked to either “Replace” or “Add New Step;” make sure that you select “Replace.” This will replace the numeric data to text, which is what we want for the FIPS variable. Check the data after you do this; notice that the “0” is now preceding values that are less than 5 digits. (FIPS code is a 5-digit county code.) After the data has been transformed, select “Close & Apply” to finalize the data. This will load the data into Power BI, which you will use to build the choropleth.

 

Step 2. Download and Install “Drilldown Choropleth” app

Power BI allows you to download apps from the AppSource store. Click on the “…” and select “Get more visuals.” This will open the AppSource store where you can type the name of the app you want.

Step 3. Enter data into the Drilldown Choropleth app

After you install the Drilldown Choropleth, an icon will appear in your Visualization panel. Click on this icon (Drilldown Choropleth); this will open up options in the Visualization panel. There are two important fields (“Location” and “Values”). Click and drag the FIPS variable into the “Location” field and the Incidence_Rate variable into the “Values” field.

Next, click on the Paint roll icon to open the Drilldown Choropleth options. Expand the “Shape” option to change the “Projection” to “albersUSA” and then enter the link to the TopoJSON shape file for the United States counties. The TopoJSON link is here: https://raw.githubusercontent.com/mbounthavong/Maps/master/c_03mr20.json

 

 

After entering the projection type and the TopoJSON link, you can make changes to the color of the choropleth. Since this is a choropleth, you will need to enter values for the Minimum, Center, and Maximum colors. This will generate a gradient where the darker colors reflect high incidence rates and the lighter colors reflect low incidence rates of COVID-19.

Finally, you can change the size of the visual and save it. Since there are a lot of data being processed, updating the choropleth will take a few minutes. With a little patience, you will generate a suitable choropleth of the COVID-19 incidence across US counties for June 21, 2020.

CONCLUSIONS

Power BI allows us to create visually spectacular choropleths. Additionally, it allows us to create choropleths that are interactive. You can hover over each shape and see the incidence rate per county. However, there are limitations. Inserting a legend to describe what the colors represent is not easily possible (I was unable to solve this problem). There are roundabout ways to get the legend inserted into this choropleth, but it will require additional software (as far as I know). There is potential for the Drilldown Choropleth app to improve by adding an easy way to include functionalities for legends and labels. Despite this limitation, Power BI allows us to create these choropleths without having to resort to more complex programming in R or Python.

Files used in this tutorial are available here.

REFERENCES

Mapshaper.org (link: https://mapshaper.org/)

US Census Cartographic files (link: https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html)

COVID-19 data come from the Johns Hopkins GitHub site (link: https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_daily_reports/06-21-2020.csv)