Introduction
Data visualization is a form of visual communication that takes quantitative information and displays it as a graphic, an abstraction of the real world. Effective data communication makes complex statistical analysis accessible without excessive mental burden. It is also used to identify patterns through data exploration. Unlike information visualization which includes catch-phrases such as “Infoviz” and “Infographics,” data visualization is intuitive, informative, and “pretty” while simultaneously focused on scientifically structured comparisons, analytic precision, and statistical inference. The challenge is compressing all the quantitative information into a single chart or graphic that provides a narrative or purpose that can be synthesized and acted on with very little mental effort.
There are a variety of data visualizations that can be used such as choropleths, heatmaps, scatter plots, and dot plots (this list is not all inclusive). The selection is dependent on the data, audience, and narrative. How complex is the analysis? Who are you presenting this information to? Why should the audience care?
The best way to present data effectively is with a good story. Your graphic should be able to tell a story based on the quantitative information. Every graphic you create should be a self-contained narrative of the data. This can be achieved using simple tools, but the creation of effective data visualization depends more on your ability to tell a good story. The purpose of this article is to highlight some important principles of data visualization, review common data visualizations, and develop a mechanism to select the most effective data visualization.
Principles of data visualization
Data visualization can be traced to several different schools of thought (e.g., Edward Tufte and William S. Cleveland), but the fundamental principles are similar and often overlap. Edward Tufte identified several key principles when developing data visualizations (Table 1).