Eight Principles of Data Visualization
Imagine you are walking out of the office after a long day and your phone buzzes with a new email. Taking a quick glance, you see that it’s from Joe in operations: "Hey, wondering if you could run me a few numbers and put them in a nice chart to show how well our new store layouts are doing along with the latest sale promo we started last week. Need to put it into a presentation for the executive team next Monday. Thanks."
What does Joe really need? Where do you start? For anyone in a business environment who collects or manages some kind of raw data, tasks that are becoming more pervasive, the need to process that data into a human-usable form is increasingly common.
Visualizations, like the chart Joe asked for, are a great way to accomplish this, but they can be difficult to do properly, as anyone who has sat through a slide show presentation with an unreadable pie chart or vague growth projection graph can attest. As available data becomes more complex and extensive, weaving it into a visualization that invites engagement, understanding and decision-making is a bigger challenge, with a bigger opportunity for payoff.
Some of the traditional business standbys, like a one-off pie chart or simple line graph, even if done well, may not offer enough data to answer multi-faceted questions like Joe's. (See Figures 1 and 2, at left.) How can we take visualizations to the next level, so they can take on the challenge of today's business complexity?
Get the Fundamentals Right
The first step is to back up and focus on the basics. If you have ever played a team sport with a good coach, you may recall that he or she spent a lot of time working on fundamentals. Trick plays or advanced moves don’t win a game without solid fundamentals supporting them, and data visualization is no different. The most complex, data-rich graphic is useless unless it follows basic principles of good visualization:
1. Understand the problem domain. If you are producing visualization for your own use or that of your department, chances are good you already understand the area you will be working in. But if, as in our scenario with Joe, the visualization is for another department, or even an external stakeholder such as a customer or partner, you may need to ask questions and do more research to understand what is involved. In this case, you should investigate when these initiatives started, whether any others are in progress at the same time and what metrics the executive team will use to determine success.
2. Get sound data. This may seem obvious, but good data is at the heart of any effective visualization. Make sure the data you select is as accurate as possible, and that you have a sense of how it was gathered and what errors or inadequacies may exist. For example, maybe our store sales data for Joe is only current as of the last close of business, thanks to an older cash register system. Make sure you get relevant data and enough of it. We probably want not only sales data after these changes, but also the month or quarter before and even the same period in past years for comparison purposes. Above all, to create an effective visualization, you need to understand the meaning of the data you are working with. This can be a challenge if it has been stored as raw numbers. In this case, we may need to determine the store visitor counting method being used to know what those numeric tallies mean.
3. Show the data and show comparisons. Picking the best type of visualization is an art and science; however, the basic rule of thumb is to choose a spatial metaphor that will show your data and the relationships within it, with minimum distractions or effort on the part of the viewer. As Eddie Breidenbach explains, most graphic arrangements fall into one of four categories or metaphors (see Figure 3, at left):
- Network - to show connections, sometimes in a radial layout.
- Linear - to show how something varies over time or in relation to another factor, often on an X/Y space.
- Hierarchical - to show groupings and importance; these can come in many different layouts.
- Parallel - to show reach, frequency or shares of a whole; these can come in many different layouts.
For Joe's chart, we can start with a well-labeled, linear line graph since we want to see how sales have been affected since introducing these new initiatives. (See Figure 4, at left.)
4. Incorporate visual design principles. Using sound visual design elements, like line, form, shape, value and color, with principles like balance and variety, make a visualization both more inviting and easier to read for trends and comparisons. (See Figure 5, at left.) This will become particularly important as we take our linear metaphor visualization to the next level.
Bring in More Dimensions
Once we have good data and a sound underlying spatial metaphor (in this case, a linear metaphor), it is time to take account of the complexity at play. Though it might seem like we have satisfied the initial question at face value (“Sales are up since changing the store layout and starting the new promo”), this answer is likely to spur more questions.
Based on our knowledge and research into the problem domain, we can come up with initial follow-up questions after looking at the simple linear metaphor visualization:
- We started both of these initiatives right before a holiday weekend. How do we know that this uptick in sales is not just a seasonal trend?
- Total sales are up, but has the new store layout succeeded in improving the performance of some departments that were struggling before?
- Are we succeeding in getting more customers into the store and not just selling more to existing ones?
- Are customers shopping more departments and buying a more diverse mix of items?
Asking these kinds of questions is a great exercise to begin taking a visualization to the next level because they prompt us to add more dimensions that allow viewers to explore and understand the subject from additional angles and in more detail. There are a variety of solid techniques that can help achieve this additional dimensionality. Below are the answers to these questions:
5. Add small multiples. As described by author Edward Tufte, small repeated variations of a graphic side-by-side allow for quick visual comparison. Whenever possible, scales should be kept the same and the axis of comparison, aligned. Adding some small, stacked thumbnails of our chart next to the main one allows a comparison of sales trends for the same period last year, and the one before that. (See Figure 6, at left.) This answers our first question: sales do normally go up this time of year, but the increase seems to be quite a bit bigger this time, so it is probably not just the normal seasonal cycle.
6. Add layers. Adding extra levels of information, while preserving the high-level summary data, can make a graphic more flexible and useful. Next, we are going to break down the "top line" of total sales into departments. (See Figure 7, at left.)The resulting stacked area chart answers our second question, showing that sales from the appliances department have increased as a proportion of the whole, but media department sales have not improved much.
7. Add axes or coding patterns. Another way to get more dimensions in a graphic is to add additional patterns for coding information, such as varying the shape or color of points on a plot based on a variable. In some cases, an extra axis in space, alongside an existing one or in a new direction (for a 3D chart), can also be useful for showing new variables. It's important to be careful with this approach, as it can add clutter, but when used sparingly and with good design principles it can increase a graphic's usefulness. In Figure 8 (at left) we added an additional vertical axis on the right to show daily foot traffic into the store, with its scale overlaid carefully to be comparable but distinct. To answer question number three, “Yes; we have increased foot traffic, but only after the sales promotion.”
8. Combine metaphors. So far, we have used a linear metaphor for our visualization. However, to answer our last question, we want to add a network metaphor to show connections between product categories in purchases. A pair of circular relationship (chord) diagrams showing snapshots at the beginning and end of the time period under consideration can help compare these connections. Like a pie chart, each product category is assigned a section of the circle, by percentage of total sales, but the center of the circle is hollow. If a majority of purchases containing items in one category also included items in a second category, a line is drawn to that second category; line width is based on the average proportion of both categories in the mixed purchases. As shown in Figure 9 (at left), the increase in these chord lines from the first to second diagram suggests there are indeed more purchases that cross departments since our initiatives went into place.
This relationship data would be even better if we could see it at any chosen point in time (for example, to see what effect, if any, the layout change alone had, before the promotion started). A zoomed-in view of the chord diagrams for detailed study might be useful, too. Clearly, some presentation media lend themselves to these opportunities more than others. As our graphics increase in complexity and sophistication, we need to think more carefully about how to deliver them.
Consider New (and Old) Delivery Methods
The point of any visualization is to be viewed by the right people, in the right context. Unfortunately, many business visualizations have a fleeting life on a slide, up one minute on a low-resolution projector to be scanned from across the room, and nothing but a vague memory the next.
What if, instead of a “flash on a slide” with all of these limitations, Joe's final visualization was printed in high-resolution color on a handout? Everyone could refer back to it as a touchstone during the whole presentation, seeing how the data backs up Joe's conclusions. Afterward, they could tack it up on a whiteboard for further study and follow-up.
On the other hand, maybe Joe needs people at a remote site to see this graphic or he would just prefer not to kill so many trees. He might consider putting a high-resolution version on the Web (or corporate intranet) for viewing on a PC or tablet. This could be as simple as a static graphic like the paper copy, but it also opens all kinds of possibilities for interactivity. To give just a few examples, we could enable scrubbing through time (great for seeing more network metaphors), drilling down and zooming out for a bird's eye view, seeing new data live as it becomes available or even manipulating future variables to watch different scenarios play out.
Toward the Future
As visualization moves toward delivery via electronic medium, complex data visualization is increasingly blending into the discipline of user experience design and programming. Business analysts, IT staff and knowledge workers will need more skills designing, building and using fluid, interactive, dynamic visualizations. Fortunately, there are great tools and great groups of people focused on user experience, The potential payoff for the investment is huge: visualizations invite us to explore, understand and decide, not as one-off disposable products, but rather as robust, enduring touchstones that customers and leaders return to for insight, conversation and connection.
Note: For more on visualization fundamentals, a good place to start is Edward Tufte's excellent series beginning with “The Visual Display of Quantitative Information.” Also see “Visual Design Fundamentals: A Digital Approach” by Alan Hashimoto.