También disponible en Español

Inf@Vis!

The digital magazine of InfoVis.net

Rules to make a bad graphic representation
by Juan C. Dürsteler [message nº 109]

Good graphics are those that aren’t noticed, the ones that support and show the data without interfering with it. We review some of the rules on how not to make a good graphic representation.

Many companies make and present next year’s budget these days. Presentations about the sales decisions that, along with powerful cuts in costs, will allow us to grow and remain profitable next year, even though the economic environment is not in its best shape.

Can you imagine the public congratulating the presenter for his/her excellent graphics and the careful selection of colours? Not me. First because I suspect that these types of expositions do not use graphics anymore. Since I don’t have objective data I encourage you to tell me whether the budget presentation is full of graphics and charts or not, in your company. Secondly because if the presenter is congratulated because of the beautiful graphics it means that they are hiding the data.

An excellent graphic should make us exclaim : what interesting data! If what catches our attention from a graphic or a table is its colours or the way it is done, means that either the data is not shown appropriately or it has no interest at all. As we have already said about design on other occasions, a good graphic is one that is not perceived. It’s just there to reveal the data and to show the phenomena underlying it.

But, what does is it mean “to make a good graphic”?. Howard Wainer in his book “Visual Revelations” summarises it this way “The goal of a good [quantitative] graphic is to show the data with precision and clarity”.

And, since more often than not it’s easier to know what doesn’t work than what does work, Wainer changes the sentence to negative and asks himself how to make a pitiful graphic? (let’s agree that it’s funnier to look it that way). 

The answer, then, is easy: don’t show much data and if you do, do it in an imprecise and obscure way.

Wainer develops these three concepts into 12 rules that we can’t enumerate in full here for the sake of briefness. However these 12 rules have some common denominators that I will try and summarize in two : occultation and inconsistency

Occultation: You can avoid showing the data in different ways 

  • Minimising the data density.

  • Minimising the data/ink ratio

  • Hiding the difference. 

  • Showing the data out of context.

  • Emphasising the trivial. l

  • Labelling in an unreadable, incomplete and ambiguous way

Let's see it with some examples (click on the graphics to enlarge them):

Minimising the data density

Minimising the data/ink ratio

Hiding the difference

SinDatos_en.gif (6581 bytes) DataInk.gif (79724 bytes) OcultoEscala_en.gif (44332 bytes)
Not much data for a large chart. See also issue number 74 . In this case a table or even a sentence would be enough to summarise only four items. Using little ink for the data and a lot of ink for the axis, the reference grid, the labels, the ancillary elements. 
Inspired in an example of Wainer's book .
Choosing a scale on which the differences between the data are barely perceivable.
Inspired in an example of Wainer's book .

Showing the data out of context

Emphasising the trivial

Labelling in an unreadable, incomplete and ambiguous way 

Villabeoda_en.gif (39580 bytes)

OcultoAccidentes.gif (134531 bytes)

For example eliminating the previous data (showing only the data included into the red rectangle) that show that alcohol consumption in Villabeoda since 1994 anti-alcohol campaign raised the consumption instead of reducing it.

The data and the story is, obviously, fictitious.   

Laying out the elements of a graphic in a way that what catches your attention is not the most important or most negative conclusion. 

Another method is to fill adjacent areas with saturated colours or dense patterns. This creates an annoying and discouraging effect.

In the example the data from traffic accidents in Spain (total and deadly) per province for year 2000. Source Dirección General de Tráfico.

In the car crash graphic it's impossible o know what series correspond tot total accidents (the highest) which ones are the deadly crashes (the intermediate) and what represents the lower (percentage of deadly over total).
On the other hand the value labels overlap and occlude between them. 
It would suffice to represent the percentage along with the total number of accidents to reach to the most interesting conclusions: which provinces have the most deadly accidents and which ones have most accidents.

Inconsistency: Most of the graphics are based on some type of codification. For example the length in a bar chart is proportional to the magnitude we want to show. On the other hand, the axis’ scales allow us to contextualise and reference the phenomenon. This leads to different inconsistency techniques. 

  • Ignore the codification. 

  • Codify in one dimension and represent it in many

  • Change regularity in the middle of the axis

  • Compare values between curves or change the situation of the origin of different data to avoid comparing in the same conditions
Ignore the codification Codify in one dimension and represent it in many
InconsBorduria.gif (32368 bytes)InconSildavia.gif (34170 bytes)

InconsPomas.gif (51906 bytes)

Make the lengths or the areas that represent the data be non proportional to its values, at your discretion.

Even better, invert  the codification without notifying it. In the example the two harts represent the export and import to and from two different imaginary countries. In the left one the darker colour represents import. It appears, then, that we export more tha we import from the two countries... except that in the right side chart the dark colour means export. Click on the images to enlarge them and see the legends.

For example codify lengths but show areas or even better, volumes

In the example the values of the apples represent the production. The height of the apple is proportional to the production but what we perceive is their areas. Between the first and last ones there's a difference of 54% what we perceive 139%!

Change regularity in the middle of the axis Compare values between curves or change the situation of the origin of different data to avoid comparing in the same conditions.
InconsSalary.gif (46474 bytes)InconsSalary2.gif (29683 bytes) InconSales.gif (31708 bytes)
In the example (left side chart) it begins in steps of 8 years to end up with steps of 1 year. This is almos a logarithmic scales. This makes that the salaries of group A, that increase nearly exponentially, appear to increase linearly, as it occurs effectively in group B.
The chart to the right shows the same data correctly represented with a regular axis, showing the unbalance between the groups.

Redrawn from the data of Washington Post, Jan, 11, 1979, as appears in Wainer's book,

A typical example that leads to confusion are the stacked bar charts. Except for the total values, the individual contributions are very difficult to compare because the base line begins in different places.  

Maybe this could be seem obvious to you, but you can find, more frequently than expected and probably more due to error than anything else, graphics with one or more of the said elements.

Making a good graphic requires a certain practice, but above all reflection, i.e. time in the end. Something that nobody seems to have in excess.

Links of this issue:

http://www.infovis.net/printRec.php?rec=llibre&lang=2#VisualRevelations   The book Visual Revelations by Howard Wainer
http://www.infovis.net/printMag.php?num=74&lang=2   Issue num. 74 titled Graphical Grammar
© Copyright InfoVis.net 2000-2014