### Lies, Damn Lies, Statistics, and Charts

The recent Irish Government report Delivering the Smart Economy provided me with a number of great examples for a class I like to give sometimes about how poorly (often deliberately) designed charts and graphs can be misleading without actually lying. Now I can replace some of my contrived examples with real life ones.

The first example Business Expenditure on R&D is unusual. It shows business expenditure on R&D by indigenous and foreign companies over the years. Bizarrely the graph codes the information with bars of three rather colours than two. The far too subtle drop shadow above the bars is meant to point out that the bars are overlaid on top of each other. At a glance it would appear as though the foreign investment is not significantly greater than the indigenous because the total areas of of their respective colours is comparable. A more honest designer would simply have used two colours and stacked them one atop the other. This (the designer might retort) would make the individual data points more difficult to read since some mental subtraction would be required. But this problem can be resolved by actual values appearing on the chart as they do anyway. My stacked two-colour version shows the relative performance of both sectors more honestly.

In Edward R. Tufte's superb book The Visual Display of Quantitative Information

the author warns of the Lie Factor and cautions against using two and three dimensional shapes to represent one-dimensional data. The second graph Growth in Total Turnover is unusual in that it uses colored circular areas to show the values on the linear axis in the center. As with the previous graph the overlay of one colour on the other is deceptive. But the use of circles here instead of rectangular bars adds another deceit. When the height of a rectangular bar is increased by 10%, say, the total area of the bar is increased by same amount. Increasing the height of a circle by 10%, increases the area by 21%. Doubling the height of a circle, increases its area four-fold. This effect of makes the increase from 2004 to 2006 appear larger than it actually is.

The Total R&D Expenditure chart pulls the same trick, but this time confuses the issue with off-center circles and semi-circles. Strangely the data points do not appear at the circumference of each coloured area, but at the mid-point between the area boundaries. The caption is confusing. I am not sure if these are year on year figures or cumulative ones. Either way the Lie Factor is very big. The 2008 value is roughly 2.7 times that of 1998. Yet the circle used to represent it has an area 7 times greater.

A similar technique is used to illustrate the Trend in Higher Education R&D Expenditure. In this chart a near five-fold increase in expenditure is depicted with an circle well in excess of 100 times the size. My more mundane line chart reflects the increase more accurately.

## Comments