Back

 

Visualisation and Presentation in Statistics

Invited oral presentations

John Aldrich (Division of Economics, School of Social Sciences, University of Southampton):
Graphs before graphics--some history

Abstract: Statistical graphics are today very prominent but they have a long history. The historian of statistical graphics, Michael Friendly, has divided that history into several periods: a formative period before 1850, a `golden age' between 1850 and 1900, the `modern dark ages' of 1900-50 and a `rebirth' in 1950-75. I examine some examples from each of these periods and consider how they reflect the changing character of statistics.


Rosemary A. Bailey (School of Mathematical Sciences, Queen Mary, University of London):
Bad Statistics

Abstract: The news media are full of statistics, often presented in a misleading way. Each year I set my first-year students an assignment to find a newspaper article reporting a statistical investigation and to comment on it: they always find some bad examples, some of which I shall present. However, bad presentations of statistics occur at a more serious level: for example, in the management of universities and research organizations, and in research publications, where many subject areas have conventions for statistical presentation that are not well suited to the complexity of today's experiments.


Martin Bland (Department of Health Sciences, University of York)
Reporting clinical trials with confidence

Abstract: The quality of the best clinical research has improved greatly over the past forty years. A key element in this has been the change from the presentation of results in the form of significance test to confidence intervals, another has been the adoption of CONSORT standards for reporting clinical trials. I shall describe how these developments fit to other changes in research and give a prominent example where ignoring them led to potentially misleading conclusions.


Michael Blastland (Freelance writer and broadcaster):
Statistics: a game for two players

Abstract: Statistics often relies on concepts with which many people outside the trade feel intuitive difficulty. For these audiences - which might include clients, politicians or public - how we present our data and arguments can help enormously. I will discuss how non-statisticians respond to statistical ideas, and whether we can help them to experience or 'play' their way into an idea or data by using interactivity.


Tony Hirst (Communication and Systems Department, The Open University):
Visualisations for the rest of us - How to create rich interactive visualisations without any of the pain

Abstract: For many people, the ability to visualise data is limited to generating chart types provided by spreadsheet applications and statistical software tools, or delving into the intricacies of scientific programming libraries. However, the rapid growth in the publication of public data sets on the one hand, and the development of powerful graphics generating libraries capable of running in web browsers as well as in desktop applications on the other, has resulted in the proliferation of tools supporting interactive visual data exploration and analysis. The growth in "data journalism" has also spurred news media organisation such as the New York Times to invest in the development of online interactive data exploration tools.

In this presentation, I shall review some of the available tools and libraries, including the open source Gephi application (a graphical network analysis package) and IBM's Many Eyes suite of interactive charts, and demonstrate how they can be used to support interactive exploration of data sets without the requirement of programming knowledge. If time allows, the presentation will also cover the rise in visual "data cleansing" applications such as Google Refine and the Stanford Visualisation Lab's Data Wrangler tool.


Jill Leyland (Vice President, Royal Statistical Society)
The lack of confidence in UK official statistics - are communication problems responsible? - And what can be done?

Abstract: Surveys in the UK have consistently shown that only around one third of respondents believe that official statistics are generally accurate. And in 2007 a Eurobarometer poll found public confidence in the UK in official statistics to be the lowest in the EU. These results do not reflect the statistical quality of official statistics so why is confidence so low? There are a number of reasons but the presentation will argue that communication issues, in the broadest sense, are a large part of the problem. These include perceptions of political interference (not helped by pre-release access granted to government ministers), skewed interpretations by politicians, media and others, the relatively low priority traditionally accorded by official statisticians to communication and to ''marketing'' their output, and to a failure sometimes to remember that communication also includes listening and responding to user needs. Some recent developments have been on the right track but it will be argued that (much) more needs to be done.


Kevin J. McConway (Department of Mathematics and Statistics, The Open University):
Statistics in the media: publicity, entertainment and wallpaper

Abstract: Statisticians and scientists are rightly concerned about the way that mass media deal with their findings and reports. But much statistical material in the media does not come through from the usual statistical and scientific sources - it is produced and placed by PR agencies, or it is there to amuse and entertain, or sometimes just to fill up the end of a newspaper column or to make the page look nicer. I will discuss, with examples, whether statisticians should be concerned about these uses of statistics, and, if we are concerned, what we might do about the concerns.


David Spiegelhalter (Statistical Laboratory, Centre for Mathematical Sciences, University of Cambridge and MRC Biostatistics Unit, Cambridge):
Visualising risk and uncertainty: the power of movement

Abstract: Perception of probability, risk and uncertainty can be influenced by the choice of words, numbers and pictures. Preferences and understanding vary widely among people, which suggests the use of multiple presentations. Interactive animations allow the user to select the representation with which they feel most comfortable, and allow the option of exploring 'what-if' scenarios. Movement can be used to represent changes over time or uncertainty, and smooth morphing between representations is intended to be attractive, retain interest, and encourage a 'playful' approach. Optional detail can be added, such as acknowledging uncertainty in the stated risks. Examples will be drawn from areas such as health, weather forecasting, and natural disasters.

This is joint work with Mike Pearson (University of Cambridge) and Ian Short (The Open University).


Michel van de Velden (Econometric Institute, Erasmus School of Economics, Erasmus University Rotterdam, The Netherlands):
Perceptual maps: the good, the bad and the ugly

Abstract: Perceptual maps are often used in marketing to visually study relations between two or more attributes. However, in many perceptual maps published in the recent literature it remains unclear what is being shown and how the relations between the points in the map can be interpreted or even what a point represents. The term perceptual map refers to plots obtained by a series of different techniques, such as principal component analysis, (multiple) correspondence analysis, and multidimensional scaling, each needing specific requirements for producing the map and interpreting it. Some of the major flaws of published perceptual maps are omission of reference to the techniques that produced the map, non-unit shape parameters for the map, and unclear labelling of the points. The aim of this paper is to provide clear guidelines for producing these maps so that they are indeed useful and simple aids for the reader. To facilitate this, we suggest a small set of simple icons that indicate the rules for correctly interpreting the map. We present several examples, point out flaws and show how to produce better maps.


Contributed poster abstracts

Martin Ashley (London College of Communication, London):

Poster 1: ROTHSCHILD BANKING GROUP
Abstract: Part of a series of re-designed diagrams for the Investment Bank Rothschild. Over 50 diagrams were audited and re-designed. A visual toolkit was created in PowerPoint which could be assembled as required. The kit included all the required visual components: graphs, tables, type, headings, colour palette, maps, bullet points, lines, bars etc
Comprehensive training and locked-down templates were given to all bankers. The example at right shows how an economic statistician successfully created a highly credible designed diagram.
Design: Martin Ashley Associates


Poster 2: PROSTRATE CANCER BROADSHEET
Abstract: Poster broadsheet to educate the public on prostate cancer with symptons to look out for.
Seral Mustafa: LCC- London College of Communication. Final Year student BA(Hons) Information Design


Poster 3: OVERVIEW: VISUALISED STATISTICS RELATING TO PROSTATE, BREAST & PROSTRATE CANCER
Abstract: A very user-friendly summary with clear diagrams/charts. The use of a light grey with clear typography means the data gets the maximum chance to communicate.
Seral Mustafa: LCC- London College of Communication. Final Year student BA(Hons) Information Design.


Poster 4: VISUALISATION OF FINANCIAL DATA IN 3D
Abstract: Evolving an effective way of overlaying multiple data sets of equities for financial instruments that was easier to read and had a higherdensity than existing solutions. Allowed 'drilling' down into detailed datasets.
Created by Alex Graul in tangent with venture capitalists for a London based Hedge Fund.
Copyright/registration: Alex Graul.
Design: Alex Graul LCC- London College of Communication. Final Year student BA(Hons) Information Design


Jonathan Bright (AstraZeneca, Macclesfield):
The Power of Pictures

Abstract: Three simple examples to illustrate why a picture of the raw data may be valuable.


Robert Grant (Faculty of Health & Social Care Sciences, St. George's, University of London & Kingston University, London):
Composite performance indicators: bringing uncertainty out into the open


Tim Jupp (Mathematics Research Institute, University of Exeter):
Interpretation, verification and calibration of ternary probabilistic forecasts

Abstract: We develop a geometrical interpretation of ternary probabilistic forecasts in which forecasts and observations are regarded as points inside a triangle. Within the triangle, we define a continuous colour palette in which hue and colour saturation are defined with reference to the observed climatology. In contrast to current methods, forecast maps created with this colour scheme convey all of the information present in each ternary forecast. The geometrical interpretation is then extended to verification under quadratic scoring rules (of which the Brier Score and the Ranked Probability Score are well-known examples). Each scoring rule defines an associated triangle in which the square roots of the score, the reliability, the uncertainty and the resolution all have natural interpretations as root-mean-square distances. This leads to our proposal for a Ternary Reliability Diagram in which data relating to verification and calibration can be summarised. We illustrate these ideas with data relating to seasonal forecasting of precipitation in South America, including an example of nonlinear forecast calibration.
Codes implementing these ideas have been produced using the statistical software package R and are available from the authors.
A draft paper is available at: http://arxiv.org/abs/1103.1303


Michel Wermelinger, Paul Piwek (Computing Department, The Open University):
iChart - Interactive Exploration of Data Charts

Abstract: We are interested in developing interactive and machine-readable charts (pie and line diagrams, etc.) Charts are widely used in newspapers, magazines and MCT course materials, and understanding them is a necessary skill, but they can be challenging for people with sight or numeracy problems. We aim to investigate techniques that allow to generate automatically textual summaries of the charts (e.g. that point out data trends). Such textual summaries not only help students understand what the chart is conveying, they can also be read aloud to students with visual disabilities. We will also investigate techniques to allow users to interactively explore a chart (showing maxima, computing growth rates, zooming in, etc.) thereby leading to a better engagement with and understanding of numeric data.


Sandra Williams, Richard Power (Computing Department, The Open University):
Get the message across: Be vague, Flex the maths, and Make size matter!

Abstract: Three keys to communicating numerical data are: ban unnecessary precision, use mathematical forms that people can understand, and call attention to unexpected smallness or largeness. Our research investigates how to communicate numerical data in natural language (English). We have investigated which kinds of numbers people perceive as approximate and precise. We have built a cognitive model that automatically varies (i) level of precision, (ii) mathematical form and (iii) direction of approximation. We have investigated how professional authors describe numerical data by approximating and using hedges such as `a little bit under' and `dramatically more than' to emphasise largeness or smallness. Our poster illustrates our results.


Thomas Woodcock, Alan Poots (Imperial College London and NIHR CLAHRC-NWL):
Time Warp: distorted interpretations of time series data in the media

Abstract: Time series data are frequently misrepresented - deliberately or otherwise. One manner in which this occurs is the selection of two observations from the series for direct comparison as evidence for non-stationarity - often implying an interesting or important difference from one point in time to the other.
A recent example [1], [2] pertaining to accident and emergency waiting times in England states ''An extra 73,000 patients were left to wait for over four hours in the last three months of 2010 compared with the previous year.'' Whilst not incorrect, this fact cannot be used in isolation as evidence that a step change has occurred in the time series data. Another example regarding referral to treatment waiting times is discussed [3].
Using only elementary probability, we emphasize that direct comparison of two observations is not a valid way to analyse time series data. We then go on to suggest an alternative way to present such data, as a simple run chart, and apply this to the waiting time data. We discuss the advantages and disadvantages of this approach.

[1] guardian.co.uk, 2011. A&E waiting times increase sharply [online]. Available at http://www.guardian.co.uk/society/2011/apr/04/accident-emergency-waiting-times-increase (Accessed 6 May 2011)
[2] pulsetoday.co.uk, 2011. A&E waiting times jump after Government scales back target. [online] Available at http://www.pulsetoday.co.uk/story.asp?storycode=4127771 (Accessed 6 May 2011)
[3] Fullfact.org, 2011. NHS waiting times: the figures used at Prime Minister's Questions [online]. Available at http://fullfact.org/factchecks/NHS_waiting_times_Prime_Minister's_Questions_David_Cameron_Ed_Miliband_King's_Fund-2673 (Accessed 6 May 2011)