Data journalism: brutal statistics

The typical journalisnumber-crunchingt (a) mismanages his or her own finances (b) is terrified of computations of a numerical nature (c) believes that statistics lie but uses them anyway, and (d) knows that the time for data journalism has come.

Do you see a contradiction here? Like all stereotypes, the picture of Joe Sope, the scruffy good-at-heart reporter who uses intuition to get the story rather than cold reason, is misleading but contains more than a grain of truth. Financial journos are hot on share prices analysis; environmental reporters know that the devil is usually hidden in the greenwash detail; and those covering public accounts and records in Parliament and government are eagle-eyed to spot the lying statistic.

Right now, the issue of police brutality in South Africa urgently requires the data journalism treatment.

police-brutality cellphone imageCitizen mobile videos of SA police brutality highlight specific cases, but the big picture is that 932 people are said to have died in police custody in South Africa in 2011/12, according to a report citing the Independent Police Investigative Directorate (Ipid). It is hardly reassuring to the ordinary citizen to be told that the figure, adjusted to a mere 720 fatal incidents, represents a decrease of 10% compared to 2010/11.¬† As the SAPS¬† itself says, “The South African Police Service is responsible for the safety and security of all Therefore, each and every death in police custody or death as a result of police action…is a matter of concern”.

I’ll say. What we’d really like to know is how the stats stack up against those of other countries, both in the West and in developing countries like, say, Brazil or Kenya. This is where publicly available datasets come in – if you know how to source them and do the number-crunching. With the comparative stats in hand one can produce eye-catching graphics that both simplify and interpret the welter of figures.

Yet for the most part, journalists find it hard to learn statistical skills and many like to go on in their old ways. The internet has changed things (not always for the better) by turning newsrooms into places where “media workers” study their screens. Legwork and shoe leather have given way to eyeballing and tweeting. A big change in real world reporting has been the rise of the citizen reporter with the video cellphone. These news hounds provide armchair editors with grist to the mill. Increasingly, the media are seeking patterns in the news; and the way to do this is with data.

“The ability to analyse and untangle datasets is a vital skill for journalists in the age of endless information,” says Journalism UK on its website, notifying colleagues of a podcast, Getting started in Data Journalism. It goes on: “…getting further than the basics can seem like a mountain of programming tools and coding languages, but the experts we spoke to describe how to take the first steps.” Those experts include

  • Paul Bradshaw, online journalist, lecturer and blogger, Help Me
  • Marianne Bouchart, web producer and data journalism projects co-ordinator, Bloomberg News
  • Nicola Hughes, data journalist, Dataminer UK, 2011/12 Knight-Mozilla fellow at the Guardian, soon to join The Times

As this line-up shows, the data journalism movement is highly collaborative, pulling in the skills of a diverse range of media pros, IT geeks and statisticians. This is not about dumbing-down stats for public consumption but about clarifying, focusing and rendering complex collections of numbers to influence open debate and policy-making.

I’ve cited Paul Bradshaw in my journalism courses as the author of common-sense explanations of how data journalism works. He teaches at City University in London while publishing the Online journalism blog.

data journalism handbookA useful tool is the Data Journalism Handbook, an open source reference book for anyone interested in learning techniques in this emerging field. A free web version is available but you can also buy it as a book through O’Reilly Media. And if you happen to be Russian there’s a translation for you too. Russian mafia, beware! – the statisticians are coming with their pencils to break your knee-caps. Or maybe just throw their laptops at you.

My own entry into this field came by way of what was called CAR – Computer-Aided Reporting – which in the old days, a year or two ago, meant using software like Excel or an Access database to collect, collate and analyse information. I ran a few courses for the Institute for the Advancement of Journalism (IAJ) in Johannesburg and also mentored corporate communicators who needed to unpack and explain company data for write-ups in annual reports and news releases. I’ve rebuilt the CAR courses around data-driven journalism, outline here.

The Poynter Institute in Florida, US, had pointed the way to using computers to gather and analyse the data necessary to write news stories. The search power of the world wide web became an essential part of the strategy. But Poynter also drew attention to some likely problems:

“…computers are changing the news-gathering process. Turning to online sources for information or using computers to analyze information has become almost as commonplace as dropping in on city hall. And, with the increased use of different newsgathering methodology, editors might face new ethical challenges, or at least, new twists on old problems,” wrote Bob Steele, director of the journalism ethics programmes at Poynter. What if, he speculated, a journalist was caught making-up the data for a prize-winning story?

Yes, it can happen. It would bear out the adage that there are lies, damned lies, and statistics. The whole point of data journalism should be to produce credible stories from number-crunching and drive out the lies.

This entry was posted in data journalism, Journalism, Research writing, Workshops and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published.