There are 3,431,688 articles in the English Wikipedia on a dizzying range of topics. Surely there must be one on data-driven journalism, right? Well, the answer is no; an article on database journalism focuses on computer-assisted reporting, the digital journalism entry is an unreadable, sourceless and unstructured block of text on journalism originating from the Web, while the online journalism contribution deals with fact reporting produced and distributed on the Internet. None of them emphasize the open source tools that allow the extraction, analysis and visualization of data and how such applications have been increasingly popular with newspaper giants such The Guardian or The New York Times, especially in light of the Afghan War Diary documents released by WikiLeaks.
Posting a concise, objective and hopefully complete contribution seemed like a daunting task. Data driven journalism is an emerging field, one that has not been extensively researched in academic circles. Actually, the very first event dedicated exclusively to this journalistic process was organized by the European Journalism Centre and took place less than two months ago, on the 24th of August, in Amsterdam. But the premises of data driven journalism have been around for years: open data, open knowledge, open source software and information visualization. Recently they have come together in a complex and painstaking workflow which aims to create new stories out of raw data.
Given the relative novelty of this reporting practice and its continuous development due to the ‘open’ factor, I deemed it appropriate to write an entry that points the direction, but does not lead the way, that outlines the main features of data driven journalism, but does not go into controversial debates on its storytelling benefits or democratic potential. Such aspects have been left to the reader, who should, in the same spirit of openness, experiment with the many applications that have become widely available. What remains to be seen is if data driven journalism will become an established practice among mainstream media and alternative sources. As new developments come about – or as I come across older ones – I will update and expand the article’s sections.
So how was it to write a Wikipedia entry? More complicated than I thought. The markup language is a strange concoction that lacks intuitiveness and takes a while to get used to. For example, bolding a text is not done with the classical ‘ctrl+B’ key combination or the corresponding HTML format ‘<b></b>’. Instead a word must be placed between three apostrophes. Other obstacles looming ahead were immediate deletion or endless message exchanges with crossed editors. Fearing that my post was too short and likely to get a fail grade for completeness, I pessimistically gave it about an hour of Wiki-life, considering the aforementioned, somewhat related entries gravitating around the same topic or the negative experiences some of my New Media Practices colleagues had. Four hours later, the entry is still standing and it seems like it is here to stay.
This begs the question: what did I do right? In addition to making the entry as concise as possible, I stripped the topic bare to its essentials and talked about definitions that have been agreed upon by experts, gave examples of tools used to select, format, examine and ultimately visualize data, and of course referred to the most prominent example of data driven journalism. To these brief sections I added lists of related topics, mostly the conceptual foundations of this practice, data desks of major newspapers and organizations interested in promoting it. This is by no means a recipe for success; in spite of countless pages filled with guidelines and taking into account the collaborative nature of this platform, editor or user intervention seems inevitable. But constructive suggestions are always welcome and a lively discussion is what Wikipedia should be about. My message to fellow Wikipedians? Bring it on!