Debugging the backlash to data journalism

March 26, 2014 by

While the craft and context that underlies “data journalism” is well-known to anyone who knows the history of computer-assisted reporting (CAR), the term itself is a much more recent creation.

This past week, data journalism broke into the hurly burly of mainstream discourse, with the predictable cycle of hype and then backlash, for two reasons:

1)  The launch of Nate Silver’s FiveThirtyEight this past week, where he explicitly laid out his vision for data journalism in a manifesto on “what the fox knows.” He groups statistical analysis, data visualization, computer programming and “data-literate reporting” under the rubric of data journalism.

2) A story in USA Today on the booming market for data journalists and the scoops and audiences they create and enable. The “news nerd job” openings at both born-digital and traditional media institutions shows clear demand across the industry.

There are several points that I think are worth making in light of these two stories.

First, if you’re new to this discussion, Mark Coddington has curated the best reviewscomments and critiques of FiveThirtyEight.com in his excellent weekly digest of the news at the Nieman Journalism Lab. The summary ranges from the quality of 538′s stories to criticism of Nate Silver‘s detachment or even “data journalism” and questions the notion of journalists venturing into empirical projects at all. If you want more context, start there.

Second, it’s great to see the topic of data journalism getting its moment in the sun, even if some of the reactions to Silver’s effort may mistake the man or his vision for the whole practice. Part of the backlash has something to do with high expectations for Silver’s effort. FiveThirtyEight is a new, experimental media venture in which a smart guy has been empowered to try to build something that can find signal in the noise (so to speak) for readers. I’m more than willing to give the site and its founder more time to find its feet.

Third, while FiveThirtyEight is new, as are various other startups or ventures within media companies, data journalism and its practice are not new, along with existing critiques of its practices or or of programming in journalism generally. There are powerful new digital tools and platforms. If we broaden the debate to include screeds asserting that journalists don’t have to know how to code, it’s much easier to find a backlash, along with apt responses about the importance of courses in journalism school or digital literacy, grounded in the importance of looking ahead to the future of digital media, not its ink-stained past.

Fourth, a critical backlash against computers, coding and databases in the media isn’t new. As readers of this blog certainly know, data journalism’s historic antecedent, computer-assisted reporting, has long since been recognized as an important journalistic discipline, as my colleague Susan McGregor highlighted last year in Columbia Journalism Review.

Critics have been assessing the credibility of CAR for years,  If you take a longer view, database-driven journalism has been with us since journalists first started using mainframes, arriving in most newsrooms in a broad sense over two decades ago.

The idea of “computer-assisted reporting” now feels dated, though, inherited from a time when computers were still a novelty in newsrooms. There’s probably not a single reporter or editor working in a newsroom in the United States or Europe today who isn’t using a computer in the course of journalism.

Many members of the media may use several of them over the course of the day, from the powerful handheld computers we call smartphones to laptops and desktops, crunching away at analysis or transformations, or servers and cloud storage, for processing big data at Internet scale.

After investigating the subject for many months, it’s fair to say that the powerful new tools and increased sophistication differentiates the CAR of decades ago from the way data journalism is being practiced today.

While I’ve loosely defined data journalism as “gathering, cleaning, organizing, analyzing, visualizing and publishing data to support the creation of acts of journalism,” a more succinct definition might be the “application of data science to journalism.”

Other observers might suggest that data journalism involves applying the scientific method or social science and statistical analysis to journalism. Philip Meyer called the latter “precision journalism” in the 1970s.

2014 was the year that I saw the worm really turn on the use of term “data journalism,” from its adoption by David Kaplan, a pillar of the investigative journalism community, to its use as self-identification by dozens of attendees, to the annual conference of the National Institute for Computer-Assisted Reporting (NICAR), where nearly a thousand journalists from 20 countries gathered in Baltimore to teach, learn and connect. Its younger attendees use titles like “data editor,” “data reporter” or “database reporter.”

The NICAR conference has grown by leaps and bounds since its first iteration, two decades ago, tripling in size in just the past four years. That rapid expansion is happening for good reason: that strong, clear market demand for data journalists in both traditional media outlets I mentioned earlier.

The size of NICAR 2014 may have given some long-time observers pause, in terms of the effect upon the vibrant community that has grown around it for years or the focus on tools.

“I’m a little worried that NICAR has gotten too big, like SXSW, and that it will lose its soul,” said Matt Waite, a professor of practice at the College of Journalism and Communications at the University of Nebraska, in an interview. “I don’t think it’s likely.”

Fifth, there is something important happening around the emergence of data journalism. I thought that the packed hallways and NICAR sessions accurately reflect what’s happening in the industry.

“Five years ago, this kind of thing was still seen in a lot of places at best as a curiosity, and at worst as something threatening or frivolous,” said Chase Davis, assistant editor for interactive news at the New York Times, in an interview.

“Some newsrooms got it, but most data journalists I knew still had to beg, borrow and steal for simple things like access to servers. Solid programming practices were unheard of. Version control? What’s that? If newsroom developers today saw Matt Waite’s code when he first launched PolitiFact, their faces would melt like Raiders of the Lost Ark.

Now, our team at the Times runs dozens of servers. Being able to code is table stakes. Reporters are talking about machine frickin’ learning, and newsroom devs are inventing pieces of software that power huge chunks of the web.”

What’s happening today does have some genuinely interesting novelty to it, from the use of Amazon’s cloud to the maturation of various open source tools that have been funded by the Knight Foundation, like the Overview Project, Document Cloud, the PANDA Project, or free or open source tools like Google Spreadsheets, Fusion Tables, and Open Refine.

These are still relatively new and powerful tools, which will both justify excitement about their applications and prompt  understandable skepticism about what difference will they make if a majority of practicing journalists aren’t quite ready to use them yet.

One broader challenge that the adoption of “data journalism” has created in mainstream discourse is that it may then be divorced  from the long history that has come before, as Los Angeles Times data editor Ben Welsh reminded this year’s NICAR conference in a brilliant lightning talk.

What ever we call it, if you look around the globe, the growing importance of data journalism is now clear, given the explosion in data creation. Data and journalism have become deeply intertwined, with increased prominence.

To make sense of the data deluge, journalists today need to be more numerate, technically literate and logical. They need to be able to add context, fact-check sources, and weave in narrative, interrogating data just as a reporter would skeptically interview human sources for hidden influences and biases.

If you read Anthony DeBarros’ post on CAR and data journalism in 2010, you’d be connected to the past, but it’s fair to guess that most people who read Nate Silver’s magnum opus on FiveThirtyEight’s approach to data journalism had not. In 3500 words or so, Silver didn’t link to DeBarros, Philip Meyer, or a single organization that’s been practicing, researching or expanding data journalism in the past decade, perhaps the most fertile time for the practice in history.

Journalists have been gathering data and analyzing it for many decades, integrating it into their stories and broadcasts in tables, charts and graphics, like a box score that compares the on-base percentage for baseball player at a given position over time. Data is a critical component to substantiating various aspects of a story, as it’s woven into the way that the story was investigated and reported.

There have been reporters going to libraries, agencies, city halls and courts to find public records about nursing homes, taxes, and campaign finance spending for decades. The difference today is that in addition to digging through dusty file cabinets in court basements, they might be scraping a website, or pulling data from an API that New York Times news developer Derek Willis made, because he’s the sort of person who doesn’t want to have to repeat a process every time and will make data available to all, where possible.

Number-crunching enables Pulitzer Prize-winning stories like the one on speeding cops in Florida Welsh referenced in his NICAR talk, or The Los Angeles Times’ analysis of ambulance response times. That investigation showed the public and state something important, which was that the data quality used to analyze performance was poor because the fire stations weren’t logging it well.

The current criticism of data journalism is a tiny subset of broader backlash against the hype around “big data,” which has grown in use in recent years, adopted all the way up to President Obama in the White House. Professional pundits and critics will always jump on opportunities to puncture hype. (They have families and data plans to support too, after all.)

I may even have inadvertently participated in creating hype around “data journalism” myself over the years, although I maintain that my interest and coverage has always been grounded in my belief that it’s importance has grown because of bigger macro trends in society. The number of sensors and mobile devices that are going to come online in the next couple years are going to exponentially expand the amount of data available to interrogate. As predictive policing  or “personalized redlining” become real, intrusive forces in the lives of Americans, data journalism will become a crucial democratic bulwark against the increased power of algorithms in society.

That puts a huge premium upon the media having the capacity to do this kind of work, and editors hiring them. They should: data journalism is creating both scoops and audiences. It’s also a fine reason to be focused on highlighting that demand and to celebrate the role of NICAR and data journalism MOOCs have in training an expanding tribe, along with the willingness of the people who have gone before to help others who want to learn.

I expect to see more mainstream pushback regarding data journalism from members of the media who are highly proficient at interviewing, writing and editing, but perhaps less so with other skills that are now part of the reporter’s modern toolkit, like video, social media, Web development or mobile reporting. Professional pundits who don’t ground their assertions in history or science may not fare quite as well, in this world. Researchers who blog, by contrast, will. As more sources for expert, data-driven analysis of law, science, medicine or technology go direct online, opinion journalists without deep subject matter expertise are going to have to recalibrate.

It’s possible that there could also be a (much smaller) backlash from long-time practitioners that observe too much of a focus on the tools at NICAR.

“I’m concerned that it’s become too focused on data, and not enough on journalism,” said Waite. “There used to be much more on stories, with a focus on beats. People would talk about how they reported out stories, not technology. The number of panels about algorithm design are growing, and the number of story panels are shrinking. They’re not as well attended. That’s a reflection of the wishes of the attendees, but it troubles me.”

There may also be people who may push back against the meaning of “data journalist” being diluted, though I doubt we’ll see much of it. People the top of the profession and have serious technical chops which enable them to do much more than download a .csv file and making it into an infographic. These folks are proficient in Python, R and other programming languages, able to pursue scraping, cleaning and interrogation of huge data sets with complicated statistical analyses. At the edges of that gradient, there is computational journalism, although that is a specialty that doesn’t seem to exist outside of the academy.

Every one of the data journalists I’ve met over the years, however, cared a lot more about good code, clean data and beautiful design than the semantics of what to call them, or defending their professional turf.

Of the 997 NICAR attendees, how many were students and investigative reporters, editors who had showed up for the first time to learn these skills? If you told me a majority, I wouldn’t be surprised.

My sense was that in 2014, the unprecedented number of people who came had internalized the message that data journalism was important and they need to know how to do some of these things, or at least know what they are. They want to know what forking code on Github means, or at least what Github is and how people use it.

I don’t mean to knock the digital literacy of the NICAR attendees, as my sense was that it is higher than any other gathering of journalists in the world, but it’s easy for people to forget that there’s a significant portion of the public for whom these concepts are novel.

I think that’s true of the new media industry too, in which digital literacy and numeracy is perhaps not what it could be. There’s now more pressure on people in the industry to learn more, and for those who want to enter it to have more basic data skills. That’s driven some changes in the NICAR program.

“The temptation is that NICAR will become all about code-sharing,” said Waite. “That would lose the value-add, which is how the code relates to journalism. What’s different, versus programming or Web development?”

This reflects a common dividing line I’ve seen between people in the business world: the “suits” versus hoodies, jeans versus khakis, or MBA’s vs developers. Today, the world of the “news hacker” is being democratized — a good thing — so there’s always going to be a little bit of a discomfort around something that stretched from being a smaller tribe that self identifies into something bigger.

I expect that the backlash within the NICAR community to its expanded ranks and role in the industry will be minimal, leaving people room to work, collaborate, learn and teach. We’d be better off focused on the journalism itself, from storytelling to rigorous fact checking, and a bit less focused upon the tools, however new and shiny some may be.

“I’m not overly pessimistic about NICAR — quite the opposite,” said Waite, “but this focus on the data part of data journalism and less on the journalism part of data journalism is a nagging worry in the back of my head.”

That’s not to say that the technology isn’t worth considering or covering, as I have for years. We have huge amounts of data going online today, more than we ever had before, and media have access to much more powerful personal machines and cloud computing to process it.

Even with the new tech, they’re still doing something old: practicing journalism! The approach may start to look a bit more scientific, over time. An editor might float an assertion or hypothesis about new in the world, and then assigns an investigative journalist to go find out whether it’s true or not. To that, you need to go find data, evidence and knowledge about about it. To prove to skeptical readers that the conclusion is sound, the data journalist may need show his or her work, from the data sources to the process used to transform and present them.

It now feels cliched to say it in 2014, but in this context transparency may be the new objectivity. The latter concept is not one that has much traction in the scientific community, where observer effects and experimenter bias are well-known phenomena. Studies and results that can’t be reproduced are regarded with skepticism for a reason.

Such thinking about the scientific method and journalism isn’t new, nor is its practice in by journalists around the country who have pioneered the craft of data journalism with much less fanfare than FiveThiryEight.

“As we all know, there’s a lot of data out there,” said Ben Welsh, editor of the Los Angeles Times Data Desk. “and, as anyone who works with it knows, most of it is crap. The projects I’m most proud of have taken large, ugly datasets and refined them into something worth knowing: a nut graf in an investigative story or a data-driven app that gives the reader some new insight into the world around them.”

The graphic atop this post comes from that Data Desk. While you the work that created the image, it’s online if you want to look for it: The Los Angeles Times released both the code and data behind the open source maps of California’s emergency medical agencies it published in the series.

Moreover, it wasn’t the first time. As Welsh wrote, the Data Desk has “previously written about the technical methods used to conduct [the] investigation, released the base layer created for an interactive map of response times and contributed the location of LAFD’s 106 fire station to the Open Street Map.”

This is what an open source newsroom that practices open data journalism looks like. It’s not just applying statistics and social science to polls and publishing data visualizations. If FiveThirtyEight, Vox, The New York Times Uptake or other outlets want to publish data journalism and build out the newsroom stack, that’s the high bar that’s been set. (Update: I was heartened to learn that FiveThirtyEight has a Github account.) In sharing not only its code but its data, the Los Angeles Times also set a notable example for the practice of open journalism in the 21st century.

I don’t know about you, but I think that’s a much more compelling vision for what data journalism is and how it has been, is being and could be applied in the 21st century than the fox’s tale.

Postscript: Good news: 538 is both listening and acting.

15 Comments

yasmenabdallah Aug 27, 2014
تخزين اثاث شركة تخزين عفش بجدة نقل عفش جدة شركة تسليك مجارى بجدة شركة تنظيف خزانات بجدة شركات رش المبيدات الحشرية بجدة شركات مكافحة الحشرات في جدة شركة تنظيف شقق بجدة شركة تنظيف فلل بجدة شركة تنظيف بجدة شركات تنظيف المنازل في جدة شركة تخزين اثاث بجدة شركات رش المبيدات الحشرية بجدة شركات مكافحة الحشرات في جدة شركة تنظيف شقق بجدة شركة تنظيف فلل بجدة شركة تنظيف بجدة شركات تنظيف المنازل في جدة نقل عفش جدة شركة تنظيف خزانات بجدة شركة تسليك مجارى بجدة شركة تنظيف موكيت بجدة شركة عزل خزانات بجدة شركات رش المبيدات الحشرية بجدة شركات مكافحة الحشرات في جدة شركة تنظيف موكيت بجدة شركة تنظيف فلل بجدة شركات تنظيف المنازل في جدة نقل عفش جدة شركة تنظيف خزانات بجدة شركة تسليك مجارى بجدة شركة تنظيف موكيت بجدة شركات نقل عفش بالدمام شركة نقل عفش بالطائف نقل عفش مكة نقل عفش جدة شركة نقل اثاث بالرياض شركة تخزين اثاث بالدمام شركة نقل اثاث بالدمام شركة تنظيف خزانات بالدمام شركة تسليك مجاري بالدمام شركة رش مبيدات بالدمام شركة مكافحة حشرات بالدمام شركة تنظيف شقق بالدمام شركة تنظيف منازل بالدمام شركة تنظيف فلل بالدمام تنظيف بيارات بالدمام شركة نقل اثاث بالدمام شركة تنظيف خزانات بالدمام شركة تسليك مجاري بالدمام شركة رش مبيدات بالدمام شركة مكافحة حشرات بالدمام شركة تنظيف شقق بالدمام شركة تنظيف منازل بالدمام شركة تنظيف فلل بالدمام شركة تنظيف موكيت بالدمام 7 شركة تخزين اثاث بالدمام شركة تنظيف شركة تخزين اثاث بالرياض شركة نقل اثاث بالرياض شركة رش مبيدات بالرياض شركة مكافحة حشرات بالرياض شركة تسليك مجارى بالرياض شركة تنظيف خزانات بالرياض شركة تنظيف موكيت بالرياض شركة تنظيف شقق بالرياض شركة تنظيف منازل بالرياض شركة تنظيف فلل بالرياض تنظيف شقق شركة تخزين اثاث بالرياض شركة نقل اثاث بالرياض شركة رش مبيدات بالرياض شركة تسليك مجارى بالرياض شركة مكافحة حشرات بالرياض
Rahma Ahmad Aug 24, 2014
Hello شركة نقل اثاث بالدمام شركة تخزين اثاث بالدمام شركة تنظيف خزانات بالدمام شركة مكافحة حشرات بالدمام شركة رش مبيدات بالدمام شركة تنظيف فلل بالدمام  شركه تنظيف موكيت بالدمام شركة كشف تسربات المياه بالدمام شركه تنظيف منازل بالدمام شركة تنظيف بالدمام شركة تسليك مجارى بالدمام شركة مكافحة حشرات بالجبيل شركة مكافحة نمل ابيض بالدمام شركة تخزين اثاث بالدمام شركة نقل اثاث بالدمام شركة رش مبيدات بالدمام شركة مكافحة حشرات بالدمام شركة تنظيف خزانات بالدمام شركة تسليك مجارى بالدمام شركة تنظيف فلل بالدمام شركة تنظيف منازل بالدمام شركة تنظيف شقق بالدمام نقل عفش جدة نقل عفش مكة شركة نقل عفش بالطائف شركة مكافحة حشرات بمكه شركة مكافحة حشرات بالطائف شركة رش مبيدات بمكة شركة تنظيف خزانات بمكة شركة تنظيف خزانات بالطائف شركة تنظيف منازل بمكة نقل اثاث بجدة شركة تنظيف منازل بجدة شركة تنظيف فلل بجدة شركة تنظيف شقق بجدة شركة تنظيف موكيت بجدة شركة تنظيف مسابح بجدة شركة تنظيف مجالس بجدة شركة تسليك مجارى بجدة كشف تسربات المياه بجدة شركة مكافحة الحشرات فى جدة شركات رش المبيدات الحشرية بحدة شركة تنظيف خزانات بجدة نقل عفش جدة شركة تنظيف بيارات بجدة شركة عزل خزانات بجدة شركة تنظيف خزانات بالمدينة المنورة نقل عفش بالمدينة المنورة شركة تخزين عفش بالمدينة المنورة شركة تنظيف بالمدينة المنورة مكافحة حشرات بالمدينة المنورة شركة رش مبيدات بالمدينة المنورة غسيل خزانات بالمدينة المنورة شركة عزل اسطح بالمدينة المنورة تخزين اثاث جدة شركة تنظيف مسابح بجدة شركة تنظيف مجالس بجدة شركة تنظيف موكيت بجدة شركة تنظيف شقق بجدة شركة تنظيف فلل بجدة شركات تنظيف المنازل في جدة شركة تنظيف بجدة نقل عفش جدة شركات مكافحة الحشرات فى جدة شركات رش المبيدات الحشرية بحدة شركة مكافحة البق بجدة شركة مكافحة النمل الابيض بجدة شركة مكافحة الصراصير بجدة شركة مكافحة فئران بجدة شركة مكافحة العته بجدة نقل عفش مصر شركة نقل اثاث بالمنصورة شركة نقل اثاث بالاسكندرية شركة نقل اثاث بالمعادى شركة نقل اثاث فى مدينة نصر شركة نقل اثاث بمدينتى شركة نقل اثاث بالتجمع شركات نقل الاثاث بالرحاب شركة نقل اثاث بالقاهرة افضل شركة شحن فى مصر شركة مكافحة النمل الابيض بالمدينة المنورة مكافحة البق بالدمام مكافحة القوارض بالمدينة المنورة ابادة الحشرات بالدمام احسن شركة تنظيف بالدمام مكافحة الصراصير الدمام عزل خزانات بالمدينة المنورة شركة تطهير خزانات بجدة شركة عزل مائى بجدة شركة تنظيف مكاتب بجدة شركة دهانات داخلية بالرياض شركة كشف تسربات المياه بالجبيل شركة تنظيف قصور بجدة شركه عزل خزانات بالدمام شركة تخزين أثاث بالقطيف شركة تنظيف فلل بالمجمعة شركة عزل اسطح بمكة شركة مكافحة حشرات ورش مبيدات بضرماء ليموزين مطار برج العرب شركة تسليك مجاري بضرماء شركة نقل أثاث بضرماء شركة مكافحة حشرات ورش مبيدات بالدمام شركة تنظيف بيارات بالخبر شركة عزل اسطح بالجبيل شركة تنظيف قصور بمكة شركة مكافحة حشرات ورش مبيدات بالخبر شركة عزل خزانات بضرما شركة تنظيف فلل بالهفوف شركة تنظيف فلل بضرماء شركة تنظيف بضرماء شركة تخزين أثاث برأس التنورة | شركة بروق السيف شركه عزل اسطح بالجبيل شركة تنظيف مكاتب بمكة شركة تنظيف بيارات بالجبيل شركة نقل اثاث بسيهات شركة عزل اسطح بالهفوف شركة عزل خزانات بشقراء شركة عزل خزانات بالمجمعة شركة نقل اثاث فى مدينة نصر شركة نقل اثاث بالقاهرة
buy_viagra Jul 16, 2014
cialis Jul 16, 2014
viagra_canada Jul 16, 2014
polo outlet Jul 10, 2014
Handbags Outlet, http://www.superbagsmarket.com/ Louis Vuitton Outlet Hermes Bags Outlet Prada Outlet Online Chanel Outlet Gucci Outlet Burberry Outlet Online Celine Outlet Online Balenciaga Outlet Online Christian Bior Outlet Chloe Outlet Bvlgari Outlet Bally Outlet Online coach Outlet Online Michael Kors Outlet MCM Backpack Outlet Fendi Outlet mulberry Outlet Online Marc Jacobs Outlet Online Miu Miu Outlet Ysl Outlet Tory Burch Outlet Givenchy Outlet Ferragamo Outlet Lancel Outlet Loewe Bags Outlet Tods Outlet Paul Smith Outlet D&G Bags Outlet Alexander Wang Outlet Bottega Veneta Outlet Life will always be confused, no one can really predict, the future, Hermes Outlet Online, Live in the world, Ralph Lauren UK, and so we were growing up, Sac Longchamp Pairs, in the confusion, Michael Kors Outlet Online, life of all things, the same is spent also confused, Beats By Dre Outlet, confused, Michael Kors Outlet, confused world, Ralph Polo Outlet, in this world, I am a leaf, Oakley Sunglaases Factory, accompanied by wind, North Face Outlet Online, leisurely, Super Bags Market, but spent 12 years in trouble, Canada Goose Outlet, me than to know their own future, Burberry Outlet, but do not know the survival dream, because I was so small, Polo Outlet Store, wind, Michael Kors Outlet, quietly floating, Coach Factory, I stood under a tree, Gucci Shoes Outlet, watching the falling leaves, MCM Outlet, know that fall, North Clearace Outlet, at the very bottom, I perhaps these leaves, Marc Jacobs On Sale, to know fall, Monster Beats Outlet, but still could not find a direction, Nike Jordan Shoes, in the confusion.
cialis_online May 16, 2014

Post a comment

We're trying to advance the conversation, and we trust that you will, too. We'd rather not moderate, but we will remove any comments that are blatantly inflammatory or inappropriate. Let it fly, but keep it clean. Thanks.