Profile of the Data Journalist: Dan Hill

Part of my research into data journalism’s past, present and future has been interviews with veteran practitioners like Aron Pilhofer, given the insight that those talks offers for debugging debates about “what it all means,” and younger journalists like Jeremy Bowers or Dan Hill. Their recent paths to the profession should offer insight and inspiration to others who would follow in their paths.



Hill was kind enough to discuss his work with me this spring. Our interview follows, lightly edited for clarity, content and hyperlinked for context.

Where do you work now? What is a day in your life like?

I joined The Texas Tribune as a full-time news apps developer in January. Our team is responsible for both largerscaleexplorer” apps and what I’d call “daily interactives.” My day often involves writing and processing public information requests, designing interactives and working on Django apps, depending on the scale of my project.

How did you get started in data journalism? Did you get any special degrees or certificates? What quantitative skills did you start with?

I’ve always wanted to be a reporter, but the work of Phillip Reese at The Sacramento Bee and The Chicago Tribune’s news apps team inspired me to enhance my storytelling with data. I was a student fellow for the Northwestern University Knight Lab and studied journalism and computer science, but an internship with The Washington Post taught me how to apply what I was learning in a newsroom.

Did you have any mentors? Who? What were the most important resources they shared with you?

I’ve had awesome mentors. Bobby Calvan and Josh Freedom du Lac were the first to treat me like a real reporter. Jon Marshall helped me explore my interests. Phillip Reese showed me how to find untold stories in spreadsheets and Brian Boyer encouraged me to learn Python. Serdar Tumgoren and Jeremy Bowers showed me how a team of news developers operates. Travis Swicegood taught me how todeal with real world data.
My mentors remind me to always be learning and asking questions.

What does your personal data journalism “stack” look like? What tools could you not live without?

I use Excel, OpenOffice, GoogleDocs, Django and iPython notebooks for data analysis. R is creeping into my workflow for exploring datasets and experimenting with visualizations. We use d3 and chartjsfor web graphics and Mapbox for web maps. I could probably survive without Backbone, but we use it a lot.

What are the foundational skills that someone needs to practice data journalism?

I think a data journalist needs news judgment and attention to detail in order to identify the newsworthiness and limitations of datasets.
Statistics can help explain a dataset’s strengths and weaknesses, so I wish I paid more attention during my stats classes in school.
In addition to finding the stories, data journalists also need to be able to explain why data is significant to their audience, so visual journalists need design skills — and, of course, reporting and writing.

Where do you turn to keep your skills updated or learn new things?

I check Source, the Northwestern Knight Lab blog and the NICAR listserv for new ideas. Lately, I’ve been teaching myself statistics and R with r-tutor and Machine Learning for Hackers.

What are the biggest challenges that newsrooms face in meeting the demand for people with these skills and experience? Are schools training people properly?

I think the differences between the developer and newsroom cultures make it hard for newsrooms to find people with tech and journalism skills ,and to coordinate projects with developers and reporters.
As a student in journalism school, I was inspired to learn more about data when professor Darnell Little showed how it could enhance my reporting and help me find stories hidden in datasets.
I learned more developer-journalist skills like database management and web design from meetups, tutorials and classes outside the j-school, but the journalism school exposed me to what journalists with those skills could do.
I’ll add I’m impressed with data literacy of the Texas Tribune newsroom, where reporters request spreadsheets and use data to verify claims on their beats. Even if reporters don’t have the programming chops to make an interactive graphic, for example, they’re great about identifying potential data stories.

What data journalism project are you the most proud of working on or creating?

My summer intern project at The Washington Post, a study of every Washington D.C. homicide case between 2000 and 2011, was my first experience making news app in a newsroom. I was honored get to work with the investigative reporters as a newbie intern and learned a ton from building the database and doing analysis with Serdar. All of my contributions were on the backend, but I was thrilled to work with that dataset as an intern.

What data journalism project created by someone else do you most admire?

Propublica’s Message Machine was my favorite project from the 2012 presidential election, because it took a unique approach to identify trends in email metadata.
I’m excited for more stories that collect everyday metadata or use sensors to explore the data around us.

How has the environment for doing this kind of work changed in the past five years?

I’d never heard of a “news apps team” five years ago. I knew I wanted to be an investigative reporter but never thought I would write code every day. I admired reporters like Phillip Reese who were working with data and making interactive graphics, but I didn’t see as many teams of specialized developer-journalists.

What’s different about practicing data journalism today, versus 10 years ago?

I wasn’t even a teenager 10 years ago, but I would gander… THE INTERNET. Online data portals, open government and open Web stuff are important to the data journalism I do. I’m not sure they were as common a decade ago.

Is data journalism the same thing as computer-assisted reporting or computational journalism? Why or why not?

I think of “data journalism” as an umbrella term that refers the use of data in reporting or presentation, whereas I think of CAR and computational journalism as subsets of data journalism that involve analyzing a dataset.

Why are data journalism and “news apps” important, in the context of the contemporary digital environment for information?

I’m excited to work with data because of its widespread use in decision making. I think news apps can help people understand meaningful data and uphold accountability for people who create and make decisions with data.
Be A Newsnerd has better answers

What’s the one thing people always get wrong when they talk about data journalism?

Although the web plays a big role in the growth of data journalism, I don’t think you need to be online to do data journalism.