Research

American Newspapers and the Built Environment

2

The shrinkage of traditional news in size – staff reductions, circulation, ad revenue, profit – has been fairly well documented. But a different kind of transformation is slowly taking place now as news companies get out of the real-estate business: the abandoning of once-grand, overflowing newsrooms that are at once too big and at the same remarkably poorly-suited for work in the digital age. All across America, from newsrooms in Rochester, NY, to Detroit to Dayton to Los Angeles – the news buildings themselves are in varying stages of being bought and sold and repurposed. This project I’ve undertaken through Tow focuses the relationship between physical space, digital space and digital change in a post-industrial economy, but to assess what’s going on, we need to step back and offer a bit of a historical overview.

The newspaper as an institution has played a significant role in actually shaping the culture, structure, politics, and urban design of cities – the newspaper has been more than just building but civic engineer, as NYU professor Aurora Wallace argues in her book, Newspapers and the Making of Modern America: A History (2005). Tied up tightly into the history of news in America is a story of deliberate and strategic nation-building, city-building, corruption bating and busting. Particularly, according to Wallace, newspapers have played a key role in actually physically creating cities.

Initially, my starting point for this was project: to journey to newsrooms that quite literally had left their newsrooms for a new, post-industrial setting. These newsrooms had said goodbye to their physical history, leaving behind giant block-long buildings meant for composing rooms, staff 3x the size of their present newspapers, presses, and, in some cases, city centers. Digital workflow rather that physical, print workflow was the new focus of these newsrooms; the industrial legacy product, it seemed would take second place. I wanted to assess whether newsrooms that had left their newsrooms were indeed moving into a truly post-industrial world, where news hubs and mobile journalism would replace a fetishization of strictly routinized print production. The newsrooms that had left their newsroom seemed the most promising starting point for this journey: to see what happened when journalists had a chance for a do-over to meet today’s needs.

Wallace’s project takes a different approach and examines the larger urban environment as it has been constructed by newspapers. Back in the 20s and 30s, nearly all big newspapers had big publishers who in turn became somehow entangled in politics. William Randolph Hearst ran for president. Being publisher or editor of a newspaper often also meant sitting on development committees, sewer commissions, public works commissions, education panels, etc. But in specific instances, particularly Des Moines, Miami, Los Angeles, and Long Island, the city newspapers, their founders, and their publishers actually helped shape the built environment not just of a stray port but of an entire city and state. Notably, Des Moines and Miami were two key sites of my research.

In Des Moines, The Des Moines Register’s Cowles family undertook a particular strategy that helped make Iowa the politically significant state that it is today. The publishing family reasoned that the only way to keep advertising dollars growing was to keep circulation growing, and what this meant was making The Des Moines Register move beyond city center and into 56,000 miles of farm state. To keep newspapers relevant to people, the Cowles family wanted to figure out ways to get people news by the next day. So the family became particularly invested in supporting the creation of roads and trains. The goal was to have a newspaper distribution system in place for all 99 counties, and today, the roads and trains map onto the legacy of the Cowles family.

The place of Iowa in the political psyche is also due, in part, to the newspaper. The editorial position of speaking a pro-farmer position helped distinguish the Iowa point of view as “singular and important” (p. 36). The newspaper began using Gallup polls, thanks to George Gallup, a doctoral student at the University of Iowa, to measure public opinion as well as to determine what kinds of stories would have the most appeal.  Wallace argues that the constant printing of opinion polls showed that the state had a coherent point of view, and offered insight into the mood of people along political, economic and social issues. And because the Cowles believed, oddly, that no competition meant better journalism (and thus bought all the major newspapers in Des Moines), they felt free to cover news in-depth from a state-wide perspective. In this way, the newspaper was able to help establish Iowa, and its caucus, as singularly significant in being able to offer the pulse of the American electorate.

Probably, though, the post-war era underscores the most obvious ways in which newspapers played a significant role in the development of their city and their region. My favorite example from Wallace’s book is The Los Angeles Times. The Chandler family, The LA Times, owners, had a diversified portfolio of holdings, from aerospace to oil to rubber to automotive manufacturing to technology. The fact that Southern California is home to the roads, ports, and harbors that it has is due, in part, to the wide influence of the paper as producing coverage in favor of Chandler positions and as sign of influence of the Chandlers in city circles. The sprawl we see in Los Angeles is the deliberate result of the Chandlers belief that roads were the best way to move goods across land, and, of course, played to both the timing of rise of car culture in the U.S. and the automotive interests of the Chandler families.

The general position of the paper was to constantly promote Southern California (dubbing it the “Southland”) as a place of constantly wonderful weather with cheap land and good agriculture. The existence of the San Fernando Valley is due to the Chandlers. Sixty-thousand acres of land owned by a Chandler-syndicate was repurposed to give soldiers coming home from the war a slice of the American dream: their own home with easy access to the industrial plants where they could work. Literally what we know of as Los Angeles was geographically defined by where Chandler interests started and ended.  Through relentless boosterism, to the extent to which earthquakes were never mentioned as occurring in Los Angeles but in San Francisco, the population boomed.

The Miami Herald  and Newsday also played key roles in post-war America. Newsday literally figured out that it could fix its circulation problem if it created a circulation for itself, and its relentless promotion of Levittown and suburban living actually created the Long Island we know today. The Miami Herald was guilty of what Florida continues to be guilty of today—relentless land speculation. Cheap land and real estate was heavily promoted to northerners, and to the many GIs who had been trained in Florida for World War I. In fact, a 1925 edition of the then Miami Daily News had a 504-page issue of mostly real estate ads- a record for the largest newspaper ever printed (Wallace, p. 101).

Like The LA Times, The Miami Herald ignored hurricanes in its coverage, arguing that the worst thing Floridians could do for recovery would be talk about the hurricanes. That also meant that the newspaper did little to appeal for aid in rebuilding for fear of tarnishing the city’s image. Despite Miami’s role as the “Chicago of the South,” where rules on liquor, prostitution, vice, etc. were ignored by the beach community, the newspaper never touched the potentially sordid details. This, Miami Herald editors, reasoned, was why tourists were coming, and the newspaper did not want to hamper the easy access to all of this through dedicated work on corruption. Just cheery news.

The Miami Herald took on a more direct role in developing the build environment when the Knight brothers bought the paper in the depths of the recession and began remaking the city. They focused on better local utilities, better street paving (for celebrity automobiles), cleaning beaches, and downtown redevelopment – which culminated in the building along Biscayne Bay that The Miami Herald abandoned in 2013 for a building closer to the airport.

I find this history useful – it is a reminder of the power of the press and the people behind the press to create and shape the cities of old – and one wonders to what extent this still remains, particularly with respect to metropolitan dailies with diminishing circulations and potentially, stature. My focus has not been on the newspaper as the voice of the community, but the grounding of the project in looking at how newspapers have been used to shape an external, urban (and suburban) environment gives us some sense of how they, in turn, might be shaped by some of these same forces.

Research

What if news orgs took a page out of Acxiom’s book?

2

The data-mining company Acxiom made a big splash last week with the launch of AboutTheData.com, which allows consumers to view some of the data the company has collected about them.

I gave the site a shot and wasn’t too impressed or terrified (these emotions so often go together for me when reading about big data that we may need to invent a new term that combines them: impressified?). Acxiom thinks I am French (my family is Italian and Welsh), enjoy camping (I’m a born-and-raised city kid whose idea of camping is sleeping on a pullout couch) and, most bizarrely, that I have a child (nope). The site also had no data on the characteristics of my home and didn’t know whether or not I owned a vehicle.

So far, though, AboutTheData has come under fire not for inaccuracies in the data but for lies of omission: some privacy advocates have charged that it deliberately leaves out some of the creepier things Acxiom knows about us, such as whether your household has a “diabetic focus” and whether you are a “potential inheritor.” Some have argued the company is only taking this step in anticipation of coming regulation against data brokers.

Still, whatever the motive, the move towards giving consumers more access to and control over their data is noteworthy. And it got me thinking: what if news organizations did the same thing? After all, nearly all news sites are tracking our clicks and shares, and many are also keeping track of our referral sites, the amount of time spent on each page, and how far down we scroll. But we, the readers, never get to see this information. Sure, we get hints here and there, usually in the form of ads and personalized article recommendations. But the process behind such recommendations is so opaque, and the suggested articles so often odd, that it’s easy to write them off as the result of algorithmic quirks rather than a reflection of our actual reading habits.

But what if news organizations started giving us unfettered access to the information they collect about us – providing, in effect, a FitBit for news? How would this shape the experience of news-reading? Would it change the content we consume, share, and engage with? Sociologists of quantification have pointed out that public measurements of human beings are reactive: they end up shaping the very behaviors they are trying to capture. For example, in their study of the impact of U.S. News and World Report rankings on law schools, Wendy Espeland and Michael Sauder found that, even as many of them disagreed with the ranking system, administrators and faculty were dramatically changing school practices to achieve a higher rank: spending far more on marketing, privileging LSAT scores over other admissions criteria, and shifting financial aid resources from need-based scholarships to merit-based ones.

While such reactivity to data proved detrimental to legal education, it might actually be a good thing for news. There is, of course, the basic transparency argument: if companies are going to collect data about us and make money from it, we should have the right to see it. But there’s another reason too. Journalists often complain that metrics create incentives to produce fluffy, frivolous content, because that’s what audiences like to click on (see: the Onion’s skewering of CNN.com for making Miley Cyrus’s VMA performance, not Syria, its top story). This may not be totally fair: one could argue – and some have – that if “serious” news were better, with less he-said-she-said reporting and less deference to powerful actors, we’d read more of it. But who among us hasn’t clicked on a story (or ten) that we’re not entirely proud of, while studiously ignoring that long, thoughtful article about gerrymandering or the latest data on carbon emissions? Maybe having to (or at least having the chance to, via an opt-in system) come face to face with our baser news-consuming instincts would be just the push we need to bypass that clickbait-y piece decrying flip-flops as an aesthetic and hygienic abomination, and finally dive into the latest investigative reporting on the NSA.

Research

NPR news app team experiments with making data-driven public media with the public

6

News applications, or ‘apps’ as they are commonly known, rely on data. They increasingly give mobile users better ways to understand the world they’re moving through, from general topics like news, weather and traffic down to the little league baseball scores. For the raw data that creates these applications, journalists have in the past requested the data and gathered it from public sources, in libraries or courthouse basements.

Today, journalists have many more options. Newsroom reporters and developers can download, scrape and digitize data from a wealth of sources. In the future, journalists will create it themselves and look to their distributed audiences of readers, listeners and watchers to help gather it for them.

In the most forward thinking newsrooms this is already happening. Earlier this year, WNYC asked its listeners to help them track the emergence of cicadas with inexpensive sensors. This week, the NPR news applications team released a project around accessible playgrounds.The NPR team made a request of their community of listeners and readers: help public media collect the data that drives it and make the resource better for everyone. The NPR Web app enables parents and children to search for accessible playgrounds, takes a commonly used for consumer recommendation engines and adds a strong public service element.

“This is sort of like Yelp, except for playgrounds for kids with special needs,” said Brian Boyer, the head of NPR’s news applications team, in an interview. “It is the first of its kind, nation-wide database of playgrounds that are well suited to kids in wheelchairs, kids with autism, or kids with other special needs.”

app screenshot

As Robert Benincasa reported for NPR, changes to federal requirements now define playground accessibility as a civil right, which has resulted in more places to play for kids with special needs. Every playground built or altered after March 14, 2013 has to be wheelchair-friendly and support children with physical challenges. The challenge that their parents face, however, lies in knowing where those playgrounds are located in the neighborhoods and cityscapes around them.

Previously, while parents could turn to informal networks of friends, schools, advocacy organizations or government websites for information about accessible playground, finding a place to play wasn’t easy. That’s where NPR’s investigative reporters and news app team saw an opportunity and worked together over the course of two months to make it real.

“Robert, the reporter on this, was looking into the issue of playgrounds and ADA [Americans with Disabilities Act] compliance nationwide,” said Boyer.
“He had this chunk of data that he’d been gathering, came to us and said, ‘Hey, I’ve got this list that we’ve been cleaning up. It’s certainly not everything, but it’s something, and it could be useful to folks.’ We said that sounds incredible. He asked us to build a website. There are people in the news who would have said ‘This isn’t complete, we can’t publish it.’ Much to the credit of our editors and the investigative team, they said that’s OK. Let’s build something instead that creates the whole dataset, that is both a guide and a way to create a much better guide.”

Given the range of news bureaus around the United States, NPR is perhaps better placed to engage its audience in extending its reporting than any other national news organization. Given a mission and vision focused upon “creating a more informed public,” directly involving the public in creating data for the public is a natural progression for an organization virtually defined by radio decades ago to evolve towards as it goes digital in the 21st century.

This data is “horribly incomplete,” said Boyer. “We know that, and we’re OK with that, because we’re asking our audience, and hopefully everybody, to contribute. We’re hoping they’ll walk down the street, look at their playground, and see if it’s got some of these features that we’re looking for — and if it does, we’re hoping that people will edit the playground or add the playground to the database to help everybody else.”

If NPR is able to activate its audience to become active participants in data collection, much in the same way that Audubon’s Christmas Bird Count and eBird are crowdsourcing data collection on bird species, it will have created both a notable case study for the power of public engagement and an important database of public data that government itself can consult. There are decades of precedent where a listening or viewing audience collaborate with a media organization in collecting images, videos or stories. What’s remains relatively new in 2013 is the capacity for a networked populace to contribute data, whether it comes from sensors in droughts, geiger counters near potential sources of radiation. If turning data into stories is now a core element of investigative journalism, NPR’s news applications team is showing how to do it better and serve the public in the process.

Interview with Brian Boyer:

brian boyerSo what does the playground app do?

Boyer:: You can search for your town, and see what playgrounds are nearby you. If you know about a playground nearby you that isn’t listed, there a very simple button, “Add a Playground,” and it takes just a moment. You can use our little map to pinpoint where it is and tell us about the playground.

There are basically no required fields. There’s a lot of data we could be gathering but people don’t often even know the name of their playground, right down the street, and they certainly don’t know what organization built it. The mapping element is built so that you don’t need to know the street address, because who knows the street address of a playground?

The mapping is built sort of like how you request a car with Uber, where you drag a little dot around and place it. We tried to make the process of adding extremely low overhead.

How did your team make a intuitive mobile user interface for this data?

Boyer:: We take a user-centric approach to design. We try to put ourselves in our audience’s shoes. We would ask folks, “Do you know the address of your playground?” and they said, “no.” There isn’t address information on playground signage, right? It quickly became obvious that we would need to have a non-address oriented location thing.

We’re all avid app users. Someone brought up the Uber idea, and it was perfect. What’s technically interesting about about how this website works is that it is [lightweight] . One thing we like to do is build applications without servers. This whole thing is deployed to [Amazon] S3. It’s all deployed as flat files. When people contribute and add an edit, once a night, our servers pick up a lot of updates and then apply them, and then the server goes back to sleep. It’s extremely scalable. There’s no way that this website will ever go down.

What did it take to build this, in terms of resources and time?

Boyer:: It’s the largest project that we’ve done as a team in the last year, with the exception of the election last November. For two months, it was pretty much the whole team heads down, with two people you’d most characterize as software developers, two people that you’d most characterize as designers — although our designers are coders as well — and then Matt Stiles, our resident reporter, working on gathering data, cleaning up data, contacting our sources. He worked with a reporter on the investigative team, who’d initially gathered the seed data.

Like all of our projects, this was an iterative process. We worked in one week cycles, showing our work every week to the investigative team, constantly changing it and adjusting the language, and then doing what I think Joel Spolksy called “hallway usability testing.” We’d grab people by the arm, walking by, and ask them to try it out. There are a lot of people who work at NPR headquarters who have kids. We got a lot of potential users just in the building who we were able to test the messages and the interactions.

On the technical side, it uses our app template, which is a baseline that we’ve built for all of our projects. It makes the first 90% of a project really easy, so that we can focus on the top 10% and the user experience, not so much worry about the servers and deployment.

If you vist Github/NPRApps, can you see the code and adopt it?

Boyer:: Yes and no. We’ve been going back and forth with our lawyers about open sourcing our work. Everybody is really into it, but the problem is that there is some stuff in this application that can’t be open sourced. The words we write, NPR is not interested in giving away. The photographs in the app, we can’t open source. It creates a very complicated problem where we would need to complete decouple what you might call the “content” from the code — even though, in my mind, code is content — and decoupling that is such an onerous task that it would add a month of work. We’ve been struggling with this with the team for the last year and trying to figure out what, exactly, is the right thing to do.

So, if you look on our Github account, you will find the source code there, and there’s going to be copyright statement on it that says “Copyright NPR 2013, All Rights Reserved,” etc., and then a note that says “if you’d like to use this, just shoot us an email.” That’s not my ideal scenario, but it’s the best thing we can do. If we receive that email, what we’ll do is tell them “sounds good: grab the code, fork it. You can’t use our name, you can’t use the words, you can’t use the pictures, but we’d be delighted if you use the platform — and we’ll give you permission. It’s messy. It’s not a library, decoupled from the work that we do.

At the same time, I think it’s important to have the code out there, so that people can see what we’re doing, people can copy what we’re doing, and borrow ideas. One of the big parts of this is about educating our member stations, folks within our community, and educating new journalism students, and other interested parties.

Our work is copyrighted, but it does have a very good read me that explains in great detail exactly how to set this thing up and getting it running on their machine. What we hope is that people will grab our stuff and learn from it. If someone wants to create an accessible playground machine in Britain, just shoot us an email and we’ll tell you how.

So what can be adapted and adopted here?

Boyer:: We feel that the most reusable asset that we’ve created here is the dataset, which we’ve gone to great lengths to give away and make usable. If you were gonna take this code and build this thing for Britain, it would be really different. Street addresses are just different in the United Kingdom.

To make a project that is generic enough to be readily used anywhere means you’ve built Drupal. We’re not trying to do that — we’re trying to build tight, focused code that only does exactly what we need it to do, both for our audience’s benefit and because we have deadlines. We build the code for all of our projects. We understand that the code isn’t something that you can just pick up and reuse. You have to make something different for a different story or audience.

NPR’s apps team is taking a daring approach to data collection: they’re trusting their audience. Is that risky?

Boyer:: We’re extremely optimistic about edits. We’ve found when we’ve done projects where we’ve asked our audience to contribute that no one’s rude. People are generally pretty nice. By default, the site accepts all edits.

What we’ve done is created a daily newsletter that goes out to our editors here that care about the site. They can eyeball the edits to make sure they look reasonable, as opposed to a process of editorial approval. How do you approve a playground edit for a playground in South Dakota you’ve never been to? The approval doesn’t matter: you don’t know. We can’t know. We’re trusting our audience to be friendly and responsible and helpful. In other efforts, in projects like this, it turns out that people are totally cool about that.”

Still, letting edits go live without approval seems a bit daring. Are NPR listeners a different slice of the online public?

Boyer:: Maybe. Maybe it’s just the questions we’re asking. We did do a project about the inauguration, where we asked folks to take a picture of themselves with a sign with a message to the president. We decided to be optimistic about it and said, “OK, we’re going to publish these signs the second someone submits them,” as opposed to an editorial filter. With maybe two exceptions, even people who had negative messages were still very thoughtful about it. I do think that’s probably a product of the NPR audience, but also the NPR brand. We had a lot of people who aren’t necessarily NPR listeners who are contributing to the website.

I don’t think people are going to be jerks. It’s for kids with special needs. We don’t need a “jerk filter.” We’re following the Shirky model: build a system where it’s harder to vandalize than it is to clean up, which is what Shirky says about Wikipedia. The idea is that it’s easier to tidy it up than to screw it up, which we’ve found to be a pretty good guiding principle.

How did you find all the data?

Boyer:: The data was gathered from a handful of different sources, including Mara Kaplan, who gave us the seed. She’s an activist who’s been gathering playground that are accessible and runs a website that has some of this data.

We contacted a big mailing list of parks administrators and the people who run that let us send a pitch to everybody asking them to help us out. New York City has a really great parks website, so I think we downloaded the data, instead of scraping it. Some of it was piece mail, others we gathered, some of it was us calling folks. We gathered data from other states and municipalities that had made it available online. And then folks from different parks districts all around the country contributed data.

We didn’t have any FOIAs, mostly because the turnaround time would have been short. We’re hoping that folks will say, hey, that’ a good idea: we should just give you this stuff without making you jump through the FOIA hoop. Immediately after we launch, we’re going to try to contact all of the people we tried to contact before and say that we built this thing, here it is, do you want to contribute to us now?

We’re hoping that we’re going to see, immediately after we launch, that we double or triple the amount of playgrounds in the app very quickly*. People, when they see the example, when they see it done, they’ll get it. There are people that probably didn’t want to give us data because they thought we were going to do a takedown piece — and this isn’t a takedown piece. I can imagine that municipalities, seeing a journalism organization calling them and asking for them for this kind of information without a FOIA in hand, thinking that they didn’t have to respond. I think we’ll see a lot of new data from people contributing and a lot more from agencies who see what we’ve done and think it’s cool.

*Postscript: In the first 48 hours after the app was launched, data for 336 more playgrounds was added to the database, for a total of 1,293 to date.

Where does the data live?

Boyer:: It’s on the home page of the app. Just scroll down, near the bottom and look for links. Download as .csv or JSON. They’re our favorite two formats. The JSON file includes some summary information, just some basic stats about how many playgrounds are in there and that sort of thing. It’s just downloadable files; there’s nothing fancy about it. There’s no API. It didn’t seem necessary. We didn’t want to build something that people were going to build a live application on it. It seemed that would be a heavy weight and low value. The data has latitude-longitude and geodata. All of the information that is in the app is in the data, from the agency that owns the playground to the features it has.

One feature that we haven’t built yet — that I hope we’ll be adding when we update it later — is a way to search by feature. It seems sort of obvious, but let’s say you’ve got a child with a certain special need, you’d want to not just search for all playgrounds but search for playgrounds with a specific feature. The reason we didn’t build that in is because we don’t have that in the data yet. We are relying on our audience to go and tag these playgrounds. The playgrounds are often in there but they don’t have that sort of coding. Hopefully, as people annotate the website, we’ll have the data to support the feature and then we’ll build it.

Can you talk about more about how, as a public media organization, you’re creating a public database with the public?

Boyer:: We’ve asked all of these different groups to contribute data. We’re going to continue asking them to contribute data after we launch. We’re also giving all the data away. We’ve written the screen scrapers, we’ve made the phone calls, and we’ve requested the data — we’d be jerks if you had to scrape our website to do something different with this data.

Part of the site offers a complete data download, so you can use this information in a different way or build your website out of it. Giving the data away and having the code be public is important. There are reasons you want to show your work and be transparent about your methods.

Our new catchphrase at NPR is that it’s important to “work in public.” We are a public resource. We are public media. We are of the people. Being proprietary about things? That’s not public. This whole public media thing is a collaboration with our audience. This project should be just as collaborative.

Research

The Rise of The Single Subject Platform

6

A Research Project by News Deeply &

The Tow Center for Digital Journalism at Columbia University

By Lara Setrakian and Kristin Nolan 

News has never been more readily available, to users and to the journalists who serve them. Any reporter with an Internet connection can set up a web-based media outlet, covering a unique beat. In our digital age, it is the equivalent of becoming an overnight publisher of a niche news magazine.

The democratization of access – a decentralized ownership of the means of production – has led to the proliferation of single-subject websites, where one topic is covered with intensity and focus. From tracking how your favorite NHL hockey team is going to deal with the new salary cap (Capgeek.com) to covering government transparency in addressing violent crime (Homicidewatch.com), single-subject websites are serving the passionate, niche news consumers looking for more information than mainstream outlets can provide.

This trend has expanded quickly, with niche news sites gaining both visibility and credibility; this year, a niche publication called InsideClimate News won the Pulitzer Prize. The trend warrants further investigation as it spreads.  Are single topic niche news sites developing a new model for modern publishing and setting new journalistic norms?

 

Niche-News Publishing: A Growing Market

 The past five years have seen the emergence of single-subject online news outlets, from Health Map to Tehran Bureau to Homicide Watch. These single-subject sites claim to respond to a consumer desire for in-depth coverage of a particular topic; in each case, there was a niche audience that felt underserved by the mainstream press, which, by their estimation, had failed to provide consistent and comprehensive coverage of a certain topic. In response, entrepreneurial journalists stepped in to fill this gap, addressing that niche audience through in-depth, subject-specific coverage.

In the case of domestic issues, the emergence of single-subject outlets is a journalistic response to an overall shrinking of investigative news. As one example, Homicide Watch in Washington, D.C., and Chicago, emerged from a lack of follow-through on the part of local news outlets and a perceived need for a richer information environment around crime reporting. The impact of its work appears to be substantial: it has yielded more transparency into homicide cases across DC, advancing the justice process and public safety for the District.

In the case of foreign issues, these single-subject sites have emerged as a response to a decline in international news coverage in the mainstream press. This came as a result of the commercial pressures within the mainstream media environment, forcing the closure of foreign news bureaus abroad. According to the American Journalism Review, over 20 papers and companies have cut their foreign bureaus entirely since their first media census in 1998.[1] Furthermore, steep cuts have been seen since the last census in 2003, from 307 full-time foreign correspondents to the most recent measure of 234 full-time correspondents in 2011. While the use of freelance journalists may have partially filled the gap, this still represents a dramatic reduction in the supply side of international news, as offered through traditional news outlets. In a 2010 analysis of 8 daily US newspapers, it was found that the number of stories dedicated to foreign news was slashed by more than half, from 689 news stories to 321 while the percentage of staff dedicated to covering foreign news at these papers declined from 14% to 4% in the same time period.[2]

The significance of this trend in foreign news is representative of a larger dilemma for the media industry. In a world where information is expected to be free, how do we produce a sustainable business model that can generate revenue without making an audience pay for content? At the user level, how do we better serve the audience for niche news stories? At a macro level, how do we ensure that cost pressures on mainstream new outlets don’t leave vital stories underreported – absent from the knowledge pool? In an increasingly complex and interconnected world, a failure to provide consistent, in-depth reporting will have lasting effects on the strength of democracy.

For the pool of emerging niche publishers, the stakes and the potential benefits are high: if they succeed, single-subject websites could dramatically raise the supply of high-quality journalism, covering complex and chronic issues that go underreported in modern media. Niche publishers can take advantage of digital tools of the day in creating a lean operations model; to borrow a phrase from the book by Eric Reis, it is “Lean Startup” philosophy, applied to the media world.

Moreover, for the beat reporter, it represents and unprecedented opportunity to serve a hyper-focused audience, capturing the market and building a community among return users. This can enhance the quality of reporting, as user feedback and potential sources become part of a steady flow of information.

 

Exploring the Single-Subject Website

With the support of the Tow Center for Digital Journalism at Columbia University and The Knight Foundation, Executive Editor Lara Setrakian and Lead Researcher Kristin Nolan will explore this general trend in the digital world. Emphasis will be put on studying how single-subject websites have responded to the lack of in-depth coverage of international issues and crises.

The discussion and study will cover a variety of online archetypes to capture the single-subject trend is in its infancy. In summary, we hope to jumpstart dialogue, create a community of like-minded leaders, and build a roadmap for others wishing to grow their own single-subject websites.

This project will involve three phases: A) The Exploratory Phase (July-November 2013); B) The Conference (November 8-10 2013); and C) The Research and Discovery Phase (November 2013-April 2014).

During the Exploratory Phase, we will conduct a baseline assessment of the realm of single-subject websites, forming an easy digestible map to the industry based upon topic and area of focus. This phase will also include working with our partners to collect quantitative and qualitative data on single-subject websites. This phase will culminate in the release of a White Paper at the conference in November.

The Conference will provide an excellent opportunity for interested parties to network, gather data, workshop their websites, and learn more about this model on an in-depth level. The primary goal of this conference is to analyze, as a group, the areas of success as well as room for growth within this model. The outcome of this conference will be an established network of Single-Subject Publishers entitled the Single-Subject News Network, where Publishers can set the standard for future participants.

The final phase of this project, the Research and Discovery Phase will be dedicated to collecting all data from the initial two phases and producing a final Research Report, a Handbook for Journalists, and a Teaching Kit for Journalism Schools. Also during this time, the Single-Subject News Network will expand its reach, attempting to produce higher quality information on its niche news network.

 

Defining a Single-Subject News Site

 The world of the single-subject platform is a vast one that covers all of the topics one would find in the traditional newspaper, broken up and divided out across single-subject platforms to fill the gap these entrepreneurs see in the news space. As a result, one can find single-subject platforms in all the veins of the traditional newspaper: domestic news, international news, and then in what we call the “features” categories, including: science/technology, arts, sports, the gossip column, home and garden, and even the op-ed section. This phenomenon is easy to spot, but those websites that have emerged in the domestic and international reporting space require an element of in-depth investigation that has replaced the beat of the former domestic or foreign service bureau member. This makes the space they occupy much different than a features-oriented website, which is what this study will focus on. However, we recognize the significance and substance of other types of sites and within this study we will explore and map an emblematic few from each features category (excluding op-ed which is not a reporting or fact-based features section), drawing upon the lessons learned from these sections, including how they monetize their feature.

Because this study will explore single-subject websites that are aimed at delivering substantive journalism, based on research and fact-based reporting, we have set parameters for inclusion and exclusion. A single-subject website is a website that: (1) addresses one topic that is (2) sufficiently narrow in scope, (3) fact-driven and that (4) started online, (5) are independently funded by private actors (i.e. not-governmental entities), (6) will not include city or local newspapers, and (7) are English-language focused due to time and language constraints.

(1)  A website must address one topic in depth, delving into a single story of single angle within a broader story; it brings a narrow focus on a topic perceived by the founder to be underreported and underserved within mainstream sources.

(2)  The topic addressed must be sufficiently narrow in scope. For example, a website that deals with “US News” as a topic is not a single subject website but one that deals with “US Healthcare” on an in-depth level is because it addresses a niche audience within a larger topic, filling a gap in available information about that issue. Often, the test of a niche news site will come in the website’s audience; the target audience, or user base, will reflect the relatively narrow focus of the reporting.

(3)  The website must feature a clear emphasis on fact-based reporting versus opinion. For instance, some blogs may have similar characteristics to a single-subject website in topical focus, however, if they are overwhelmingly based on opinion, rather than reporting the facts, then we would not consider them eligible for this study. Similarly, “conspiracy theory” websites aimed at promoting or debunking a particular point of view would not be considered in this study.

(4)  The organization must have its origins online rather than converted from a prior publication or existing news media outlet. The rationale behind this qualifier is to highlight the rising trend of entrepreneurial websites designed to fill a gap, not a pre-existing news provider that transitioned to the online market.

(5)  The website must be funded by non-governmental sources by private actors (i.e. not-governmental entities).

(6)  City newspapers, while niche in focus, are not eligible for this study. While many city newspapers conduct in-depth investigative research on issues of local importance, this coverage tends not to go to the national level, and therefore does not fill a gap in mainstream media, although there are clear exceptions of cases where a local story has gone national.

(7)  While there are many concrete examples of non-English sources who do in-depth investigative journalism, due to the language constraints of the Research Team, these sources will be unable to be studied at this time.

 

Methodology

We will conduct initial research and a baseline assessment of single-subject news sites, mapping the landscape. Subsequently, we will refine our database of single-subject websites, selecting a series of 20 “Single Subject Publishers” who will share their experience and operating model, including funding information, data analytics, and choice of information architecture. By collecting this data, we can begin to assess the case studies and best practices for single-subject websites. In time, we will compile those into two handbooks that can assist future journalist-entrepreneurs looking to create single-subject websites as well as develop a Single Subject News Network, where a wide variety of single-subject websites can come together to further their knowledge as well as promote their unique sites.

Deliverables over the course of this project will include:

- A Research Report on the single-subject model, listing notable case studies and charting their net impact. This would include:

  • An analysis of metric data, such as return rates, time on site, and clear traffic drivers.
  • Analyses of business models to understand where and how specific sites have achieved sustainability.
  • An analysis of journalistic ethics, in cases where specific issues were raised over the course of the websites news coverage.

- A Handbook for Journalists seeking to build single-subject sites, including applications of journalistic ethics in the digital context.

A Teaching Kit for Journalism Schools, using these platforms as case studies or hoping to provide their students with viable career opportunities after graduation.

-A Workshop/Conference at the Tow Center November 8-10 2013, convening single-subject journalists. Through this conference, we will create the Single-Subject News Network, a network of specialized digital journalists, otherwise known as “Publishers” under the Tow Center’s umbrella. The initial class of 20 Publishers will provide data for the research will contribute to the data gathered for this study. The conference itself and subsequent follow-ups would provide additional data for research purposes.

 

Call for Single Subject Publishers

This project is seeking an elite first class of 20 “Single-Subject Publishers” to participate in the conference as well as help provide anonymous information that can assist this study. Data collected will include: analytics, site traffic numbers, financial avenues, qualitative/quantitative analysis of website construction, return rates, traffic origins, etc. Publishers will help us shape and build the Single Subject News Network, in so doing; they will have the opportunity to workshop and improve upon their websites, ultimately providing better information to their audiences, while networking with other like-minded individuals to learn from their successes and mistakes.

For more information, or to be included in this study or for conference updates, please email your name, organizational affiliation, request for inclusion/request for conference updates, and any comments or questions you may have to: kln2120@columbia.edu.  You will be notified of your inclusion by October 1, 2013.

For more information on the Single Subject News Network and this research project, please follow us at:

Website http://beta.syriadeeply.org/

FACEBOOK https://www.facebook.com/SingleSubjectNewsNetwork

TWITTER @hypertopical


[1] American Journalism Review. “Retreating from the World”. http://www.arj.org/article.asp?id=4985 . <Accessed February 28, 2013>.

[2] American Journalism Review. “Shrinking Foreign Coverage.” [http://www.ajr.org/article.asp?id=4998. <Accessed June 4, 2013>.

 

CU Community, The Tow Center

What can journalism learn from computer science?

1

Journalism needs an algorithm. That’s not to say machines should replace reporters, but that reporters should be thinking more like machines: systematically. From computer programs that automate news stories, to data-driven narratives and mobile app development — journalism’s relationship with computer science is becoming ever more involved. Integrating technology into journalism, however, doesn’t simply mean installing Excel on newsroom computers, or teaching journalism students basic HTML and CSS. Applying core computing concepts to reporting and story telling can not only improve journalists’ production efficiency, but also shape their narratives. (more…)

CU Community

Columbia University and Reuters to work on Advanced Data Visualization Project

7

Columbia University and Thomson Reuters announced the launch of the Advanced Data Visualization Project (ADVP) based at Columbia’s Graduate School of Architecture, Planning and Preservation (GSAPP). The initiative, sponsored by Thomson Reuters, will facilitate research into data visualization and its implications for academia and industry in a world increasingly awash with data.

Read the full Reuters press release here.

 

(Photo: AP /Diane Bondareff)

Past Events

Event: Journalism and Technology Breakfast

1

The Tow Center hosted its inaugural Journalism and Technology Breakfast on Wednesday 30 May at Soho House. Journalists and tech entrepreneurs gathered at the swanky Chelsea members’ club to discuss the interplay of digital innovation and journalism over artisan granola and baked goods. The event, moderated by Tow Director Emily Bell, is to be the first of a twice yearly event which aims to plug Columbia Journalism School further into the New York tech community. In his opening remarks the Dean of the Journalism School, Nicholas Lemann, said the event was keeping in tone with the move of the school towards further engaging with the digital journalism world.

The first speaker, John Borthwick, CEO and founder of betaworks, spoke of the changing landscape of technology and its impact on journalism. Borthwick said that when he founded the new media investment company four years ago, he did so “outside of the noise” around current media-tech startups. Borthwick described betaworks as a company rather than a fund; a position that allows the organization to participate in the development of its investment projects without becoming trapped in the politics of legacy organizations. Speaking in relation to one of the recipient companies of betaworks’ investment, bitly, Borthwick emphasized the importance of data in the newsroom. “The data layer is a shadow because it’s part of how we live; it’s there but usually not observed”, he said.

Since its launch last year, the New York World has focused on producing heavily data-driven stories about government accountability in New York. Editor, Alyssa Katz, introduced the work of two of her team that particularly demonstrated the role of data in finding stories. Via video link, Michael Keller presented a four-part interactive – Our Future Selves. Keller was unable to attend the breakfast because he was in Paris receiving the second place prize for the project at the Global Editors Network International Data Journalism Awards. The piece, originally produced for Columbia Journalism School’s News21 workshop, was published by the Washington Post. It uses census data – collected and analyzed by Keller and his partners on the project, Jason Alcorn and Emily Liedel – to show the effect of an aging population.

Alice Brennan went on to explain another project she produced with Keller and other members of the New York World team. Using NYPD stop and frisk data, the New York World worked on a series of stories about incidents of stoppings around city and the demographics behind the figures. Brennan said the biggest challenge the team faced was the state of the data, which took three weeks to clean and required interrogation of 117 columns of data.

CEO and co-founder of BuzzFeed, Jonah Peretti, and editor-in-chief, Ben Smith, closed the event with a discussion that built on Borthwick’s remarks about the changing nature of the web. Peretti described the BuzzFeed homepage as “a place to share”, catering to the shift in the behavior of internet users. Smith went on to explain the impact of Twitter and how its changed the way people converge on the social web. “The beast wandered off to tweet. People were no longer hitting refresh on their RSS feeds anymore,” he said.

Like Borthwick, Peretti and Smith acknowledged the importance of data in the newsroom. Web publication is not only a cheaper production option than print, they said, but also gives editors and journalists a clearer picture of their audience. There has been a fetishization of the news which lacked engagement with the larger picture. The initial excitement generated by the gizmos has now faded and journalists and developers have arrived at a place to think more critically.

Past Events

Reconstruction of International Journalism

2
The Tow Center for Digital Journalism is hosting an evening panel discussion at the Columbia Graduate School of Journalism.

“The Reconstruction of International Journalism: Changes in Large Newsrooms”

Tuesday, May 22 from 5-7pm
Columbia Graduate School of Journalism, Stabile Student Center — main level, turn left as you enter the building.

Chair: C.W. Anderson (CUNY)

Panelists:

Caitlin B Petre (New York University): “Interviewing the Interviewer: The Challenges and Opportunities of Questioning Journalists”

Nikki Usher (George Washington University): “Ethnography in a Time of Big Newsroom Uncertainty”

Valerie Belair-Gagnon (City University, London): “Beyond the Physicality of the BBC Newsroom(s)”

Respondent: Michael Schudson (Columbia)

How It's Made

How it’s made: Stop-and-frisk stepper graphic

1

The other week, the New York World published a data reporting project with the Guardian examining the NYPD’s controversial Stop, Question and Frisk policy. Last year NYPD Commissioner Kelly issued an order to curtail low-level marijuana arrests following stop-and-frisks. WNYC had previously reported that the NYPD manufactured such arrests by ordering people to remove marijuana from their pockets and then charging them for the more serious crime of possesion in public view.

Our investigation found that marijuana arrests actually rose after Kelly’s order. But finding that story involved diving into the data.

Thanks to a recent lawsuit, the NYPD releases a database each year of every single “stop-and-frisk” that officers make. Unfortunately, the database is so big it can’t easily be opened in Excel and the data also requires some serious “cleaning” to be usable.

To address these issues, we analyzed the data using the open source statistics program R, which can handle data cleaning, interrogation, and visualization in one program. Because R lets you type in commands that apply across multiple files, it removes the need for switching among Excel windows. R also supports the SQL-like queries through the sqldf extension package, which makes more complex database systems so powerful.

Cleaning
Because we were interested in when certain types of stop-and-frisk incidents had taken place, we used R to split the day, month, and year of each incident’s date field into individual columns. This set the data up for the next step in our analysis, which was to count up how many marijuana arrests occurred each month.

Querying
Using SQL queries, we were able to group and count the data by month and crime type. We focused our searches on marijuana possession (which in the NYPD data was spelled “marihuana”).

We ran a number of queries to see month-to-month trends and also compared across years to see how 2011 compared to data to 2008. This gave us valuable context because stops actually dropped in November and December of 2011, but not as much as they did in those same months in prior years. If Kelly’s order impacted officer behavior, we should have seen relatively dramatic decreases during those months, but found only slight declines. This context was vital to our story, and explaining why the 2011 drop was not significant was a high priority for our final visualization.

We also ran queries comparing arrests to stops as well as isolating specific precincts. However, only a few of these queries yielded results that were worthy of inclusion in the final interactive.

Visualizing Part 1:
In order to find the trends mentioned above, though, we first had visualize our query results, which R can do, too. An extension package for R called ggplot2 will generate high-quality, customizable line graphs that could be used directly for print graphics. However, we wanted ours to be interactive, which required some additional work.

Visualizing Part 2:
SVG (Serialized Vector Graphic) is a type of graphic that is drawn dynamically on a computer screen, which means that it can be highlighted, clicked, rolled over, or animated in ways that .jpg, .gif and .png files can’t. The ggplot2 graphics can be converted to SVG, and then published to the web using a javascript library called called Raphaël. Although this requires some copying-and-pasting, the clean, dynamic graphics it produces are worth it.

Putting it all together
To better tell the story, we compiled four sets of charts that we incorporated into a so-called “stepper graphic.” Thanks to the newsapps team at ProPublica, there is a great open-source library (http://www.propublica.org/nerds/item/anatomy-of-a-stepper-graphic) for building these graphics. Turning my four charts into four different “slides” was as easy as creating a function for each of them and then copying in their Raphaël code. The stepper graphic library took care of numbering and transitions. We built the grid and axes with standard HTML and CSS, and made label fades using simple jQuery fadeIn() and fadeOut() methods.

Finally, once we confirmed we were running the story with the Guardian, we adjusted the styles to make sure it would mesh well with their design. So we made the months lowercase, the font Georgia, and the line fuschia – perhaps the most important part.

Past Events

Event: Doing Data Journalism

13

The archived video of this event can also be accessed here.

A panel of six of journalism’s movers and shakers convened at Columbia Journalism School on March 28 to debate the current state of data-centered reporting and interactive visualizations.

The panel, moderated by Columbia professor Susan McGregor, tackled numerous issues surrounding data journalism, but their first hurdle was to simply define data journalism — a type of storytelling quickly gaining momentum in newsrooms.

“Data journalism is just journalism,” said Julia Angwin, Wall Street Journal’s technology editor.

Angwin likened the collection of data sets to the age-old process of conducting interviews. The difference is that the technology available today allows journalists to examine data sets more exhaustively, beyond the limits of interviews and common knowledge, she added.

Angwin has worked on projects like “What They Know” which examined the tangled nature of online privacy. Her team secured data using code forensics, which she said helps “break stories” and “expand journalism.”

Jo Craven McGinty, projects editor for Computer Assisted Reporting at The New York Times, called data journalism “documents reporting on steroids,” implying data journalism allows journalists to dive into larger and more complicated data sets with the help of database systems and spreadsheets.

Scott Klein, editor of news applications at ProPublica, said the field of data journalism should also recognize the potential of “news applications” which weigh the presentation of data as greatly as its gathering, reporting and analysis.

Web scraping for jouranalism

Blog post on data scraping for ProPublica's "Dollars for Docs." Photo: Rani Molla.

ProPublica projects like Dollars for Docs — which examined doctor payoffs from drug companies using data scraped from pharmaceutical websites — allows users to search for their own doctors and view any payments they received.

Klein said this type of user interface was a key component of data journalism: “It can tell your personal story…and how it matters to you.”

But the use of data or technology in storytelling does not change the inherent concepts of journalism, Klein added.

“This is journalism that is native to the web, but it’s still just journalism,” he said. “The rules all still apply, the methodology is the same, the rigor is the same…the editorial judgement is all the same.”

It is the concept of data, though, that might need restructuring, according to Aron Pilhofer, editor of interactive news at The New York Times. Tools like Document Cloud (which he and Klein helped develop) allow even plain text documents to become data, by enriching them with metadata and providing search functionality.

The panel then turned to a discussion about which comes first: the data or the story. The panelists unanimously agreed the story idea almost always leads to the data research.

But data analysis rarely — almost never, according to Mo Tamman, a Reuters data journalist — yields the expected results. It almost always reroutes the story to an unanticipated conclusion.

Angwin stressed this kind of journalism can be thought of “testing hypotheses.” It is ultimately using data to verify or rethink story ideas.

Tamman added it is crucial to bring in outside experts almost immediately and “suck their brains dry” in order to better understand, authenticate and contextualize the meaning of the data.

But as the burgeoning practices of data journalism expands, newsrooms must adapt, according to Angwin. Newsrooms are currently “allergic to margins of error,” she said, and they must learn to cope with results that cannot be verified 100 percent — a typical situation when dealing with large data sets.

Newsrooms must also become more math friendly and data literate — something McGinty says can mean simply knowing when data and documents can successfully augment one’s storytelling.

Embracing data journalism may even support new business models, such as the new joint venture from Reuters and The New York Times data teams which will offer “white glove” Olympics coverage, Pilhofer said. However, even small changes in newsrooms — like seating data teams together — can be essential in fostering innovative thinking among the staffers.

And most importantly, Tamman said, newsrooms and journalists doing data driven journalism must incorporate into their reporting practice the process of finding a story’s “empirical spine.”

The fundamentals of this process rely on using data analysis to develop the story’s hypothesis, and then allowing the reporting to “flow” from that analysis or “spine.” This process contrasts with the practice of many journalists, panelists said, who only look to data after substantively completing their story — sometimes to discover that that story is completely inconsistent with the data.

Technical consultant and privacy expert Ashkan Soltani says this issue can be addressed in part by having a reporter seek out qualitative interviews while a data team independently looks into the quantitative data, thereby simultaneously obtaining both sides of the story.

“You can come together and ask ‘Do they confirm each other or have different findings?’,” he said. “That can then merge together to form the spine of the story.”

Angwin says another solution could stem from news organizations collecting their own data sets.

“Data itself is political,” Angwin said, referring to choices and process involved in gathering data.  If news organizations amass their own data, she said, it could help reporters find the data that best addresses the questions raised by their qualitative reporting – something existing data sets are not always sufficient to do.

The panelists also debated the best platform to convey a data driven story, but ultimately felt nuance can be expressed in graphic visualizations just as well as in long-form narratives or news apps.

It is the journalistic backbone and purpose of such pieces — which use data intelligently and appropriately — that truly makes them data journalism. Visualizations or data sets without these qualities don’t deserve the title.

As Pilhofer put it: “If you aren’t telling a story in the presentation piece or approaching it with a journalistic intent, then you’re wasting everyone’s time.”

Featured image by Rani Molla.