Claire Wardle joins Tow Center as Research Director

We are pleased to announce the appointment of Claire Wardle as research director for the Tow Center. Wardle is currently the senior social media officer at the United Nations High Commissioner for Refugees (UNHCR) and has significant background in academia and journalism research and working with journalists in the field.

“We are delighted that Wardle is joining the Tow Center to lead our research program. She has the rare combination of being a leading figure in academic research in the field and someone who has a firm grasp of the practical needs of newsrooms,” said Emily Bell, director of the Tow Center for Digital Journalism.

Wardle holds a PhD in Communications and an MA in Political Science from the University of Pennsylvania. She taught at the Cardiff School of Journalism, Media and Cultural Studies in the United Kingdom for 5 years. She is one of the world’s experts on user-generated content, and has led two substantial research projects investigating how it is handled by news organizations. She was previously director of News Services for Storyful, and currently sits on the World Economic Forum’s Committee on Social Media. In October 2014, Wardle co-founded Eyewitness Media Hub, a non-profit initiative committed to providing content creators and publishers with continuing research and resources. From October 2013 to May 2014, she was a research fellow at the Tow Center, and published the Tow-Knight Report, “Amateur Footage: A Global Study of User-Generated Content.”

In early January 2015, the Tow Center announced $3 million in new funding from the Knight Foundation. The funding will build on the Tow Center for Digital Journalism’s innovative research and field experiments that explore the changing relationship of journalism and technology, while helping newsrooms and educators meet tomorrow’s information needs.

”The Tow Center has established an extraordinary, distinctive record of research on the intersection of journalism’s democratic function and the new tools of the digital age. The Center’s work has been at once visionary and actionable. Wardle’s research and career is in a similar vein, and we are very fortunate to have her leadership at the school,” said Dean Coll.

Wardle will work with Tow Center Director Emily Bell to develop a new research program for the Center, with a focus on the following four areas:

    • Computation, Algorithms and Automated Journalism will explore ways to bring computer science into the practice of journalism and look at the benefits and challenges that automated reporting tools, large-scale data and impact evaluation metrics, among other solutions, can bring to the newsroom.
    • Data, Impact and Metrics will extend the work of Al Jazeera data journalist Michael Keller and metrics specialist Brian Abelson who are using technology tools and data to explore which stories have impact and ways to reproduce these effects.
    • Audiences and Engagement will study the new relationship between the journalist and the audience, examining the impact and new demands that social media, participatory journalism, crowdsourcing and other developments, are creating in the field.
Films Online
  • Experimental Journalism, Models and Practice will develop field experiments with journalists around themes such as the influence of philanthropy on news startups; surveillance technologies used by and against journalists; applying game design techniques in newsrooms; and gender balance and diversity in journalism.

The Tow Center, established in 2010, works at the intersection of journalism and technology, researching new challenges in the rapidly changing world of digital journalism. The Center has quickly become a leader in the field, spearheading research in areas like sensors and journalism, algorithmic accountability, digital security and source protection for journalists, and the digitally native foreign correspondent.

In conjunction with our announcing Wardle’s appointment, we are pleased to announce a formal call for research fellows and research project proposals. Read more here.

All applications for research fellow positions must be submitted via the Columbia University Recruitment of Academic Personnel Site (RAPS). Read the full job description and submit an application here.

Call for Research Applications – Spring 2015

In January 2015, the Tow Center for Digital Journalism was awarded $3 million in new funding from the Knight Foundation to expand research into the following 4 areas: Computation, Algorithms and Automated Journalism; Data, Metrics and Impact; Audiences and Engagement; and Experimental Journalism, Models and Practice. These 4 areas are expanded upon in greater detail below.

Call for Research Fellows

We are pleased to announce a call for applications for research fellowships to lead and oversee research in these areas. Research fellows will occupy full-time positions with the Tow Center, and direct the course of research projects in their area of specialization. Persons interested in these positions should read more below, and apply via the Columbia University Recruitment of Academic Personnel Site (RAPS), here:

Call for Project Proposals

We likewise are pleased to announce a call for research project proposals. We invite students, researchers, faculty and practitioners in the fields of computer science and journalism to propose potential research projects that fall within our 4 areas of inquiry. We outline the proposal process for research projects in further detail below, and encourage you to adhere closely to the outline noted.

Research Fellows

Research Fellows are employed full-time with the Tow Center for Digital Journalism at Columbia Journalism School in the role of Associate Research Scholar. Research Fellows will be responsible for developing research streams that tackle large and cutting-edge issues in digital journalism through a combination of field research, workshops, events and published articles, briefs, and reports. Applicants must apply through the Columbia University Recruitment of Academic Personnel Site (RAPS) at the link below.

Apply here:


The ideal candidate will have:

  • Demonstrated deep knowledge of the current practice, thinking and future potential in one or more of the Tow Center’s research themes (articulated above)
  • A set of original ideas for research projects in one or more of those themes
  • Demonstrated ability to produce research that is rigorous, authoritative, contains original ideas and analysis, and relevant to the journalism industry
  • A graduate degree or PhD in journalism, communications or a closely related field; candidates with experience working in research & development in journalism, communications or a related field will also be considered.
  • Demonstrated experience with qualitative and quantitative research methods
  • A network in the innovative journalism community, or the journalism research community
  • Demonstrated ability to write and edit well for the Tow Center’s community: working journalists, newsroom managers and leaders, academic researchers, journalism students and educators
  • Demonstrated ability to work in small teams.


Research Fellows will:

  • Work with the Tow Center’s Research Director to lead the Tow Center’s research in one of the four themes
  • Generate ideas for research projects
  • Produce their own research (which may include written reports, journalistic experiments, events, workshops, panels or other forms)
  • Ensure that work by contracted research fellows is delivered at high quality, to an appropriate timetable,
  • Regularly blog on their areas of expertise
  • Help disseminate the Tow Center’s work throughout the digital journalism community

Call for Project Proposals – Scope, Format, and Process

Project Proposal Scope

Research projects range from small to large in scope; a small project might comprise a few months of local field research and writing to produce a white paper on a specific topic at the forefront of the study and practice of digital journalism; a large project might comprise the design and implementation of a technology or process to then be tested and evaluated in an applied journalism context.

Project Proposal Format

Project proposals are reviewed on a 6 month basis. Project proposals should be limited to three pages only. All project proposals should be sent to the Tow Center’s senior research fellow, Fergus Pitt, at: as an MS Word Document, and must include:

  • A brief description of the proposed research activities
  • A distinct explanation of the significance and timeliness of the research, and the anticipated impact and applicability of the resulting findings.
  • Project deliverables, including blog posts, papers, events and programming, applications
  • Project personnel, including short biographies and links to CVs or resumes for all proposed team members, indicating why they are capable of delivering the work
  • Project timeline, clearly indicating proposed timeline for work, project phases, and expected time commitments from team members
  • Project budget, including labor, materials and/or travel

If you have ideas for a research project but would like to meet with a member of our research team prior to the submission of a proposal, contact the Tow Center’s senior research fellow, Fergus Pitt, at:

Project Proposal Review Process

We will announce and advertise open calls for project proposals every 6 months. Our timeline for the next year is projected below.

March 31, 2015: Research Project Proposal period opens
April 30, 2015: Research Project Proposals are due
May 15, 2015: We will have contacted Project Proposal applicants whose work we are interested in

September 15, 2015: Research Project Proposal period opens
October 15, 2015: Research Project Proposals are due
October 30, 2015: We will have contacted Project Proposal applicants whose work we are interested in

Applications received after each closing deadline must be re-submitted by the applicant during the next Research Project Proposal window to receive consideration.

Our Research Themes, expanded:

  • Computation, Algorithms and Automated Journalism
    Computer science now plays a key role in journalism. Whether it is the investigation of algorithms, large-scale data projects, automated reporting tools or detailed metrics evaluations data, a whole new area of professional expertise now sits alongside more traditional journalistic skills. This stream will explore the leading edge of this new practice, work with institutions that are actively and progressively embracing computation, and find innovative ways to bring computer scientists into the practice and development of journalism.
  • Data, Impact and Metrics
    The Tow Center is already known for its work in data journalism, both in research and the classroom, and looks to extend this work over the next three years. Journalists often get into the profession inspired to “make a difference”, but newsrooms are naïve in the science of figuring out which stories have impact, why that’s so and how to reproduce those effects. Developing work around the use of metrics in newsrooms has been a cornerstone of the Tow Center since its inception four years ago. Work is needed in developing new metrics and measurement as our access to data increases and ways of finding and consuming journalism develops.
  • Audiences and Engagement
    Participatory journalism, social journalism, crowdsourcing, open journalism: there are many ways of describing the new pact between the journalist and the audience they seek to engage and inform through new platforms and tools. With experiments taking place in newsroom comment systems and social platforms, publishers are grappling with how to evaluate the direct relationship with audiences. As some news organisations withdraw from commenting or outsource their interactions to other social platforms, we see this area as needing far more research to examine questions such as, What are best practices? Do news organisations which employ more engagement tools improve key metrics? When data scraping is more efficient than crowdsourcing, what is the future of participatory techniques and social media teams?
  • Experimental Journalism, Models and Practice
    This track will develop our field experiments with practicing journalists and newsrooms to test new techniques and technologies. In models and practice there are a number of areas where we are already establishing research or will start new investigations. These may include the transformation of broadcast newsrooms, the influence of philanthropy and non-profit funding on news start-ups, a continuation of our work on surveillance technologies used by and against journalism

Research Audiences

The Tow Center’s different research projects engage particular communities in different ways – both in terms of who produces the work and who consumes it. Researchers and project proposers should let the target audiences inform their plans for research content, activities and dissemination.The practicing digital journalism community forms the Tow Center’s biggest constituency; these are largely working reporters and newsroom managers. For these people the Tow Center should identify, articulate and analyse the new digital journalism ideas that will affect their work over the coming years. If your project is likely to be of interest to this group, you may want to include specific training or seminars, panels at mainstream journalism conferences, alongside publication through the Tow Center’s website and social media channels. To reach these audiences, you may want to pitch articles based on your work to journalism news channels including PBS Mediashift, Nieman Lab, Poynter and the influential industry bloggers.

For news executives the Tow Center aims to provide awareness of significant new ideas, access to our fellows for their expertise, workshop space to focus on ‘the next thing’, and avenues of engagement for their teams can be involved. If your research is likely to be of particular interest to news executives, you may want to include in your deliverables specific meetings or events.

Current journalism students and educators are another constituency for the Tow Center. They are often reached through similar activities and channels to practicing journalists, however, fellows may also choose to publicize and distribute their work directly to educators. We welcome collaboration with faculty, researchers and students at other academic institutions.

For the academic research community, the Tow Center also aims to provide opportunities to publish quality, relevant work on fast timelines. The Tow Center will reach this community through their specialist conferences, including ISOJ and NICAR, and online channels including their email lists and twitter groups.

Research Type: Tone, Language & Output Rhythm

The qualities of the tone, language and output rhythm of the Tow Center’s research flow logically from the Tow Center’s identity, the Center’s mission, and the Center’s audiences: The Tow Center is a great research institution, housed within an excellent journalism school and designed to respond to and serve the rapidly changing journalism industry. The Tow Center’s mission is to help individual journalists, students, faculty, news organizations and policymakers to develop and expand their thinking around and practice in digital journalism.

Tow Center research projects need to serve busy people who consume a lot of information: The tone and language must therefore efficiently convey that the content is insightful, stimulating, backed by quality research that would otherwise be unavailable, and is relevant to the audiences’ professional lives. Different projects will reach slightly different constituencies, so the tone and language can likewise vary, but they should not stray far from the description above.

The output rhythm should similarly serve the constituencies: Although the Center will study topics that remain relevant over the medium to long term, the precise trajectories of journalistic movements may be unpredictable and our audiences need to respond quickly. For that reason, each program of work should have agility; by producing regular audience-facing deliverables every few months and adjusting course as needs be.

Eyewitness Media – how do online and broadcast compare?

This time last year, almost to the day, Sam Dubberley and myself presented the results of a new study on user-generated content and its use by broadcast media. You can find the full report here, and our presentation here.  Our research assistant on that project was the super smart Pete Brown.

Over the past year Pete has been working on a partner study, examining the integration of user-generated content (which we now call eyewitness media – see why here) by online news sites. His research was published last week, and you can find the research website and full pdf here.

What I want to do here is compare the two reports. What are the main differences between the ways in which television broadcasters and news websites use eyewitness media? There are differences, and it turns out digital media performs better, but not so much that we can all go home.

In the first study we analysed three weeks of output (November 25 until December 15, 2013) by eight different international news broadcasters – Al Jazeera Arabic, Al Jazeera English, BBC World, CNN International, euronews, France 24 English, NHK, and Telesur.

In this second study, Pete Brown again examined three weeks of coverage, although different dates to the first sample (August 26 until September 15, 2014). For this study his sample included eight newspaper websites, which were located across five continents, and had the largest relative readership. They were: Cairo Post[1], Clarín (Argentina), the Daily Mail (UK), the Guardian (UK), New York Times, Sydney Morning Herald and the Times of India.

How much do broadcasters use in comparison to the news websites?

On TV, there were 2,115 pieces of eyewitness media broadcast on the eight channels over the three weeks. Online, there was 4,974 – almost 2.5 times as much content. Each broadcaster had 24 hours of programming to fill. Online publishers, in comparison, have unlimited space. Online, there was a vast difference in terms of the number of articles that were featured on the homepage. The Daily Mail had an average of 465 articles each day, compared to Clarín, which had 37 per day. Overall, both studies show the consistent, and significant use of eyewitness media.


Broadcasters News websites
Number of ‘items’ of eyewitness media over 21 consecutive days 2,115 4,974
Average Eyewitness media was used every half an hour of programming Eyewitness media was used once every 5.6 articles


Some broadcasters – such as Telesur – used hardly any, but for those who understood how to use eyewitness media in their output, it is being used, particularly as a way of covering breaking news stories. Online, it’s also certainly a way of creating articles that otherwise would not be written. For example a story of a bear dancing on a golf course needs pictures!


In what types of stories is it used?

(Here is the Daily Mail article on the dancing bear)

On television, eyewitness media was almost entirely related to breaking news coverage, or the ongoing conflict in Syria.  Photographs and videos were used as part of packages to add depth and colour.

Online, eyewitness media was also used for breaking news coverage, but it was also used to create photo galleries to illustrate a particular trend or experience, to create stand-alone stories based around a particular video, or used as part of more in-depth coverage, for example explainer videos about ISIS (this was an area in which the New York Times was particularly strong). The variety of formats that exist online meant that eyewitness media appeared in a number of different ways. There were definitely more examples of ‘softer’ types of stories, such as talented (or potty-mouthed) children, or entertaining animal encounters.


How much was credited?

Our first study received quite a bit of interest because of the evidence that eyewitness media is frequently not credited. We found only 16% of the eyewitness media broadcast during the study period had been actively credited. (We say ‘actively’ because we did not consider a watermarked logo in the top left hand corner of a video uploaded by a Syrian activist group to be a clear credit, from the audience’s perspective).

In the online study, the results were quite different. 49% of the content was credited. It’s worth noting that 19% of that was an ‘automatic’ credit because the content was embedded, which means the DNA of the piece of content is automatically available as part of that process, i.e. you can see the username, click through to the picture or video on the social network that hosts it.


Broadcasters News websites
Percentage of eyewitness media credited 16% 49%


Overall, it’s pretty depressing to note that less than half of all content was credited to the eyewitness. In certain situations, crediting is not the right thing to do. Sometimes eyewitnesses do not want to be credited, either for safety or privacy reasons, and best practice is for journalists to ask the eyewitness whether they would like to be credited and how. It’s certainly not the case that 51% asked not to be credited, particularly as this research also included a number of conversations with the eyewitnesses whose content had been found during the study. Many of them revealed that they hadn’t even had their permission sought, let alone been asked how they would like to be credited. In one example, the eyewitness gave permission for her photo on the condition that she was credited, only for it to be re-used the following day without credit.


How much was embedded?

Embedding is a really interesting question when comparing broadcasters with online publishers, because broadcasters can’t embed. TV producers who want to use footage they’ve found on YouTube or Facebook, don’t have the option to embed. They have to download the content and re-upload to their own systems. In order to do that, the broadcasters should be seeking permission from the person who filmed the content (crucially not the person who uploaded it; the person who filmed the footage and therefore retains the copyright). Embedding therefore has significant benefits, particularly during breaking news events as online news publishers can embed photos and videos without having to wait for permission.

Embedding also comes with a couple of downsides. Mainly that if the eyewitness deletes the post, your online site suffers from a black hole where the content used to be. Embedding also means that the person’s identity is automatically being shared much more widely with the world. Someone caught up in a active shooter situation might post to instagram to share information with family and friends, but embedding that post on a news site that receives significant traffic might not be the ethical thing to do.

Taking this into account, it’s still worth looking at embedding practices. Would it be more ethical if newsrooms asked permission to embed, for example? There is certainly a conversation to be had here. Overall only 19% of eyewitness media was embedded by the news sites included in this study.

How much content was labelled?

Crediting involves identifying who the eyewitness was, by including their username or real name on screen or within a caption. Labelling simply means explaining that the content audiences are viewing was captured by someone unrelated to the newsroom. Interestingly, in some UK-based focus group research carried out by Pete Brown (to be published soon) many participants argued that they didn’t need labels as they could always ‘tell’ when something was eyewitness media. As a research team that has been examining eyewitness media for almost two years, we need to explain that it is certainly not always clear, and often as part of the research we had to carry out quite detailed investigations to confirm the provenance of a piece of footage.

It is also much more than simply labeling whether a piece of footage is eyewitness media, it is also about providing context for the audience. If a piece of footage was captured by an innocent bystander, someone caught up in a breaking news event, that is one thing. But many other people, filming on their mobile phones and uploading to the social web, have other motivations. It could be relatively benign, such as a humanitarian worker documenting their experiences in the field, or it could be less so – such as when it is someone related to ISIS documenting the capture of a town – or it could be an activist at a protest. It is important for the audience to know the ‘source’ of the picture or video as it provides crucial context, and I use the term ‘source’ deliberately here.


Broadcasters News websites
Percentage of eyewitness media labelled 28% 74%


On television, only 28% of the eyewitness media examined included a label. Online, it was 74%, a significant difference. The additional space online provides more options, undoubtedly, but I would also argue labeling is an example of how online there are higher expectations in terms of transparency.


And finally…

No research is perfect, and this sort of comparison is a little bit unfair. The three weeks studied were not the same as the earlier three weeks examined for the broadcast study. Similarly the averages used here hide really significant differences between broadcasters or publishers. But still there are trends that can be seen here, that suggest there are importance differences in the way broadcasters and publishers are using eyewitness media.

The most significant take-away is that eyewitness media is a really important source for news outlets. And as both of these reports show in different ways, particularly the qualitative aspects (the interviews with journalists, as well as uploaders), it is a wild west out there. There is a great deal of ignorance both in newsrooms and amongst eyewitnesses about the legal and ethical questions that remain.

The original Tow Center research led to the formation of the Eyewitness Media Hub, and as a result this latest research. Pete Brown has continued researching this area, and his report on the way audiences think about the news industry’s use of eyewitness media will be published very soon. The bottom line about all of this research is that there is still a long way to go in supporting newsrooms so they know how to use it, as well as educating eyewitnesses on their rights. We hope the research will provide evidence for newsrooms that they need to think about their current practices, and we are continuing to build resources to help people navigate their way through these issues.

[1] Pete had originally intended to focus solely on newspaper websites ranked in the top 1000 for web traffic by Alexa. He had hoped to include Youm7, but as a non-Arabic speaker he had to make a compromise. He decided to include the Cairo Post instead because it is English language and produced by Youm7.

Supporting Citizen Journalism in Turkey

An audio recording of a recent discussion with the co-founder of 140journos is available here on SoundCloud.

Thanks to the rise of social media and crowd-sourcing, an increasing amount of reporting now comes from individuals without specific journalistic training. While communications researchers often focus on the impact of citizen reporting on journalism, we rarely interrogate how closely it is linked to civic engagement: Are citizens who send reports from the ground already engaged in other areas of civic life? What are the ways of building diverse citizen news communities online? And how can social media be used to spur a conversation – not to mention action – despite the constant stream of new information that these tools provide?

Engin Onder, co-founder of Turkey’s 140journos, a Twitter-based citizen journalism network, visited Columbia Journalism School last week to talk about his organization’s experience of building and engaging a news community online. Founded in early 2012, 140journos set out to counter media manipulation and censorship in Turkey, whose media environment has been degrading for several years. In its 2014 press freedom rankings, Freedom House downgraded Turkey from “partly free” to “not free.” “Turkey’s TV stations aired 44 hours of live speeches by President Erdoğan in one week” said Onder – meaning many major stories that matter to certain political, social, ethnic, religious or intellectual communities remain underreported.

In its first 18 months, 140journos operated in relative obscurity. Onder and his co-founders were reporting directly from street protests or courtrooms, sending 140-character Twitter reports about events that mainstream media largely ignored. When the Gezi protests began, however, Onder said their role changed almost overnight. In the information vacuum left by mainstream media, hundreds of people turned to Twitter to report and share news. Sifting through thousands of tweets, 140journos began curating and verifying social media content instead of reporting on the ground. “We wanted to keep a neutral identity so we were always avoiding to be part of the conversation during the Gezi protests,” Onder said.

During Gezi, 140journos gained a loyal following on social media and a reputation as a reliable citizen news network. Today, the organization has a core group of citizen reporters around the country that has already been vetted by 140journos’ editors. “We’re friends with many of them,” said Engin, adding that the group can generally rely on these reporters when there’s a particular story to be covered. 140journos also uses social media in creative ways to build intelligence around what kind of stories might become critical via Twitter lists or Facebook groups based on events or places.

The success of the group has prompted wider interest in their methods. 140journos was recently awarded a European Cultural Fountation grant to organize citizen journalism workshops in cities where there are not enough citizen reporters. Editors from 140journos are meeting with local change-makers or activists who already use social media platforms, but have yet to use them for news reporting. “In Turkey more than 30 million people use Facebook and more than 10 million are on Twitter, but not everyone may not be aware of how to better use the existing infrastructure,” said Onder.

It is one thing to have a large, diverse network of citizen reporters, and another to engage this network in an ongoing conversation. Turkey’s social media can be highly partisan and host to heated commentaries. 140journos, however, is very meticulous about using neutral language – to the extent that they keep a collection of “controversial words on social media.” While 140journos might not be liked on social media because their coverage is so neutral, Onder admits, he believes it is essential in order to reach to a diverse set of communities.

140journos strives to turn heated moments into “meaningful discussions” by using explanatory tools, such as maps. Onder described how 140journos created a community discussion around the death of Turkey’s former president Kenan Evren, who died on May 9. Evren was strongly disliked in much of the country at the time of  his death, so the news immediately became the most popular topic on Twitter. Among all the visuals and commentaries people were sharing on social media, 140journos team noticed a front-page story from the year Evren was elected president. Though was a poor-quality image, it displayed the results of Evren’s 1982 referendum, showing how cities around Turkey had voted. Using that image and the data it contained, 140journos created these new maps that, Onder explained, offers many insights into Turkey’s current political controversies, thus generating a more informed debate.

140journos is now working on two fronts by growing their team and community, and using better information and visuals to contextualize partisan issues. A key part of this, Onder said, is getting to know the country better by actually visiting cities and organizing workshops. “I can’t really tell where all these efforts are going to take us,” Onder confessed. “But we believe it is going to empower people.”

An audio recording of the event can be accessed on SoundCloud, here.

Read about Professor Susan McGregor’s upcoming work in Turkey with the President’s Global Innovation Fund at Columbia University.

Stills from the Showcase

On Tuesday, May 12th, the Tow Center co-hosted the Columbia Journalism School Showcase with the Brown Institute for Media Innovation.  The showcase is an annual open house event that allows students and researchers to share their work with professional journalists, industry partners, entrepreneurs, technologists, academics, and the public.

This year’s showcase featured projects on data visualization, computational journalism, video and audio storytelling and research. Selected projects were installed and presented in the Brown Institute in Pulitzer Hall, and over 100 people came to celebrate the end of the year, and see the students’ and fellows’ work.

Digital Media Associate Joanna Plucinska photographed the event, a selection of her photos appear below.

BrownShowcase-36  BrownShowcase-5  BrownShowcase-21  BrownShowcase-4    BrownShowcase-10  BrownShowcase-8  BrownShowcase-31  BrownShowcase-14  BrownShowcase-30  BrownShowcase-11  BrownShowcase-24  BrownShowcase-33

Assistant Director Susan McGregor to Partner with Global Centers

Read the announcement by the Columbia Provost here.
The Tow Center for Digital Journalism at Columbia Journalism School is proud to announce that Assistant Director Susan McGregor was among those selected to receive 2015 funding from the Columbia University President’s Global Innovation Fund for her project, the Global Operational Data Index. McGregor’s project was one of sixteen proposals chosen for funding from among the more than fifty submissions received.
Through a collaboration between the Tow Center and Columbia University’s Global Center in Istanbul, the Global Operational Data Index will collate information about on-the-ground communication conditions in countries around the world. By gathering and publishing region-specific information about the legal and technical circumstances affecting digital communications, the Global Operational Data Index will serve as a centralized, up-to-date reference for journalists, academics, human rights workers and others who depend on these communication systems for their work.
“In conversations with a wide range of journalistic and non-governmental organizations, it has become clear that accurate contextual information is essential to operating safely” in a given region, McGregor said. “Yet despite the wealth of knowledge contained in these organizations’ networks, a centralized repository of this information does not currently exist.  By drawing on the diversity of Columbia University’s students and faculty, in addition to forging partnerships with the Global Centers and others, the Global Operational Data Index can provide an essential resource for everyone wishing to work safely and effectively in a region foreign to their experience.”
Tow Center Director Emily Bell adds, “Freedom of the press and security for sources and journalists in a digital age is a key area of interest for the Tow Center. Working with Global Centers gives us an opportunity to develop our research and resources internationally.”
To learn more about the President’s Global Innovation Fund and McGregor’s project,  read the Columbia University press release and full project description.
For regular updates about the Tow Center, subscribe to our newsletter or follow Susan McGregor and the Tow Center on Twitter.

A Measurable Effect: Using Metrics in the Newsroom

By Sybile Penhirin

Metrics have become an inevitable component of today’s journalism. Many websites, such as Chartbeat and Google Analytics, offer various ways for newsrooms to measure and develop their audience. Even people who do not necessarily subscribe to these services can gauge their stories’ success by seeing how well they do on social media such as Twitter and Facebook.

However, there has been little empirical research on how these metrics are produced and how they affect newsrooms’ cultures and journalists’ daily work, said Caitlin Petre, who recently published a report on the topic.

Petre presented the main points of her study, “Traffic Factories: Metrics at Chartbeat, Gawker Media and The New York Times” at a Tow Center event on Thursday night. You can read her report here and the key findings here.

She then invited three panelists to join the conversation: Sam Henig, one of The New York Times digital deputies in charge of broadening the use of metrics throughout the company’s newsroom, Chadwick Matlin, a features editor for FiveThirtyEight who is helping develop that newsroom’s approach to audience analytics; and John Herrman, whose ‘Content Wars’ series for The Awl examines the journalism industry’s metrics-driven moves and counter moves.

When understood and used well, analytics can help newsrooms know and target their audience better, the panelists said. Editors can manipulate different elements, such as headlines or the time a story is published, and then analyse the metrics in real time to determine which option attracts the most readers.

Matlin said he recently changed the photo on a published piece about Deflagate, which proved to be an efficient move to retain readers.

Similarly, The New York Times published the first part of their one-year long investigation on nails salons early on Thursday morning to optimize the number of readers reached.

“It’s very obvious to us that we should be publishing things at times when our readers are coming to us,” Henig said when talking about the publishing process of the company.

This strategy ensures that not only an optimal number of readers read the piece, but there is also a greater potential for the story to be shared and to go viral.

Thanks to analytics, media companies can explore dozens of features to optimize their stories’ spread.

“If you work at a place that is truly analytics-savvy, the thing that is most fluid is form,” Herrman noted.

A place like BuzzFeed, which is very analytics-savvy, seeks new ways of delivering their entertainment and news stories by experimenting with innovative concepts and formats for instance, he said.

“You can categorize that as a slippery slope, but I’m not sure what’s at the bottom, a lot of traffic and a lot of posts that people really like and share online,” he said.

Metrics, used as feedback, can also have a positive impact on writers and editors. In her report, Petre found that journalists sometimes turn to these data as a reassuring reminder of their professional competence.

“If you write online, you can sort of get the feeling sometimes that no one is reading or that you might not even exist, you need to be reminded a lot,” said Herrman when asked why metrics matter.

But data used in newsrooms also come with several underlying issues, Herrman, Henig, Matlin and Petre noted.

One of the pitfalls, for instance, is for the company’s end-goal to become reaching as many “clicks” as possible.

News organisations experimenting with their stories’ features in order to optimize their audience have to be careful that the piece form does not overwhelm its editorial part, Madlin said. A stand alone slideshow on the Syrian conflict might get a lot of visitors but is probably not the best editorial way to tell the story, for example.

“You write to be engaging and you can take that way way too far, analytics provide a lot of temptations to do that,” Herrman said, adding this trap was not so much inherent to analytics but rather to a loss of perspective from the institution’s part.

“That’s the kind of thing that happen in intuitions where you don’t have structures built around analytics, you don’t have someone to interpret it for you,” Herrman said.

To be well understood and used, metrics need to be placed into context and they cannot be interpreted on the fly, Petre noted in her report.

Henig agreed and explained that after their Innovation report, the New York Times brought several specialists – including a former hedge-fund employee, analytic experts and employees from the company’s product side – to form their team handling metrics.

To learn more about Metrics at The New York Times, but also at Gawker Media and Chartbeat, you can download Petre’s report by clicking here.

Research Director Claire Wardle Arrives at Tow

It’s my fourth day on the job. My new job as Research Director at the Tow Center for Digital Journalism. I’ve spent the last few hours trying to come up with a way of describing how this feels and failed miserably. I realised no words sum it up, although this gif of Peggy Olsen from this week’s episode of Mad Men comes pretty close to how I felt as I walked into the building on Monday morning.


For the past 6 years, I’ve been incredibly fortunate, being paid to travel the world training and consulting for newsrooms, international organisations and NGOs in social media, verification and user generated content.  I loved my time working with incredibly smart people who were eager, and also at times utterly terrified about embracing the communications revolution that was fundamentally changing the way they worked everyday. My time as a freelancer was interspersed by periods at Storyful and UNHCR, one a small and agile startup, and one a very traditional organisation trying to make the move to digital. Both were utterly fascinating but challenging in so many different ways.

But fundamentally, I’m an academic. I completed a PhD at the Annenberg School for Communication at the University of Pennsylvania, which was followed by five years at the Cardiff University School of JournalismIn 2013 I undertook some research for the Tow Center on user-generated content, and so loved the experience of having time to think, reflect, and analyse, that this job as Research Director makes me wants to do more than sashay down a corridor like Peggy Olsen.

Air Online

I’m writing this to encourage people to come and feel the same excitement. The Tow Center is a very special place. Through the generosity of the Knight Foundation and with Emily Bell’s leadership, there has been some incredible research published over the past three years. It has focused on the ways in which technology is changing journalism, both the ways in which news is produced, but also how audiences are consuming that news. If you haven’t had a look, browse our publications. You will find all sorts of gems on the ways in which algorithms are impacting the news we consume, how misinformation goes viral, the evolving use of sensors in journalism, and an in-depth look at how newsrooms understand metrics (that research is going to be launched tonight. You can join us in person or via the livestream during the event.)

Our aim is to commission research that will benefit the industry, as well as add to a wider understanding of what is happening in the ‘journalism space’ at this crucial period. But we’re always thinking about how we can look forward. It would be too easy to see something interesting today, but by the time we had commissioned the research, someone had completed the data collection and

written up the results, it could very easily feel very out of date when change is happening at this break-neck speed.

As I struggled with this idea today I started day-dreaming about having some sort of standby research taskforce so that when Paul Lewis starts periscoping from Baltimore, a red light goes off , and somewhere around the world, someone drags themselves out of bed in their pyjamas and starts researching. You don’t need to have a crystal ball to know that I’m going to want to press that red button the first time that New York Times’ content is published within Facebook, or someone captures a breaking news event on their Apple watch, or we discover that the most shared story of 2016 was actually written by a robot.

So if you are a curious journalist with some time to spare, an academic who needs some additional support for data collection, a policy person, or simply a smart interested news junkie, please get in touch. The deadline for this round of proposals has just passed, and this week is all about reading a number of fine-looking project plans.

But I write this to sow some seeds. The next deadline will be October 15, 2015. If you’re interested, take a look here, as we outline our four main areas of research, and please do get in touch if you fancy a chat about any ideas you might have. I’d love to hear from you. It really is such an interesting time to be working/thinking about journalism. Come and be part of it.

Key points from “The Traffic Factories”

Audience metrics have become ubiquitous in news organizations, but there has been little empirical research on how the data is produced or how it affects newsroom culture and journalists daily work. The Tow Center sought to understand how the use of metrics changes reporters behavior and what this means for journalism. Thus, researcher Caitlin Petre conducted ethnographic analysis of the role of metrics in journalism, focusing on three case studies: Chartbeat, a dominant metrics vendor; Gawker Media, a newsroom intently focused on metrics; and The New York Times, a legacy news outlet where currently metrics were more peripheral. Petre offers the following key points based on her findings. The full report is online here, and available as a pdf and ebook downloads here.

  • Metrics exert a powerful influence over journalists’ emotions and morale.
    Metrics inspire a range of strong feelings in journalists, such as excitement, anxiety, self-doubt, triumph, competition, and demoralization. When devising internal policies for the use of metrics, newsroom managers should consider the potential effects of traffic data not only on editorial content, but also on editorial workers.
  • Traffic-based rankings can drown out other forms of evaluation.
    It is not uncommon for journalists to become fixated on metrics that rank them or their stories, even if these are not the sole criteria by which they are evaluated. Once rankings have a prominent place on a newsroom wall or website, it can be difficult to limit their influence.
  • News organizations can benefit from big-picture, strategic thinking about analytics.
    Most journalists are too busy with their daily assignments to think extensively or abstractly about the role of metrics in their organization, or which metrics best complement their journalistic goals. As a result, they tend to consult, interpret, and use metrics in an ad hoc way. But this data is simply too powerful to implement on the fly. Newsrooms should create opportunities—whether internally or by partnering with outside researchers—for reflective, deliberate thinking removed from daily production pressures about how best to use analytics.
  • When a news organization is choosing an analytics service, it should consider the business model and the values of the vendor.
    We have a tendency to see numbers — and, by extension, analytics dashboards — as authoritative and dispassionate reflections of the empirical world. When selecting an analytics service, how- ever, its important to remember that analytics companies have their own business imperatives. Newsroom managers should consider which analytics companies values, branding strategy, and strategic objectives best align with their own goals.
  • Not everything can — or should — be counted.
    Efforts to improve audience analytics and to measure the impact of news are important and worthwhile. But newsroom, analytics companies, funders, and media researchers might consider how some of journalisms most compelling and indispensable traits, such as its social mission, are not easily measured. At a time when data analytics are increasingly valorized, we must take care not to equate what is quantifiable with what is valuable.

The full report is online here, and available as a pdf and ebook downloads here.

CJS Showcase 2015

Columbia Journalism Innovation Showcase

The Columbia Journalism School Showcase is an open house event for students and researchers to share their work with professional journalists, industry partners, entrepreneurs, technologists, academics, and the public.

The showcase will feature data visualization, computational journalism, video and audio storytelling and research. Selected projects will be installed and presented in the Brown Institute for Media Innovation and the lobby of Pulitzer Hall on Tuesday, May 12th, from 6-9 pm, accompanied by an evening reception.

The event creates a space for students to forge connections with those in the industry, and for media professionals to learn about some of the creative and cutting edge ideas informing student work and research at the school.  Join us for an evening of conversation and drinks as we celebrate a diverse body of work.  RSVP

Brown Institute Projects


Lenses is a new open-source tool that lets anyone build and transform interactive graphics for mobile audiences. Lenses empowers people to explore open data without any programming skills. Unlike existing data visualization platforms, it is open-source and extensible, meaning that additional features can be added by its users, and the potential of the tool grows as more people use it.  Each data visualization created in Lenses preserves the steps taken to create it, enabling new users to learn how to make sophisticated graphs by seeing how more advanced users have produced visualizations. This project is funded by a NYC Media Lab seed grant in partnership with News Corporation and the Integrated Digital Media program at NYU Polytechnic School of Engineering.


 Reframe Iran

Alexandra Glorioso | Joao Inada | Matteo Lonardi | Matt Yu

Journalists can glean remarkable insights into the social and cultural tensions of a region by studying the lives and experiences of its artists.  These insights are particularly important in countries whose cultures have been misconstrued by traditional reporting in mainstream media. Built on this notion, Reframe Iran will present 40 profiles of Iranian artists living both in Iran and abroad, using text, photo, and the innovative medium of immersive video.


Science Surveyor

Marguerite Holloway | Laura Kurgan | Juan Francisco Saldarriaga | Dennis Tenen

One of the biggest challenges facing science journalists is the ability to quickly contextualize journal articles they are reporting on deadline. Science Surveyor is a tool that can help science journalists and others rapidly and effectively characterize the scientific literature for any topic by providing a contextual consensus, a timeline of publications surrounding the topic, and categorized funding.


Tow Center Projects

Documentary in Virtual Reality 

Fergus Pitt | Taylor Owen | PBS FRONTLINE | Secret Location

For this in-progress work exploring the narrative opportunities of virtual reality, the Tow Center partnered with the prestigious documentary program FRONTLINE to send film director Dan Edge to West Africa, recording the people and locations that were fundamental to Ebola’s spread. The rough cut available at this showcase will give audiences an early glimpse at the virtual reality platform.


NewsLynx: The Impact Platform

Michael Keller | Brian Abelson

The NewsLynx team has built a platform for tracking what happens after journalists publish their stories. Journalistic impact is a topic of crucial interest to the industry right now, as funding models change, and new management techniques come to the fore.


Student Work 

Pill Puzzle: HIV Wonder Drug Stirs Hope and Unease

Yasmin Nouh | @YasminNouh


Last June, Governor Cuomo announced a plan to end AIDS in New York. One of the initiative’s key priorities was to increase access to Truvada. Although the drug was federally approved in 2012
as an HIV prevention treatment, uptake remains remarkably slow. Many people still don’t know about the drug and concerns over controversy, stigma, high costs and long-term side effects have stymied uptake.

Clinics based in poorer areas of New York City with greater rates of the disease, in particular, face significant challenges educating their communities about the drug. High rates of illiteracy and low rates of insurance along with pre-existing stigmas of HIV and homosexuality are to blame. Thus the people who can potentially most benefit from the drug are least likely to take it.


HSDLA: Homeschooling’s Guard Dog

Jessica Huseman | @JessicaHuseman

Huseman - Showcase Promo Image


This project takes an 8-month look at homeschooling regulations across the country, revealing gaping holes in the nation’s homeschooling laws. While public school students may be flagged if they are chronically truant, in most states home-schooled children may be illiterate, suffer from an acute medical condition, or endure abuse and no one would notice.

The lack of regulation has come about after two decades of scorched-earth style lobbying by the Home School Legal Defense Association, a small, but fierce advocacy group based in Purcellville, Virginia founded by lawyer and ordained Baptist minister Michael Farris. Despite representing only about 15 percent of the nation’s homeschooling population, the HSLDA’s tactics have given it a leading role in blocking or dismantling most attempts to regulate homeschooling nationwide in the last 20 years.

The result is a patchwork of laws that make little sense from one state to the next: While one state may require parents to submit quarterly reports with detailed grading, parents in a neighboring state may not even have to inform the state they are homeschooling. Today, many states lack the legal means to know such basic information as whether a homeschooled child even exists, let alone is being properly educated. Only three states have the legal authority to prevent parents previously convicted of abuse from homeschooling — and even those laws are severely limited. And it’s illegal for most states to request proof that homeschooled students are meeting basic educational standards.


Elder Abuse: The Hidden Crisis

Kate Cox | @thekatecox

Elder Abuse Promo Shot

In this country, as many as five million people over the age of 60 suffer abuse every year. Family members are often the abusers, and people who are incapacitated or cognitively impaired are especially vulnerable. The problem is generations old, but it hasn’t always been considered what it now is — a crime. That’s partly because elder abuse can be hard to detect, and only about one in every 23 cases is ever reported.

In a little more than fifteen years, one in five people in the United States will be over age 65. That means more elders in need of care, more caretakers under strain and a whole lot more potential for elder abuse. At the same time baby boomers are redefining what it means to enter the third wave of our lives, they are becoming increasingly vulnerable to abuse. It is a phenomenon advocates call, “the hidden crisis.”


The Trial: When foreign workers are injured in America’s wars, who pays the price?

J.P Lawrence | @jplawrence3


During America’s wars in Iraq and Afghanistan, 20,000 Ugandan guards were contracted to U.S. Army bases in Iraq. As told in Sarah Stillman’s New Yorker article, these contractors formed an invisible army supporting U.S. military efforts there. “The Trial” is about Charles Mbule, a Ugandan guard who was injured on an American base in Iraq, and his long legal struggle as he sued for compensation for his injuries. Charles was shot on his second day in Iraq and had to confront a massive and faceless bureaucracy, one that means well but is completely unsuited for today’s environment.


The Making of Snapchat

Gurman Bhatia | @GurmanBhatia



On March 11, 2015, Alibaba was said to be investing in Snapchat with a $15 billion valuation. In February 2015, the rumours for Snapchat’s valuation rested at $19 billion. This is taking into account that until January this year, Snapchat had no source of revenue at all. A portal that was created by millennials, for the millennials, the mobile based social network became the fastest growing mobile app of 2014, according to a study by the Global Web Index. Shortly after that study came out, came out Snapchat Discover, the first step by the company towards revenue.

Amidst sky-high valuations, the company is putting down the first steps for structuring a business model. This project tells the story of the brand within a Snapchat-like interface. Done as an individual project for an Interactive design and storytelling class at Columbia Journalism School, it uses a mix of pngs and gifs to tell that story in “snaps”.

“I’m in 7th grade with Tourette syndrome and my school thinks I have an attitude problem.”

Michelle Inaba | @michelleinaba



Joseph Pizarro suffers from a mild case of Tourette’s, but his co-existing conditions make it really hard for him at school. Teachers and administrators don’t understand his case, as he doesn’t present the stereotype of Tourette’s, and his co-existing conditions – OCD and ADHD, are invisible to the naked eye. By the time of the report, he was failing four classes at school because the teachers and the administration believed he had a behavior problem.

The subject of a child with Tourette Syndrome in NYC public schools is such a taboo that nobody wanted to talk about it in the administration. The schools didn’t want to comment on it, the school districts didn’t want to comment, and not even the press office at the DOE wanted to comment on it. It was a several-months long struggle until an official from the DOE agreed to talk on background. It is estimated that 300,000 children suffer from Tourette Syndrome in the United States, and many of them go through the same struggles as Joseph.


Transform Or Move to Poland

Polish businesses in Greenpoint Fall to Gentrification

Joanna Socha | @joa_soc


For the many Poles who have lived in the old Greenpoint in Brooklyn, the neighborhood was a place where Polish culture flourished, where one could taste Polish bread, where real Polish folk music was played, and where blue-collar Polish immigrants worked hard to support their families. Some were building small businesses, helping to grow the Greenpoint economy. For years, the Greenpoint neighborhood was a mecca for Polish immigrants leaving their home country for economic or political reasons.  It was one of the biggest “Little Polands” in the U.S., which enriched New York’s multicultural flavor. Now, that is changing.

New York’s little Poland is steadily and slowly disappearing. Many Poles and the Polish businesses are moving out to other neighborhoods, states, or even returning to Poland. But as the Polish small businesses move out, or transform, some of the children of Polish immigrants assimilate into the American environment and create new businesses, often original and innovative.


Born Into This

Sean Ryon | @seanryon89 | Lea Zora Claxton Scruggs | @LeaScruggs


Born Into This is an immigrant father and son’s American Dream told through one of the most violent and difficult professional sports. Junior “Sugar Boy” Younan is a 19-year-old Super Middleweight boxer from Brooklyn, New York. His father Sherif, 46, has been his trainer his entire life. Now, after 14 years of personal strife and physical adversity, Junior and Sherif are starting to live out their goals of making it big in the fight game. However, their greatest challenge still lies ahead: surviving the unforgiving business of boxing without sacrificing their family bond. Is Sherif helping his son succeed as a professional boxer, or pushing him too far?


Blog: The Responsive Cities Initiative

On Tuesday, April 28, Tow Fellow Susan Crawford hosted a panel of civic tech advocates to launch a white paper titled “The Responsive Cities Initiative: What a University Could Do to Help.” The white paper was generated from a series of meetings hosted by the Tow Center in the fall of 2014, which gathered together government officials, fiber internet access builders, journalists, and others to discuss how a university center could help governments leverage technology to better serve their citizens.

Crawford opened with a call to action: the United States is destined to fall behind other countries unless civic leaders begin to value and implement fiber Internet access. Having fiber internet access is as different as having electricity was from not having electricity. With fiber, city governments could better serve their constituents in a myriad of ways. From better data, increased transparency, and innovative services, the end goal of fiber matches the ultimate goal of the fiber advocates, journalists, and city officials: to improve lives in communities.

Lev Gonick (CEO of OneCommunity), Brett Goldstein (Fellow in Urban Science, U. of Chicago, Board Member of CFA), Elin Katz (Consumer Counsel for State of Connecticut), and Oliver Wise (Director of Officer of Performance and Accountability for the City of New Orleans) joined Susan to offer their thoughts on how a university center could help, what is useful about the cross-disciplinary work that the university center will focus on, and  the toughest problems that the country faces that could be addressed by the responsive city approach.

Gonick explained that there are four models for what a university center could do: creating a service model in which universities can discover fiber competencies, figuring out a faculty engagement strategy, generating an action agenda for students, and combining industry and philanthropy with sponsored engagement with industry partnerships. “We are facing real issues in our community, and how we attack them with fiber, with sensors, with data, and with representations of data is really the stuff that universities and policymakers need to be able to do together.”

Wise believes that the hardest problems facing our cities and country are resilience, income inequality, and trust in government – all which can be ameliorated with data science and increased transparency. He adds that data can help package and deliver to citizens services that they actually need. “When you’re in local government, you’re focused on the here and now, and addressing the needs of your citizens.”

Goldstein said that the problem that the government needs to solve is silos – something built from a very functional perspective, but problems that cities face today are actually cross-disciplinary. For example, crime isn’t just a police problem, it’s about everything from economics to schooling to garbage pickup. We need to accept the technology footprint that we have and the data from silos, but we need to figure out a way make sense of enormous amounts of data to solve problems.

Katz shared that in Connecticut, they have worked to create public-private relationships. It has helped in that the towns are looking for data, and need to explain to citizens why it’s important. At UConn, a group of business students is studying how fiber impacts communities. She also justifies investment in fiber internet access is that young people are growing up with the Internet at their fingertips, and expect fast, available Internet access as they grow up and a state that is future-oriented. The towns that she is working with are dedicated to providing an open-access network, and touching every home in the community – an example that other towns and cities would do well to follow.

What emerged from the conversation was the power of cities, and the need for more cross-disciplinary cooperation. A university center that sits at the crossroads of these concerns could create a diverse talent pool. It could also serve as a research center for how a city government can effectively wield technology to build stronger communities. The paper and the conversations that preceded it will provide a springboard for such a center to create more trust between communities and governments and improve cities as a whole.


Towards a Standard for Algorithmic Transparency in the Media

Last week, on April 21st, Facebook announced a few updates to its algorithmically curated news feed. The changes were described as motivated by “improving the experience” and making the “balance of content the right one” for each individual. And while a post with some vague information is better than the kind of vapid corporate rhetoric Jay Rosen recently complained about, we need to develop more sophisticated algorithmic transparency standards than a blog post along the lines of, essentially, “well, it’ll be a little less of this, and a little more of that.”

That’s why, last month at the Tow Center for Digital Journalism, we convened about fifty heavy hitters from the media industry and from academia to talk about algorithmic transparency in the media. The goal of the workshop was to discuss and work towards ideas to support a robust policy of news and information stewardship via algorithm. To warm things up we heard seven amazing five-minute “firestarter” talks with provocative titles like “Are Robots Too Liberal?”, “Accessible Modeling” and “An Ethical Checklist for Robot Journalism”. The videos are all now online for your viewing pleasure.

But the brunt of the workshop was spent delving into three case studies where we see algorithms operating in the public media sphere: “Automatically Generated News Content”, “Simulation, Prediction, and Modeling in Storytelling”, and “Algorithmically Enhanced Curation”. Participants broke out into groups and spent an hour discussing each of these with a case study facilitator. They brainstormed dimensions of the various algorithms in use that could be disclosed relating to how they work or are employed. They evaluated these dimensions on whether they would be feasible, technically or financially, and what the expected impact and significance to the public would be. And they confronted dilemma like how and whether the algorithm could be gamed.

Based on the numerous ideas generated at the workshop, I boiled things down into five broad categories of disclosable information, including:

Human Involvement

There is a lot of interest in understanding the human component to how algorithms are designed, and how they evolve and are adjusted over time and are kept in operation. Facebook and Google: we know there are people behind your algorithm! At a high level, transparency here might involve explaining the goal, purpose, and intent of the algorithm, including editorial goals and the human editorial process or social context crucible from which the algorithm was cast. Who at your company has direct control over the algorithm, who has oversight and is accountable? Ultimately we want to know who are the authors, or the designers, or the team that created this thing. Who are behind these algorithms?

More specifically, for an automatically written story, this type of transparency might include explaining if there were bits that were written by a person, and if so which bits, as well as if the whole thing was reviewed by a human editor before being published. For algorithmic curation this would include disclosing what the algorithm is optimizing for, as well as rationale for the various curation criteria. Are there any hard-coded rules or editorial decisions in the system?


Algorithmic systems often have a big appetite for data, without which they couldn’t do any fancy machine learning, make personalization decisions, or have the raw material for things like automatically written stories. There are many opportunities to be transparent about the data that are driving algorithms in various ways. One opportunity for transparency here is to communicate the quality of the data, including its accuracy, completeness, and uncertainty, as well as its timeliness, magnitude (when training a model), and assumptions or other limitations. But there are other dimensions of data processing that can also be made transparent such as how it was collected, transformed, vetted, and edited (either automatically or by human hands). Some disclosure could be made about whether the data was private or public, and if it incorporated dimensions that if disclosed would have personal privacy implications. Finally, in the case of automatically written text, it would be interesting to show the connection between the underlying data that contributed to a given chunk of text.

The Model

Modeling involves building a simplified microcosm of some system using data and a method that predicts, ranks, associates, or classifies. This really gets into the nuts and bolts, with many potential avenues for transparency. Of high importance is knowing what the model actually uses as input: what are the features or variables used in the algorithm? Oftentimes those features are weighted: what are those weights? If there was training data used in some machine learning process: characterize the data used for that along all of the potential dimensions enumerated above. Since some software modeling tools have different assumptions or limitations: what were the tools used to do the modeling?

Of course this all ties back into human involvement as well, so we want to know the rationale for weightings and the design process for considering alternative models or model comparisons. What are the assumptions (statistical or otherwise) behind the model and where did those assumptions arise from? And if some aspect of the model was not exposed in the front-end, why was that?


Algorithms often make inferences, such as classifications or predictions, leaving us with questions about the accuracy of these techniques and of the implications of possible errors. Algorithm creators might consider benchmarking the inferences in their algorithms against standard datasets and with standard measures of accuracy to disclose some key statistics.  What is the margin of error? What is the accuracy rate, and how many false positives versus false negatives are there? What kinds of steps are taken to remediate known errors? Are errors a result of human involvement, data inputs, or the algorithm itself? Classifiers oftentimes produce a confidence value and this too could be disclosed in aggregate to show the average range of those confidence values as a measure of uncertainty in the outcomes. The disclosure of uncertainty information would seem to be a key factor, though also a fraught one. What are the implications of employing a classifier that you disclose to be accurate only 80% of the time?

Personalization, Visibility, and the Algorithmic Presence

Throughout the discussions there was a lot of interest in knowing if and when algorithms are being employed, in particular when personalization may be in use, but also just to know for instance if A/B testing is being employed. One participant put is as a question: “Am I being watched?” If personalization is in play, then what types of personal information are being used and what is the personal profile of the individual that is driving the personalization? Essentially, people want to know what the algorithm knows about them. But there are also questions of visibility, which implies maintaining access to elements of a curation that have been filtered out in some way. What are the things you’re not seeing, and conversely what are the things that you’re posting (e.g. in a news feed) that other people aren’t seeing. These comments are about having a different viewpoint into an algorithmic curation different than your own personalized version: to compare and contrast it.  There was also an interest in having algorithmic transparency for the rationale of why you’re seeing something in your feed. What exactly caused an element to be included?


So, there’s your laundry list of things that we could potentially be transparent about. But the workshop was also about trying to evaluate the feasibility of transparency for some of these dimensions. And that was incredibly hard. There are several stakeholders here with poorly aligned motivations. Why would media organizations voluntarily provide algorithmic transparency?  The end game here is about accountability to the public, and transparency is just one potential avenue towards that. But what does ethical accountability even look like in a system that relies on algorithmic decision making?

We can’t really ever expect corporate entities to voluntarily disclose information that makes them look bad. If the threat of regulation were there they might take some actions to get the regulators off their backs. But what is really the value proposition for the organization to self-motivate and disclose such information? What’s really at stake for them, or for users for that matter? Credibility and legitimacy were proffered, yet we need more research here to measure how algorithmic transparency might actually affect these attributes, and to what extent. To be most salient, the value proposition perhaps needs to be made as salient as: you will lose income or users, or some other direct business metric will be negatively impacted unless you disclose X,Y, and Z.

Users will likely be interested in more details when something is at odds, or something goes wrong, like a salient error. The dimensions enumerated above could be a starting point for the range of things that could be disclosed in the event of user demand for more information. Users are likely to care most when they themselves are the error, like if they were censored incorrectly (e.g. in a false positive category). If corporations were transparent with predictions about individuals, and had standards for due process in the face of a false positive event, then this would not only empower users by allowing them to correct the error, but also provide feedback data that improves the algorithm in the future. This idea is perhaps the most promising for aligning the motivations between individuals and corporate actors. Corporations want more and better data for training their algorithms. Transparency would allow the users that care most to find and correct errors, which is good for the user, and for the company because they now have better training data.

There was no consensus that that is a clear and present general demand from users for algorithmic transparency. But this is challenging, since many users don’t know what they don’t know. Many people may ultimately simply not care, but others will, and this raises the challenge of trying to meet the needs of many publics, while not polluting the user experience with a surfeit of information for the uninterested. We need more research here too, along several dimensions: to understand what really matters to users about their consumption of algorithmically-created content, but also to develop non-intrusive ways of signaling to those that do care.

Organizations might consider different mechanisms for communicating algorithmic transparency. The notion of an algorithm ombudsperson could help raise awareness and assuage fears in the face of errors. Or, we might develop new and better user interfaces that address transparency at different levels of detail and user interest. Finally, we might experiment with the idea of an “Algorithmic Transparency Report” that would routinely disclose aspects of the five dimensions enumerated above. But what feels less productive are the vague blurbs that Facebook and others have been posting. I hope the outcome of this workshop at least gets us all on the path towards thinking more critically about whether and how we need algorithmic transparency in the media.

Nicholas Diakopoulos is an Assistant Professor at the University of Maryland, College Park College of Journalism and a member of the UMD Human Computer Interaction Lab (HCIL).