Computational Campaign Coverage
Algorithms for automatically generating stories from machine-readable data have been shaking up the news industry, not least since the Associated Press started to automate the production and publication of quarterly earnings reports in 2014. Once developed, such algorithms can create an unlimited number of news stories for a routine and repetitive topic—faster, cheaper, and with fewer errors than any human journalist ever could. Within the “Computational Campaign Coverage” research project, researchers teamed up with the German-based software company AX Semantics to develop automated news based on forecasting data for the 2016 U.S. presidential elections. The data was provided by the PollyVote research project, which also hosted the platform for the publication of the resulting texts. The process of generating the news was completely automated, from collecting and aggregating the forecasting data, to exchanging the data with AX Semantics and generating the texts, to publishing those texts at pollyvote.com. Over the course of the project, nearly 22,000 automated news articles were published in English and German. The project built on the prior work published in the “Guide to Automated Journalism.” This guide provided an overview of the state of automated journalism based on interviews and a review of the literature. The goal of the “Computational Campaign Coverage” project was to conduct our own primary research to gain firsthand experience around the potentials and limitations of automated journalism. The project’s key learnings can be summarized as follows:
Multilingual texts, as well as texts based on a single row in the dataset, are easy to automate. Adding additional insights quickly increases complexity at a level that is difficult to manage. Because of the fully automated process, the rate of errors in the final texts was high. Most errors occurred due to errors in the source data. Efforts for quality control, troubleshooting, and onboarding were higher than expected. It’s difficult to develop a “one-fits-all” algorithm for different story types. Contextual knowledge is a boundary of automation that is reached quickly.
In addition to developing automated news, the project team also conducted an online experiment to study how news consumers perceive the quality of the generated texts (specifically regarding their credibility and readability) and how these quality perceptions depend on various levels of algorithmic transparency.