Towards a Standard for Algorithmic Transparency in the Media
Last week, on April 21st, Facebook announced a few updates to its algorithmically curated news feed. The changes were described as motivated by “improving the experience” and making the “balance of content the right one” for each individual. And while a post with some vague information is better than the kind of vapid corporate rhetoric Jay Rosen recently complained about, we need to develop more sophisticated algorithmic transparency standards than a blog post along the lines of, essentially, “well, it’ll be a little less of this, and a little more of that.”
That’s why, last month at the Tow Center for Digital Journalism, we convened about fifty heavy hitters from the media industry and from academia to talk about algorithmic transparency in the media. The goal of the workshop was to discuss and work towards ideas to support a robust policy of news and information stewardship via algorithm. To warm things up we heard seven amazing five-minute “firestarter” talks with provocative titles like “Are Robots Too Liberal?”, “Accessible Modeling” and “An Ethical Checklist for Robot Journalism”. The videos are all now online for your viewing pleasure.
But the brunt of the workshop was spent delving into three case studies where we see algorithms operating in the public media sphere: “Automatically Generated News Content”, “Simulation, Prediction, and Modeling in Storytelling”, and “Algorithmically Enhanced Curation”. Participants broke out into groups and spent an hour discussing each of these with a case study facilitator. They brainstormed dimensions of the various algorithms in use that could be disclosed relating to how they work or are employed. They evaluated these dimensions on whether they would be feasible, technically or financially, and what the expected impact and significance to the public would be. And they confronted dilemma like how and whether the algorithm could be gamed.
Based on the numerous ideas generated at the workshop, I boiled things down into five broad categories of disclosable information, including:
There is a lot of interest in understanding the human component to how algorithms are designed, and how they evolve and are adjusted over time and are kept in operation. Facebook and Google: we know there are people behind your algorithm! At a high level, transparency here might involve explaining the goal, purpose, and intent of the algorithm, including editorial goals and the human editorial process or social context crucible from which the algorithm was cast. Who at your company has direct control over the algorithm, who has oversight and is accountable? Ultimately we want to know who are the authors, or the designers, or the team that created this thing. Who are behind these algorithms?
More specifically, for an automatically written story, this type of transparency might include explaining if there were bits that were written by a person, and if so which bits, as well as if the whole thing was reviewed by a human editor before being published. For algorithmic curation this would include disclosing what the algorithm is optimizing for, as well as rationale for the various curation criteria. Are there any hard-coded rules or editorial decisions in the system?
Algorithmic systems often have a big appetite for data, without which they couldn’t do any fancy machine learning, make personalization decisions, or have the raw material for things like automatically written stories. There are many opportunities to be transparent about the data that are driving algorithms in various ways. One opportunity for transparency here is to communicate the quality of the data, including its accuracy, completeness, and uncertainty, as well as its timeliness, magnitude (when training a model), and assumptions or other limitations. But there are other dimensions of data processing that can also be made transparent such as how it was collected, transformed, vetted, and edited (either automatically or by human hands). Some disclosure could be made about whether the data was private or public, and if it incorporated dimensions that if disclosed would have personal privacy implications. Finally, in the case of automatically written text, it would be interesting to show the connection between the underlying data that contributed to a given chunk of text.
Modeling involves building a simplified microcosm of some system using data and a method that predicts, ranks, associates, or classifies. This really gets into the nuts and bolts, with many potential avenues for transparency. Of high importance is knowing what the model actually uses as input: what are the features or variables used in the algorithm? Oftentimes those features are weighted: what are those weights? If there was training data used in some machine learning process: characterize the data used for that along all of the potential dimensions enumerated above. Since some software modeling tools have different assumptions or limitations: what were the tools used to do the modeling?
Of course this all ties back into human involvement as well, so we want to know the rationale for weightings and the design process for considering alternative models or model comparisons. What are the assumptions (statistical or otherwise) behind the model and where did those assumptions arise from? And if some aspect of the model was not exposed in the front-end, why was that?
Algorithms often make inferences, such as classifications or predictions, leaving us with questions about the accuracy of these techniques and of the implications of possible errors. Algorithm creators might consider benchmarking the inferences in their algorithms against standard datasets and with standard measures of accuracy to disclose some key statistics. What is the margin of error? What is the accuracy rate, and how many false positives versus false negatives are there? What kinds of steps are taken to remediate known errors? Are errors a result of human involvement, data inputs, or the algorithm itself? Classifiers oftentimes produce a confidence value and this too could be disclosed in aggregate to show the average range of those confidence values as a measure of uncertainty in the outcomes. The disclosure of uncertainty information would seem to be a key factor, though also a fraught one. What are the implications of employing a classifier that you disclose to be accurate only 80% of the time?
Personalization, Visibility, and the Algorithmic Presence
Throughout the discussions there was a lot of interest in knowing if and when algorithms are being employed, in particular when personalization may be in use, but also just to know for instance if A/B testing is being employed. One participant put is as a question: “Am I being watched?” If personalization is in play, then what types of personal information are being used and what is the personal profile of the individual that is driving the personalization? Essentially, people want to know what the algorithm knows about them. But there are also questions of visibility, which implies maintaining access to elements of a curation that have been filtered out in some way. What are the things you’re not seeing, and conversely what are the things that you’re posting (e.g. in a news feed) that other people aren’t seeing. These comments are about having a different viewpoint into an algorithmic curation different than your own personalized version: to compare and contrast it. There was also an interest in having algorithmic transparency for the rationale of why you’re seeing something in your feed. What exactly caused an element to be included?
So, there’s your laundry list of things that we could potentially be transparent about. But the workshop was also about trying to evaluate the feasibility of transparency for some of these dimensions. And that was incredibly hard. There are several stakeholders here with poorly aligned motivations. Why would media organizations voluntarily provide algorithmic transparency? The end game here is about accountability to the public, and transparency is just one potential avenue towards that. But what does ethical accountability even look like in a system that relies on algorithmic decision making?
We can’t really ever expect corporate entities to voluntarily disclose information that makes them look bad. If the threat of regulation were there they might take some actions to get the regulators off their backs. But what is really the value proposition for the organization to self-motivate and disclose such information? What’s really at stake for them, or for users for that matter? Credibility and legitimacy were proffered, yet we need more research here to measure how algorithmic transparency might actually affect these attributes, and to what extent. To be most salient, the value proposition perhaps needs to be made as salient as: you will lose income or users, or some other direct business metric will be negatively impacted unless you disclose X,Y, and Z.
Users will likely be interested in more details when something is at odds, or something goes wrong, like a salient error. The dimensions enumerated above could be a starting point for the range of things that could be disclosed in the event of user demand for more information. Users are likely to care most when they themselves are the error, like if they were censored incorrectly (e.g. in a false positive category). If corporations were transparent with predictions about individuals, and had standards for due process in the face of a false positive event, then this would not only empower users by allowing them to correct the error, but also provide feedback data that improves the algorithm in the future. This idea is perhaps the most promising for aligning the motivations between individuals and corporate actors. Corporations want more and better data for training their algorithms. Transparency would allow the users that care most to find and correct errors, which is good for the user, and for the company because they now have better training data.
There was no consensus that that is a clear and present general demand from users for algorithmic transparency. But this is challenging, since many users don’t know what they don’t know. Many people may ultimately simply not care, but others will, and this raises the challenge of trying to meet the needs of many publics, while not polluting the user experience with a surfeit of information for the uninterested. We need more research here too, along several dimensions: to understand what really matters to users about their consumption of algorithmically-created content, but also to develop non-intrusive ways of signaling to those that do care.
Organizations might consider different mechanisms for communicating algorithmic transparency. The notion of an algorithm ombudsperson could help raise awareness and assuage fears in the face of errors. Or, we might develop new and better user interfaces that address transparency at different levels of detail and user interest. Finally, we might experiment with the idea of an “Algorithmic Transparency Report” that would routinely disclose aspects of the five dimensions enumerated above. But what feels less productive are the vague blurbs that Facebook and others have been posting. I hope the outcome of this workshop at least gets us all on the path towards thinking more critically about whether and how we need algorithmic transparency in the media.