The NewsLynx Tool For Impact Tracking: a Walkthrough
As the public sphere grows increasingly complex and the funding sources for journalism change, researchers and working journalists alike are asking “what does our work do in the world?” This question has become all-the-more pressing as content is filtered and curated online with metrics like “clicks” and “attention minutes” which are often disconnected from journalistic goals.
Last year [in May], we officially announced NewsLynx, a tool for newsrooms to keep track of their journalism’s impact – what happens in the world after reporters publish their work. NewsLynx aims to help answer this question by combining quantitative metrics with qualitative annotations. question of Here’s a first look at the platform and some of the design decisions that went into it.
How do you get data into it and how do you organize it?
A key issue in designing an easy-to-use impact-assessment platform is whether to integrate it within a CMS or not. Although building NewsLynx inside a CMS would solve some issues such as how to load in articles or how to encourage staff members to use the platform, choosing which CMS to support would limit the number of newsrooms that could use the platform. It would also mean that the tool would become outdated when a particular CMS falls out of favor.
When you sign-up for NewsLynx, it grabs your articles based on RSS feeds that you enter into the Settings page. Not all RSS feeds are built alike, however, so we needed to build a custom RSS reader to deal with the edge cases. We also utilize Embedly’s excellent content extraction API to fetch the article’s full text. This approach makes getting up-and-running with NewsLynx relatively painless.
Another important consideration was how best to categorize qualitative and anecdotal indicators of impact. In developing NewsLynx, we have strived to create an impact taxonomy which is capable of accounting for a wide variety of events and flexible enough to adapt to shifting contexts. This is crucial because, without such a taxonomy, it is impossible to make comparisons across different articles, newsrooms, or even industries. In constructing our impact taxonomy, we were heavily inspired by the work that Chalkbeat and the Center for Investigative Reporting.
Chalkbeat’s system, named MORI, drastically simplifies impact into two categories: “Citations” (is someone talking about it?) and “Change” (Did it effect something?). While these categories can contain more specific subcategories such as “story pickup” or “a bill was passed”, this simplification helps them get a sense of the “big picture” by enabling newsroom-level comparisons.
A related strategy might involve quantifying these values somehow. For example, maybe an article that led to a change in the law is worth eight points, an article cited by a politician is worth four (maybe?) and an article that led to a proposed bill and vigorous public debate but failed to become law is worth seven (or maybe nine?). You can see how difficult it becomes to assign these values. It’s similar to the Borges story “On the Exactitude in Science” where map makers require one-to-one precision between a map and the land it represents.
“… In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.”
This level of specificity obviates the need for a map, which necessarily must sacrifice some level of detail for usability. In statistical language, overly specific categories are like overfitting your model to your data — it works perfectly for the specific information you’re looking at, but you lose the ability to generalize and compare, which is the point of making a model.
In addition to “Citation” and “Change” we added two more categories:
- “Achievement” such as whether an article won an award or maybe saw record traffic
- “Other” in case we missed something
From CIR, we borrowed the idea that impact can occur at different scales such as to an individual or to an entire institution. In their system, they classify impact into “Micro”, “Meso”, and “Macro” levels. We reimagined these as the five levels below:
- Individual — A single person.
- Community — A group of people, loosely defined by the newsroom.
- Institutional — A government or organization.
- Media — A media organization that republishes, picks up, localizes or cites a newsroom’s work.
- Internal — An article had a strong effect internally, possibly opening the door to future articles on this topic, more organizational support, or the advancement of staff member’s career (potentially through a promotion or an increase in external attention).
We combined these two concepts such that organizations create an “impact tag” that has a category and a level. This lets newsrooms define their own taxonomy while allowing for broad comparisons across two dimensions: what kind of impact was it (the category) and who did it affect (the level).
Here’s an example configuration:
We also allow for “subject tags,” which are equivalent to the tags that most news organizations currently use. These groupings come in handy when one wants to quickly drill-down to a specific coverage area, or make comparisons across multiple coverage areas.
Building a better impact workflow
In a previous post, we discussed the problem of tracking mentions across the internet and proposed one solution, dubbed the Approval River.
On this page, newsrooms create different “Recipes,” which are little scripts that go off and monitor the internet for the specified search. We currently have recipes that monitor Google Alerts, Twitter lists, Twitter users, Twitter search, Facebook pages, and Reddit.
For example, let’s say you want to monitor members of congress when they tweet a link to your article. You would set up a Twitter List recipe, input your domain name and any matching tweets would show up in the Approval River where it can be approved, given an corresponding impact tag (most likely a Citation-Institutional tag) and assigned to an article. You could also make a general Twitter search recipe for articles on your domain and filter by minimum number of followers in order to more easily see when someone with a large reach is sharing your journalism.
We’ve automated as much as this process as possible but impact collection, we feel, still requires a person in the loop and can’t be completely outsourced to the robots. We designed the Approval River after extensive interviews with people who are currently in charge of this type of clip searching. Many of the features in it are designed to alleviate pain points in that process.
As we continue to develop the platform, we are striving to make the process of contributing new recipes as simple as possible. Currently, this is as simple as writing a simple python script to grab data from a source and feed it through a series of filters. Another potential approach would be to integrate NewsLynx as a channel on IFTTT. Both approaches get at our desire of making NewsLynx less of an isolated service and more of a node in the rich ecosystem of tools that newsrooms use to monitor their impact and influence online.
One of the biggest frustrations we saw people having with their current analytics platforms was the lack of context in the numbers they saw. If an article got 125,000 pageviews, was that above average or below average? Was it below average for the main website but above average for, say, feature articles or articles on the environment? Similarly, if you see a spike in traffic at 11:34am, did that correlate with your own promotion efforts on the home page, Twitter or Facebook? Many of these concerns came out of the work that Brian did when he was at the New York Times on “Pageviews Above Replacement”.
With this question of context in mind, we built two analytics modes: comparison and detail.
This view is really where tagging your articles with subject or impact tags shows its power. You can filter your articles by tag and then add them to the comparison view, which, for now, shows metrics such as Twitter shares and Facebook likes over time, pageviews, and time on page. In this interface, you can quickly add your top performing articles on fracking, for instance, and see how they compare with politics articles. Because you can filter by impact tag or sort by more quantitative metrics, “top performing” is now up to the newsroom to define and no longer limited to what Google Analytics or a similar product outputs.
Each of these blue bars has a marker, which is the average of all articles on that metric. In this manner, you can quickly see if an article is above or below average. In case you have one article that was such a high performer it skews your averages, you can also have this marker represent the median value. Similarly, you can choose that this value is the average or median of a subset of articles, as you’ve defined them in your subject tags.
This view aims to give context to the life of a story by showing a timeseries of pageviews, Twitter and Facebook shares, alongside promotional events such as when that article appeared on the home page and when your newsroom’s Facebook and Twitter accounts posted it. We also show any events that you’ve added from the Approval River or created manually.
We continue contextualizing your Google Analytics data on the “How people are reading & finding it” tab which shows breakdowns of device use, internal/external visits and referrers. Again, the marker represents either the average or median value according to the user’s desire.
We also display every tweet that mentions this article, sorted by the number of followers so you can have a record of who was talking about it, and if it makes sense to do so, make an impact event.
That’s NewsLynx v1! We have a few features we’re working on the make things faster and help workflow. We are currently close to our max for trial newsrooms but if you’re interested, send us an email at firstname.lastname@example.org.