Past Events

Why We Like Pinterest for Fieldwork: Research by Nikki Usher and Phil Howard

2

Nikki Usher, GWU

Phil Howard, UW and CEU

7/16/2014

Anyone tackling fieldwork these days can chose from a wide selection of digital tools to put in their methodological toolkit. Among the best of these tools are platforms that let you archive, analyze, and disseminate at the same time. It used to be that these were fairly distinct stages of research, especially for the most positivist among us. You came up with research questions, chose a field site, entered the field site, left the field site, analyzed your findings, got them published, and shared your research output with friends and colleagues.

 

But the post-positivist approach that many of us like involves adapting your research questions—reflexively and responsively—while doing fieldwork. Entering and leaving your field site is not a cool, clean and complete process. We analyze findings as we go, and involve our research subjects in the analysis. We publish, but often in journals or books that can’t reproduce the myriad digital artifacts that are meaningful in network ethnography. Actor network theory, activity theory, science and technology studies and several other modes of social and humanistic inquiry approach research as something that involves both people and devices. Moreover, the dissemination of work doesn’t have to be something that happens after publication or even at the end of a research plan.

 

Nikki’s work involves qualitative ethnographic work at field sites where research can last from five months to a brief week visit to a quick drop in day. She learned the hard way from her research for Making News at The New York Times that failing to find a good way to organize and capture images was a missed opportunity post-data collection. Since then, Nikki’s been using Pinterest for fieldwork image gathering quite a bit. Phil’s work on The Managed Citizen was set back when he lost two weeks of field notes on the chaotic floor of the Republican National Convention in 2000 (security incinerates all the detritus left by convention goers). He’s been digitizing field observations ever since.

 

Some people put together personal websites about their research journey. Some share over Twitter. And there are plenty of beta tools, open source or otherwise, that people play with. We’ve both enjoyed using Pinterest for our research projects. Here are some points on how we use it and why we like it.

 

How To Use It

  1. When you start, think of this as your research tool and your resource.   If you dedicate yourself to this as your primary archiving system for digital artifacts you are more likely to build it up over time. If you think of this as a social media publicity gimmick for your research, you’ll eventually lose interest and it is less likely to be useful for anyone else.
  2. Integrate it with your mobile phone because this amps up your capacity for portable, taggable, image data collection.
  3. Link the board posts to Twitter or your other social media feeds. Pinterest itself isn’t that lively a place for researchers yet. The people who want to visit your Pinterest page are probably actively following your activities on other platforms so be sure to let content flow across platforms.
  4. Pin lots of things, and lots of different kinds of things. Include decent captions though be aware that if you are feeding Twitter you need to fit character limits.
  5. Use it to collect images you have found online, images you’ve taken yourself during your fieldwork, and invite the communities you are working with to contribute.
  6. Backup and export things once in a while for safe keeping. There is no built-in export function, but there are a wide variety of hacks and workarounds for transporting your archive.

 

What You Get

  1. Pinterest makes it easy to track the progress of the image data you gather. You may find yourself taking more photos in the field because they can be easily arranged, saved and categorized.
  2. Using it regularly adds another level of data as photos and documents captured on phone and then added on Pinterest can be quickly field captioned and then re-catalogued, giving you a chance to review the visual and built environment of your field site and interrogate your observations afresh.
  3. Visually-enhanced constant comparative methods: post-data collection, you can go beyond notes to images and captions that are easily scanned for patterns and points of divergence. This may be going far beyond what Glaser and Strauss had imagined, of course.
  4. Perhaps most important, when you forget what something looks like when you’re writing up your results, you’ve got an instant, easily searchable database of images and clues to refresh your memory.

Why We Like It

  1. It’s great for spontaneous presentations. Images are such an important part of presenting any research. Having a quick publically accessible archive of content allows you to speak, on the fly, about what you are up to. You can’t give a tour of your Pinterest page for a job talk. But having the resource there means you can call on images quickly during a Q&A period, or quickly load something relevant on a phone or browser during a casual conversation about your work.
  2. It gives you a way to interact with subjects. Having the Pinterest link allows you to show a potential research subject what you are up to and what you are interested in. During interviews it allows you to engage people on their interpretation of things. Having visual prompts handy can enrich and enliven any focus group or single subject interview. These don’t only prompt further conversation, they can prompt subjects to give you even more links, images, videos and other digital artifacts.
  3. It makes your research interests transparent. Having the images, videos and artifacts for anyone to see is a way for us to show what we are doing. Anyone with interest in the project and the board link is privy to our research goals. Our Pinterest page may be far less complicated than many of our other efforts to explain our work to a general audience.
  4. You can disseminate as you go. If you get the content flow right, you can tell people about your research as you are doing it. Letting people know about what you are working on is always a good career strategy. Giving people images rather than article abstracts and draft chapters gives them something to visualize and improves the ambient contact with your research community
  5. It makes digital artifacts more permanent. As long as you keep your Pinterest, what you have gathered can become a stable resource for anyone interested in your subjects. As sites and material artifacts change, what you have gathered offers a permanent and easily accessible snapshot of a particular moment of inquiry for posterity.

 

Pinterest Wish-list

One of us is a Windows Phone user (yes really) and it would be great if there was a real Pinterest app for the Windows Phone. One touch integration from the iPhone, much like Twitter, Facebook, and Flicker from the camera roll would be great (though there is an easy hack).

 

We wish it would be easier to have open, collaborative boards. Right now, the only person who can add to a board is you, at least at first. You can invite other people to join a “group board” via email, but Pinterest does not have open boards that allow anyone with a board link to add content.

 

Here’s a look at our Pinboards: Phil Howard’s Tech + Politics board, and Nikki Usher’s boards on U.S. Newspapers. We welcome your thoughts…and send us images!

 

 

 

 

Nikki Usher is an assistant professor at the George Washington University’s School of Media and Public Affairs. Her project is Post Industrial News Spaces and Places with Columbia’s Tow Center on Digital Journalism. Phil Howard is a professor at the Central European University and the University of Washington. His project is a book on Political Power and the Internet of Things for Yale University Press.

 

Research

Knight Foundation joins The Tow Foundation as a sponsor for the initiative headed by Columbia University’s Tow Center for Digital Journalism

3

Knight Foundation joins The Tow Foundation as a sponsor for the initiative headed by Columbia University’s Tow Center for Digital Journalism

Tow Center program defends journalism from the threat of mass surveillance ” by Jennifer Henrichsen and Taylor Owen on Knight Blog 

NEW YORK – June 10, 2014 – The Journalism After Snowden initiative, a project of The Tow Center for Digital Journalism at Columbia University Graduate School of Journalism, will expand to further explore the role of journalism in the age of surveillance, thanks to new funding from the John S. and James L. Knight Foundation.

Journalism After Snowden will contribute high-quality conversations and research to the national debate around state surveillance and freedom of expression through a yearlong series of events, research projects and articles that will be published in coordination with the Columbia Journalism Review.

Generous funding from The Tow Foundation established the initiative earlier in the academic year. The initiative officially kicked off in January with a high-level panel of prominent journalists and First Amendment scholars who tackled digital privacy, state surveillance and the First Amendment rights of journalists.

Read more in the press release from the Knight Foundation.

Past Events

Glenn Greenwald Speaks | Join the Tow Center for an #AfterSnowden Talk in San Francisco on June 18, 2014

3

Join the Tow Center for an evening lecture with Glenn Greenwald, who will discuss the state of journalism today and his recent reporting on surveillance and national security issues, on June 18, 2014 at 7pm at the Nourse Theater in San Francisco.

In April 2014, Greenwald and his colleagues at the Guardian received the Pulitzer Prize for Public Service. Don’t miss Greenwald speak in-person as he fits all the pieces together, recounting his high-intensity eleven-day trip to Hong Kong, examining the broader implications of the surveillance detailed in his reporting, and revealing fresh information on the NSA’s unprecedented abuse of power with never-before-seen documents entrusted to him by Snowden himself.  Sponsored by: Haymarket Books, Center for Economic Research and Social Change, Glaser Progress Foundation, Tow Center for Digital Journalism – Columbia Journalism School, reserve your seat for Glenn Greenwald Speaks / Edward Snowden, the NSA, and the U.S. Surveillance State.

Please note: this is a ticketed event. Tickets are $4.75 each.  | Purchase Tickets

This event is part of Journalism After Snowden, a yearlong series of events, research projects and writing from the Tow Center for Digital Journalism in collaboration with the Columbia Journalism Review. For updates on Journalism After Snowden, follow the Tow Center on Twitter @TowCenter #AfterSnowden.

Journalism After Snowden is funded by The Tow Foundation and the John S. and James L. Knight Foundation.

Lauren Mack is the Research Associate at the Tow Center. Follow her on Twitter @lmack.

Past Events

Tow Center Launches Amateur Footage: A Global Study of User-Generated Content in TV and Online News Output

5

Crediting is rare, there’s a huge gulf in how senior managers and newsdesks talk about it and there’s a significant reliance on news agencies for discovery and verification. These are some of the key takeaways of Amateur Footage: A Global Study of User-Generated Content in TV and Online News Output published today by the Tow Center of Digital Journalism.

 

The aim of this research project was to provide the first comprehensive report about the use of user-generated content (UGC) among broadcast news channels. UGC being – for this report – photographs and videos captured by people unrelated to the newsroom, who would not describe themselves as professional journalists.

 

Some of the Principle Findings are:

  • UGC is used by news organizations daily and can produce stories that otherwise would not, or could not, be told. However, it is often used only when other imagery is not available. 40% of UGC on television was related to Syria.
  • There is a significant reliance on news agencies in terms of discovering and verifying UGC. The news agencies have different practices and standards in terms of how they work with UGC.
  • News organizations are poor at acknowledging when they are using UGC and worse at crediting the individuals responsible for capturing it. Our data showed that: 72 percent of UGC was not labeled or described as UGC and just 16 percent of UGC on TV had an onscreen credit.
  • News managers are often unaware of the complexities involved in the everyday work of discovering, verifying, and clearing rights for UGC. Consequently, staff in many newsrooms do not receive the training and support required to develop these skills.
  • Vicarious trauma is a real issue for journalists working with UGC every day – and it’s different from traditional newsroom trauma. Some newsrooms are aware of this – but many have no structured approach or policy in place to deal with it.
  • There is a fear amongst rights managers in newsrooms that a legal case could seriously impact the use of UGC by news organisations in the future

 

This research was designed to answer two key questions.  First, when and how is UGC used by broadcast news organizations, on air as well as online?  Second, does the integration of UGC into output cause any particular issues for news organizations? What are those issues and how do newsrooms handle them?

 

The work was completed in two phases. The first involved an in-depth, quantitative content analysis examining when and how eight international news broadcasters use UGC.  1,164 hours of TV output and 2,254 Web pages were analyzed here. The second was entirely qualitative and saw the team interview 64 news managers, editors, and journalists from 38 news organizations based in 24 countries across five continents. This report takes both phases to provide a detailed overview of the key findings.

 

The research provides the first concrete figures we have about the level of reliance on UGC by international news channels. It also explores six key issues that newsrooms face in terms of UGC. The report is designed around those six issues, meaning you can dip into any one particular issue:

1) Workflow – how is UGC discovered and verified? Do newsrooms do this themselves, and if so, which desk is responsible? Or is UGC ‘outsourced’ to news agencies?

2) Verification – are there systematic processes for verifying UGC? Is there a threshold that has to be reached before a piece of content can be used?

3) Permissions – how do newsrooms seek permissions? Do newsrooms understand the copyright implications around UGC?

4) Crediting – do newsrooms credit UGC?

5) Labeling – are newsrooms transparent about the types of UGC that they use in terms of who uploaded the UGC and whether they have a specific agenda?

6) Ethics and Responsibilities – how do newsrooms consider their responsibilities to uploaders, the audience and their own staff?

 

The full report can be viewed here.

Announcements, Events, Past Events, Research

Digital Security and Source Protection For Journalists: Research by Susan McGregor

2

EXECUTIVE SUMMARY

The law and technologies that govern the functioning of today’s digital communication systems have dramatically affected journalists’ ability to protect their sources.  This paper offers an overview of how these legal and technical systems developed, and how their intersection exposes all digital communications – not just those of journalists and their sources – to scrutiny. Strategies for reducing this exposure are explored, along with recommendations for individuals and organizations about how to address this pervasive issue.

 

DOWNLOAD THE PDF

GitBookCover

 

 

 



Order a (bound) printed copy.

Comments, questions & contributions are welcome on the version-controlled text, available as a GitBook here:

http://susanemcg.gitbooks.io/digital-security-for-journalists/

DIGITAL SECURITY AND SOURCE PROTECTION FOR JOURNALISTS

Preamble

Digital Security for Journalists A 21st Century Imperative

The Law: Security and Privacy in Context

The Technology: Understanding the Infrastructure of Digital Communications

The Strategies: Understanding the Infrastructure of Digital Communications

Looking Ahead

Footnotes

 

Research

Diversity in the Robot Reporter Newsroom

2

bots

The Associated Press recently announced a big new hire: A robot reporter from Automated Insights (AI) would be employed to write up to 4,400 earnings report stories per quarter. Last year, that same automated writing software produced over 300 million stories — that’s some serious scale from a single algorithmic entity.

So what happens to media diversity in the face of massive automated content production platforms like the one Automated Insights created? Despite the fact that we’ve done pretty abysmally at incorporating a balance of minority and gender perspectives in the news media, I think we’d all like to believe that by including diverse perspectives in the reporting and editing of news we fly closer to the truth. A silver lining to the newspaper industry crash has been a profusion of smaller, more nimble media outlets, allowing for far more variability and diversity in the ideas that we’re exposed to.

Of course software has biases and although the basic anatomy of robot journalists is comparable, there are variations within and amongst different systems such as the style and tone that’s produced as well as the editorial criteria that are coded into the systems. Algorithms are the product of a range of human choices including various criteria, parameters, or training data that can also pass along inherited, systematic biases. So while a robot reporter offers the promise of scale (and of reducing costs), we need to consider an over-reliance on any one single automated system. For the sake of media diversity the one bot needs to fork itself and become 100,000.

We saw this in microcosm unfold over the last week. The @wikiparliament bot was launched in the UK to monitor edits to Wikipedia from IP addresses within parliament (a form of transparency and accountability for who was editing what). Within days it had been mimed by the @congressedits bot which was set up to monitor the U.S. Congress. What was particularly interesting about @congressedits though is that it was open sourced by creator Ed Summers. And that allowed the bot to quickly spread and be adapted for different jurisdictions like Australia, Canada, France, Sweden, Chile, Germany, and even Russia.

Tailoring a bot for different countries is just one (relatively simple) form of adaptation, but I think diversifying bots for different editorial perspectives could similarly benefit from a platform. I would propose that we need to build an open-source news bot architecture that different news and journalistic organizations could use as a scaffolding to encode their own editorial intents, newsworthiness criteria, parameters, data sets, ranking algorithms, cultures, and souls into. By creating a flexible platform as an underlying starting point, the automated media ecology could adapt and diversify faster and into new domains or applications.

Such a platform would also enable the expansion of bots oriented towards different journalistic tasks. A lot of the news and information bots you find on social media these days are parrots of various ilks: they aggregate content on a particular topical niche, like @BadBluePrep, @FintechBot and @CelebNewsBot or for a geographical area like @North_GA, or they simply retweet other accounts based on some trigger words. Some of the more sophisticated bots do look at data feeds to generate novel insights, like @treasuryio or @mediagalleries, but there’s so much more that could be done if we had a flexible bot platform.

For instance we might consider building bots that act as information collectors and solicitors, moving away from pure content production to content acquisition. This isn’t so far off really. Researchers at IBM have been working on this for a couple years already and have already build a prototype system that “automatically identifies and ask[s] targeted strangers on Twitter for desired information.” The technology is oriented towards collecting accurate and up-to-date information from specific situations where crowd information may be valuable. It’s relatively easy to imagine an automated news bot being launched after a major news event to identify and solicit information, facts, or photos from people most likely nearby or involved in the event. In another related project the same group at IBM has been developing technology to identify people on Twitter that are more likely to propagate (Read: Retweet) information relating to public safety news alerts. Essentially they grease the gears of social dissemination by identifying just the right people for a given topic and at a particular time who are most likely to further share the information.

There are tons of applications for news bots just waiting for journalists to build them: factchecking, information gathering, network bridging, audience development etc. etc. Robot journalists don’t just have to be reporters. They can be editors, or even (hush) work on the business side.

What I think we don’t want to end up with is the Facebook or Google of robot reporting: “one algorithm to rule them all”. It’s great that the Associated Press is exploring the use of these technologies to scale up their content creation, but down the line when the use of writing algorithms extends far beyond earnings reports, utilizing only one platform may ultimately lead to homogenization and frustrate attempts to build a diverse media sphere. Instead the world that we need to actively create is one where there are thousands of artisanal news bots serving communities and variegated audiences, each crafted to fit a particular context and perhaps with a unique editorial intent. Having an open source platform would help enable that, and offer possibilities to plug in and explore a host of new applications for bots as well.

Research

The Anatomy of a Robot Journalist

5

Given that an entire afternoon was dedicated to a “Robot Journalism Bootcamp” at the Global Editors Network Summit this week, it’s probably safe to say that automated journalism has finally gone mainstream — hey it’s only taken close to 40 years since the first story writing algorithm was created at Yale. But there are still lots of ethical questions and debates that we need to sort out, from source transparency to corrections policies for bots. Part of that hinges on exactly how these auto-writing algorithms work: What are their limitations and how might we design them to be more value-sensitive to journalism?

Despite the proprietary nature of most robot journalists, the great thing about patents is that they’re public. And patents have been granted to several major players in the robo-journalism space already, including Narrative ScienceAutomated Insights, and Yseop, making their algorithms just a little bit less opaque in terms of how they operate. More patents are in the pipeline from both heavy weights like CBS Interactive, and start-ups like Fantasy Journalist. So how does a robo-writer from Narrative Science really work?

Every robot journalist first needs to ingest a bunch of data. Data rich domains like weather were some of the first to have practical natural language generation systems. Now we’re seeing a lot of robot journalism applied to sports and finance — domains where the data can be standardized and made fairly clean. The development of sensor journalism may provide entirely new troves of data for producing automated stories. Key here is having clean and comprehensive data, so if you’re working in a domain that’s still stuck with PDFs or sparse access, the robots haven’t gotten there yet.

After data is read in by the algorithm the next step is to compute interesting or newsworthy features from the data. Basically the algorithm is trying to figure out the most critical aspects of an event, like a sports game. It has newsworthiness criteria built into its statistics. So for example, it looks for surprising statistical deviations like minimums, maximums, or outliers, big swings and changes in a value, violations of an expectation, a threshold being crossed, or a substantial change in a predictive model. “Any feature the value of which deviates significantly from prior expectation, whether the source of that expectation is due to a local computation or from an external source, is interesting by virtue of that deviation from expectation,” the Narrative Science patent reads. So for a baseball game the algorithm computes “win probability” after every play. If win probability has a big delta in-between two plays it probably means something important just happened and the algorithm puts that on a list of events that might be worthy of inclusion in the final story.

Once some interesting features have been identified, angles are then selected from a pre-authored library. Angles are explanatory or narrative structures that provide coherence to the overall story. Basically they are patterns of events, circumstances, entities, and their features. An angle for a sports story might be “back-and-forth horserace”, “heroic individual performance”, “strong team effort”, or “came out of a slump”. Certain angles are triggered according to the presence of certain derived features (from the previous step). Each angle is given an importance value from 1 to 10 which is then used to rank that angle against all of the other proposed angles.

Once the angles have been determined and ordered they are linked to specific story points, which connect back to individual pieces of data like names of players or specific numeric values like score. Story points can also be chosen and prioritized to account for personal interests such as home team players. These points can then be augmented with additional factual content drawn from internet databases such as where a player is from, or a quote or picture of them.

The last step the robot journalist takes is natural language generation, which for the Narrative Science system is done by recursively traversing all of the angle and story point representations and using phrasal generation routines to generate and splice together the actual English text. This is probably by far the most straightforward aspect of the entire pipeline — it’s pretty much just fancy templates.

So, there you have it, the pipeline for a robot journalist: (1) ingest data, (2) compute newsworthy aspects of the data, (3) identify relevant angles and prioritize them, (4) link angles to story points, and (5) generate the output text.

Obviously there can be variations to this basic pipeline as well. Automated insights for example uses randomization to provide variability in output stories and also incorporates a more sophisticated use of narrative tones that can be used to generate text. Based on a desired tone, different text might be generated to adhere to an apathetic, confident, pessimistic, or enthusiastic tone. YSeop on the other hand uses techniques for augmenting templates with metadata so that they’re more flexible. This allows templates to for instance conjugate verbs depending on the data being used. A post generation analyzer (you might call it a robot editor) from YSeop further improves the style of a written text by looking for repeated words and substituting synonyms or alternate words.

From my reading, I’d have to say that the Narrative Science patent seems to be the most informed by journalism. It stresses the notion of newsworthiness and editorial in crafting a narrative. But that’s not to say that the stylistic innovations from Automated Insights, and template flexibility of YSeop aren’t important. What still seems to be lacking though is a broader sense of newsworthiness besides “deviance” in these algorithms. Harcup and O’Neill identified 10 modern newsworthiness values, each of which we might make an attempt at mimicking in code: reference to the power elite, reference to celebrities, entertainment, surprise, bad news, good news, magnitude (i.e. significance to a large number of people), cultural relevance to audience, follow-up, and newspaper agenda. How might robot journalists evolve when they have a fuller palette of editorial intents available to them?

Research

Sensors and Journalism, A Major Report

3

Today, the Tow Center is publishing a report about the uses, the opportunities and the risks of sensors and journalism.

We’re describing the landscape where sensors and journalism combine, and continues on to define necessary terms for understanding this area of research. Reporters are using sensors in an era when the rapid development of technology is moving data into the mainstream of journalism. The increasing ubiquity of sensors, their increasing capability and accessibility are on the supply side, while investigative reporters, computer aided reporters and journalist/technologists are on the demand side.

We are including drones in the field of sensing, partially because of the amount of attention they’re currently receiving, and partially because of their potential to extend human sight far beyond our bodily bounds.

While recent commentaries about journalistic sensing have focused just on sensors that journalists have built themselves (or commissioned), our definition also includes journalistic uses of data from sensor systems that are not controlled by the reporters themselves. We have excluded opinion polling, information gathered by humans’ five senses, and data produced by monitoring computer processes like bit-torrent networks.

That said, our description should not be used to separate sensor-based journalism from other reporting processes. The intellectual tools we discuss may be useful for many data-intensive projects, and sensor reporting needs to be integrated with traditional forms.

The introduction also includes a chapter by scholar Charles Berret, who has written a sensor history, charting humanity’s efforts to extend the reach of our five natural senses. It starts with the scales unearthed by archeologists, the Neolithic markers like Stonehenge, and the agricultural tools from the Nile region. Berret notes that, in the 1500’s, the astronomer Tycho Brahe built a data network using the post, which compiled sensor readings to draw the most accurate and comprehensive star maps of his time. Cameras and sound sensors came in the nineteenth century, a moment ‘when mechanical sensors were first treated with greater credibility than the human observer.’ The history outlined here only goes as far as the first half of the twentieth century, but within our time period it covers early medical sensors in the form of René Laennec’s stethoscope and Willem Einthoven’s electrocardiogram, and meteorologists’ use of doppler radar.
The introduction finishes by outlining the characteristics of sensors that make them useful—or not—and helping readers identify what elements of the world can be sensed.

Case Studies

The report’s second section, containing case studies, examines seven projects that used sensors for journalism. Each study includes the story of what happened and then offers analysis in which we identify its distinctive or noteworthy elements, as well as the lessons journalists may take from the projects. The case studies start to show distinct types of sensor uses that suit different journalistic goals. The first type is when investigative reporters (environmental reporters in these two examples) design a sensing process to collect data with the intent of testing a hypothesis. They used relatively mature professional equipment and consulted with experts. They had justifiable confidence in their data, even though their processes were quite different from how scientists work when intending to publish in a peer-reviewed journal, or when doing work on behalf of regulators.
Another type of sensor use by journalists is accessing data from municipal sensor systems. The Sun Sentinel won a Pulitzer Prize by using tollgate data to investigate widespread speeding by off duty Florida police. The Washington Post published an extensive explanatory feature based on data from a network of microphones installed by the city law enforcement. A separate type of journalistic sensing involves DIY hardware development. At the moment, these projects value participation and informal science education. At the moment the equipment they use is unlikely to produce data that can be heavily relied upon in legal or health settings. However, the makers in this part of the field see great long-term potential, inspired by open-source software, a phenomenon that has returned great value to newsrooms. In case studies, we have also added analysis of the U.S. drone journalism industry, as it stands right now. At the high end, a small number of organizations are using footage shot by specialist pilots of professional cinematography drones. In the middle, enterprising news industry employees are experimenting with pro-am equipment costing hundreds, not thousands of dollars. Mainstream media organizations are also sourcing drone footage shot by hobbyists. All of this activity is proceeding despite a rapidly changing, highly contested regulatory environment.

Laws, Ethics, Sensors and Journalism

For the next section, about laws and ethics for reporting with sensors, we recruited 12 experts to write a chapter each. They applied their considerable knowledge and ability from professions in law, technology, ethics, academic research, and the sciences. In the individual essays, each helps identify and navigate the key issues that arise when their field intersects with sensors and journalism.

The authors who address the privacy and surveillance issues write that these are early days for the field. The courts have thus far dealt with consent to record, defamation and false light in the context of cameras and microphones. The potential for journalists to break those laws using different types of sensors certainly exists, and if legal claims are made courts will likely consider the ethical standards that emerge in these next few years of sensor reporting. The field is emerging even as relationship between newsrooms and their audiences transforms. Our authors suggest that journalists should involve their communities as they negotiate the tricky questions of who owns and controls personal data from sensors.

Newsroom managers who have staff making or acquiring hardware should also acquaint themselves with the basics of open-source licensing. Often, journalists who design their own sensing systems will lean towards sharing their work under open-source principles, but this may involve legal liability if hardware goes wrong and causes physical damage. The risks are avoidable, however, as the article by Diana Cooper makes clear.

Still in the realm of legal issues for hardware makers, this current phase of rapid, widespread DIY development is moving a lot faster than The Federal Communications Commission. The FCC requires that any electronic device that might emit radio interference be tested and approved before marketing. However, that regime did not consider many conceivable journalistic uses of custom sensors produced in small batches.

If and when sensing, in particular drone use, becomes a widespread journalistic practice, human error is likely. Serious mistakes will attract negligence claims, following in a tradition codified by The Digest of Justinian in the mid-sixth century. It contained a section on ‘Those Who Pour or Throw Things Out of Buildings’. The laws extended to falling things, as well. Despite this history, the novelty of drone journalism will make insurance tricky and expensive until the industry has more data on which to model risk profiles.

The last group within the legal and ethical section concerns truth and accuracy. Sensors may seduce journalists into thinking their output is objective and free from the errors inherent in human testimony. That is a risky belief. We have drawn on the expertise of the EPA to show how reporters might design a sensor-based data collection process to improve their accuracy. For journalists, the concept of ‘ground-truthing’—supplementing sensor information with human input—will be valuable. It should help introduce nuance, guard against mistakes and treat fairly the people at the heart of our stories.

For the final section, we have distilled this report into a set of recommendations, including groups of strategic moves, good work practices, and efforts the industry may collectively consider.

Strategic Recommendations

  • Identify and cultivate sensor sources for the beats you’ve prioritized.
  • Put a watching brief on open source sensing systems.
  • News nerds should do hardware too.

Work Practice Recommendations

  • Before sensing, articulate your hypothesis.
  • Work with experts on complex stories.
  • Understand the entire pipeline for your story’s sensor data.
  • Combine sensing with traditional reporting.

Recommendations for the industry, collectively

  • Journalists have an opportunity and a responsibility to report on sensor systems.
  • Advocate for access to data from publicly funded sensor systems.

However, even though we have documented significant amounts of journalistic sensing here, we hope that this report will need updates as newsrooms keep combining their reporters’ ideas with new sensing opportunities.

Research

The Art and Science of Data-Driven Journalism

5

Journalists have been using data in their stories for as long as the profession has existed. A revolution in computing in the 20th century created opportunities for data integration into investigations, as journalists began to bring technology into their work. In the 21st century, a revolution in connectivity is leading the media toward new horizons. The Internet, cloud computing, agile development, mobile devices, and open source software have transformed the practice of journalism, leading to the emergence of a new term: data journalism.

Although journalists have been using data in their stories for as long as they have been engaged in reporting, data journalism is more than traditional journalism with more data. Decades after early pioneers successfully applied computer-assisted reporting and social science to investigative journalism, journalists are creating news apps and interactive features that help people understand data, explore it, and act upon the insights derived from it. New business models are emerging in which data is a raw material for profit, impact, and insight, co-created with an audience that was formerly reduced to passive consumption. Journalists around the world are grappling with the excitement and the challenge of telling compelling stories by harnessing the vast quantity of data that our increasingly networked lives, devices, businesses, and governments produce every day.

While the potential of data journalism is immense, the pitfalls and challenges to its adoption throughout the media are similarly significant, from digital literacy to competition for scarce resources in newsrooms. Global threats to press freedom, digital security, and limited access to data create difficult working conditions for journalists in many countries. A combination of peer-to-peer learning, mentorship, online training, open data initiatives, and new programs at journalism schools rising to the challenge, however, offer reasons to be optimistic about more journalists learning to treat data as a source.

Following is a list of the 14 findings, recommendations and predictions explored in detail in the full report, which can be downloaded here (PDF).

1) Data will become even more of a strategic resource for media.

2) Better tools will emerge that democratize data skills.

3) News apps will explode as a primary way for people to consume data journalism.

4) Being digital first means being data-centric and mobile-friendly.

5) Expect more robojournalism, but know that human relationships and storytelling still matter.

6) More journalists will need to study the social sciences and statistics.

7) There will be higher standards for accuracy and corrections.

8) Competency in security and data protection will become more important.

9) Audiences will demand more transparency on reader data collection and use.

10) Conflicts over public records, data scraping, and ethics will surely arise.

11) Collaborate with libraries and universities as archives, hosts, and educators.

12) Expect data-driven personalization and predictive news in wearable interfaces..

13) More diverse newsrooms will produce better data journalism.

14) Be mindful of data-ism and bad data. Embrace skepticism.

Past Events

Tow Center Launches Three Tow Reports on UGC, Sensors, and Data-driven Journalism

3

The Tow Center team is thrilled to launch three new research reports.

Amateur Footage: A Global Study of User-Generated Content in TV and Online News Output, written by Claire Wardle and Sam Dubberley is the result of a major global study into the integration of User Generated Content (UGC) in news output in television broadcasts and online.

Sensors and Journalism, led by Fergus Pitt and including a wide range of contributors, explores how recent advances in sensor networks, citizen science, unmanned vehicles and community-based data collection can be used by a new generation of sensor journalist to move from data analysis to data collection.  The report critically reviews recent prominent uses of sensors by journalists, explores the ethics and legal implications of sensing for journalism, and makes a series of recommendations for how sensors can be integrated into newsrooms.

The Art and Science of Data-Driven Journalism, by Alex Howard provides a recent history and current best-practices in the space of data and computational journalism, based on dozens of interviews with industry leaders.

This research was made possible by grants from the Knight and Tow Foundations.  More details on the Tow Center research program can be found here.

All three reports will be launched at today’s Tow Center conference Quantifying Journalism: Data, Metrics, and Computation which will also include panels on newsroom metrics, data journalism, and sensors as well as talks by a range of Tow Fellow.  Further information on today’s conference can be found here.  All sessions will be broadcast live starting at 9am (EST) here.

Past Events

LIVE BLOG: Quantifying Journalism: Data, Metrics, and Computation

2
From left: Amanda Cox, Dan Gardner, and Mark Hansen.

From left: Amanda Cox, Dan Gardner, and Mark Hansen.

UPDATED June 3, 2014:

The following is the Live Blog from the Tow Center’s first Tow research conference Quantifying Journalism: Data, Metrics, and Computation held Friday, May 30, 2014 at Columbia Journalism School.

The day-long conference included panel discussions, lectures, lightning talks, and the launch of three Tow Center reports. All sessions can be viewed here: http://cuj.tw/1g8MvZU 

Download a PDF of the Conference Program Live Blog curated by

Yumi Araki (@yaraki)
Lauren Beck
​Rachel Delia Benaim
Anna Ruela-Browne
Julien Gathelier
​Jessica Quan Li (@CURJournal)
Rachel Lowry​ (@rachelllowry)

9:15am–10:30am PANEL | Beyond Clickbait: How are news organizations actually using analytics, and what does It mean for content?

  • Caitlin Petre, Tow Fellow (@cbpetre)
  • James Robinson, Director of News Analytics, The New York Times (@JamesGRobinson)
  • Tony Haile, CEO, Chartbeat (@arctictony)
  • Daniel Mintz, Director of Business Intelligence, Upworthy (@danielmintz)

[9:10 a.m.] Welcome to the live blog! We’re kicking off today’s talks with a subject that is on the minds of many newsrooms: how to leverage analytics to drive meaningful traffic. Our panelists are just about to hit the stage.

Tune into our live stream!

IMG_4173 [9:15 a.m.] Tow Center Director Emily Bell (@ takes the stage and welcomes the panel to the Tow Center’s first research conference.   [9:20 a.m.] “We need to talk about the hashtag!” [9:21 a.m.] Caitlin Petre introduces the panelists on stage.      Petre says the goal for the talk is to map out the landscape of metrics, and to gain a more nuanced understanding of how newsrooms are using metrics. Tow-Panel [9:27 a.m.] Tony Haile, CEO of Chartbeat takes the mic. Haile informs the crowd that even when newsrooms have data, it’s difficult to precisely predict reader engagement. Choosing the right metrics to align the end-goal is important. Here are some of the trends Haile has noticed in the world of analytics:

“Caring about traffic to caring about audience.”

 

“New and better ways to measure.”

[9:35 a.m.] Petre introduces Daniel Mintz, Director of Business Intelligence of Upworthy: Mintz says choosing the right metric to measure audience engagement is vital. He says there is a distinction between page views, clicks, and how long a user spent on a page, article, etc. (Upworthy uses a metric called Attention Minutes.)

“Data just for data’s sake is useless.” “You are what you measure.”

Mintz says that as Upworthy fights the zero-sum game for attention for things that really matter, choosing the right kind of metrics to advance the website’s goals is the way to go. [9:41 a.m.] Petre introduces James Robinson, Director of News Analytics at the New York Times: The difference between reporting and insight-generated analytics.

“How did my story do?”

Metrics are “a means to an end.”

[9:50 a.m.] Petre throws out a question to the panel about commensuration: how do we compare metrics? Can we compare them?

Petre then asks, who is going to be interpreting the data? Who should and who does have the role of interpreting the data? Should it be reporters? Should it be the ones who understand what a p-value is? Editors? “The answer is ‘yes’,” says Haile. Haile says the key question is: “What can I do for this story right now?”

“If you just give numbers to people, that’s no good.”

Mintz says that data is only useful insofar as it helps make decisions in context. His analytics team handles engagement; his business team handles monetizing the engagement. Robinson doesn’t have a rule about who handles the analytics.

Here are some thoughts from the audience: 

[10:09 a.m.] Questions from the audience:

Q: What’re some of the metrics used?

A: Are you paying attention to the content or not? Mintz says you can pull up a video’s API (application programming interface) and see how long a user is playing the video. (This is like Upworthy’s Attention Minutes.) Google analytics is “super janky” and better for measuring engagement on e-commerce sites, not necessarily for news/content.

Q: To what extent are advertisers considering attention?

A: Hale says “increasingly.” Brands advertisers want to be able to communicate their message to audiences that are paying attention. Advertisers are increasingly getting specific about how much time they want to show their ads to X customer.

Q: What are good tools to measure social shares?

A: There are a set standard given tracking tools – Haile Mintz recommends buying off-the-shelf internal analytics suites. Robinson says social and mobile are often connected.

Q: [To Robinson] – Any advice for building up a baseline for parsing out differentiation (of users)? What are the most valuable lines to draw–is it age? Demographics? 

A: Robinson says they’re still in the prototype days [so it's hard to say, exactly].

Q: Where do we draw the line between making decisions based on data versus based on intuition and experience?

A: Robinson says it’s a combination of both relying on analytics, statistics, and intuition. Mintz says, “I ask people to tell me a story.” If you can’t tell him how A got to B, then there is no correlation.

That’s a wrap! Stay tuned in for our next panel.

10:30am–11:15am

TOW REPORT LAUNCH | The Art and Science of Data Journalism

  • Alexander Howard, Tow Fellow

[10:35 a.m.] Alexander Howard says data and programming originated in the 1960s with computer-assisted reporting, and has since escalated to a surge in data creation, with the addition of new devices: “This is a trendy thing, but not a new thing.”

[10:40 a.m.] Howard applauds news outlets such as WNYC, New York Times and La Nación for innovative data journalism. 

[10:44 a.m.] “This is just another set of tools, but the story itself still matters,” Howard says, predicting that data journalism will cease to be a niche in the future. “We don’t talk about telephone journalism, or email journalism — it’s just journalism.”

[10:48 a.m.] Howard says people need to understand the basics of data analysis and numeracy: average vs. median, statistical significance, correlation and causation.

[10:56 a.m.] There is this huge amount of data flowing now, Howard says. From startups to social data flowing online on social networks, as well as open government data platforms, there is an explosion of tools that allows people to put data to use and make sense of it. A question of rights to these data mining tools becomes relevant. [10:59 a.m.] Howard says there are new risks for discrimination, with personalized red lining. “People who understand data and statistics will find examples of it.” [11:05 a.m.] “Data journalism is the new punk,” Howard says. “Anyone can learn new punk. And there is a lot of bad punk music out there, but the fact is that we all can learn these things.” He says we won’t all be computer whizzes right away, but there are many opportunities for data journalism for the masses. [11:09 a.m.] Howard notes the necessity of government data.

[11:09 a.m.] Data-ism is a thing, Howard says. Embrace it. Be a skeptic. This kind of work matters to reach everyone and report on everyone.

photo

Alexander Howard, Tow Fellow

@rachelllowry: stay tuned for our next panel. 11:30am–12:45pm

PANEL | Data: What is (and isn’t) it good for?

  • Jonathan Stray, Tow Fellow
  • Amanda Cox, Graphics Editor, The New York Times
  • Dan Gardner, Author and Journalist
  • Jen Lowe, Data Scientist, datatelling
  • Mark Hansen, Director, David and Helen Gurley Brown Institute for Media Innovation & Professor of Journalism, Columbia University

[11:32 a.m.] “Data is never just data,” Jonathan Stray says. “It’s never about answering the question.” There are politics attached to it. “Can you really use data to decide whether two people of the same gender can marry? Stray says much of what journalists deal with are empirical questions that we cannot use data to answer. Sometimes, he says, it’s difficult to determine which.

[11:41 a.m.] Data-backed journalism is opinion journalism, Stray says, quoting Richard Lanham: “There is no truth. There is only opinion.” 

[11:48 a.m.] And yet on the other hand, our political system would not work without data, Stray says. How, then, to reconcile between the two and distinguish between quantitative vs.qualitative?   

[11:48 a.m.] Dan Gardner says the problem today is ignoring the empirical evidence unless it happens to coincide with our biases. How to guard against such biases? “You have to demand more and better evidence,” Gardner says. He hopes one day the real problem is that we are paying too much attention to the data. For Gardner, we’ve got a long way to come.  [12:02 p.m.] It is important to be aware of the human side of data and its implications, Mark Hansen says. Journalists should be able to interrogate and tell stories around data: “Stories may come from a clever use of data that was used for an entirely different purpose,”Hansen says. “Data can be a source of speculation, exploration and answers. It can be useful for helping us arrive at the right question.” [12:08 p.m.] Stray: At the onset of increasingly sophisticated models and techniques, if we can’t explain how we arrived at a conclusion, nobody will believe us. [12:10 p.m.] Gardner spoke on an upcoming book he is authoring. The book brings in volunteer intelligence analysts who have access to classified information: “One of them is a pipeline worker in Alaska,” Gardner says. “And he’s kicking the CIA’s ass.”

photo (2)

Left to right: Jonathan Stray, Amanda Cox, Dan Gardner, Mark Hansen.

[12:25 p.m.] Stray asks if journalism should be representative. And if so, with who and how do we do that? “Journalists do a lot of generalization without really looking at it closely.” How to guard against that? [12:28 p.m.] Often, Gardner says, journalists and politicians must make an empirical claim to cover a morale claim. Good journalism has to have both a data and a non-data component. [12:31 p.m.] Stray asks, if we want to improve the quality of data journalism, does there have to be a standard? Gardner : “It’s that great collective argument that eventually hashes out the truth.” [12:35 p.m.] Hansen adds: Yes, we need to set best practices and be tool builders and not tool users, but it can be a trap to focus on the places we get it wrong, rather than pay attention to the places where we get it right.

[12:36 p.m.] Questions from the audience:

Q: Is it even possible to do representative journalism and can we learn from non-representative journalism?

A: No, Cox says. She is comfortable with slight bias: “Representativeness is not always desirable.”

Q: Are journalists going to be able to use data to call out bullshit on politicians?

A: Gardner says data there are many strong improvements: You have to have some faith in the progression of man and as politicians are being held to the fire, we are seeing more evidence that data can reward that kind of behavior. Hansen agrees. But he says there needs to be a place where children, at a k-12 level, learn how to use data.

LUNCH TALKS | Reports from Tow Fellows ongoing Tow Research Projects

  • Andy Carvin
  • Brian Abelson and Michael Keller | NewsLynx
  • Nicholas Diakopoulos | Data Journalism: Algorithmic Accountability
  • Susan E. McGregor | Journalism Security

Brian Abelson and Michael Keller discuss NewsLynx: A suite of open source tools for online analytics and a research project.   It combines data from many sources (Google Analytics, Twitter, Facebook, Press Clippings, etc.); incorporates a framework for logging qualitative ‘impact events’.

Brian Abelson and Michael Keller on Newslynx

Brian Abelson and Michael Keller on Newslynx

Software features:  tracking of social media “mentions” and “likes” over time and integration with Google Analytics. Nicholas Diakopoulos, Tow Fellow, talks about algorithmic power.  @ndiakopoulos says algorithm is becoming pervasive in society, including in romance. Open questions about algorithms:  How is an algorithm discriminatory/unfair? Does it make a mistake that denies a service?  Censorship?  Breaks the law or social norm?  False prediction? Diakopoulos’ research addresses teaching journalists algorithmic accountability, legal issues, algorithms in the newsroom and transparency policy. [1:37] Susan E. McGregor presents her paper, which will be released in full next month, on source protection. “If we don’t have sources, we don’t have journalism,” she said.

Susan McGregor on Journalism Security

Susan McGregor on Journalism Security

Source protection is non negotiable, she said. All reporters needs to worry about source protection, not just national security reporters. There is a new technology known as Stingrays that can be used to hack cell conversations. These devices mimic and cell phone tower and can be used to triangulate the location of a cell phone signal. The majority of the devices are controlled by the federal government, but sometimes they are shared with local law enforcement officers who use the tech to identify communications. Unfortunately there isn’t a lot of clarity about what our [reporters] rights are in the context of the law. There is often a sense of helplessness when thinking about how to resolve these protection issues.McGregor suggests that we need to educate ourselves about what is visible, and how these systems work so we can protect ourselves. We need to educate; we need to organize; we need to innovate. “Digital security is herd protection.” By our doing due diligence and learning about and using digital security, we will be doing a service to all reporters around the world that may not have access to this sort of technologies. Look out for McGregor’s full report, which will be released on June 18th! [1:48] Andy Carvin on Broken News. Carvin speaks about what happens when new organizations get it wrong, and social media makes it worse.

Andy Carvin on Broken News

Andy Carvin on Broken News

He recounts NPR’s misreporting of Gabby Gifford’s shooting/death. The misreporting took place on social media, but the correction did too. Through social media because of the inherent column response nature of it, there is a quick way to correct misinformation/misreporting.

A similar situation happened with the Newtown Massacre when CNN misreported Ryan Lanza as the “murderer.” This information spread like wildfire on Twitter. Boston Bombing: Carvin flips the data-approach on its head. Here, online communities made the mistake and the media followed, as opposed to the other way around like with the Gifford and Newtown cases. What I’m really trying to do is understand the interplay between social media and the news cycles, he says. Corvin’s project will look at how journalists can embed themselves in these communities and avoid these mistakes. [2:00pm] Journalism by the numbers: Measuring a rapidly moving target. Jesse Holcomb (speaker)

Jesse Holcomb on The State of Data on Journalism

Jesse Holcomb on The State of Data on Journalism

 

“We do data–that’s our hedgehog on politics” “we try to tell big stories” How is journalism being produced, consumed, and distributed? [2:06pm] “let’s get journalism out of the ivory tower,” says Holcomb. [2:09pm] Holcomb exploring the nonprofit news landscape. Remark: we seem to be publishing data journalism about data journalism… Questions that remain from ongoing conversations: What is a non-profit news room? How many are there? 16,000 news jobs lost in the past decade. What’s happening in digital news publishers? 5000 jobs hosted by about 500 digital news outlets. [2:13pm] With rising citizen efforts in measurement, we still have limitations in how well we can collect data. Data on digital revenue has been harder to come by. [2:15pm] Professional Journalism Revenue: $63-65 billion today. Ratio of revenue distribution has changed, more in favor of audience and non-traditional revenue. Media deserts–sources that lack good information, often which serve communities of color, that speak English as a second language. What are some data challenges that we’re encountering today. The CNN’s, Buzzfeeds, etc.–how are we evaluating the engagement, consumption, quality of these sources? It’s harder than ever for people to remember where they got their news–so many sources, whether radio, newspaper, TV, online/mobile news. Variation in social desirability of  “Important” information. Young people consider news part of the social atmosphere–present on Facebook, etc. “Ambient” news–present on feeds, but not as actively sought out. [2:20pm] We are finding that it is becoming increasingly difficult to aggregate and normalize data from social media sites–secured by firms, guarded by individual users, etc. makes studying digital and social news behavior more and more difficult. Shout out to work being done at MIT — mapping diffusion of information on digital and social networks. Detecting and Tracking Political Abuse in Social Media: http://bit.ly/1nB0mvx Modeling Social Diffusion Phenomena using Reality Mining: http://bit.ly/1mRroLL Trends Prediction Using Social Diffusion Models: http://web.media.mit.edu/~yanival/SBP-Behavior-shaping.pdf Information Diffusion Through Blogspace: http://people.csail.mit.edu/dln/papers/blogs/idib.pdf Holcomb warns that audiences ought to continue to expect incomplete, imperfect sources of data and information.

Emphasizes the importance of remaining transparent about how much exactly data can say, the validity of studies.

[2:30pm] Families are becoming more and more multi-screen users. Watching various news sources simultaneously. So we start to see the disappearance or obsolescence of certain legacy platforms, but they’re not going away tomorrow, and are still important to our studies. Our obligation is to understand how people are still engaging with news and information in places where they have not become completely digitally immersed.

Check out this storify, made by Yangbo Du @mitgc_cm

2:45pm–3:30pm TOW REPORT LAUNCH | Amateur Footage: A Global Study of User-Generated Content in TV and Online News Output

  • Claire Wardle, Tow Fellow
  • Sam Dubberley, Tow Fellow

[2:51pm] Now, in 2014 we take social media involvement in current event information dissemination for granted, says Wardle. “User-generated content” — anyone have suggestions for a better phrase? requests Wardle. UGC is the wild west–no normalization of practices. Many news rooms, as a consequence, really wanted to know what other news rooms were doing. [2:55pm]: Two phases of the research, as described by Wardle: 1) The How, When, and Why that the issues are built upon. Quoted 8 news channels from around the world, to look at UGCs, international scope. Found a way to record the channels (surprisingly most news channels don’t actually record their own outputs). Observed the differences among news sources–internet vs. TV–and how they used UGCs. The news sources of different countries also broadcast these findings at different rates, in different amounts. Some countries making motions to use UGCs as data. [3:05pm] “the V word: Verification” Journalists often groan when asked to verify things, yet news sources are always absolutely terrified of putting out incorrect information, which is why they are especially hesitant to rely on social media sources for information. How do we verify information? Claiming experience as a journalist who has been in the field for X number of years is simply not sufficient–how about the empirical methods we have developed to study the flow and sourcing of information that have been put on on social media outlets? –Wardle, on Verification. Journalist need to take the legal aspects of crediting and verification seriously. Their reaction to advocates for properly crediting sources is often that their creativity is being “stifled” and that this whole issue is simply “bollocks.” –Sam Dubberley, on Crediting. [3:10pm] Crediting vs. Labeling. Naming the source of the photo vs. simply acknowledging that it isn’t your own. Being transparent with your audience about where the information and photography is coming from, whose work is it? Need an industry standard for how we ought to site and label content. [3:15pm] Present some suggestions: 1) crediting: often news rooms gave “screen cluttering” as a reason for not crediting sources 2) news room technology: even news rooms with advanced media asset management system, they could not process the details about information used. credit needs to be burned into the UGC video before it enters the system, so that information is not lost in the process 3) agencies: all have different standards. news rooms need to ask some important questions. Reuters, for example, do not credit and can’t even if they wanted to because they so often do not speak to their sources in a room, so there’s no way of knowing the accuracy of the identities of the sources. 4) social networks need to work towards developing a standard of use 24/7 for news sources that is like creative commons–to have a common standard across industry so news rooms can know what they can do with content. 5) very limited resources, very limited training for journalists to develop these skills and to be aware of these practices.

Question: What sources can journalists look to learn these skills?

Answer: Verification handbook, by Craig Silverman. Specifies how to verify videos, tweets. Can use storify as well. Rise of these sources of learning how to credit. Absence of these skills in newsrooms–really worth the time, if you’re a freelancer, to look into these.

[3:25pm]

 REPORT LAUNCH | Sensors and Journalism

  • Fergus Pitt, Tow Fellow
  • Scott Klein, Assistant Managing Editor, ProPublica
  • Shannon Dosemagen, Co-founder and Executive Director, Public Lab
  • Joe Procopio, VP of Product, Automated Insights
  • Nabiha Syed, Associate, Levine Sullivan Koch & Schulz

Fergus Pitt, on Robot Reporters, and what new journalistic data gathering tools, such as drones and sensors are on the horizon.

Sensors and Journalism — report.

[3:30pm] Joe Procopio elaborates on the development of automated content, and its application to a wide array of websites. Algorithms for tone, style, topic, lexicon, prioritization of content. Usually applied to content that journalists leave out–fantasy football, etc. recaps of sports, diverse, thousands at a time, compiling information in a timely manner, presenting this information to journalists, efficiently canceling out outliers, as long as data is there Automated Insights can provide insights.

[3:35pm] Scott Klein, Propublica. Nonprofit news outlet–producing long-form investigative journalism. Speaking on satellite journalism. News Applications–data journalism effort, statistics, data science, build large-scale interactive databases, tools that allow you to look up why data is important and interesting to individuals and their communities.

[3:40pm] reference sensor journalism workshop at the tow center: http://bit.ly/1mRQWsb

[3:45pm] Using satellite journalism to draw attention to important phenomena–for instance, the fact that Louisiana is experiencing serious land erosion at its southern coast–seen from above, losing football-sized pieces of land every hour.

Using people’s stories, audio, photographs, satellite imagery, to tell stories and to enhance journalism.

New way of recording history, live cams of ongoing problems, orbiting satellites capturing sections of the earth quickly every day.

[3:50pm] Shannon Dosemagen, Co-founder + Executive Director of Public Lab

Nonprofit, open source community supporter. For open Technology and Science. Started low-cost air sampling. Worked with environmental justice groups. During BP oil spill, complete media blackout–took to boats, beaches, with basic cameras on balloons to take pictures of the events as they unfolded. 100,000 different images. Mapped about 100 miles of coastline in a community-driven manner.

Created open source software platforms to allow this data to be sent out to the public. Created open-archiving systems to allow individuals to download metadata, maps, to see exact coordinates, the individuals involved in creating the data, videos, pictures, ground-field notes, etc. Doing work now on air quality sensing, water quality sensing.

Community-driven monitoring. Engage people as researchers, not as subjects. Creating access–to low-cost tools to involve people in community monitoring, journalism, involve individual in the process of science and journalism. Pull complexity off the shelf. Turn simple camera into effective data-collection device. Reimagining our relationship with the manufacturing environment.

[3:55pm] Build in openness and accountability. Create collaborative workflows. Maintaining public data archives. Mainstream true accountability. Creating local versions of tools.

http://t.co/FGXKZvgWHc a tool from @PublicLabs that allows you to stitch together aerial photos. #towtalk

— Nick Diakopoulos (@ndiakopoulos) May 30, 2014 [4:00pm] Losing a sense of concepts like good, bad, superlatives, the number 10, in favor of percentages, ratios, etc. — values in context, rather than raw data. Producing a more robust report. Saying an athlete “had a great day” vs. an athlete had x number of touch downs, performing at what percentile compared to other players or him/herself– Joe Procopio “We don’t have to be great but we can never be wrong.” – Joe Procopio, VP of Product at Automated Insights, on automated journalism #towtalk — Matt Waite (@mattwaite) May 30, 2014

  [4:08pm] Public lab: emphasizing access. — Shannon Dosemagen Federal agencies have approached @PublicLab about helping to fill gaps in their data #towtalk — Current Public Media (@currentpubmedia) May 30, 2014Lots for newsrooms to learn from @PublicLab‘s process in terms of openness, transparency, engagement and more. Thanks @sdosemagen. #towtalk — Josh Stearns (@jcstearns) May 30, 2014

[4:09pm] producing new knowledge and information — research scientists and journalists come at things with different intensions, use the same tools, but are after the same “truth.” Very much a symbiotic flow between the two fields. — Scott Klein.  9 Key Principles for Open Tech, by @PublicLab: Summary of @SDosemagen‘s #TowTalk https://t.co/EpXStaKPeI — Jeremy Caplan (@jeremycaplan) May 30, 2014  [4:11pm] Bottom-up research. Problem identified by community member, calls upon team of interdisciplinary forces to come together to engage people in solution. — Shannon Dosemagen.

  The idea of translating the intention of these satellite tools will be the next challenge. So long as we emphasize that these efforts are expressive, then these efforts are safe in 1st amendment territory. It is key, then, to highlight the fact that these robots are actually directed by real people with real goals. — Nabiha Syed.

   

Thanks all for coming and following Tow Center at today’s “Quantifying Journalism” Conference at Columbia University! Feel free to follow up with us with any questions and comments you have after the event! Looking forward to hearing from you all! Top Tweets from Today’s #TowTalk “Quantifying Journalism” Event http://t.co/EL1Lsd4b6N via @SeenCo