Tag Archives: Knight Foundation

Expanding the Television Archive

When we started archiving television in 2000, people shrugged and asked, “Why?  Isn’t it all junk anyway?” As the saying goes, one person’s junk is another person’s gold. From 2010-18, scholars, pundits and above all, reporters, have spun journalistic gold from the data captured in our 1.5 million hours of television news recordings. Our work has been fueled by visionary funders(1) who saw the potential impact of turning television – from news reports to political ads – into data that can be analyzed at scale. Now the Internet Archive is taking its Television Archive in new directions. In 2018 our goals for television will be: better curation in what we collect; broader collection across the globe; and working with computer scientists interested in exploring our huge data sets. Simply put, our mission is to build and preserve comprehensive collections of the world’s most important television programming and make them as accessible as possible to researchers and the general public. We will need your help.  

“Preserving TV news is critical, and at the Internet Archive we’ve decided to rededicate ourselves to growing our collection,” explained Roger MacDonald, Director of Television at the Internet Archive. “We plan to go wide, expanding our archives of global TV news from every continent. We also plan to go deep, gathering content from local markets around the country. And we plan to do so in a sustainable way that ensures that this TV will be available to generations to come.”

Libraries, museums and memory institutions have long played a critical role in preserving the cultural output of our creators. Television falls within that mandate. Indeed some of the most comprehensive US television collections are held by the Library of Congress, Vanderbilt University and UCLA. Now we’d like to engage with a broad range of libraries and memory institutions in the television collecting and curation process. If your organization has a mandate to collect television or researcher demand for this media, we would like to understand your needs and interests. The Internet Archive will undertake collection trials with interested institutions, with the eventual goal of making this work self-sustaining.

Simultaneously, we are looking to engage researchers interested in the non-consumptive analysis of television at scale, in ways that continue to respect the interests of right holders. The tools we’ve created may be useful. For instance, we hope the tools the Internet Archive used to detect TV campaign ads can be applied by researchers in new and different ways.  If your organization has interest in computing with television as data at large, we are interested in working with you.

This groundbreaking interface for searching television news, based on the closed captions associated with US broadcasts, was developed between 2009-2012.

A brief history of the Internet Archive’s Television collection:

2000 Working with pioneering engineer, Rod Hewitt, IA begins archiving 20 channels originating from many nations.

Oct. 2001 September 11, 2001 Collection established, and enhanced in 2011.

2009-2012 With funding from the Knight Foundation and many others, we built a service to allow public searching, citation and borrowing of US television news programs on DVD.

2012-2014 Public TV news library launched with tools to search, quote and share streamed snippets from television news.

2014 Pilot launched to detect political advertisements broadcast in the Philadelphia region, led to developing open sourced audio fingerprinting techniques.

2016 Political ad detection, curation, and access expanded to 28 battleground regions for 2016 elections, enabling journalists to fact check the ads and analyze the data at scale. The same tools helped reporters analyze presidential debates.  This resulted in front-page data visualizations in The New York Times, as well as 150+ analyses by news outlets from Fox News to The Economist to FiveThirtyEight.

2017-date Experiments with artificial intelligence techniques to employ facial identification, and on-screen optical character recognition to aid searching and data mining of television. Special curated collections of top political leaders and fact-check integrations.

In the run-up to the 2016 presidential elections, journalists at the NYT and elsewhere began analyzing television as data, in this case looking at the different sound bites each network chose to replay.

Embarking on a new direction also means shifting away from some of our current services. Our dedicated television team has been focusing on metadata enhancement and assisting journalists and scholars to use our data. We will be wrapping up some of these free services in the next three to four months.  We hope others will take up where we left off and build the tools that will make our collection even more valuable to the public.

Now more than ever in this era of disinformation, our world needs an open, reliable, canonical reference source of television news. This cannot exist without the diligent efforts of technologists, journalists, researchers, and television companies all working together to create a television archive open for all. We hope you will join us!

To learn more about the work of the TV News Archive outreach and metadata innovation team over the last few years, please see our blog posts.

(1) Funding for the Television Archive has come from diverse donors, including the John S. and James L. Knight Foundation, Democracy Fund, Rita Allen Foundation, craigslist Charitable Fund and The Buck Foundation.

Pro-Airbnb advertising dominated recent political TV ads in San Francisco

Based on algorithmic analysis, Pro-Airbnb advertising dominated political TV ads in San Francisco in the weeks leading up to Election Day. Two thirds of the minutes devoted to political ads on several initiatives and races before voters focused on arguments against a proposal to curb the company’s operations in the city, according to a review of the Internet Archive television archive. Voters ended up rejecting Proposition F, whose opponents claimed it would encourage neighbors to spy on each other and increase lawsuits, by a margin of 55 to 45 percent.

Minutes of TV Political Ads in San Francisco

The Archive identified total of 1,959 minutes of ads (4,591 plays) opposing Proposition F, out of 2,895 minutes devoted to all political TV ads, or roughly two thirds of the air-time.

To put that in perspective, Mayor Ed Lee, who won his reelection easily, was the subject of only 55 minutes of ads. Though he appeared in and narrated hundreds of ads supporting Propositions A and D, the only ads that mention his mayoral race were airings of a support ad paid for not by his own campaign, but rather by an independent expenditure from Clint Reilly, a local real estate developer and former professional political consultant.

Samples of all ads found to be related to 2015 San Francisco elections can be viewed here, and metadata about those that occurred in archived television can be downloaded from this page.

The only political ad that aired on television in support of proposition F was this one, which was observed for a total of 16 minutes between October 16th to 25th. The ad, which features a parody of the Eagles’ song “Hotel California,” was pulled from Youtube and the ShareBetterSF campaign website because of claims of copyright infringement. Dale Carlson, a spokesman for the campaign who contacted the Archive, wrote “We believe the ad is parody and did not constitute a copyright violation. But it had already run its course and we weren’t going to spend money on legal bills to defend an ad that was already off the air.”

In all, the Archive identified 14 unique ads opposing Proposition F that aired on TV. In the final days of the campaign, the opponents devoted airtime to this ad that calls the proposal “too extreme,” quotes from the San Francisco Chronicle, and cites high profile opponents such as Lt. Gov. Gavin Newsom, Mayor Lee. This 30-second ad aired 423 times on 10 channels in San Francisco (CNBC, CNN, FOXNEWS, KGO, KNTV, KOFY, KPIX, KRON, KTVU, MSNBC).

This review updates an earlier one issued last week focused exclusively on Airbnb ads, broadening the analysis to include all political TV ads aired from August 25th through November 3.  The Archive identified ads through a number of sources, including SFGov’s Summary of Third Party Expenditures Regarding San Francisco Candidates hosted by the City of San Francisco. An audio fingerprint was created for each ad and used to find matches in some 35,000 hours of archived local station programming and cable news network shows available in the San Francisco region.  The Internet Archive’s television news research library presents public opportunities to search, compare and contrast news programs in its archive.  Entertainment programming is only available for select algorithmic study within its server environment.

The Internet Archive’s review of political TV ads relating to Proposition F is part of experimentation in preparation for our new Knight Foundation funded project to track political TV ads in key primary states. Stay tuned for news about our December launch.

Research by Trevor von Stein

Pro-Airbnb political TV ads air at rate of 100:1 as San Franciscans head to polls

For every one minute of political ads aired in favor of a contentious ballot initiative intended to further regulate Airbnb’s growing presence in the city where it is headquartered, more than 100 minutes of ads urging them to vote “no,” have aired on local San Francisco area TV stations, according to an assessment of the Internet Archive’s television archive.

Audio fingerprinting of YouTube-hosted advertising was used to identify the same ads in local station programming and cable news networks available in the region, from August 25th through October 26th.  Sample ads can be viewed here, and metadata about their occurrences can be downloaded from this page.

Proposition F, which is backed by a coalition of unions, land owners, housing advocates, and neighborhood groups, would restrict private rentals to 75 nights per year as well as enact rules that would ensure that hotel taxes are paid and city code followed. It would also allow private party lawsuits by neighbors against private renters suspected of violating the law.

The Internet Archive found just one TV ad favoring the initiative, also appeared on the Proposition F campaign website. The Archive discovered 32 instances of this ad airing on local TV stations, for a total of 16 minutes of airplay. However, the ad, which features a parody of the song “Hotel California,” by the Eagles, (the lyrics were replaced with “Hotel San Francisco,”) was recently removed from the official website because of a claim of copyright infringement.

In contrast, in our sample range, Airbnb supporters aired more than 26 hours of ads against the initiative. One example ad, which is below, claims that the initiative would “encourage neighbors to spy on each other,” and “create thousands of new lawsuits.” This ad played at least 358 times in recent weeks, for a total of 179 minutes of airtime.

Over all, according to reports filed with the San Francisco Ethics Commission, opponents of Proposition F have reported spending $6.5 million compared to $256,000 from organizations supporting the initiative.

Of course the ad campaigns are not just limited to television. Airbnb apologized last week after it caught flack for a series of controversial bus stations and billboard ads that critics called “passive aggressive” and “whiny,”  for complaining about how public institutions, such as libraries, spent their tax revenue-derived budgets.

But TV remains a key way that political operators try to influence voters. As Nate Ballard, a Democratic strategist recently said on a local newscast: “That’s how you win campaigns in California, on TV.”

The Internet Archive’s review of political TV ads relating to Proposition F is part of experimentation in preparation for our new Knight Foundation funded project to track political TV ads in key primary states. Stay tuned for news about our December launch.

research by Trevor von Stein

 

 

 

 

 

Mapping 400,000 Hours of U.S. TV News

TVnewMap2
We are excited to unveil a couple experimental data-driven visualizations that literally map 400,000 hours of U.S. television news. One of our collaborating scholars, Kalev Leetaru, applied “fulltext geocoding” software to our entire television news research service collection. These algorithms scan the closed captioning of each broadcast looking for any mention of a location anywhere in the world, disambiguate them using the surrounding discussion (Springfield, Illinois vs Springfield, Massachusetts), and ultimately map each location. The resulting CartoDB visualizations provide what we believe is one of the first large-scale glimpses of the geography of American television news, beginning to reveal which areas receive outsized attention and which are neglected.

Watch4-year

 

Watch TV news mentions of places throughout the world for each day.

 

Compare-Contrast

 

Select a TV station and time window to view their representations of places.

 

Keep in mind that as you explore, zoom-in and click the locations in these pilot maps, you are going to find a lot of errors. Those range from errors in the underlying closed captioning (“two Paris of shoes”) to locations that are paired with onscreen information (a mention of “Springfield” while displaying a map of Massachusetts on the screen). Thus, as you click around, you’re going to find that some locations work great, while others have a lot more error, especially small towns with common names.

What you see here represents our very first experiment with revealing the geography of television news and required bringing together a bunch of cutting-edge technologies that are still very much active areas of research. While there is still lots of work to be done, we think this represents a tremendously exciting prototype for new ways of interacting with the world’s information by organizing it geographically and putting it on a map where it belongs!

Virtual Machines: Unlocking Media for Research

In addition to our public web-based research service, we are facilitating scholars, like Kalev, and other researchers in applying advanced data treatments to our entire collection, at a speed and scale beyond any individual’s capacity. As responsible custodians of an enormous collection of television news content created by others, we endeavor to secure their work within the context of our library. Therefore, rather than lending out copies of large portions of the collection for study, researchers instead work in our “virtual reading room” where they may run their computer algorithms on our servers within the physical confines of the Archive. We hope our evolving demonstrations of this data queries in — results out — process may help forge a new model for how exceptional public interest value can be derived from media without challenging their value and integrity to their creators.

The Knight Foundation and other insightful donors are providing critical support in our ongoing efforts to open television news and join with others in re-visioning how digital libraries can respectfully address the educational potential of other diverse media. We hope you will consider lending your support.

The Atlantic