Tag Archives: television archive

Expanding the Television Archive

When we started archiving television in 2000, people shrugged and asked, “Why?  Isn’t it all junk anyway?” As the saying goes, one person’s junk is another person’s gold. From 2010-18, scholars, pundits and above all, reporters, have spun journalistic gold from the data captured in our 1.5 million hours of television news recordings. Our work has been fueled by visionary funders(1) who saw the potential impact of turning television – from news reports to political ads – into data that can be analyzed at scale. Now the Internet Archive is taking its Television Archive in new directions. In 2018 our goals for television will be: better curation in what we collect; broader collection across the globe; and working with computer scientists interested in exploring our huge data sets. Simply put, our mission is to build and preserve comprehensive collections of the world’s most important television programming and make them as accessible as possible to researchers and the general public. We will need your help.  

“Preserving TV news is critical, and at the Internet Archive we’ve decided to rededicate ourselves to growing our collection,” explained Roger MacDonald, Director of Television at the Internet Archive. “We plan to go wide, expanding our archives of global TV news from every continent. We also plan to go deep, gathering content from local markets around the country. And we plan to do so in a sustainable way that ensures that this TV will be available to generations to come.”

Libraries, museums and memory institutions have long played a critical role in preserving the cultural output of our creators. Television falls within that mandate. Indeed some of the most comprehensive US television collections are held by the Library of Congress, Vanderbilt University and UCLA. Now we’d like to engage with a broad range of libraries and memory institutions in the television collecting and curation process. If your organization has a mandate to collect television or researcher demand for this media, we would like to understand your needs and interests. The Internet Archive will undertake collection trials with interested institutions, with the eventual goal of making this work self-sustaining.

Simultaneously, we are looking to engage researchers interested in the non-consumptive analysis of television at scale, in ways that continue to respect the interests of right holders. The tools we’ve created may be useful. For instance, we hope the tools the Internet Archive used to detect TV campaign ads can be applied by researchers in new and different ways.  If your organization has interest in computing with television as data at large, we are interested in working with you.

This groundbreaking interface for searching television news, based on the closed captions associated with US broadcasts, was developed between 2009-2012.

A brief history of the Internet Archive’s Television collection:

2000 Working with pioneering engineer, Rod Hewitt, IA begins archiving 20 channels originating from many nations.

Oct. 2001 September 11, 2001 Collection established, and enhanced in 2011.

2009-2012 With funding from the Knight Foundation and many others, we built a service to allow public searching, citation and borrowing of US television news programs on DVD.

2012-2014 Public TV news library launched with tools to search, quote and share streamed snippets from television news.

2014 Pilot launched to detect political advertisements broadcast in the Philadelphia region, led to developing open sourced audio fingerprinting techniques.

2016 Political ad detection, curation, and access expanded to 28 battleground regions for 2016 elections, enabling journalists to fact check the ads and analyze the data at scale. The same tools helped reporters analyze presidential debates.  This resulted in front-page data visualizations in The New York Times, as well as 150+ analyses by news outlets from Fox News to The Economist to FiveThirtyEight.

2017-date Experiments with artificial intelligence techniques to employ facial identification, and on-screen optical character recognition to aid searching and data mining of television. Special curated collections of top political leaders and fact-check integrations.

In the run-up to the 2016 presidential elections, journalists at the NYT and elsewhere began analyzing television as data, in this case looking at the different sound bites each network chose to replay.

Embarking on a new direction also means shifting away from some of our current services. Our dedicated television team has been focusing on metadata enhancement and assisting journalists and scholars to use our data. We will be wrapping up some of these free services in the next three to four months.  We hope others will take up where we left off and build the tools that will make our collection even more valuable to the public.

Now more than ever in this era of disinformation, our world needs an open, reliable, canonical reference source of television news. This cannot exist without the diligent efforts of technologists, journalists, researchers, and television companies all working together to create a television archive open for all. We hope you will join us!

To learn more about the work of the TV News Archive outreach and metadata innovation team over the last few years, please see our blog posts.

(1) Funding for the Television Archive has come from diverse donors, including the John S. and James L. Knight Foundation, Democracy Fund, Rita Allen Foundation, craigslist Charitable Fund and The Buck Foundation.

Get your Dem debate visualizations here

Hot off the internet presses, here is media analyst’s Kalev Leetaru’s visualization tool, fueled by Internet Archive data, which enables users to trace particular phrases used in broadcast news coverage in the first 24 hours after would-be presidential nominees appeared in the first Democratic debate of the 2016 election.

Scroll down and what sticks out immediately are the two subjects that captured most of the news broadcasters’ attention: “Bernie Sanders’ “damn emails” quote and guns.

When the subject came up of the controversy over Clinton’s decision to do public work from a private email server, rather than attack Clinton, Sanders defended her:

“Let me say — let me say something that may not be great politics. But I think the secretary is right, and that is that the American people are sick and tired of hearing about your damn e-mails.”

According to Internet Archive data, that sound bite aired 496 times across stations.

The other issue that grabbed attention was gun violence: Sanders, who hails from gun-friendly rural Vermont, was called to task for his vote to make it tougher to hold gun manufacturers liable when the guns they make are used in a crime. Answering a question by CNN moderator Anderson Cooper, on whether Sanders is tough enough on guns, Clinton said:

“No, not at all. I think that we have to look at the fact that we lose 90 people a day from gun violence. This has gone on too long and it’s time the entire country stood up against the NRA. The majority of our country…(APPLAUSE)… supports background checks, and even the majority of gun owners do.”

This clip aired 260 times across stations.

However, these are just the top take-aways from this massive data crunching tool. It provides a search mechanism for the user to do deeper dives into the data and discover trends across and within certain types of news broadcasts.

Leetaru’s own analysis is here, on the Washington Post’s Monkey Cage. Among his observations:

There was also variation in how much attention each network paid to each candidate (you can see for yourself using the interactive visualization). Telemundo favored Sanders with 41 percent, followed by O’Malley with 24 percent and Clinton at just 21 percent, though admittedly, they broadcast a relatively small number of excerpts. FOX Business also favored Sanders 50 percent to Clinton’s 38 percent, as did CSPAN with Sanders at 52 percent to Clinton’s 44 percent. All other networks favored Clinton, though sometimes by a relatively close margin — like CNBC (50 percent Clinton to 43 percent Sanders) or PBS affiliates (41 percent Clinton to 38 percent Sanders).

This tool is also part of the Internet Archive’s testing of technology that we’ll use in our new Knight Foundation funded project to track political TV ads in key primary states, which will launch in early December.

Dig in and have fun.

As Democratic candidates debate, Internet Archive will be gathering data

When Hillary Clinton and Bernie Sanders take the podium tonight along with other contenders for the Democratic presidential nomination in 2016, their debate will be televised. The Television Archive will be tracking the news coverage surrounding the debate, viewable and searchable, here.

And this tool, developed by political scientist Kalev Leetaru  and fueled by Internet Archive data, allows users to see how many times a particular candidate’s name is mentioned in news coverage. Going into the debate, Hillary Clinton is getting more than twice as mentions as Sen. Bernie Sanders.

We take for granted that candidates will debate on screen, but it wasn’t always so. The faceoff between Republican Vice President Richard Nixon and Democrat U.S. Senator Jack Kennedy in 1960, 55 years ago last month, marked the first time that Americans were able to watch candidates for the nation’s highest office from the comfort of their living rooms. You can see part one of the debate here, preserved on the Archive’s servers:

The received wisdom about this famous debate was that, from this point on, candidates had to think not just about what they said on the campaign stump, but how they looked. This could make a huge difference in how the public and the media perceived who “won” the debate. Nixon looked tired and like he needed a shave. Kennedy looked healthy and vibrant. Those who listened on the radio thought Nixon won.

“It’s one of those unusual points in the timeline of history where you say things changed very dramatically–in this case, in a single night,” Alan Schroeder, a media historian and associate professor at Northeastern University, told Time Magazine in 2010.

Here’s part II of the Kennedy-Nixon 1960 debate:

We don’t know yet who the perceived winner of tonight’s debate will be. The Internet Archive’s data will provide one way to evaluate this. Stay tuned.