Some Very Entertaining Plastic, Emulated at the Archive

It’s been a little over 4 years since the Internet Archive started providing emulation in the browser from our software collection; millions of plays of games, utilities, and everything else that shows up on a screen have happened since then. While we continue to refine the technology (including adding Webassembly as an option for running the emulations), we also have tried to expand out to various platforms, computers, and anything else that we can, based on the work of the emulation community, especially the MAME Development Team.

For a number of years, the MAME team has been moving towards emulating a class of hardware and software that, for some, stretches the bounds of what emulation can do, and we have now put up a collection of some of their efforts here at archive.org.

Introducing the Handheld History Collection.

This collection of emulated handheld games, tabletop machines, and even board games stretch from the 1970s well into the 1990s. They are attempts to make portable, digital versions of the LCD, VFD and LED-based machines that sold, often cheaply, at toy stores and booths over the decades.

We have done our best to add instructions and in some cases link to scanned versions of the original manuals for these games. They range from notably simplistic efforts to truly complicated, many-buttoned affairs that are truly difficult to learn, much less master.

They are, of course, entertaining in themselves – these are attempts to put together inexpensive versions of video games of the time, or bringing new properties wholecloth into existence. Often sold cheaply enough that they were sealed in plastic and sold in the same stores as a screwdriver set or flashlight, these little systems tried to pack the most amount of “game” into a small, custom plastic case, running on batteries. (Some were, of course, better built than others.)

They also represent the difficulty ahead for many aspects of digital entertainment, and as such are worth experiencing and understanding for that reason alone.

Taking a $2600 machine and selling it for $20

The shocking difference between the original sold arcade stand-ups and their toy store equivalents can be seen, for example, in the Arcade Game Q*Bert, which you can play at the Archive.

The original Arcade machine looks like this:

And the videogame itself looks like this:

Meanwhile. some time after the release of the arcade machine, a plastic tabletop version of the game came out, and it looked like this:

Using VFD (Vacuum Fluorescent Display) technology, the pre-formed art is lit up based on circuits that try to act like the arcade game as much as possible, without using an actual video screen or a even the same programming. As a result, the “video’ is much more abstract, fascinatingly so:

The music and speech synthesis is gone, a small plastic joystick replaces the metal and hard composite of the original, and the colors are a fraction of what they were. But somehow, if you squint, the original Q*Bert game is in there.

This sort of Herculean effort to squeeze a major arcade machine into a handful of circuits and a beeping, booping shell of what it once was is an ongoing situation – where once it was trying to make arcade machines work both on home consoles like the 2600 and Colecovision, so it was also the case of these plastic toy games. Work of this sort continues, as mobile games take charge and developers often work to bring huge immersive experiences to where a phone hits all the same notes.

The work in this area often speaks for itself. Check out some of these “screenshots” in the VFD games and see if you recognize the originals:

Naturally, these simple screens came packed in the brightest, most colorful stickers and plastic available, to lure in customers. The original containers, while not “emulated” in this browser-based version, definitely represent an important part of the experience.

A Major Bow to the Emulation Developers

The efforts behind accurately reflecting video game and computer experiences in an emulator, which the Archive then uses to provide our in-browser Emularity, are impressive in their own right, and should be highlighted as the lion’s group of the effort. Groups like the MAME Team as well as efforts like Dolphin, Higan, and many others, are all poking and prodding code to bring accuracy, speed and depth to software preservation. They are an often overlooked legion of volunteer effort addressing technical hurdles that no one else is approaching.

While this entry could be filled with many paragraphs about these efforts, one particularly strong example sticks out: Bringing emulation of LCD-based games to MAME.

Destroying The Artifact to Save It

In the case of most emulation, the chips of a circuit board as well as storage media connected to a machine can be read from non-destructively, such that the information is pulled off the original, returned to place, and these copies are used to present emulated versions. An example of this might be an arcade machine, whose chips are pulled from a circuit board, read, and then plugged back into the board, allowing the arcade machine to keep functioning. (Occasionally, an arcade machine/computer will use approaches like glue or batteries to prevent this sort of duplication, but it is generally a rare thing, due to maintenance concerns for operators.)

In the case of an LCD game machine, however, sometimes it is necessary to pull the item completely apart to get all the information from it. On the MAME team, there is a contributor named Sean Riddle and his collaborator “hap” who have been tireless in digging the information out of both LCD games and general computer chips.

To get the information off an LCD game, it has to be pulled apart and all its components scanned, vectorized, and traced to then make them into a software version of themselves. Among the information grabbed is the LCD display itself, which has a pre-formed set of images that do not overlap and represent every possible permutation of any visual data in the game. This will make almost no sense without illustrations, so here are some.

When playing the LCD version of the game “Nightmare Before Christmas”, the game will look like this:

That is a drawn background (also scanned in this process) that has a clear liquid-crystal display over it, showing Jack Skellington, the tree, and an elf. The artistry and intense technical challenge as both the original programming/design and the recovery of this information becomes clear when you see the LCD layer with all the elements “on” at once:

This sort of intense work is everywhere in the background of these LCD games. Here are some more:

 

(There are many more examples of these at this page at Sean Riddle’s site.)

Not only must the LCD panel be disassembled, but the circuit board beneath as well, to determine the programming involved. These are scanned and then studied to work out the cross-connections that tell the game when to light up what. The work has been optimized and can often go relatively quickly, but only due to years of experience behind the effort, experience which, again, comes from a volunteer force. Unfortunately, the machine does not survive, but the argument is made, quite rightly, that otherwise these toys will fade into oblivion. Now, they can be played by thousands or millions and do so for a significant amount of time to come.

The Fundamental Question: What Needs to be Emulated?

Floating in the back of this new collection, and in the many new LCD and electronic games being emulated by the MAME Team, is the core concern of “what will bring the most of the old game to life to be able to experience and study it?” With “standard” arcade games, it is often just a case of providing the video output as well as the speaker output and accepting the control panel signals either through a keyboard or through connected hardware. While you do not get the full role-play of being inside a dark arcade in the 1980s, you do get both the chance to play the original program as well as study its inner workings and the discoveries made in the process. Additional efforts to photograph or reference control panels, outside artwork and so on are also being done to the best available amount.

This question falls into sharp focus, however, with these electronic toys. The plastic is such a major component of the experience that it may not be enough for some researchers and users to be handed a version of the visual output to really know what the game was like. Compare the output of Bandai Pair Match:

…to what the original toy looked like:

The “core” is there, but a lot is left to the side out of necessity. Documentation, research and capturing all aspects of these machines will be required if they are to be ever recreated or understood in the future.

It’s the best of times that we are able to ask these questions while originals are still around, and it’s a testament to the many great teams and researchers who are bringing these old games into the realms of archives.

So please, take a walk through the Handheld History collection (as well as our other emulation efforts) and relive those plastic days of joy again.

Shout Outs and Thanks

Many different efforts and projects were brought together to make the Handheld History collection what it is. (We intend to expand it over time.) As always, a huge thanks to the MAME Developers for their tireless efforts to emulate our digital history; a special shout-out to Ryan Holtz for his announcements and highlighting of advances in this effort that inspired this collection to be assembled. Thanks to Daniel Brooks for maintenance of The Emularity as well as expanding the capabilities of the system to handle these new emulations. Sources for the photographs of the original plastic systems include The Handheld Games Museum and Electronic Plastic. (It is amazing how few photos of the original toy systems exist; in some cases Ebay sales are the only documented photographs of any resolution.) As a reference work for knowing which systems are emulated and how, we relied heavily on the work of the Arcade Italia Database site. Thanks to Azma and Zeether for providing metadata on images and control schemes for these games; and a huge thanks to all the photographers, documenters, scanners and reviewers who have been chronicling the history of these games for decades.

Posted in Announcements, News | 9 Comments

TV News Record: Glorious ContextuBot making progress

A round up on what’s happening at the TV News Archive by Katie Dahl and Nancy Watzman.

This week, we present an update on the video context project Glorious Contextubot, two recent news reports that use TV News Archive data, and fact-checks of TV appearances by the DNC chair and the president.

Fueled by TV News Archive, the Glorious Contextubot is making progress

Let’s say a friend posts a YouTube video link to a politician’s statement on Facebook, but you have a feeling it’s taken out of context. The clip is tightly edited, and you’re curious to see the rest of the statement. Was the politician answering a question? Was the statement part of a larger discussion?

Enter the Glorious ContextuBot. For the past nine months, veteran media innovators Mark Boas and Laurian Gridnoc of Hyperaudio and Trint, led by the Internet Archive’s own Dan Schultz, senior creative technologist of the TV News Archive, have been building a prototype of the Contextubot, fueled by the TV News Archive. The Contextubot is one of 20 winners of the Knight Prototype Fund’s $1 million challenge, announced in June 2017.

With the ContextuBot, it’s possible to use video to search video. Just paste a link to a video snippet into an interface and then pull up a transcript that puts things in context of what came before and after. Built from the Duplitron 5000, an audio fingerprinting tool Schultz developed to track political ads for the Political TV Ad Archive, the ContextuBot demonstrates how open technology built by the TV team can be repurposed and improved by motivated technologists – one that’s already captured the attention of the University of Iowa Informatics department, which is considering adopting it for researchers.

To date, the team has:

  • Made it easier to scale audio search. It’s now possible to scale up and down audio fingerprint finding within a corpus of TV news by adding or removing individual computers or compute clusters.  Our Duplitron would take eight hours to search a year of television, but the ContextuBot makes it much easier to spread that computing across multiple machines.
  • Built a demo interface. You can see a clip in context with a transcript of what comes before and after. Click on a word in the transcript, and you’ll be able to jump to that point in the video stream.
  • Begun to explore a “comic view.”  The team’s biggest goal is to explore ways to communicate the essence of a longer clip in a short amount of time.  One approach: converting video into a comic. This would set the groundwork for automatically extracting (and rendering) a storyboard from a video clip.

The team will present the prototype shortly before the International Symposium of Online Journalism conference in Austin in April 2018.


The Washington Post finds stark differences in cable TV coverage of Jared Kushner

After a heavy news week of developments related to Jared Kushner, President Trump’s son-in-law and a senior adviser, The Washington Post’s Philip Bump dug into the TV News Archive and found that while MSNBC and CNN had numerous mentions of Kushner’s name, Fox News had just ten.


The Washington Post examines coverage of Parkland shooting

Rachel Siegal used the TV News Archive to compare coverage of the Parkland shooting with several other high-profile shootings, and found that this time cable TV attention spans are a bit longer.


Fact-Check: the DNC raised record-making amounts in January. (Two Pinocchios)

In a recent interview, Democratic National Committee Chairman Tom Perez said, “We raised more money in January… of 2018 than any January in our history. So if the question is, ‘Do we have enough money to implement our game plan?’ Absolutely.”

This claim earned “two Pinocchios” from Salvador Rizzo, reporting for The Washington Post’s Fact Checker:  the “DNC raised $6 million in January 2018… That was below what it raised in January 2014 ($6.6 million), January 2012 ($13.2 million), January 2011 ($7.1 million) and January 2010 ($9.1 million).”  A spokesman for Perez “backed off from those comments when we reached out with FEC figures that told a different story.”


Fact-Check: Congressman fears NRA downgrade for gun legislation (misleading)

In a meeting with lawmakers to talk gun legislation, President Donald Trump suggested that an age requirement increase for purchasing guns was not included in a 2013 reform effort by Rep. Pat Toomey, R., Pa., “because you’re afraid of the NRA, right?”

Reporting by FactCheck.org’s Eugene Kiley, Lori Robertson, and Robert Farley calls this statement misleading.  “As a result of the legislation, Toomey’s rating with the NRA dropped from an “A” to a “C,” and the endorsements and contributions Toomey got from the NRA in previous House and Senate races disappeared. In 2016, the NRA stayed out of Toomey’s Senate race altogether; his Democratic opponent, Katie McGinty, had an “F” grade from the NRA. In that race, Toomey got the endorsement of a gun-control group, Everytown for Gun Safety, which ran ads supporting him.”


Follow us @tvnewsarchive, and subscribe to our biweekly newsletter here.

Posted in News, Television Archive | Tagged , , , , , , , , , , , , | Comments Off on TV News Record: Glorious ContextuBot making progress

Archive video now supports WebVTT for captions

We now support .vtt files (Web Video Text Tracks) in addition to .srt (SubRip) (.srt we have supported for years) files for captioning your videos.

It’s as simple as uploading a “parallel filename” to your video file(s).

Examples:

  • myvid.mp4
  • myvid.srt
  • myvid.vtt

Multi-lang support:

  • myvid.webm
  • myvid.en.vtt
  • myvid.en.srt
  • myvid.es.vtt

Here’s a nice example item:
https://archive.org/details/cruz-test

VTT with caption picker (and upcoming A/V player too!)

(We will have an updated A/V player with a better “picker” for so many language tracks in days, have no fear 😎

Enjoy!

 

Posted in Technical, Television Archive, Video Archive | Tagged , , , | Comments Off on Archive video now supports WebVTT for captions

10 Ways To Explore The Internet Archive For Free

The Internet Archive is a treasure trove of fascinating media, texts, and ephemera. Items that if they didn’t exist here, would be lost forever. Yet so many of our community members have difficulty describing what exactly it is…that we do here. Most people know us for the Wayback Machine, but we are so much more. To that end, we’ve put together a fun and useful guide to exploring the Archive. So, grab your flashlight and pith hat and let your digital adventure begin…

1. Pick a place & time you want to explore. Search our eBooks and Texts collection and download or borrow one of the 3 million books for free, offered in many formats, including PDFs and EPub.

2. Enter a time machine of old time films. Explore films of historic significance in the Prelinger Archives.

3. Want to listen to a live concert? The Live Music Archive holds more than 12,000 Grateful Dead concerts.

4. Who Knows What Evil Lurks in the Hearts of Men? Only the Shadow knows. You can too. Listen to “The Shadow” as he employs his power to cloud minds to fight crime in Old Time Radio.

5. To read or not to read? Try listening to Shakespeare with the LibriVox Free Audiobook Collection.

6. Need a laugh? Search the Animation Shorts collection for an old time cartoon.

7. Before there was Playstation 4… there was Atari. Play a classic video game on an emulated old time console, right in the browser. Choose from hundreds of games in the Internet Arcade.

8. Are you a technophile? Take the Oregon Trail or get nostalgic with the Apple II programs. You have instant access to decades of computer history in the Software Library.

9. Find a television news story you missed. Search our Television News Archive for all the channels that presented the story. How do they differ? Quote a clip from the story and share it.

10. Has your favorite website disappeared? Go to the Wayback Machine and type in the URL to see if this website has been preserved across time. Want to save a website? Use “Save Page Now.”

What does it take to become an archivist? It’s as simple as creating your own Internet Archive account and diving in. Upload photos, audio, and video that you treasure. Store them for free. Forever.

 

Sign up for free at https://archive.org.

Posted in Announcements, News | Comments Off on 10 Ways To Explore The Internet Archive For Free

Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation

The Andrew W. Mellon Foundation has awarded a research and development grant to the Internet Archive to address the critical need to preserve the “long tail” of open access scholarly communications. The project, Ensuring the Persistent Access of Long Tail Open Access Journal Literature, builds on prototype work identifying at-risk content held in web archives by using data provided by identifier services and registries. Furthermore, the project expands on work acquiring missing open access articles via customized web harvesting, improving discovery and access to this materials from within extant web archives, and developing machine learning approaches, training sets, and cost models for advancing and scaling this project’s work.

The project will explore how adding automation to the already highly automated systems for archiving the web at scale can help address the need to preserve at-risk open access scholarly outputs. Instead of specialized curation and ingest systems, the project will work to identify the scholarly content already collected in general web collections, both those of the Internet Archive and collaborating partners, and implement automated systems to ensure at-risk scholarly outputs on the web are well-collected and are associated with the appropriate metadata. The proposal envisages two opposite but complementary approaches:

  • A top-down approach involves taking journal metadata and open data sets from identifier and registry sources such as ISSN, DOAJ, Unpaywall, CrossRef, and others and examining the content of large-scale web archives to ask “is this journal being collected and preserved and, if not, how can collection be improved?”
  • A bottom-up approach involves examining the content of general domain-scale and global-scale web archives to ask “is this content a journal and, if so, can it be associated with external identifier and metadata sources for enhanced discovery and access?”

The grant will fund work to use the output of these approaches to generate training sets and test them against smaller web collections in order to estimate how effective this approach would be at identifying the long-tail content, how expensive a full-scale effort would be, and what level of computing infrastructure is needed to perform such work. The project will also build a model for better understanding the costs for other web archiving institutions to do similar analysis upon their collection using the project’s algorithms and tools. Lastly, the project team, in the Web Archiving and Data Services group with Director Jefferson Bailey as Principal Investigator,  will undertake a planning process to determine resource requirements and work necessary to build a sustainable workflow to keep the results up-to-date incrementally as publication continues.

In combination, these approaches will both improve the current state of preservation for long-tail journal materials as well as develop models for how this work can be automated and applied to existing corpora at scale. Thanks to the Mellon Foundation for their support of this work and we look forward to sharing the project’s open-source tools and outcomes with a broad community of partners.

Posted in Announcements, News | Tagged , , | Comments Off on Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation

27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving

The lives and activities of communities are increasingly documented online; local news, events, disasters, celebrations — the experiences of citizens are now largely shared via social media and web platforms. As these primary sources about community life move to the web, the need to archive these materials becomes an increasingly important activity of the stewards of community memory. And in many communities across the nation, public libraries, as one of their many responsibilities to their patrons, serve the vital role of stewards of local history. Yet public libraries have historically been a small fraction of the growing national and international web archiving community.

With generous support from the Institute of Museum and Library Services, as well as the Kahle/Austin Foundation and the Archive-It service, the Internet Archive and 27 public library partners representing 17 different states have launched a new program: Community Webs: Empowering Public Libraries to Create Community History Web Archives. The program will provide education, applied training, cohort network development, and web archiving services for a group of public librarians to develop expertise in web archiving for the purpose of local memory collecting. Additional partners in the program include OCLC’s WebJunction training and education service and the public libraries of Queens, Cleveland and San Francisco will serve as “lead libraries” in the cohort. The program will result in dozens of terabytes of public library administered local history web archives, a range of open educational resources in the form of online courses, videos, and guides, and a nationwide network of public librarians with expertise in local history web archiving and the advocacy tools to build and expand the network. A full listing of the participating public libraries is below and on the program website.

In November 2017, the cohort gathered together at the Internet Archive for a kickoff meeting of brainstorming, socializing, and, of course, talking all things web archiving.  Partners shared details on their existing local history programs and ideas for collection development around web materials. Attendees talked about building collections documenting their demographic diversity or focusing on local issues, such as housing availability or changes in community profile. As an example, Abbie Zeltzer from the Patagonia Public Library, spoke about the changes in her community of 913 residents as the town redevelops a long dormant mining industry. Zeltzer intends on developing a web archive documenting this transition and the related community reaction and changes.

Since the kickoff meeting, the Community Webs cohort has been actively building collections, from hyper-local media sites in Kansas City, to neighborhood blogs in Washington D.C., to Mardi Gras in East Baton Rouge. In addition, program staff, cohort members, and WebJunction have been building out an extensive online course space with educational materials for training on web archiving for local history. The full course space and all open educational resources will be released in early 2019 and a second full in-person meeting of the cohort will take place in Fall 2018.

For further information on the Community Webs program, contact Maria Praetzellis, Program Manager, Web Archiving [maria at archive.org] or Jefferson Bailey, Director, Web Archiving [jefferson at archive.org].

Public Library City State
Athens Regional Library System Athens GA
Birmingham Public Library Birmingham AL
Brooklyn Public Library – Brooklyn Collection New York City NY
Buffalo & Erie County Public Library Buffalo NY
Cleveland Public LIbrary Cleveland OH
Columbus Metropolitan Library Columbus OH
County of Los Angeles Public Library Los Angeles CA
DC Public Library Washington DC
Denver Public Library – Western History and Genealogy Department and Blair-Caldwell African American Research Library Denver CO
East Baton Rouge Parish Library East Baton Rouge LA
Forbes Library Northampton MA
Grand Rapids Public Library Grand Rapids MI
Henderson District Public Libraries Henderson NV
Kansas City Public Library Kansas City MO
Lawrence Public Library Lawrence KS
Marshall Lyon County Library Marshall MN
New Brunswick Free Public Library New Brunswick NJ
Schomburg Center for Research in Black Culture (NYPL) New York City NY
Patagonia Library Patagonia AZ
Pollard Memorial Library Lowell MA
Queens Library New York City NY
San Diego Public Library San Diego CA
San Francisco Public Library San Francisco CA
Sonoma County Public Library Santa Rosa CA
The Urbana Free Library Urbana IL
West Hartford Public Library West Hartford CT
Westborough Public Library Westborough MA
Posted in Announcements, Archive-It | Tagged | Comments Off on 27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving

Mass downloading 78rpm record transfers

To preserve or discover interesting 78rpm records you can download them to your own machine (rather than using our collection pages).  You can download lots on to a mac/linux machine by using a command line utility.

Preparation:  Download the IA command line tool.     Like so:

$ curl -LO https://archive.org/download/ia-pex/ia
$ chmod +x ia
$ ./ia help

Option 1:   if you want just a set of mp3’s to play download to your /tmp directory:

./ia download --search "collection:georgeblood" --no-directories --destdir /tmp -g "[!_][!7][!8]*.mp3"

or just blues (or hillbilly or other searches):

./ia download --search "collection:georgeblood AND blues" --no-directories --destdir /tmp -g "[!_][!7][!8]*.mp3"

Option 2: if you want to preserve the FLAC and MP3 and metadata files for the best version of the 78rpm record we have.  (if you are using a Mac Install homebrew on a mac, then type “brew install parallel”.  On linux try “apt-get install parallel”)

./ia search 'collection:georgeblood' --sort=publicdate\ asc --itemlist > itemlist.txt
cat itemlist.txt | parallel --joblog download.log './ia download {} --destdir /tmp -g "[!_][!7][!8]*"'

parallel --retry-failed --joblog download.log './ia download {} --destdir /tmp -g "[!_][!7][!8]*"'
Posted in 78rpm, Audio Archive | Comments Off on Mass downloading 78rpm record transfers

TV News Record: Television Explorer 2.0, shooting coverage & more

A round up on what’s happening at the TV News Archive by Katie Dahl and Nancy Watzman. 

Explore Television Explorer 2.0

Television Explorer, a tool to search closed captions from the TV News Archive, keeps getting better. Last week GDELT’s Kalev Leetaru added new and improved features:

  • 163 channels are now available to search, from C-Span to Al Jazeera to Spanish language content from Univision and Telemundo.
  • Results now come as a percentage of 15 second clips, making comparisons between simpler.
  • The context word function for searches is similarly redesigned, counting a matching 15-second clip as well searching the 15 second clips immediately before and after, helping to alleviate some previous issues with overcounting.
  • You can now see normalization timelines on the site, with newly available data about the total number of 15-second clips monitored each day and hour included in your query.

Take the revamped Television Explorer for a spin.

Here’s what we found when we used the new tools to track the use of the term, “cryptocurrency.” The rapid ascent, and sometimes fall, of the value of cryptocurrencies such as bitcoin have led to rises and dips in TV news coverage as well. In May 2017, international TV news channels began to run stories featuring the term, rapidly increasing in November and peaking just last week with BBC News. Television Explorer shows that Deutsche Welle led the pack ahead of BBC News and Al Jazeera in covering cryptocurrency. Among US networks, Bloomberg uses the term more than twice as often as Deutsche Welle. A search of the term bitcoin shows a similar trajectory, with CNBC coverage spiking December 11, 2017, a few days before bitcoin hit its historic peak in value to date.

Florida high school shooting TV news coverage shows familiar pattern

Within a broader analysis of how responses to the most recent school shooting compare with others, The Washington Post’s Philip Bump used TV News Archive closed caption data using GDELT’s Television Explorer to examine the pattern of use of the term “gun control” on CNN, Fox, and MSNBC. “After the mass shooting in Las Vegas last October, a political discussion about banning ‘bump stocks’ — devices that allowed the shooter to increase his rate of fire — soon collapsed.” “So far, the conversation after Parkland looks similar to past patterns.”

Washington Post graphic

Fact-check: Trump never said Russia didn’t meddle in election (Pants on Fire!)

“I never said Russia did not meddle in the election”

Reacting to the indictments of Russian nationals by Special Counsel Robert Mueller, President Donald Trump wrote, “I never said Russia did not meddle in the election, I said, ‘It may be Russia, or China or another country or group, or it may be a 400 pound genius sitting in bed and playing with his computer.’ The Russian “hoax” was that the Trump campaign colluded with Russia – it never did!”

Fact-checkers moved quickly to investigate this claim.

The Washington Post’s Fact Checker, Glenn Kessler: “According to The Fact Checker’s database of Trump claims, Trump in his first year as president then 44 more times denounced the Russian probe as a hoax or witch hunt perpetuated by Democrats. For instance, here’s a tweet from the president after reports emerged about the use of Facebook by Russian operatives, a key part of the indictment: ‘The Russia hoax continues, now it’s ads on Facebook. What about the totally biased and dishonest Media coverage in favor of Crooked Hillary?’”

PolitiFact’s Jon Greenberg:  “Pants on Fire!” The president “called the matter a ‘made-up story,’ and a ‘hoax.’ He has said that he believes Russian President Putin’s denial of any Russian involvement. He told Time, ‘I don’t believe they (Russia) interfered.’”

Vox on Fox (& CNN & MSNBC): Mueller indictment, Florida shooting

In an analysis of Fox News, CNN, and MSNBC during the 72 hours following the announcement of the indictment of 13 Russians, Vox’s Alvin Chang used TV News Archive closed captioning data and the GDELT Project’s Television Explorer to show “how Fox News spun the Mueller indictment and Florida shooting into a defense of the president.” Chang uses the data to show that “[I]nstead of focusing on the details of the indictment itself, pundits on Fox News spent a good chunk of their airtime pointing out that this isn’t proof of the Trump administration colluding with Russia.”


TV news coverage and analysis in one place

Scholars, pundits, and reporters have used the data we’ve created here in the TV News Archive in ways that continue to inspire us, adding much-needed context to our chaotic public discourse as seen on TV.  All that content is now in one place, showcasing the work of these researchers and reporters who turned TV news data into something meaningful.

Follow us @tvnewsarchive, and subscribe to our weekly newsletter here.

Posted in Announcements, Television Archive | Tagged , , , , , , , , , | Comments Off on TV News Record: Television Explorer 2.0, shooting coverage & more

Expanding the Television Archive

When we started archiving television in 2000, people shrugged and asked, “Why?  Isn’t it all junk anyway?” As the saying goes, one person’s junk is another person’s gold. From 2010-18, scholars, pundits and above all, reporters, have spun journalistic gold from the data captured in our 1.5 million hours of television news recordings. Our work has been fueled by visionary funders(1) who saw the potential impact of turning television – from news reports to political ads – into data that can be analyzed at scale. Now the Internet Archive is taking its Television Archive in new directions. In 2018 our goals for television will be: better curation in what we collect; broader collection across the globe; and working with computer scientists interested in exploring our huge data sets. Simply put, our mission is to build and preserve comprehensive collections of the world’s most important television programming and make them as accessible as possible to researchers and the general public. We will need your help.  

“Preserving TV news is critical, and at the Internet Archive we’ve decided to rededicate ourselves to growing our collection,” explained Roger MacDonald, Director of Television at the Internet Archive. “We plan to go wide, expanding our archives of global TV news from every continent. We also plan to go deep, gathering content from local markets around the country. And we plan to do so in a sustainable way that ensures that this TV will be available to generations to come.”

Libraries, museums and memory institutions have long played a critical role in preserving the cultural output of our creators. Television falls within that mandate. Indeed some of the most comprehensive US television collections are held by the Library of Congress, Vanderbilt University and UCLA. Now we’d like to engage with a broad range of libraries and memory institutions in the television collecting and curation process. If your organization has a mandate to collect television or researcher demand for this media, we would like to understand your needs and interests. The Internet Archive will undertake collection trials with interested institutions, with the eventual goal of making this work self-sustaining.

Simultaneously, we are looking to engage researchers interested in the non-consumptive analysis of television at scale, in ways that continue to respect the interests of right holders. The tools we’ve created may be useful. For instance, we hope the tools the Internet Archive used to detect TV campaign ads can be applied by researchers in new and different ways.  If your organization has interest in computing with television as data at large, we are interested in working with you.

This groundbreaking interface for searching television news, based on the closed captions associated with US broadcasts, was developed between 2009-2012.

A brief history of the Internet Archive’s Television collection:

2000 Working with pioneering engineer, Rod Hewitt, IA begins archiving 20 channels originating from many nations.

Oct. 2001 September 11, 2001 Collection established, and enhanced in 2011.

2009-2012 With funding from the Knight Foundation and many others, we built a service to allow public searching, citation and borrowing of US television news programs on DVD.

2012-2014 Public TV news library launched with tools to search, quote and share streamed snippets from television news.

2014 Pilot launched to detect political advertisements broadcast in the Philadelphia region, led to developing open sourced audio fingerprinting techniques.

2016 Political ad detection, curation, and access expanded to 28 battleground regions for 2016 elections, enabling journalists to fact check the ads and analyze the data at scale. The same tools helped reporters analyze presidential debates.  This resulted in front-page data visualizations in The New York Times, as well as 150+ analyses by news outlets from Fox News to The Economist to FiveThirtyEight.

2017-date Experiments with artificial intelligence techniques to employ facial identification, and on-screen optical character recognition to aid searching and data mining of television. Special curated collections of top political leaders and fact-check integrations.

In the run-up to the 2016 presidential elections, journalists at the NYT and elsewhere began analyzing television as data, in this case looking at the different sound bites each network chose to replay.

Embarking on a new direction also means shifting away from some of our current services. Our dedicated television team has been focusing on metadata enhancement and assisting journalists and scholars to use our data. We will be wrapping up some of these free services in the next three to four months.  We hope others will take up where we left off and build the tools that will make our collection even more valuable to the public.

Now more than ever in this era of disinformation, our world needs an open, reliable, canonical reference source of television news. This cannot exist without the diligent efforts of technologists, journalists, researchers, and television companies all working together to create a television archive open for all. We hope you will join us!

To learn more about the work of the TV News Archive outreach and metadata innovation team over the last few years, please see our blog posts.

(1) Funding for the Television Archive has come from diverse donors, including the John S. and James L. Knight Foundation, Democracy Fund, Rita Allen Foundation, craigslist Charitable Fund and The Buck Foundation.

Posted in Announcements, News | Tagged , , , , , , , , , , , | Comments Off on Expanding the Television Archive