TV News Record: Third Eye goes to Trump press conference

A weekly round up on what’s happening and what we’re seeing at the TV News Archive by Katie Dahl and Nancy Watzman. Additional research by Robin Chin.

All three major U.S. cable news networks covered President Donald Trump’s impromptu press conference with Sen. Mitch McConnell, R., Ky., on Monday, October 16, but there were notable differences in their editorial choices for chyrons – the captions that appear in real-time on the bottom third of the screen – throughout the broadcast. We used the TV News Archive’s new Third Eye chryon extraction data tool to demonstrate these differences, similar to how The Washington Post examined FBI director James B. Comey’s hearing in June 2017.

The beauty of the Third Eye tool is you can do this too, any time there is breaking news or a widely covered live event, like yesterday’s Senate judiciary committee hearing where AG Jeff Sessions testified (7:31am-9:46am PT) or the October 5 White House briefing about Puerto Rico (11:20am-11:48am PT). Third Eye data – which includes chyrons from BBC News, CNN, Fox News, and MSNBC – is available for data download, via API, in both raw and filtered formats. (Get into the weeds over on the Third Eye collection page.) Please take Third Eye for a spin, and let us know if you have questions: tvnews@archive.org or @tvnewsarchive.

For example, at 11:03 PT, Trump began answering a question about pharmaceutical companies “making money.” MSNBC chooses a chyron that characterizes Trump’s statements as a claim, whereas Fox News displays Trump’s assertion that Obamacare is a disaster. CNN goes with a chyron saying that Trump is “very happy” to end Obamacare subsidies.  In the following minute, 11:04, Fox News chooses other bold statements from Trump: “I do not need pharma money” and “I want tax reform this year.” CNN’s chyron instead says Trump “would like to see” tax reform, a less bold statement.

(Note: these are representative chryons from the minute period and did not necessarily display for the full 60-second period.)

Later in the press conference, the discussion turns to natural disasters before then focusing on the proposed wall on the border with Mexico. Again, Fox News features Trump making bold, simple assertions: “we are getting high marks for our hurricane response,” and “PR was in bad shape before the storm hit.” MSNBC instead uses the word “claims”: “Trump claims Puerto Rico now has more generators than any place in the world.”

Watch the Trump-McConnell press conference in context on C-Span.

Fact-check: Sen. McCaskill not present for bill to weaken DEA (four Pinocchios)

The day following The Washington Post-60 Minutes report on legislation passed by Congress and signed by President Barack Obama to weaken the authority of the Drug Enforcement Agency, Sen. Claire McCaskill, D., Mo., called for repeal of the law. In an interview, she also said, “Now, I did not go along with this. I wasn’t here at the time. I was actually out getting breast cancer treatment. I don’t know that I would have objected. I like to believe I would have, but the bottom line is, once the DEA [Drug Enforcement Administration] kind of, the upper levels at the DEA obviously said it was okay, that’s what gave it the green light.”

But “despite her claim that she ‘wasn’t here at the time,’ McCaskill was clearly back at the Senate, participating in votes and hearings,” according to The Washington Post‘s Fact Checker’s Glenn Kessler. “McCaskill’s staff acknowledged the error, saying that they had forgotten she had come back at that time. ‘It was sloppy on our part, and we take responsibility,’ a spokesman said.”


Fact-check: Pressure from Trump led to stepped up NATO members’ defense spending (half true)

In an interview on October 15, Secretary of State Rex Tillerson said, “The president early on called upon NATO member countries to step up their contributions — step up their commitment to NATO, modernize their own forces… He’s been very clear, and as a result of that countries have stepped up contributions toward their own defense.”

PolitiFact reporter Allison Graves found that “25 NATO allies plan to increase spending in real terms in 2017.” And “according to NATO, over the last 3 years, European allies and Canada spent almost $46 billion more on defense, meaning increases in spending have occurred before Trump’s presidency. Experts said it’s possible that Trump’s pressure has contributed to the continuation of the upward trend, but Tillerson’s explanation glazes over the other factors that have led to increases, including the conflict in the Ukraine in 2014.”

Follow us @tvnewsarchive, and subscribe to our weekly newsletter here.

Posted in Announcements, News, Television Archive | Tagged , , , , , , , , , , , | Leave a comment

Fifth Annual Aaron Swartz Day & International Hackathon

Aaron Swartz Weekend

by Lisa Rein, Cofounder and Coordinator, Aaron Swartz Day

In memory of Aaron Swartz, whose social, technical, and political insights still touch us daily, Lisa Rein, in partnership with the Internet Archive, will be hosting a weekend of events on Saturday, November 4th and Sunday, November 5th. Friends, collaborators, and hackers can participate in a two-day Hackathon and Aaron Swartz Day Evening Reception.

Schedule of events held at the Internet Archive:

Saturday, November 4th, from 10 am – 6 pm and Sunday, November 5th, from 11am – 5pm — Participate in the hackathon, which will focus on SecureDrop, the whistleblower submission system originally created by Aaron just before he passed away, and other projects inspired by Aaron’s work.

Saturday night, November 4th, from 6:00pm – 9:30pm — Celebrate and remember Aaron, and also the grand tradition of working hard to make the world a better place, at the Aaron Swartz Day Evening Celebration:
Reception: 6:00pm – 7:00pm – Come mingle with the speakers and enjoy nectar, wine & tasty nibbles.

Migrate your way upstairs: 7:10-7:30pm – Finish your nibbles and wine at the reception, exchange contact info, and make your way upstairs to grab a seat to watch the speakers, which will begin promptly at 7:30 pm – a half hour earlier than usual, because we have so many amazing speakers this year.

Speakers 7:30-9:30 pm (Break 8:15-8:30pm)

  • Chelsea Manning (Network Security Expert, Former Intelligence Analyst)
  • Lisa Rein (Chelsea Manning’s Archivist, Co-founder Creative Commons, Co-founder Aaron Swartz Day)
  • Daniel Rigmaiden (Transparency Advocate)
  • Barrett Brown (Journalist, Activist, Founder of the Pursuance Project) (via SKYPE)
  • Jason Leopold (Senior Investigative Reporter, Buzzfeed News)
  • Jennifer Helsby (Lead Developer, SecureDrop, Freedom of the Press Foundation)
  • Cindy Cohn (Executive Director, Electronic Frontier Foundation)
  • Gabriella Coleman (Hacker Anthropologist, Author, Researcher, Educator)
  • Caroline Sinders (Designer/Researcher, Wikimedia Foundation, Creative Dissent Fellow, YBCA)
  • Brewster Kahle (Co-founder and Digital Librarian, Internet Archive, Co-founder Aaron Swartz Day)
  • Steve Phillips (Project Manager, Pursuance)
  • Mek Karpeles (Citizen of the World, Internet Archive)
  • Brenton Cheng (Senior Engineer, Open Library, Internet Archive)

GET TICKETS HERE

Saturday, November 4th, 2017
10:00 am Hackathon
6:00 pm Reception
7:30 pm Program

Sunday, November 5th, 2017
11:00 am Hackathon

Internet Archive
300 Funston Ave.
San Francisco, CA 94118
For more information, contact:
lisa@lisarein.com
https://www.aaronswartzday.org

Posted in Event | Leave a comment

The 20th Century Time Machine

by Nancy Watzman & Katie Dahl

Jason Scott

With the turn of a dial, some flashing lights, and the requisite puff of fog, emcees Tracey Jaquith, TV Architect, and Jason Scott, Free Range Archivist, cranked up the Internet Archive 20th Century Time Machine on stage before a packed house at the Internet Archive’s annual party on October 11.

Eureka! The cardboard contraption worked! The year was 1912, and out stepped Alexis Rossi, director of Media and Access, her hat adorned with a 78rpm record.

1912

D’Anna Alexander (center) with her mother (right) and grandmother (left).

“Close your eyes and listen,” Rossi asked the audience. And then, out of the speakers floated the scratchy sounds of Billy Murray singing “Low Bridge, Everybody Down” written by Thomas S. Allen. From 1898 to the 1950s, some three million recordings of about three minutes each were made on 78rpm discs. But these discs are now brittle, the music stored on them precious. The Internet Archive is working with partners on the Great 78 Project to store these recordings digitally, so that we and future generations can enjoy them and reflect on our music history. New collections include the Tina Argumedo and Lucrecia Hug 78rpm Collection of dance music collected in Argentina in the mid-1930s.

1927

Next to emerge from the Time Machine was David Leonard, president of the Boston Public Library, which was the first free, municipal library founded in the United States. The mission was and remains bold: make knowledge available to everyone. Knowledge shouldn’t be hidden behind paywalls, restricted to the wealthy but rather should operate under the principle of open access as public good, he explained. Leonard announced that the Boston Public Library would join the Internet Archive’s Great 78 Project, by authorizing the transfer of 200,000 individual 78s and LPs to preserve and make accessible to the public, “a collection that otherwise would remain in storage unavailable to anyone.”

David Leonard and Brewster Kahle

Brewster Kahle, founder and Digital Librarian of the Internet Archive, then came through the time machine to present the Internet Archive Hero Award to Leonard. “I am inspired every time I go through the doors,” said Kahle of the library, noting that the Boston Public Library was the first to digitize not just a presidential library, of John Quincy Adams, but also modern books.  Leonard was presented with a tablet imprinted with the Boston Public Library homepage by Internet Archive 2017 Artist in Residence, Jeremiah Jenkins.

1942

Kahle then set the Time Machine to 1942 to explain another new Internet Archive initiative: liberating books published between 1923 to 1941. Working with Elizabeth Townsend Gard, a copyright scholar at Tulane University, the Internet Archive is liberating these books under a little known, and perhaps never used, provision of US copyright law, Section 108h, which allows libraries to scan and make available materials published 1923 to 1941 if they are not being actively sold. The name of the new collection: the Sony Bono Memorial Collection, named for the now deceased congressman and former representative who led the passage of the Copyright Term Extension Act of 1998, which included the 108h provision as a “gift” to libraries.

One of these books includes “Your Life,” a tome written by Kahle’s grandfather, Douglas E. Lurton, a “guide to a desirable living.” “I have one copy of this book and two sons. According to the law, I can’t make one copy and give it to the other son. But now it’s available,” Kahle explained.

1944

Sab Masada

The Time Machine cranked to 1944, out came Rick Prelinger, Internet Archive Board member, archivist, and filmmaker. Prelinger introduced a new addition to the Internet Archive’s film collection: long-forgotten footage of an Arkansas Japanese internment camp from 1944.  As the film played on the screen, Prelinger welcomed Sab Masada, 87, who lived at this very camp as a 12-year-old.

Masada talked about his experience at the camp and why it is important for people today to remember it. “Since the election I’ve heard echoes of what I heard in 1942,” Masada said. “Using fear of terrorism to target the Muslims and people south of the border.”

1972

Next to speak was Wendy Hanamura, the director of partnerships. Hanamura explained how as a sixth grader she discovered a book at the library, Executive Order 9066, published in 1972, which chronicled photos of Japanese internment camps during World War II.

“Before I was an internet archivist, I was a daughter and granddaughter of American citizens who were locked up behind barbed wire in the same kind of camps that incarcerated Sab,” said Hanamura. That one book – now out of print – helped her understand what had happened to her family.

Inspired by making it to the semi-final round of the MacArthur 100&Change initiative with a proposal that provides libraries and learners with free digital access to four million books, the Internet Archive is forging ahead with plans, despite not winning the $100 million grant. Among the books the Internet Archive is making available: Executive Order 9066.

1985

The year display turned to 1985, Jason Scott reappeared on stage, explaining his role as a software curator. New this year to the Internet Archive are collections of early Apple software, he explained, with browser emulation allowing the user to experience just what it was like to fire up a Macintosh computer back in its hay day. This includes a collection of the then wildly popular “HyperCards,” a programmatic tool that enabled users to create programs that linked materials in creative ways, before the rise of the world wide web.

1997

After Vinay Goelthis tour through the 20th century, the Time Machine was set to 1997. Mark Graham, Director of the Wayback Machine and Vinay Goel, Senior Data Engineer, stepped on stage. Back in 1997, when the Wayback Machine began archiving websites on the still new World Wide Web, the entire thing amounted to 2.2 terabytes of data. Now the Wayback Machine contains 20 petabytes. Graham explained how the Wayback Machine is preserving tweets, government websites, and other materials that could otherwise vanish. One example: this report from The Rachel Maddow Show, which aired on December 16, 2016, about Michael Flynn, then slated to become National Security Advisor. Flynn deleted a tweet he had made linking to a falsified story about Hillary Clinton, but the Internet Archive saved it through the Wayback Machine.

Goel took the microphone to announce new improvements to Wayback Machine Search 2.0. Now it’s possible to search for keywords, such as “climate change,” and find not just web pages from a particular time period mentioning these words, but also different format types — such as images, pdfs, or yes, even an old Internet Archive favorite, animated gifs from the now-defunct GeoCities–including snow globes!

Thanks to all who came out to celebrate with the Internet Archive staff and volunteers, or watched online. Please join our efforts to provide Universal Access to All Knowledge, whatever century it is from.

Editor’s Note, 10/16/17: Watch the full event https://archive.org/details/youtube-j1eYfT1r0Tc  

 

Posted in 78rpm, Announcements, Event, News, Open Library, Wayback Machine - Web Archive | Tagged , , , , , | Leave a comment

Syncing Catalogs with thousands of Libraries in 120 Countries through OCLC

We are pleased to announce that the Internet Archive and OCLC have agreed to synchronize the metadata describing our digital books with OCLC’s WorldCat. WorldCat is a union catalog that itemizes the collections of thousands of libraries in more than 120 countries that participate in the OCLC global cooperative.

What does this mean for readers?
When the synchronization work is complete, library patrons will be able to discover the Internet Archive’s collection of 2.5 million digitized monographs through the libraries around the world that use OCLC’s bibliographic services. Readers searching for a particular volume will know that a digital version of the book exists in our collection. With just one click, readers will be taken to archive.org to examine and possibly borrow the digital version of that book. In turn, readers who find a digital book at archive.org will be able, with one click, to discover the nearest library where they can borrow the hard copy.

There are additional benefits: in the process of the synchronization, OCLC databases will be enriched with records describing books that may not yet be represented in WorldCat.

“This work strengthens the Archive’s connection to the library community around the world. It advances our goal of universal access by making our collections much more widely discoverable. It will benefit library users around the globe by giving them the opportunity to borrow digital books that might not otherwise be available to them,” said Brewster Kahle, Founder and Digital Librarian of the Internet Archive. “We’re glad to partner with OCLC to make this possible and look forward to other opportunities this synchronization will present.”

“OCLC is always looking for opportunities to work with partners who share goals and objectives that can benefit libraries and library users,” said Chip Nilges, OCLC Vice President, Business Development. “We’re excited to be working with Internet Archive, and to make this valuable content discoverable through WorldCat. This partnership will add value to WorldCat, expand the collections of member libraries, and extend the reach of Internet Archive content to library users everywhere.”

We believe this partnership will be a win-win-win for libraries and for learners around the globe.

Better discovery, richer metadata, more books borrowed and read.

Read the OCLC press release.

Posted in Announcements, News, Open Library | Leave a comment

Boston Public Library’s Sound Archives Coming to the Internet Archive for Preservation & Public Access

Today, the Boston Public Library announced the transfer of significant holdings from its Sound Archives Collection to the Internet Archive, which will digitize, preserve and make these recordings accessible to the public. The Boston Public Library (BPL) sound collection includes hundreds of thousands of audio recordings in a variety of historical formats, including wax cylinders, 78 rpms, and LPs. The recordings span many genres, including classical, pop, rock, jazz, and opera – from 78s produced in the early 1900s to LPs from the 1980s. These recordings have never been circulated and were in storage for several decades, uncataloged and inaccessible to the public. By collaborating with the Internet Archive, Boston Public Libraries audio collection can be heard by new audiences of scholars, researchers and music lovers worldwide.

Some of the thousands of 20th century recordings in the Boston Public Library’s Sound Archives Collection.

“Through this innovative collaboration, the Internet Archive will bring significant portions of these sound archives online and to life in a way that we couldn’t do alone, and we are thrilled to have this historic collection curated and cared for by our longtime partners for all to enjoy going forward,” said David Leonard, President of the Boston Public Library.

78 rpm recordings from the Boston Public Library Sound Archive Collection

Listening to the 78 rpm recording of “Please Pass the Biscuits, Pappy,” by W. Lee O’Daniel and his Hillbilly Boys from the BPL Sound Archive, what do you hear? Internet Archive Founder, Brewster Kahle, hears part of a soundscape of America in 1938.  That’s why he believes Boston Public Library’s transfer is so significant.

Boston Public Library is once again leading in providing public access to their holdings. Their Sound Archive Collection includes hillbilly music, early brass bands and accordion recordings from the turn of the last century, offering an authentic audio portrait of how America sounded a century ago.” says Brewster Kahle, Internet Archive’s Digital Librarian. “Every time I walk through Boston Public Library’s doors, I’m inspired to read what is carved above it: ‘Free to All.’”

The 78 rpm records from the BPL’s Sound Archives Collection fit into the Internet Archive’s larger initiative called The Great 78 Project. This community effort seeks to digitize all the 78 rpm records ever produced, supporting their  preservationresearch and discovery. From about 1898 to the 1950s, an estimated 3 million sides were published on 78 rpm discs. While commercially viable recordings will have been restored or remastered onto LP’s or CD, there is significant research value in the remaining artifacts which include often rare 78rpm recordings.

“The simple fact of the matter is most audiovisual recordings will be lost,” says George Blood, an internationally renowned expert on audio preservation. “These 78s are disappearing right and left. It is important that we do a good job preserving what we can get to, because there won’t be a second chance.”

George Blood LP’s 4-arm turntable used for 78 digitization.

The Internet Archive is working with George Blood LP, and the IA’s Music Curator, Bob George of the Archive of Contemporary Music  to discover, transfer, digitize, catalog and preserve these often fragile discs.  This team has already digitized more than 35,000 sides.  The BPL collection joins more than 20 collections  already transferred to the Internet Archive for physical and digital preservation and access. Curated by many volunteer collectors, these collections will be preserved for future generations.

The Internet Archive began working with the Boston Public Library in 2007, and our scanning center is housed at its Central Library in Copley Square.  There, as a digital-partner-in-residence, the Internet Archive is scanning bound materials for Boston Public Library, including the John Adams Library, one of the BPL’s Collections of Distinction.

To honor Boston Public Library’s long legacy and pioneering role in making its valuable holdings available to an ever wider public online, we will be awarding the 2017 Internet Archive Hero Award to David Leonard, the President of BPL, at a public celebration tonight at the Internet Archive headquarters in San Francisco.

 

 

Posted in Announcements, News | 1 Comment

Books from 1923 to 1941 Now Liberated!

[press: boingboing]

The Internet Archive is now leveraging a little known, and perhaps never used, provision of US copyright law, Section 108h, which allows libraries to scan and make available materials published 1923 to 1941 if they are not being actively sold. Elizabeth Townsend Gard, a copyright scholar at Tulane University calls this “Library Public Domain.”  She and her students helped bring the first scanned books of this era available online in a collection named for the author of the bill making this necessary: The Sonny Bono Memorial Collection. Thousands more books will be added in the near future as we automate. We hope this will encourage libraries that have been reticent to scan beyond 1923 to start mass scanning their books and other works, at least up to 1942.

While good news, it is too bad it is necessary to use this provision.

Trend of Maximum U.S. General Copyright Term by Tom W Bell

If the Founding Fathers had their way, almost all works from the 20th century would be public domain by now (14-year copyright term, renewable once if you took extra actions).

Some corporations saw adding works to the public domain to be a problem, and when Sonny Bono got elected to the House of Representatives, representing Riverside County, near Los Angeles, he helped push through a law extending copyright’s duration another 20 years to keep things locked-up back to 1923.  This has been called the Mickey Mouse Protection Act due to one of the motivators behind the law, but it was also a result of Europe extending copyright terms an additional twenty years first. If not for this law, works from 1923 and beyond would have been in the public domain decades ago.

Lawrence Lessig

Lawrence Lessig

Creative Commons founder, Larry Lessig fought the new law in court as unreasonable, unneeded, and ridiculous.  In support of Lessig’s fight, the Internet Archive made an Internet bookmobile to celebrate what could be done with the public domain. We drove the bookmobile across the country to the Supreme Court to make books during the hearing of the case. Alas, we lost.

Internet Archive Bookmobile in front of
Carnegie Library in Pittsburgh: “Free to the People”

But there is an exemption from this extension of copyright, but only for libraries and only for works that are not actively for sale — we can scan them and make them available. Professor Townsend Gard had two legal interns work with the Internet Archive last summer to find how we can automate finding appropriate scanned books that could be liberated, and hand-vetted the first books for the collection. Professor Townsend Gard has just released an in-depth paper giving libraries guidance as to how to implement Section 108(h) based on her work with the Archive and other libraries. Together, we have called them “Last Twenty” Collections, as libraries and archives can copy and distribute to the general public qualified works in the last twenty years of their copyright.  

Today we announce the “Sonny Bono Memorial Collection” containing the first books to be liberated. Anyone can download, read, and enjoy these works that have been long out of print. We will add another 10,000 books and other works in the near future. “Working with the Internet Archive has allowed us to do the work to make this part of the law usable,” reflected Professor Townsend Gard. “Hopefully, this will be the first of many “Last Twenty” Collections around the country.”

Now it is the chance for libraries and citizens who have been reticent to scan works beyond 1923, to push forward to 1941, and the Internet Archive will host them. “I’ve always said that the silver lining of the unfortunate Eldred v. Ashcroft decision was the response from people to do something, to actively begin to limit the power of the copyright monopoly through action that promoted open access and CC licensing,” says Carrie Russell, Director of ALA’s Program of Public Access to Information. “As a result, the academy and the general public has rediscovered the value of the public domain. The Last Twenty project joins the Internet Archive, the HathiTrust copyright review project, and the Creative Commons in amassing our public domain to further new scholarship, creativity, and learning.”

We thank and congratulate Team Durationator and Professor Townsend Gard for all the hard work that went into making this new collection possible. Professor Townsend Gard, along with her husband, Dr. Ron Gard, have started a company, Limited Times, to assist libraries, archives, and museums implementing Section 108(h), “Last Twenty” collections, and other aspects of the copyright law.

Prof. Elizabeth
Townsend Gard

Tomi Aina
Law Student

Stan Sater
Law Student

 

 

 

 

 

 

Hundreds of thousands of books can now be liberated. Let’s bring the 20th century to 21st-century citizens. Everyone, rev your cameras!

Posted in Announcements, News, Open Library | 48 Comments

Internet Archive’s Annual Bash this Wednesday! — Get your tickets now before we run out!

UPDATE: Tickets for the 20th Century Time Machine are officially Sold Out! If you would like to join our waitlist, we’ll release tickets as they become available and let you know via email.

Click here to join the waitlist


Limited tickets left for 20th Century Time Machine — the Internet Archive’s Annual Bash – happening this Wednesday at the Internet Archive from 5pm-9:30pm. In case you missed it, here’s our original announcement.

Tickets start at $15 here.

Once tickets sell out, you’ll have the opportunity to join the waitlist. We’ll release tickets as spaces free up and let you know via email.

We’d love to celebrate with you!

Posted in Announcements, News | Leave a comment

History is happening, and we’re not just watching

  1. Which recent hurricane got the least amount of attention from TV news broadcasters?
    1. Irma
    2. Maria
    3. Harvey
  2. Thomas Jefferson said, “Government that governs least governs best.”
    1. True
    2. False
  3. Mitch McConnell shows up most on which cable TV news channel?
    1. CNN
    2. Fox News
    3. MSNBC

Answers at end of post.

The Internet Archive’s TV News Archive, our constantly growing online, free library of TV news broadcasts, contains 1.4 million shows, some dating back to 2009, searchable by closed captioning. History is happening, and we preserve how broadcast news filters it to us, the audience, whether it’s through CNN’s Jake Tapper, Fox’s Bill O’Reilly, MSNBC’s Rachel Maddow or others. This archive becomes a rich resource for journalists, academics, and the general public to explore the biases embedded in news coverage and to hold public officials accountable.

Last October we wrote how the Internet Archive’s TV News Archive was “hacking the election,” then 13 days away. In the year since, we’ve been applying our experience using machine learning to track political ads and TV news coverage in the 2016 elections to experiment with new collaborations and tools to create more ways to analyze the news.

Helping fact-checkers

Since we launched our Trump Archive in January 2017, and followed in August with the four congressional leaders, Democrat and Republican, as well as key executive branch figures, we’ve collected some 4,534 hours of curated programming and more than 1,300 fact-checks of material on subjects ranging from immigration to the environment to elections.

 

The 1,340 fact-checks–and counting–represent a subset of the work of partners FactCheck.orgPolitiFact and The Washington Post’s Fact Checker, as we link only to fact-checks that correspond to statements that appear on TV news. Most of the fact-checks–524–come from PolitiFact; 492 are by FactCheck.org, and 324 from The Washington Post’s Fact Checker.

We’re also proud to be part of the Duke Reporter’s Lab’s new Tech & Check collaborative, where we’re working with journalists and computer scientists to develop ways to automate parts of the fact-checking process.  For example, we’re creating processes to help identify important factual claims within TV news broadcasts to help guide fact-checkers where to concentrate their efforts. The initiative received $1.2 million from the John S. and James L. Knight Foundation, the Facebook Journalism Project and the Craig Newmark Foundation.

See the TrumpUS Congress, and executive branch archives and collected fact-checks.

TV News Kitchen

We’re collaborating with data scientists, private companies and nonprofit organizations, journalists, and others to cook up new experiments available in our TV News Kitchen, providing new ways to analyze TV news content and understand ourselves.

Dan Schultz, our senior creative technologist, worked with the start-up Matroid to develop Face-o-Matic, which tracks faces of selected high level elected officials on major TV cable news channels: CNN, Fox News, MSNBC, and BBC News. The underlying data are available for download here. Unlike caption-based searches, Face-o-Matic uses facial recognition algorithms to recognize individuals on TV news screens. It is sensitive enough to catch this tiny, dark image of House Minority Leader Nancy Pelosi, D., Calif., within a graphic, and this quick flash of Senate Minority Leader Chuck Schumer, D., N.Y., and Senate Majority Leader Mitch McConnell, R., Ky.

The work of TV Architect Tracey Jaquith, our Third Eye project scans the lower thirds of TV screens, using OCR, or optical character recognition, to turn these fleeting missives into downloadable data ripe for analysis. Launched in September 2017, Third Eye tracks BBC News, CNN, Fox News, and MSNBC, and collected more than four million chyrons captured in just over two weeks, and counting.

Download Third Eye data. API and TSV options available.

Follow Third Eye on Twitter.

Vox news reporter Alvin Chang used the Third Eye chyron data to report how Fox News paid less attention to Hurricane Maria’s destruction in Puerto Rico than it did to Hurricanes Irma and Harvey, which battered Florida and Texas. Chang’s work followed a similar piece by Dhrumil Mehta for FiveThirtyEight, which used Television Explorer, a tool developed by data scientist Kalev Leetaru to search and visualize closed captioning on the TV News Archive.

 

FiveThirtyEight used TV News Archive captions to create this look at how cable networks covered recent hurricanes.

CNN’s Brian Stelter followed up with a similar analysis on “Reliable Sources” October 1.

We’re also working with academics who are using our tools to unlock new insights. For example, Schultz and Jaquith are working with Bryce Dietrich at the University of Iowa to apply the Duplitron, the audiofingerprinting tool that fueled our political ad airing data, to analyze floor speeches of members of Congress. The study identifies which floor speeches were aired on cable news programs and explores the reasons why those particular clips were selected for airing. A draft of the paper was presented in the 2017 Polinfomatics Workshop in Seattle and will begin review for publication in the coming months.

What’s next? Our plans include making more than a million hours of TV news available to researchers from both private and public institutions via a digital public library branch of the Internet Archive’s TV News Archive. These branches would be housed in computing environments, where networked computers provide the processing power needed to analyze large amounts of data. Researchers will be able to conduct their own experiments using machine learning to extract metadata from TV news. Such metadata could include, for example, speaker identification–a way to identify not just when a speaker appears on a screen, but when she or he is talking. Metadata generated through these experiments would then be used to enrich the TV News Archive, so that any member of the public could do increasingly sophisticated searches.

Going global

We live in an interdependent world, but we often lack understanding about how other cultures perceive us. Collecting global TV could open a new window for journalists and researchers seeking to understand how political and policy messages are reported and spread across the globe. The same tools we’ve developed to track political ads, faces, chyrons, and captions can help us put news coverage from around the globe into perspective.

We’re beginning work to expand our TV collection to include more channels from around the globe. We’ve added the BBC and recently began collecting Deutsche Welle from Germany and the English-language Al Jazeera. We’re talking to potential partners and developing strategy about where it’s important to collect TV and how we can do so efficiently.

History is happening, but we’re not just watching. We’re collecting, making it accessible, and working with others to find new ways to understand it. Stay tuned. Email us at tvnews@archive.org. Follow us @tvnewsarchive, and subscribe to our weekly newsletter here.

Answer Key

  1. b. (See: “The Media Really Has Neglected Puerto Rico,” FiveThirtyEight.
  2. b. False. (See: Vice President Mike Pence statement and linked PolitiFact fact-check.)
  3. c. MSNBC. (See: Face-O-Matic blog post.)

Members of the TV News Archive team: Roger Macdonald, director; Robin Chin, Katie Dahl, Tracey Jaquith, Dan Schultz, and Nancy Watzman.

Posted in Announcements, News | Tagged , , , , , , , , , , , , , , , , , , , , , , | Leave a comment

TV News Record: 1,340 fact checks collected and counting

A weekly round up on what’s happening and what we’re seeing at the TV News Archive by Katie Dahl and Nancy Watzman. Additional research by Robin Chin.

In an era when social media algorithms skew what people see online, the Internet Archive TV News Archive’s collections of on-the-record statements by top political figures serves  as a powerful model for how preservation can provide a deep resource for who really said what, when, and where.

Since we launched our Trump Archive in January 2017, and followed in August with the four congressional leaders, Democrat and Republican, as well as key executive branch figures, we’ve collected some 4,534 hours of curated programming and more than 1,300 fact-checks of material on subjects ranging from immigration to the environment to elections.

The 1,340 fact-checks–and counting–represent a subset of the work of partners FactCheck.org, PolitiFact and The Washington Post’s Fact Checker, as we link only to fact-checks that correspond to statements that appear on TV news. Most of the fact-checks–524–come from PolitiFact; 492 are by FactCheck.org, and 324 from The Washington Post’s Fact Checker.

As a library, we’re dedicated to providing a record – sometimes literally, as in the case of 78s! – that can help researchers, journalists, and the public find trustworthy sources for our collective history. These clip collections, along with fact-checks, now largely hand-curated, provide a quick way to find public statements made by elected officials.

See the Trump, US Congress, and executive branch archives and collected fact-checks.

The big picture

Given his position at the helm of the government, it is not surprising that Trump garners most of the fact-checking attention.  Three out of four, or 1008 of the fact-checks, focus on Trump’s statements. Another 192 relate to the four congressional leaders: Senate Majority Leader Mitch McConnell, R., Ky.; Senate Minority Leader Chuck Schumer, D., N.Y.; House Speaker Paul Ryan, R., Wis.; and House Minority Leader Nancy Pelosi, D., Calif. We’ve also logged 140 fact-checks related to key administration figures such as Sean Spicer, Jeff Sessions, and Mike Pence.

pie chart

The topics

The topics covered by fact-checkers run the gamut of national and global policy issues, history, and everything in between. For example, the debate on tax reform is grounded with fact-checks of the historical and global context posited by the president. Fact-checkers have also examined his aides’ claims on the impact of the current reform proposal on the wealthy and on the deficit. They’ve also followed the claims made by House Speaker Paul Ryan, R., Wis., the leading GOP policy voice on tax reform.

Another large set of fact-checks cover health care, going back as far as this claim made in 2010 by Pelosi about job creation under healthcare reform (PolitiFact rated it “Half True.”) The most recent example is the Graham-Cassidy bill that aimed to repeal much of Obamacare. One of the most sharply contested debates about that legislation was whether or not it would require coverage of people with pre-existing conditions. Fact-checkers parsed the he-said he-said debate as it unfolded on TV news, for example examining dueling claims by Schumer and Trump.

Browse or download  fact-checked TV clips by topic

The old stuff

The collection of Trump fact checks include a few dating back to 2011, long before his successful presidential campaign. Here he is at the CPAC conference that year claiming no one remembered now-former President Barack Obama from school, part of his campaign to question Obama’s citizenship. (PolitiFact rated: “Pants on Fire!”) And here he is with what FactCheck.org called a “100 percent wrong” claim about the Egyptian people voting to overturn a treaty with Israel.

This fact-check of McConnell dates back to 2009, when PolitiFact rated “false” his claim of how much federal spending occurred under Obama’s watch: “In just one month, the Democrats have spent more than President Bush spent in seven years on the war in Iraq, the war in Afghanistan and Hurricane Katrina combined.”

Meanwhile, this 2010 statement by Schumer, rated “mostly false” by PolitiFact, asserted that the U.S. Supreme Court “decided to overrule the 100-year-old ban on corporate expenditures.” The ban on giving directly to candidates is still in place; however,  corporations are free to spend unlimited funds on elections providing they do so separate from a candidate’s official campaign.

The repetition

Twenty-four million people will be forced off their health insurance, young farmers have to sell the farm to pay estate tax, NATO members owe the United States money, millions of women turn to Planned Parenthood for mammograms, and sanctuary cities lead to higher crime. These are all examples of claims found to be inaccurate or misleading, but that continued or continue to be repeated by public officials.

The unexpected

Whether you lean one political direction or another, there are always surprises from the fact-checkers that can keep all our assumptions in check. For example, if you’re opposed to building a wall on the southern border to keep people from crossing into the U.S., you might guess Trump’s claim that people use catapults to toss drugs over current walls is an exaggeration. In fact, that statement was rated “mostly true” by PolitiFact. Or if you’re conservative, you might be surprised to learn an often repeated quote ascribed to Thomas Jefferson, in this case by Vice President Mike Pence, is in fact falsely attributed to him.

How to find

If you’re looking for the most recent TV news statements with fact-checks, you can see the latest offerings on the TV Archive’s homepage by scrolling down.

screen grab of place on tv homepageYou can review whole speeches, scanning for just the fact-checked claims by looking for the fact-check icon  on a program timeline. For example, starting in the Trump Archive, you can choose a speech or interview and see if and how many of the statements were checked by reporters.

screen grab of timeline w icons

You can also find the fact-checks in the growing table, also available to download, which includes details on the official making the claim, the topic(s) covered, the url for the corresponding TV news clip, and the link to the fact-checking article.

image of fact-checks table

To receive the TV News Archive’s email newsletter, subscribe here.

###

Posted in Announcements, News | Tagged , , , , , , , , , | Comments Off on TV News Record: 1,340 fact checks collected and counting

Wayback Machine Playback… now with Timestamps!

The Wayback Machine has an exciting new feature: it can list the dates and times, the Timestamps, of all page elements compared to the date and time of the base URL of a page.  This means that users can see, for instance, that an image displayed on a page was captured X days before the URL of the page or Y hours after it.  Timestamps are available via the “About this capture” link on the right side of the Wayback Toolbar.  Here is an example:

The Timestamps list includes the URLs and date and time difference compared to the current page for the following page elements: images, scripts, CSS and frames. Elements are presented in a descending order. If you put your cursor over a list element on the page, it will be highlighted and if you click on it you will be shown a playback of just that element.

Under the hood

Web pages are usually a composition of multiple elements such as images, scripts and CSS. The Wayback Machine tries to archive and playback web pages in the best possible manner, including all their original elements.  Each web page element has its own URL and Timestamp, indicating the exact date and time it was archived. Page elements may have similar Timestamps but they could also vary significantly for various reasons which depend on the web crawling process. By using the new Timestamps feature, users can easily learn the archive date and time for each element of a page.

Why this is important

The Wayback Machine is increasingly used in critical procedures such as legal evidence or political debate material.  It is important that what is presented is clear and transparent, even in the light of a web that was not designed to be archived. One of the ways a web archive could be confusing is via anachronisms, displaying content from different dates and times than the user expects. For example, when a archived page is played back, it could include some images from the current web, making it look like the image came from the past when it did not. We implemented Timestamps to provide users with more context about, and in turn hopefully greater confidence in, what they are seeing.

Posted in Announcements, News | Comments Off on Wayback Machine Playback… now with Timestamps!