Tag Archives: journalism

A New Approach To Understanding War Through Television News: Introducing The TV News Visual Explorer & The Belarusian, Russian & Ukrainian TV News Archive

For more than 20 years, the Internet Archive’s Television News Archive has monitored television news, preserving more than 9.5 million broadcasts totaling more than 6.6 million hours from across the world, with a continuous archive spanning the past decade. Today just a small sliver of that archive is accessible to journalists and scholars due to the inaccessibility of video at this scale: fast forwarding through that much television news is simply beyond the ability of any human to make sense of. The small fraction of programs that contain closed captioning, speech recognition transcripts or OCR’d onscreen text can be keyword searched through the TV Explorer and TV AI Explorer, but for the majority of this global multi-decade archive, there has until now been no way for researchers to assess and understand the narratives of television news at scale, especially the visual landscape that distinguishes television from other forms of media and which is so central to understanding many of the world’s biggest stories from war to pandemics to the economy.

As the TV News Archive enters its third decade, it is increasingly exploring the ways in which it can preserve the domestic and international response to global events as it did with 9/11 two decades ago. As a first step towards this vision, over the last few months the Archive has preserved more than 46,000 broadcasts from domestic Belarusian, Russian and Ukrainian television news channels, including (in the order they were added to the Archive) Russia Today (part of the Archive since July 2010 but included in this collection starting January 1), Russian channels 1TV, NTV and Russia 1 (from March 26) and Russia 24 (from April 25), Ukrainian channel Espreso (from April 25) and Belarusian channel Belarus 24 (from May 16).

Why preserve television news coverage in a time of war? For journalists today it makes it possible to digest and report on how the war is being framed and narrated, with an eye towards how these narratives influence and shape popular support for the conflict and its potential future trajectory. For future generations of scholars, it makes it possible to look back at the contemporary information environment and prevailing public information, perspectives, and narratives.

While there are myriad options for the general public to watch these channels today in realtime, there is no research-oriented archival interface designed for journalists and scholars to understand their coverage at the scale of days to months, to scan for key visuals and events and to comment, discuss and illustrate how nations are portraying major stories.

To address this critical need, today we are tremendously excited to unveil the Television News Visual Explorer, a collaboration of the GDELT Project, the Internet Archive’s Television News Archive and the Media-Data Research Consortium to explore new approaches to enabling rapid exploration and understanding of the visual landscape of television news.

The Visual Explorer converts each broadcast into a grid of thumbnails, one every 4 seconds, displayed in a grid six frames wide and scrolling vertically through the entire program, making it possible to skim an hour-long broadcast in a matter of seconds. Clicking on any thumbnail plays a brief 30 second clip of the broadcast at that point, making it trivial to rapidly triage a broadcast for key moments. The underlying thumbnails can even be downloaded as a ZIP file to enable non-consumptive computational analysis, from OCR to augmented search.

Machines today can catalog the basic objects and activities they see in video and generate transcripts of their spoken and written words, but the ability to contextualize and understand the meaning of all that coverage remains a uniquely human capability. No person could watch the entirety of the Archive’s 6.6 million hours of broadcasts, yet even just the 46,000 broadcasts in this new collection would be difficult for a single researcher to watch or even fast forward through in their entirety. Television’s linear format means coverage has historically been consumed a single moment at a time like a flashlight in a darkened warehouse. In contrast, this new interface makes it possible to see an entire broadcast all at once in a single display, making television news “skimmable” for the first time.

The Visual Explorer and this new research collection of Belarusian, Russian and Ukrainian television news coverage represent early glimpses into a new initiative reimagining how memory institutions like the Archive can make their vast television news archives more accessible to scholars, journalists and informed citizens. Beneath the simple and intuitive interface lies an immensely complex and highly experimental set of workflows prototyping both an entirely new scholarly and journalistic interface to television news and entirely new approaches to rapidly archiving international television coverage of global events.

Over the coming weeks, additional channels from the TV News Archive will become available through the new Visual Explorer, as well as a variety of experiments with the new lenses that tools like automatic transcription and translation can offer in helping journalists and scholars make sense of such vast realtime archives.

Get Started With The Television News Visual Explorer!

About Kalev Leetaru

For more than 25 years, GDELT’s creator, Dr. Kalev H. Leetaru, has been studying the web and building systems to interact with and understand the way it is reshaping our global society. One of Foreign Policy Magazine’s Top 100 Global Thinkers of 2013, his work has been featured in the presses of over 100 nations and fundamentally changed how we think about information at scale and how the “big data” revolution is changing our ability to understand our global collective consciousness.

Reflecting on 9/11: Twenty Years of Archived TV News – Special Event and Resources

On Thursday, September 9, the Internet Archive will host an online webinar, “Reflecting on 9/11: Twenty Years of Archived TV News” Learn from scholars, journalists, archivists, and data scientists about the importance of archived television for gaining insights into our evolving understanding of history and society.

Participants include the Internet Archive, The American Archive of Public Broadcasting, The Vanderbilt Television News Archive and UCLA Library’s NewsScape TV News Archive. Speakers will include Roger Macdonald (Founder, Internet Archive’s TV News Archive), Jim Duran (Director, Vanderbilt Television News Archives), Karen Cariani (David O. Ives Executive Director, GBH Archives and GBH Project Director, American Archive of Public Broadcasting), Todd Grappone (UCLA Associate University Librarian for Digital Initiatives and Information Technology), Kalev Leetaru (Founder, Global Database of Events, Language and Tone Project), and Philip Bump (Washington Post national correspondent focused largely on the numbers behind politics)

Please register in advance for the September 9 webinar (11:00 AM – 12:30 PM PDT)

Journalists and scholars: as you prepare 20th anniversary 9/11 reporting and analysis, these unique resources are available:

  • Internet Archive’s 9/11 Television News Archive – a browsable library of TV news from U.S. and international broadcasters from 19 networks, over seven days, from the morning of September 11 through September 17, 2001. Contact: Josh Baran 917-797-1799
  • The Vanderbilt Television News Archive (VTNA) – Founded in 1968, the Archive’s collection includes TV news of attacks on 9/11/2001 coverage during the following weeks broadcast by ABC, NBC, CBS and CNN. Over 270 hours of footage is available for viewing and research. The VTNA records and preserves national television broadcasts of the evening news on ABC, CBS, and NBC with the addition of the primetime news program on CNN in 1995 and the Fox News Channel in 2004. In addition to these nightly recordings, the VTNA also monitors television news networks for breaking live events. Contact: Jim Duran – 615-936-4019  
  • The American Archive of Public Broadcasting (AAPB) marks the 20th anniversary of the 9/11 terrorist attacks by releasing a new 9/11 Special Coverage Collection of 68 public television and radio programs from stations across the country covering the events of the attacks and the aftermath. Among the featured programs are coverage of 9/11 and its anniversaries by The Newshour with Jim Lehrer, the PBS News Hour, and much more. The AAPB is a collaboration between Boston public media producer GBH and the Library of Congress to preserve and make accessible culturally significant public media programs from across the country. Contact: Emily Balk, GBH External Communications Manager – 617-300-5317
  • UCLA Library’s NewsScape TV News Archive contains digitized television news programs collected from cable and broadcast sources in the Los Angeles area from 2005 to the present, as well as a smaller number of news programs from other domestic, international, and online sources collected from 2004 to the present. The archive includes hundreds of thousands of hours of news programs, which are indexed and time-referenced via their closed captions and other associated metadata to enable full-text searching and interactive streaming playback.
Interface for browsing TV news on 19 networks – September 11, 2001 through September 17th – Internet Archive

Background:

  • 500+ archived 9/11-related websites curated by The National September 11 Memorial Museum using the Internet Archive’s Archive-It service
  • Internet Archive’s Open Library offers a list of 2,630 published works about the 9/11 attack
  • A decade ago, on the 10th anniversary of 9/11, NYU’s Department of Cinema Studies hosted a conference that featured work by scholars using television news materials to help us understand how TV news presented the events of 9/11 and the international response. “Learning from Recorded Memory”
  • This fall, the Internet Archive celebrates its 25th anniversary.
  • The Internet Archive’s TV News Archive repurposes closed captioning as a search index for nearly three million hours of U.S. local and national TV news (2,239,000+ individual shows) from mid-2009 to the present. The public interest library is dedicated to facilitating journalists, scholars, and the public to compare, contrast, cite, and borrow specific portions of the collection. Advanced quantitive analysis opportunities and data visualizations are available via the collaborating GDELT Project’s Television Explorer and AI Television Explorer.
  • Roger Macdonald, founder of the Internet Archive’s TV News Archive, is available for background interviews and to help journalists access the archive.

Internet Archive 9/11 Event and Resources Media Contact:  pressinfo@archive.org

Archiving Online Local News with the News Measures Research Project

Over the past two years Archive-It, Internet Archive’s web archiving service, has partnered with researchers at the Hubbard School of Journalism and Mass Communication at University of Minnesota and the Dewitt Wallace Center for Media and Democracy at Duke University in a project designed to evaluate the health of local media ecosystems as part of the News Measures Research Project, funded by the Democracy Fund. The project is led by Phil Napoli at Duke University and Matthew Weber at University of Minnesota. Project staff worked with Archive-It to crawl and archive the homepages of 663 local news websites representing 100 communities across the United States. Seven crawls were run on single days from July through September and captured over 2.2TB of unique data and 16 million URLs. Initial findings from the research detail how local communities cover core topics such as emergencies, politics and transportation. Additional findings look at the volume of local news produced by different media outlets, and show the importance of local newspapers in providing communities with relevant content. 

The goal of the News Measures Research Project is to examine the health of local community news by analyzing the amount and type of local news coverage in a sample of community. In order to generate a random and unbiased sample of communities, the team used US Census data. Prior research suggested that average income in a community is correlated with the amount of local news coverage; thus the team decided to focus on three different income brackets (high, medium and low) using the Census data to break up the communities into categories. Rural areas and major cities were eliminated from the sample in order to reduce the number of outliers; this left a list of 1,559 communities ranging in population from 20,000 to 300,000 and in average household income from $21,000 to $215,000. Next, a random sample of 100 communities was selected, and a rigorous search process was applied to build a list of 663 news outlets that cover local news in those communities (based on Web searches and established directories such as Cision).

The News Measures Research Project web captures provide a unique snapshot of local news in the United States. The work is focused on analyzing the nature of local news coverage at a local level, while also examining the broader nature of local community news. At the local level, the 100 community sample provides a way to look at the nature of local news coverage. Next, a team of coders analyzed content on the archived web pages to assess what is being covered by a given news outlet. Often, the websites that serve a local community are simply aggregating content from other outlets, rather than providing unique content. The research team was most interested in understanding the degree to which local news outlets are actually reporting on topics that are pertinent to a given community (e.g. local politics). At the global level, the team looked at interaction between community news websites (e.g. sharing of content) as well as automated measures of the amount of coverage.

The primary data for the researchers was the archived local community news data, but in addition, the team worked with census data to aggregate other measures such as circulation data for newspapers. These data allowed the team to examine the amount and type of local news changes depending on the characteristics of the community. Because the team was using multiple datasets, the Web data is just one part of the puzzle. The WAT data format proved particularly useful for the team in this regard. Using the WAT file format allowed the team to avoid digging deeply into the data – rather, the WAT data allowed the team to examine high level structure without needing to examine the content of each and every WARC record. Down the road, the WARC data allows for a deeper dive,  but the lighter metadata format of the WAT files has enabled early analysis.

Stay tuned for more updates as research utilizing this data continues! The websites selected will continue to be archived and much of the data are publicly available.

TV news highlights with fact checks

By Nancy Watzman and Katie Dahl

Last week, our national fact checking partners concentrated on two events featuring President Donald Trump: a press conference on February 16, and his rally in Melbourne, Florida on February 18. The Conservative Political Action Conference is being hosted this week. Look out for fact-checking of President Trump’s speech soon.  Here are some highlights, along with TV news segments from the Trump Archive and TV News Archive.

Steve Bannon and Reince Priebus addressed the conference yesterday. Bannon again called the press the “opposition party.”

Claim: Obama released Gitmo detainee that recently became a suicide bomber (wasn’t him)

Deputy assistant to the president, Sebastian Gorka, on Fox & Friends: “So President Obama released lots and lots of people that were there for a very good reason, and what happened? Almost half the time, they returned to the battlefield. This individual… goes and executes a suicide attack in Iraq.” At FactCheck.org, Farley wrote “Gorka wrongly suggested the man was released by President Barack Obama. He was transferred… President George W. Bush… then wrongly claimed that among detainees released by Obama, ‘almost half the time, they returned to the battlefield.’ According to the Office of the Director of National Intelligence, about 12.4 percent of those transferred from Gitmo under Obama are either confirmed or suspected of reengaging.”

Claim: there are 13, 14, 15 million undocumented people in the country (too high)

At a press briefing this week, White House Press Secretary Sean Spicer said “12, 14, 15 million people [are] in the country illegally,” but Yee gave him Three Pinocchios for The Washington Post’s Fact Checker. “Spicer’s statement that there are about 12 million people in the country illegally is safely within the margin of error in credible demographics research. But once he enters the realm of ‘13, 14, 15 million’ or ‘potentially more,’ his claim becomes problematic.”

Claim: Thomas Jefferson said “nothing can be believed which is seen in a newspaper.” (out of context)

At his rally in Florida, Trump said President Thomas Jefferson had said that “nothing can be believed which is seen in a newspaper. Truth itself….becomes suspicious by being put into that polluted vehicle.”

However, “Trump selectively quotes from Jefferson here, who, for most of his life, was a fierce defender of the need for a free press,” Kessler wrote for The Washington Post’s Fact Checker. PolitiFact staff made a similar point, using this quote as evidence: “And were it left to me to decide whether we should have a government without newspapers, or newspapers without a government, I should not hesitate a moment to prefer the latter.”

Claim: something happened in Sweden. (Not exactly)

By far the quote that received the most attention from the president’s rally were his comments about Sweden: “We’ve got to keep our country safe … You look at what’s happening last night in Sweden… Sweden? Who would believe this? Sweden. They took in large numbers. They’re having problems like they never thought possible.”

“This was a very strange comment. Nothing had happened the night before in Sweden,” wrote Kessler for The Washington Post’s Fact Checker.  A White House spokesperson said later that he “was talking about rising crime and recent incidents in general and not referring to a specific incident.”

PolitiFact reporter Miriam Valverde reported on the Fox news interview on Swedish crime rates, which aired the night before the rally and purportedly inspired Trump’s comments. Valverde quoted several Swedish experts countering the argument that crime rates are rising in Sweden, including political scientist Henrik Selin, who said that “[i]n general, crime statistics have gone down the last (few) years, and no there is no evidence to suggest that new waves of immigration has lead to increased crime.”

Robert Farley reported for  FactCheck.org, “Swedish authorities and criminologists say President Donald Trump is exaggerating crime in Sweden as a result of its liberal policy of accepting refugees from Syria and other Middle Eastern countries.”

Claim: The stock market has hit record numbers (mostly true)

The President mentioned the economy at a press conference, saying “The stock market has hit record numbers, as you know. And there has been a tremendous surge of optimism in the business world.” At PolitiFact, Miriam Valverde rated this as “Mostly True,” reporting “All three major stock indexes closed at record highs for five days in row on Feb. 15.”

Claim: the media is less trustworthy than Congress (mostly false, but…)

Also at the press conference, President Trump excoriated the media, saying journalists “will not tell you the truth and treat the wonderful people of our country with the respect that they deserve,” that the “press is out of control,” and that the media has a “lower approval rate than Congress, I think that’s right, I don’t know.”

PolitiFact reporter Jon Greenberg rated the trust claim as “mostly false”: “Congress actually ranks below the news media, according to surveys from three different research groups spanning several years. In two polls, mistrust in the media broke 40 percent, which is hardly anything to brag about. But in those studies, mistrust in Congress was over 50 percent.”

Glenn Kessler and Michelle Ye Hee Lee at The Washington Post’s Fact Checker agreed that Congress ranks lower than the media–but that that isn’t saying much: “[B]esides Congress, only ‘big business’ ranks lower than the media — but it’s enough to make Trump’s claim incorrect.”

FactCheck.org chimed in, noting that the “public’s approval of Congress is lower than its trust in the media,” but pointed out there’s more public trust in Trump than in the media: “Trump would have been correct to say that trust in the media is even lower than approval of himself. According to Gallup, Trump’s approval rating stood at 41 percent, as of the week ending Feb. 12, while the public’s trust in the media was down to 32 percent.”  

Claim: Trump had biggest electoral college win since Ronald Reagan. (False)

President Trump claimed his victory marked “the biggest electoral college win since Ronald Reagan.” NBC reporter Peter Alexander challenged him on the spot, saying, “Why should Americans trust you when you have accused information they have received as being fake when you have been providing information that is fake?” Trump didn’t answer the question, but rather pivoted by asking whether the reporter agreed that his victory was substantial.  

According to our fact-checking partners, there have been three presidents since Reagan who received more electoral college votes than Trump. FactCheck.org noted “Trump’s Electoral College victory margin ranks 46th out of 58 presidential elections.” Kessler and Lee wrote: “Of the nine presidential elections since 1984, Trump’s electoral college win ranks seventh.”

Claim: Hillary Clinton gave away 20 percent of the uranium in the United States (false)

President Trump asserted a claim the Washington Post Fact Checker has given Four Pinocchios, that Hillary Clinton “gave away 20 percent of the uranium in the United States,” going on to say, “you know what uranium is, right? This thing called nuclear weapons and other things like lots of things are done with uranium, including some bad things.” insinuate that the uranium could be used in a Russian nuclear weapon. FactCheck.org wrote: “The deal Clinton had a role in approving gave Russia ownership of 20 percent of U.S. production capacity — not existing stocks of uranium. Furthermore, Clinton alone could not have stopped the deal; only the president could have done that with a finding that national security would be endangered. Lastly, none of the uranium goes to Russia. That would require export licenses.”