Tag Archives: journalism

Archiving Online Local News with the News Measures Research Project

Over the past two years Archive-It, Internet Archive’s web archiving service, has partnered with researchers at the Hubbard School of Journalism and Mass Communication at University of Minnesota and the Dewitt Wallace Center for Media and Democracy at Duke University in a project designed to evaluate the health of local media ecosystems as part of the News Measures Research Project, funded by the Democracy Fund. The project is led by Phil Napoli at Duke University and Matthew Weber at University of Minnesota. Project staff worked with Archive-It to crawl and archive the homepages of 663 local news websites representing 100 communities across the United States. Seven crawls were run on single days from July through September and captured over 2.2TB of unique data and 16 million URLs. Initial findings from the research detail how local communities cover core topics such as emergencies, politics and transportation. Additional findings look at the volume of local news produced by different media outlets, and show the importance of local newspapers in providing communities with relevant content. 

The goal of the News Measures Research Project is to examine the health of local community news by analyzing the amount and type of local news coverage in a sample of community. In order to generate a random and unbiased sample of communities, the team used US Census data. Prior research suggested that average income in a community is correlated with the amount of local news coverage; thus the team decided to focus on three different income brackets (high, medium and low) using the Census data to break up the communities into categories. Rural areas and major cities were eliminated from the sample in order to reduce the number of outliers; this left a list of 1,559 communities ranging in population from 20,000 to 300,000 and in average household income from $21,000 to $215,000. Next, a random sample of 100 communities was selected, and a rigorous search process was applied to build a list of 663 news outlets that cover local news in those communities (based on Web searches and established directories such as Cision).

The News Measures Research Project web captures provide a unique snapshot of local news in the United States. The work is focused on analyzing the nature of local news coverage at a local level, while also examining the broader nature of local community news. At the local level, the 100 community sample provides a way to look at the nature of local news coverage. Next, a team of coders analyzed content on the archived web pages to assess what is being covered by a given news outlet. Often, the websites that serve a local community are simply aggregating content from other outlets, rather than providing unique content. The research team was most interested in understanding the degree to which local news outlets are actually reporting on topics that are pertinent to a given community (e.g. local politics). At the global level, the team looked at interaction between community news websites (e.g. sharing of content) as well as automated measures of the amount of coverage.

The primary data for the researchers was the archived local community news data, but in addition, the team worked with census data to aggregate other measures such as circulation data for newspapers. These data allowed the team to examine the amount and type of local news changes depending on the characteristics of the community. Because the team was using multiple datasets, the Web data is just one part of the puzzle. The WAT data format proved particularly useful for the team in this regard. Using the WAT file format allowed the team to avoid digging deeply into the data – rather, the WAT data allowed the team to examine high level structure without needing to examine the content of each and every WARC record. Down the road, the WARC data allows for a deeper dive,  but the lighter metadata format of the WAT files has enabled early analysis.

Stay tuned for more updates as research utilizing this data continues! The websites selected will continue to be archived and much of the data are publicly available.

TV news highlights with fact checks

By Nancy Watzman and Katie Dahl

Last week, our national fact checking partners concentrated on two events featuring President Donald Trump: a press conference on February 16, and his rally in Melbourne, Florida on February 18. The Conservative Political Action Conference is being hosted this week. Look out for fact-checking of President Trump’s speech soon.  Here are some highlights, along with TV news segments from the Trump Archive and TV News Archive.

Steve Bannon and Reince Priebus addressed the conference yesterday. Bannon again called the press the “opposition party.”

Claim: Obama released Gitmo detainee that recently became a suicide bomber (wasn’t him)

Deputy assistant to the president, Sebastian Gorka, on Fox & Friends: “So President Obama released lots and lots of people that were there for a very good reason, and what happened? Almost half the time, they returned to the battlefield. This individual… goes and executes a suicide attack in Iraq.” At FactCheck.org, Farley wrote “Gorka wrongly suggested the man was released by President Barack Obama. He was transferred… President George W. Bush… then wrongly claimed that among detainees released by Obama, ‘almost half the time, they returned to the battlefield.’ According to the Office of the Director of National Intelligence, about 12.4 percent of those transferred from Gitmo under Obama are either confirmed or suspected of reengaging.”

Claim: there are 13, 14, 15 million undocumented people in the country (too high)

At a press briefing this week, White House Press Secretary Sean Spicer said “12, 14, 15 million people [are] in the country illegally,” but Yee gave him Three Pinocchios for The Washington Post’s Fact Checker. “Spicer’s statement that there are about 12 million people in the country illegally is safely within the margin of error in credible demographics research. But once he enters the realm of ‘13, 14, 15 million’ or ‘potentially more,’ his claim becomes problematic.”

Claim: Thomas Jefferson said “nothing can be believed which is seen in a newspaper.” (out of context)

At his rally in Florida, Trump said President Thomas Jefferson had said that “nothing can be believed which is seen in a newspaper. Truth itself….becomes suspicious by being put into that polluted vehicle.”

However, “Trump selectively quotes from Jefferson here, who, for most of his life, was a fierce defender of the need for a free press,” Kessler wrote for The Washington Post’s Fact Checker. PolitiFact staff made a similar point, using this quote as evidence: “And were it left to me to decide whether we should have a government without newspapers, or newspapers without a government, I should not hesitate a moment to prefer the latter.”

Claim: something happened in Sweden. (Not exactly)

By far the quote that received the most attention from the president’s rally were his comments about Sweden: “We’ve got to keep our country safe … You look at what’s happening last night in Sweden… Sweden? Who would believe this? Sweden. They took in large numbers. They’re having problems like they never thought possible.”

“This was a very strange comment. Nothing had happened the night before in Sweden,” wrote Kessler for The Washington Post’s Fact Checker.  A White House spokesperson said later that he “was talking about rising crime and recent incidents in general and not referring to a specific incident.”

PolitiFact reporter Miriam Valverde reported on the Fox news interview on Swedish crime rates, which aired the night before the rally and purportedly inspired Trump’s comments. Valverde quoted several Swedish experts countering the argument that crime rates are rising in Sweden, including political scientist Henrik Selin, who said that “[i]n general, crime statistics have gone down the last (few) years, and no there is no evidence to suggest that new waves of immigration has lead to increased crime.”

Robert Farley reported for  FactCheck.org, “Swedish authorities and criminologists say President Donald Trump is exaggerating crime in Sweden as a result of its liberal policy of accepting refugees from Syria and other Middle Eastern countries.”

Claim: The stock market has hit record numbers (mostly true)

The President mentioned the economy at a press conference, saying “The stock market has hit record numbers, as you know. And there has been a tremendous surge of optimism in the business world.” At PolitiFact, Miriam Valverde rated this as “Mostly True,” reporting “All three major stock indexes closed at record highs for five days in row on Feb. 15.”

Claim: the media is less trustworthy than Congress (mostly false, but…)

Also at the press conference, President Trump excoriated the media, saying journalists “will not tell you the truth and treat the wonderful people of our country with the respect that they deserve,” that the “press is out of control,” and that the media has a “lower approval rate than Congress, I think that’s right, I don’t know.”

PolitiFact reporter Jon Greenberg rated the trust claim as “mostly false”: “Congress actually ranks below the news media, according to surveys from three different research groups spanning several years. In two polls, mistrust in the media broke 40 percent, which is hardly anything to brag about. But in those studies, mistrust in Congress was over 50 percent.”

Glenn Kessler and Michelle Ye Hee Lee at The Washington Post’s Fact Checker agreed that Congress ranks lower than the media–but that that isn’t saying much: “[B]esides Congress, only ‘big business’ ranks lower than the media — but it’s enough to make Trump’s claim incorrect.”

FactCheck.org chimed in, noting that the “public’s approval of Congress is lower than its trust in the media,” but pointed out there’s more public trust in Trump than in the media: “Trump would have been correct to say that trust in the media is even lower than approval of himself. According to Gallup, Trump’s approval rating stood at 41 percent, as of the week ending Feb. 12, while the public’s trust in the media was down to 32 percent.”  

Claim: Trump had biggest electoral college win since Ronald Reagan. (False)

President Trump claimed his victory marked “the biggest electoral college win since Ronald Reagan.” NBC reporter Peter Alexander challenged him on the spot, saying, “Why should Americans trust you when you have accused information they have received as being fake when you have been providing information that is fake?” Trump didn’t answer the question, but rather pivoted by asking whether the reporter agreed that his victory was substantial.  

According to our fact-checking partners, there have been three presidents since Reagan who received more electoral college votes than Trump. FactCheck.org noted “Trump’s Electoral College victory margin ranks 46th out of 58 presidential elections.” Kessler and Lee wrote: “Of the nine presidential elections since 1984, Trump’s electoral college win ranks seventh.”

Claim: Hillary Clinton gave away 20 percent of the uranium in the United States (false)

President Trump asserted a claim the Washington Post Fact Checker has given Four Pinocchios, that Hillary Clinton “gave away 20 percent of the uranium in the United States,” going on to say, “you know what uranium is, right? This thing called nuclear weapons and other things like lots of things are done with uranium, including some bad things.” insinuate that the uranium could be used in a Russian nuclear weapon. FactCheck.org wrote: “The deal Clinton had a role in approving gave Russia ownership of 20 percent of U.S. production capacity — not existing stocks of uranium. Furthermore, Clinton alone could not have stopped the deal; only the president could have done that with a finding that national security would be endangered. Lastly, none of the uranium goes to Russia. That would require export licenses.”