Tag Archives: election

Web Archive 96: How the Smithsonian Helped Create One of the First Wayback Machine Collections

Screenshot from the Wayback Machine of the Web Archive 96 project page (October 11, 1997).

In 1996, the World Wide Web was starting to catch on. Politicians were just beginning to explore how to use online communication to reach voters. And in a house in San Francisco, the fledgling Internet Archive was starting to archive pieces of the web before they disappeared.

That same year, a letter arrived from Washington, D.C., with the Smithsonian Institution’s iconic sunburst logo at the top. The Smithsonian had agreed to partner with the Internet Archive to preserve the digital record of the 1996 U.S. presidential election.

“It was a major milestone for us,” recalls Internet Archive founder Brewster Kahle. “The big Smithsonian was working with this new little Internet Archive nonprofit library.”

Together, the two institutions launched Web Archive 96, one of the first web collections the Internet Archive ever created. It captured the early campaign webpages of candidates Bill Clinton, Bob Dole, and Ross Perot — online brochures filled with policy positions, photos, and promises — along with news coverage of the race. It was a pioneering effort to preserve the political life of a nation as it moved onto the web. The collection is now a foundational part of our cultural history on the web, and is available for public access via the Wayback Machine.

Explore Web Archive 96 via the Wayback Machine

Nearly thirty years later, that collaboration still stands out as visionary: two institutions, one old and one new, working together to recognize the internet as part of our shared cultural record.

Politics Goes Digital

We the People: Winning the Vote exhibit installation, National Museum of American History, 1996-2000.

In Washington, D.C., the National Museum of American History added a personal computer displaying online presidential election website content to its “We The People” campaign exhibit. “It was delivered and was displayed next to campaign buttons from the 1800s,” Kahle recalled.

Indeed, Smithsonian curators Larry Bird and Harry Rubenstein traveled to New Hampshire and Iowa every four years to collect buttons, signs and physical memorabilia from the campaign offices. Just as television changed the political landscape in the 1960s, they recognized the potential influence of the web in 1996. When they heard Kahle was archiving campaigns, Bird said they were “ecstatic” to collaborate.

“We were all over it,” said Bird, now a curator emeritus from the Smithsonian division of political history. “We were super glad that we could take this non-dimensional thing and for it to have a presence on the floor – even in this most rudimentary, stripped down way – limited to the candidates’ websites. It was an acknowledgement of where things were heading.”

Jeff Ubois, who forged the partnership in 1996, recalled “Why would anyone care about the ephemera of the web?” as the prevailing attitude at the time. “The Smithsonian helped change some of that.”

Once the Internet Archive partnered with the Smithsonian, “it wasn’t possible to dismiss web archiving as irrelevant, impossible, useless,” Ubois said.

People contact the Smithsonian often, Bird said, and the Internet Archive outreach was unexpected, but welcome. “We were constantly looking at the way things were shifting in politics, which always takes what’s popular and successful in the real world and bends it into its own political world or reality,” he said. “And this just seemed to be yet, the latest iteration of that as a cultural phenomenon….To have [the Internet Archive] assemble it wasn’t anything that any of us could have done at the time.”

‘Collection of Record for the Web’

Bird said the Internet Archive is a “remarkable resource” that he and other researchers have relied on for years.

“The museum is the collection of record for material things, objects, and dimensional things. And the Internet Archive is the collection of record for the web and all that implies,” Bird said. “There’s hardly anything that it doesn’t touch anymore. It didn’t start out that way, but it’s become that. It’s the collection of record that people use and cite and compare. It’s a tremendous historical resource.”

Preserving the evolution of political campaigns is important to anyone trying to do research or understand political trends over time, said David Almacy, president and chief executive officer of Far Post Media, a digital public affairs firm in Virginia and former White House E-Communications Director for President George W. Bush. In 1996, campaign websites were primarily online brochures – just text and photos without much customization. Today, websites are more advanced with video, digitally integrated with interactive elements that can be tailored to the user.

“The value is to provide an archive and a record of what was said, and basically a snapshot in time politically,” Almacy said. “It actually becomes fascinating to go back and look at the issues that were facing the country that would be deemed priorities in 1996 and how that compares to today. I assume a lot are the same – the economy, education, immigration, national security, global peace – but they’ve evolved in different ways. Many are very important to Americans, just as they were back then.”

New Research Tool for Visualizing Two Million Hours of Television News

Guest post by Kalev Leetaru

Today the Internet Archive announces a new interactive timeline visualization–the Television Explorer–that lets you trace how any keyword–think “emails”, “tax returns”, “alt-right”–has been covered on U.S. television news over the past half-decade.

See the Television Explorer, a new tool for exploring TV News.

screenshot-2016-12-19-09-50-09

Over the past year and a half, the GDELT Project and the Internet Archive’s Television News Archive have worked closely together to visualize how U.S. television news has covered the contentious 2016 political campaign.

One of the tools we created was the 2016 Candidate Television Tracker, which used closed captioning to count how many times each of the presidential candidates was mentioned on television and offered a day-by-day timeline showing the ebbs and flows of who was “winning” the free media wars. (Answer: President-elect Donald Trump.) This tool was used by such media outlets as The Atlantic, The Washington Post, FiveThirtyEight, Politico and The Guardian, among many others.

Now we are adapting this tool to allow more sophisticated searches: rather than just the presidential candidates, now you can trace television news coverage of any keyword of your choosing. You can even run advanced searches that find words in conjunction with other works or phrases, such as finding mentions of Hillary Clinton that also discuss her email server. All search results are available for download via CSV and JSON export, making it possible for data journalists, researchers, and advocates to fine tune their analysis of the data.

When searching, you get back a visual timeline showing how often that word or phrase has appeared on American television news over the past half-decade. Nearly two million hours of television news totaling more than 5.7 billion words from over 150 distinct stations spanning July 2009 to present (though not all stations were monitored for the entire period) are searchable in this interface.

Unlike the Internet Archive’s Television New Archive interface, which returns results at the level of an hour or half-hour “show,” the interface here reaches inside of those six and a half years of programming and breaks the more than one million shows into individual sentences and counts how many of those sentences contain your keyword of interest. Instead of reporting that CNN had 24 hour-long shows yesterday that mentioned Donald Trump one or more times, the interface here will count how many sentences uttered on CNN yesterday mentioned his name–a vastly more accurate metric for assessing media attention.

Explore how CNN covered the presidential campaign of 2012 versus 2016 and understand just how big of a media event this year’s election really was. See precisely when Edward Snowden burst onto the scene and how Wikileaks got more coverage during the 2016 presidential election than its debut in 2010. Watch the seasonal spikes of Thanksgiving, or see how ebola received little attention, even as thousands died in Africa, becoming a topic only after the first Americans became infected.

Using the “near” search feature, plot coverage of Wikileaks that also mentioned either “Podesta,” “email,” or “emails” nearby and discover that FOX paid far more attention to the DNC and Podesta email hacks than CNN, MSNBC, CNBC or Bloomberg. In contrast, CNN focused more intensely on the Trayvon Martin shooting (Aljazeera America and Bloomberg were not yet being monitored by the Archive), while Aljazeera led coverage of the Michael Brown and Eric Garner deaths.

screenshot-2016-12-19-09-53-55

Search of term “Wikileaks” near Podesta, emails, Clinton

Search for “ivory” to see that Aljazeera America (which ceased operation in April 2016) devoted vastly more of its coverage to elephant poaching in Africa than any other monitored national network. It also paid the most attention to “Africa” and to the “refugee” crisis. On the other hand, Bloomberg has devoted much more of its time to “China” and to the economic crisis in “Greece” last year.

We look forward to seeing what people do with this new tool Please share your favorite searches on Twitter with the hashtag “#internetarchivetvsearch”. If you have any questions, please email kalev.leetaru5@gmail.com or nancyw@archive.org.

Kalev Leetaru is an independent data journalist.