Wayback Machine Playback… now with Timestamps!

The Wayback Machine has an exciting new feature: it can list the dates and times, the Timestamps, of all page elements compared to the date and time of the base URL of a page.  This means that users can see, for instance, that an image displayed on a page was captured X days before the URL of the page or Y hours after it.  Timestamps are available via the “About this capture” link on the right side of the Wayback Toolbar.  Here is an example:

The Timestamps list includes the URLs and date and time difference compared to the current page for the following page elements: images, scripts, CSS and frames. Elements are presented in a descending order. If you put your cursor over a list element on the page, it will be highlighted and if you click on it you will be shown a playback of just that element.

Under the hood

Web pages are usually a composition of multiple elements such as images, scripts and CSS. The Wayback Machine tries to archive and playback web pages in the best possible manner, including all their original elements.  Each web page element has its own URL and Timestamp, indicating the exact date and time it was archived. Page elements may have similar Timestamps but they could also vary significantly for various reasons which depend on the web crawling process. By using the new Timestamps feature, users can easily learn the archive date and time for each element of a page.

Why this is important

The Wayback Machine is increasingly used in critical procedures such as legal evidence or political debate material.  It is important that what is presented is clear and transparent, even in the light of a web that was not designed to be archived. One of the ways a web archive could be confusing is via anachronisms, displaying content from different dates and times than the user expects. For example, when a archived page is played back, it could include some images from the current web, making it look like the image came from the past when it did not. We implemented Timestamps to provide users with more context about, and in turn hopefully greater confidence in, what they are seeing.

Posted in Announcements, News | 1 Comment

TV News Record: Wayback Machine saves deleted prez tweets

A weekly round up on what’s happening and what we’re seeing at the TV News Archive by Katie Dahl and Nancy Watzman. Additional research by Robin Chin.

In this week’s TV News Archive roundup, we explain how presidential tweets are forever, show how different TV cable news networks summarized NFL protests via Third Eye chyron data, and present FiveThirtyEight’s analysis of hurricane coverage (hint: Puerto Rico got less attention.)

Wayback Machine preserved deleted prez tweets; PolitiFact fact-checks legality of prez tweet deletions (murky)

The Internet Archive’s Wayback Machine has preserved President Donald Trump’s deleted tweets praising failed GOP Alabama U.S. Senate candidate Luther Strange following his defeat by Roy Moore on September 26. So does the Pulitzer Prize-winning investigative journalism site ProPublica, through its Politwoops project.

The story of Trump’s deleted tweets about Strange was reported far and wide, including this segment on MSNBC’s “Deadline Whitehouse” that aired on September 27.

In a fact-check on the legality of a president deleting tweets, linked in the TV News Archive clip above, John Kruzel, reports for PolitiFact that the law is murky but still being fleshed out:

Experts were split over how much enforcement power courts have in the arena of presidential record-keeping, though most seemed to agree the president has the upper hand.

“One of the problems with the Presidential Records Act is that it does not have a lot of teeth,” said Douglas Cox, a professor at the City University of New York School of Law. “The courts have held that the president has wide and almost unreviewable discretion to interpret the Presidential Records Act.”

That said, many of the experts we spoke to are closely monitoring how the court responds to the litigation around Trump administration record-keeping.

He also provides background on that litigation, a lawsuit brought by Citizens for Responsibility and Ethics in Washington. The case is broadly about requirements for preserving presidential records, and a previous set of deleted presidential tweets is a part of it.

Fact Check: NFL attendance and ratings are way down because people love their country (Mostly false)

Speaking of Trump’s tweets, the president ignited an explosion of coverage with an early morning tweet on Sunday, Sept. 24, ahead of a long day of football games: “NFL attendance and ratings are WAY DOWN. Boring games yes, but many stay away because they love our country.”

Manuela Tobias of PolitiFact rated this claim as “mostly false,” reporting, “Ratings were down 8 percent in 2016, but experts said the drop was modest and in line with general ratings for the sports industry. The NFL remains the most watched televised sports event in the United States.” “As for political motivation, there’s little evidence to suggest people are boycotting the NFL. Most of the professional sports franchises are dealing with declines in popularity.”

How did different cable TV news networks cover the NFL protests?

We first used the Television Explorer tool to see where there was a spike in the use of the word “NFL” near the word “Trump.” It looked like Sunday showed the most use of these words. After a  closer look, we saw MSNBC, Fox News, and CNN all showed highest mentions of these terms around 2 pm Pacific.

Spike at 2 pm (PST) for CNN, MSNBC, and CNN

Then we downloaded data from the new Third Eye project, which turns TV News chyrons into data, filtering for that date and hour. We were able to see how the three cable news networks were summarizing the news at that particular point in time.

At about 2:02, CNN broadcast this chyron“NFL teams kneel, link arms in defiance of Trump.”

Screen grab of chyron caught by Third Eye from 2:02 pm 9/24/17 on CNN

Fox News chose the following, also seen below tweeted from one of the Third Eye twitter bots: “Some NFL owners criticize Trump’s statements on player protests, link arms with players”

Meanwhile, MSNBC chose a different message “Taking a knee: NFL teams send a message.”

Screen grab of chyron caught by Third Eye from 2:02 pm 9/24/17 on MSNBC

About eight minutes later, all three cable channels were still reporting on the NFL protests:

Puerto Rico’s hurricane Maria got less media attention than hurricanes Harvey & Irma

Writing for FiveThirtyEight.com, Dhrumil Mehta demonstrated that both online news sites and TV news broadcasters paid less attention to Puerto Rico’s hurricane Marie than to hurricanes Harvey and Irma, which hit mainland U.S. primarily in Texas and Florida. Mehta used TV News Archive data via Television Explorer, as well as data from Media Cloud on online news coverage, to help make his case:

While Puerto Rico suffers after Hurricane Maria, much of the U.S. media (FiveThirtyEight not excepted) has been occupied with other things: a health care bill that failed to pass, a primary election in Alabama, and a spat between the president and sports players, just to name a few. Last Sunday alone, after President Trump’s tweets about the NFL, the phrase “national anthem” was said in more sentences on TV news than “Puerto Rico” and “Hurricane Maria” combined.

To receive the TV News Archive’s email newsletter, subscribe here.

 

 

Posted in Announcements, News, Television Archive | Tagged , , , , , , , , , | Comments Off on TV News Record: Wayback Machine saves deleted prez tweets

Experiments Day Hackathon 2017

Join us this Saturday, September 23 @ 10:30am PT for our Experiments Day Hackathon

It’s almost that time again — October 11 — the day the Internet Archive invites you to celebrate another year of preserving our cultural heritage and the progress our community has made towards  building tools that facilitate universal access to all knowledge.

Making these collections as discoverable and accessible as possible is a huge task, and we need your help! It’s often our community members who bring our items to life.

Now’s your chance!

Champions of open-access, unite: This Saturday, September 23 @ 10:30am PT,  join us in person at the  Internet Archive HQ or joins us remotely online for an Experiments Day Hackathon; a day of camaraderie and civic action fuelled by fresh ground coffee and abundant amounts of pizza.

Let’s team up to prototype experimental interfaces, remix content, and build tools to make knowledge more accessible to those who need it most.

What experiments would you craft with 2M hours or tv news, 5B archived images, 3M books, and petabytes of free storage? Proposed themes include #decentralization, #accessibility, #books, #scholarly-papers, #annotations.

We’ve helped backup over 2M hours of television news, hundreds of billions of webpages through time, audio for tens of thousands of live music concerts and 78rpms, and have helped digitize and lend millions of public domain and modern books. The breadth, archival quality, and uniqueness of our collections make the Internet Archive a rich terrain for experimentation and hacking.  Many of our top engineers will be on hand to guide you through the APIs to build with.

Then, be sure to come back on October 11 for our annual celebration, where many of our experiments will be on display!

Register/RSVP: https://www.eventbrite.com/e/internet-archive-experiments-hackathon-2017-tickets-37012125263

Join our chat: https://gitter.im/ArchiveExperiments/Lobby

Schedule + more event details: https://experiments.archivelab.org/hackathon

Watch remotely: https://archive.org/details/archive-experiments-hackathon-2017

Posted in News | 2 Comments

MacArthur Foundation’s $100 Million Award Finalists

Today, the MacArthur Foundation announced the finalists for its 100&Change competition, awarding a single organization $100 million to solve one of the world’s biggest problems. The Internet Archive’s Open Libraries project, one of eight semifinalists, did not make the cut to the final round. Today we want congratulate the 100&Change finalists and thank the MacArthur Foundation for inspiring us to think big. For the last 15 months, the Internet Archive team has been building the partnerships that can transform US libraries for the digital age and put millions of ebooks in the hands of more than a billion learners. We’ve collaborated with the world’s top copyright experts to clarify the legal framework for libraries to digitize and lend their collections. And we’ve learned an amazing amount from the leading organizations serving the blind and people with disabilities that impact reading.  

To us, that feels like a win.

In the words of MacArthur Managing Director, Cecilia Conrad:

The Internet Archive project will unlock and make accessible bodies of knowledge currently located on library shelves across the country. The proposal for curation, with the selection of books driven not by commercial interests but by intellectual and cultural significance, is exciting. Though the legal theory regarding controlled digital lending has not been tested in the courts, we found the testimony from legal experts compelling. The project has an experienced, thoughtful and passionate team capable of redefining the role of the public library in the 21st Century.

Copyright scholar and Berkeley Law professor, Pam Samuelson (center), convenes a gathering of more than twenty legal experts to help clarify the legal basis for libraries digitizing and lending physical books in their collections.

So, the Internet Archive and our partners are continuing to build upon the 100&Change momentum. We are meeting October 11-13 to refine our plans, and we invite interested stakeholders to join us at the Library Leaders Forum. If you are a philanthropist interested in leveraging technology to provide more open access to information—well, we have a project for you.

For 20 years, at the Internet Archive we have passionately pursued one goal: providing universal access to knowledge. But there is almost a century of books missing from our digital shelves, beyond the reach of so many who need them. So we cannot stop. We now have the technology, the partners and the plan to transform library hard copies into digital books and lend them as libraries always have. So all of us building Open Libraries are moving ahead.

Members of the Open Libraries Team at the Internet Archive headquarters, part of a global movement to provide more equitable access to knowledge.

Remember: a century ago, Andrew Carnegie funded a vast network of public libraries because he recognized democracy can only exist when citizens have equal access to diverse information. Libraries are more important than ever, welcoming all of society to use their free resources, while respecting readers’ privacy and dignity. Our goal is to build an enduring asset for libraries across this nation, ensuring that all citizens—including our most vulnerable—have equal and unfettered access to knowledge.

Thank you, MacArthur Foundation, for inspiring us to turn that idea into a well thought-out project.

Onward!

–The Open Libraries Team

Posted in Announcements, Books Archive, News, Open Library | 6 Comments

TV News Record: Debt ceiling, hurricane funding, GDP

A weekly round up on what’s happening and what we’re seeing at the TV News Archive by Katie Dahl and Nancy Watzman. Additional research by Robin Chin.

In this week’s TV News Archive roundup, we examine the latest Face-o-Matic data (you can too!); and present our partner’s fact-checks on Sen. Ted Cruz’s claims that Hurricane Sandy emergency funding was filled with “unrelated pork” and President Donald Trump’s claims about other country’s GDPs.

What got political leaders sustained face-time on TV news last week?

What got Trump, McConnell, Schumer, Ryan, and Pelosi the longest clips on TV cable news screens this past week? Thanks to our new trove of Face-O-Matic data developed with the start-up Matroid’s facial recognition algorithms, reporters and researchers can get quick answers to questions like these.

House Minority Leader Nancy Pelosi, D., Calif., got almost six minutes – an unusually large amount of sustained face-time for the Democrat from California – from “MSNBC Live” on September 7 covering her press conference following President Donald Trump’s surprise deal with congressional Democrats on the debt ceiling.

Senate Majority Leader Mitch McConnell, R., Ky., also enjoyed his longest sustained face-time segment last week on the debt ceiling, clocking in at 34 seconds on MSNBC’s “Morning Joe.”

House Speaker Paul Ryan, R., Wis., got 11 minutes on September 7 on Fox News’ “Happening Now,” for his weekly press conference, where he was shown discussing a variety of topics, including Hurricane Harvey, tax reform, and also debt relief. For Sen. Majority Leader Chuck Schumer, D., N.Y., the topic that got him the most sustained time–21 seconds–was also his unexpected deal with the president on the debt ceiling.

For President Donald Trump, however, who never lacks for TV news face-time, his longest sustained appearance on TV news this past week was his speech at this week’s 9/11 memorial at the Pentagon.

Fact-check: Hurricane Sandy relief was 2/3 filled with pork and unrelated spending (false)

In the aftermath of Hurricane Harvey, Sen. Ted Cruz, R., Texas, came under criticism by supporting federal funding for Harvey victims while having opposed such funding for victims of Hurricane Sandy in 2013.  Cruz defended himself by saying, “The problem with that particular bill is it became a $50 billion bill that was filled with unrelated pork. Two-thirds of that bill had nothing to do with Sandy.”

But Lori Robertson of FactCheck.org labeled this claim as “false,” noting that a Congressional Research Service study pegged at least 69 percent of that bill’s funding as related to Sandy, and that even more of the money could be attributed to hurricane relief funding: “Cruz could have said he thought the Sandy relief legislation included too many non-emergency items. That’s fair enough, and his opinion. But he was wrong to specifically say two-thirds of the bill “had nothing to do with Sandy,” or “little or nothing to do with Hurricane Sandy.”

Fact-check: Trump spoke with world leader unhappy with nine percent GDP growth rate (three Pinocchios)

At a recent press conference on his tax reform plan, President Donald Trump remarked that other foreign leaders are unhappy with higher rates of growth in gross domestic product (GDP) than the U.S. has. “I spoke to a leader of a major, major country recently. Big, big country. They say ‘our country is very big, it’s hard to grow.’ Well believe me this country is very big. How are you doing, I said. ‘Cause I have very good relationships believe it or not with the leaders of these countries. I said, how are you doing? He said ‘not good, not good at all. Our GDP is 7 percent.’ I say 7 percent? Then I speak to another one. ‘Not good. Not good. Our GDP is only 9 percent.’”

Nicole Lewis of The Washington Post’s Fact Checker gave this claim “three Pinocchios”: “Of the 58 heads of state he’s met or spoke with since taking office, not one can claim 9 percent GDP growth. Perhaps Trump misheard. Or perhaps the other leader was fibbing. Or maybe Trump just thought the pitch for a tax cut sounded better if he could quote two leaders….In any case, Trump is making a major economic error in comparing the GDP of a developed country to a developing one. For his half-truths, and for comparing apples to oranges, Trump receives Three Pinocchios.”

To receive the TV News Archive’s email newsletter, subscribe here.

Posted in News | Tagged , , , , , , , , , , , | Comments Off on TV News Record: Debt ceiling, hurricane funding, GDP

Rubbing the Internet Archive

In July 2017, Los Angeles-based artist Katie Herzog visited our headquarters in San Francisco and created Rubbing the Internet Archive — a 10-foot high by 84-foot wide rubbing of the exterior of the building, made using rubbing wax on non-fusible interfacing. The imposing 1923 building—formerly a Christian Science church and now a library—features an intricate facade that translated well into two dimensions.

The drawing is now adhered to the walls of Klowden Mann’s main exhibition space allowing the to-scale exterior of the Internet Archive to form the interior built-environment of the gallery.

Rubbing the Internet Archive is on view at Klowden Mann, 6023 Washington Blvd., Culver City, California, through October 14th.



Posted in News | Comments Off on Rubbing the Internet Archive

Face-o-Matic data show Trump dominates – Fox focuses on Pelosi; MSNBC features McConnell

For every ten minutes that TV cable news shows featured President Donald Trump’s face on the screen this past summer, the four congressional leaders’ visages were presented  for one minute, according an analysis of Face-o-Matic downloadable, free data fueled by the Internet Archive’s TV News Archive and made available to the public today.

Face-o-Matic is an experimental service, developed in collaboration with the start-up Matroid, that tracks the faces of selected high level elected officials on major TV cable news channels: CNN, Fox News, MSNBC, and the BBC. First launched as a Slack app in July, the TV News Archive, after receiving feedback from journalists, is now making the underlying data available to the media, researchers, and the public. It will be updated daily here.

Unlike caption-based searches, Face-o-Matic uses facial recognition algorithms to recognize individuals on TV news screens. Face-o-Matic finds images of people when TV news shows use clips of the lawmakers speaking; frequently, however, the lawmakers’ faces also register if their photos or clips are being used to illustrate a story, or they appear as part of a montage as the news anchor talks.  Alongside closed caption research, these data provide an additional metric to analyze how TV news cable networks present public officials to their millions of viewers.

Our concentration on public officials and our bipartisan tracking is purposeful; in experimenting with this technology, we strive to respect individual privacy and extract only information for which there is a compelling public interest, such as the role the public sees our elected officials playing through the filter of TV news. The TV News Archive is committed to doing this right by adhering to these Artificial Intelligence principles for ethical research developed by leading artificial intelligence researchers, ethicists, and others at a January 2017 conference organized by the Future of Life Institute. As we go forward with our experiments, we will continue to explore these questions in conversations with experts and the public.

Download Face-o-Matic data here.

We want to hear from you:

What other faces would you like us to track? For example, should we start by adding the faces of foreign leaders, such as Russia’s Vladimir Putin and South Korea’s Kim Jong-un? Should we add former President Barack Obama and contender Hillary Clinton? Members of the White House staff? Other members of Congress?

Do you have any technical feedback? If so, please let us know what they are by contacting tvnews@archive.org or participating in the GitHub Face-o-Matic page.

Trump dominates, Pelosi gets little face-time

Overall, between July 13 through September 5, analysis of Face-o-Matic data show:

  • All together, we found 7,930 minutes, or some 132 hours, of face-time for President Donald Trump and the four congressional leaders. Of that amount, Trump dominated with 90 percent of the face-time. Collectively, the four congressional leaders garnered 15 hours of face-time.
  • House Minority leader Nancy Pelosi, D., Calif., got the least amount of time on the screen: just 1.4 hours over the whole period.
  • Of the congressional leaders, Senate Majority Leader Mitch McConnell’s face was found most often: 7.6 hours, compared to 3.8 hours for House Speaker Paul Ryan, R., Wis.; 1.7 hours for Senate Minority Leader Chuck Schumer, D., N.Y., and 1.4 hours for Pelosi.
  • The congressional leaders got bumps in coverage when they were at the center of legislative fights, such as in this clip of McConnell aired by CNN, in which the senator is shown speaking on July 25 about the upcoming health care reform vote. Schumer got coverage on the same date from the network in this clip of him talking about the Russia investigation. Ryan got a huge boost on CNN when the cable network aired his town hall on August 21.

Fox shows most face-time for Pelosi; MSNBC, most Trump and McConnell

The liberal cable network MSNBC gave Trump more face-time than any other network. Ditto for McConnell. A number of these stories highlight tensions between the senate majority leader and the president. For example, here, on August 25, the network uses a photo of McConnell, and then a clip of both McConnell and Ryan, to illustrate a report on Trump “trying to distance himself” from GOP leaders. In this excerpt, from an August 21 broadcast, a clip of McConnell speaking is shown in the background to illustrate his comments that “most news is not fake,” which is interpreted as “seem[ing] to take a shot at the president.”

MSNBC uses photos of both Trump and McConnell in August 12 story on “feud” between the two.

While Pelosi does not get much face-time on any of the cable news networks examined, Fox News shows her face more than any other. In this commentary report on August 20, Jesse Waters criticizes Pelosi for favoring the removal of confederate statues placed in the Capitol building. “Miss Pelosi has been in Congress for 30 years. Now she speaks up?” On August 8, “Special Report With Bret Baier” uses a clip of Pelosi talking in favor of women having a right to choose the size and timing of her family as an “acid test for party base.”

Example of Fox News using a photo of House Minority Leader Nancy Pelosi to illustrate a story, in this case about a canceled San Francisco rally.

While the BBC gives some Trump face-time, it gives scant attention to the congressional leaders. Proportionately, however, the BBC gives Trump less face-time than any of the U.S. networks.

On July 13 the BBC’s “Outside Source” ran a clip of Trump talking about his son, Donald Trump, Jr.’s, meeting with a Russian lobbyist.

For details about the data available, please visit the Face-O-Matic page. The TV News Archive is an online, searchable, public archive of 1.4 million TV news programs aired from 2009 to the present.  This service allows researchers and the public to use television as a citable and sharable reference. Face-O-Matic is part of ongoing experiments in generating metadata for reporters and researchers, enabling analysis of the messages that bombard us daily in public discourse.

 

Posted in Announcements, News | Tagged , , , , , , , , , , , , | 4 Comments

Why Bitcoin is on the Internet Archive’s Balance Sheet

 

A foundation was curious as to why we have Bitcoin on our balance sheet, and I thought I would explain it publicly.

The Internet Archive explores how bitcoin and other Internet innovations can be useful in the non-profit sphere– this is part of it. We want to see how donated bitcoin can be used, not just sold off. We are doing this publicly so others can learn from us.   And it is fun.  And it is interesting.

We started receiving donations in bitcoin in 2012, the first year we got about 2,700 and we sold them to an employee who was heavily involved (for the prevailing $2 per bitcoin). The next year, we held onto them and offered them to employees as an optional way to get their salary– ⅓ took some. We set up an ATM at the Internet Archive. We got the sushi place next door to take bitcoins, and encouraged our employees to buy books at Green Apple Books in bitcoin. We set up a vanity address. Started taking bitcoin in our swag store. Tried (and failed) to get our credit union to help bitcoin firms.

Another year we gave a small amount to people as an xmas bonus to those that set up a wallet (from a matching grant of bitcoins from me).

We paid vendors and contractors in bitcoin when they wanted it. Starting getting micropayments from the Brave Browser. Hosted a movie with filmmakers on living on bitcoin. We publicly tested if people are stealing bitcoins like the press was saying (didn’t steal ours).

A few years later, the price had gone up so much, I personally bought some at the going rate to decrease financial risk to the Internet Archive, but then I did not just cash those in for dollars. We may seem like we are geniuses, but we are not, we saw the price go down as well and we did not sell out then either.

Recently Zcash folks helped us set up a Zcash address, and would love people to donate there.

What we are doing is trying to “play the game” and see how it works for non-profits. It is not an investment for us, it is testing a technology in an open way. If you want to see the donations to us in bitcoin, they are here. Zcash here.

Bitcoin donations have been decreasing in recent years, which may reflect we are not moving with the times. I am hoping that someone will say, gosh, I will donate a thousand bitcoins to these guys who have been so good :). Here is to hoping.

So the Internet Archive has some bitcoin on its balance sheet to be a living example of an organization that is trying this innovative Internet technology. We do the same with bittorrent, tor, and decentralized web tech.

Please donate and we will put them to good use supporting the Internet Archive’s mission.

 

Posted in Announcements | 2 Comments

The Internet Archive’s Annual Bash – Come Celebrate With Us!

What’s your personal rabbit hole?

78 rpm recordings?
20th Century women writers?
Friendster sites?
Vintage software?
Educational films from the 50s?

Find out at the Internet Archive’s Annual Bash:

The Internet Archive invites you to enter our 20th Century Time Machine to experience the audio, books, films, web sites, ephemera and software fast disappearing from our midst. We’ll be connecting the centuries—transporting 20th century treasures to curious minds in the 21st. Come explore the possibilities at our annual bash on Wednesday, October 11, 2017, from 5-9:30 pm.

Tickets start at $15 here.

We’ll kick off the evening with cocktails, food trucks and hands-on demos of our coolest collections. Come scan a book, play in a virtual reality arcade, or talk about 78 rpm recordings with DJ Chas Gaudi. When you arrive, be sure to get your library card. If you “check out” all the stations on your card, we’ll reward you with a special Internet Archive gift.

Starting at 7 p.m., we’ll unveil the latest media the Internet Archive has to offer, presented by the artists, writers, and scientists who lose themselves in our collections every day. And to keep you dancing into the evening, DJ Phast Phreddie the Boogaloo Omnibus, will once again be spinning records from 8-9:30. Come join our celebration!

Event Info:                    Wednesday, October 11th
5pm: Cocktails, food trucks, and hands-on demos
7pm: Program
8pm: Dessert and Dancing

Location:  Internet Archive, 300 Funston Avenue, San Francisco

Get your tickets now!

Free bike valet parking available!

 

Posted in Announcements, Event, News | 6 Comments