Guest Post: Preserving Digital Music – Why Netlabel Archive Matters

The following entry is by Simon Carless, who worked for the Internet Archive in the early 2000’s before moving on to work in media and conferences, while simultaneously maintaining collections at the Internet Archive and running the for-free game information site Mobygames.

netlabelsIt’s fascinating that the early Internet era (digital) data can sometimes be trickier to preserve & access than pre-Internet (analog) data. A prime example is the amazing work of the Netlabel Archive, which I wanted to both laud and highlight as ‘digital archiving done right’.

Created in 2016 by the amazing Zach Bridier, the Netlabel Archive has preserved the catalogs of 11 early ‘netlabels’ and counting, a number of which involve music that was either completely unavailable online, or difficult to listen to online. One of these netlabels is the one that I ran from 1996 to 2009, Mono/Monotonik. So obviously, I’m particularly delighted by that project. But a number of the other netlabels are also great and previously tricky to access, and I’m even more excited for those. (Reminder: all these netlabels freely distributed their music at the time, which makes it a great thing to archive and bring back.)

The nub of the problem around early netlabels  – particularly from 1996 to 2003 – is due to PCs & the Internet (& pre-Internet BBSes!) just not being fast enough or having enough storage to support MP3 downloads at that time.

So this early netlabel music – on PCs and even other computers like Commodore Amigas – was composed in smaller (in kB!) module files, which was composed and played on computers by using sample data and MIDI-style ‘note triggering’ with rudimentary real-time effects. This allows 5-minute long songs to be just 30kB-300kB in size, versus the 5mB or more that a MP3 takes.

For the more recent history of netlabels, I founded the Netlabels collection at the Internet Archive back in 2003, and that’s grown to hold over 65,000 individual music releases – and hundreds of thousands of tracks – by 2016. But the Internet Archive’s collection was largely designed to hold MP3 and OGG files, and so the early .MODs, .XMs and .ITs were not always preserved as part of this collection – and they were certainly not listenable to in-browser.

Additionally, there were a number of netlabels that used their own storage instead of the Internet Archive’s, even after 2003. But if it disappeared, their data disappeared with it, and music files are generally large enough not to be archived by the saintly Wayback Machine.

So if early netlabel archives exist, it was as ZIP/LHA archives on or other relevant demoscene FTP sites. (Netlabels were spawned from the demoscene to some extent, since demo soundtracks use the same format of .MODs and .XMs.) And tracker music is annoyingly hard to play on today’s PCs and Macs – there are programs (such as VLC & more specialist apps) which do it, but it’s not remotely mainstream & not web browser-streamable.

So what Zach has done is keep the original .ZIP/.LHA files, which often had additional ASCII art & release info in them, save the .MODs and .XMs, convert everything to .MP3, painstakingly catalog all of the releases, and then upload the entire caboodle (both original and converted files) to both the Internet Archive and additionally to YouTube, where there are gigantic playlists for each label. So there’s now multiple opportunities for in-browser listening & the original files are also properly preserved.

This means we can now all easily browse and listen to the complete catalog of Five Musicians, a seminal early global PC tracker group/netlabel, as well as the super-neat Finnish electronic music netlabel Milk, the aggressive chiptune/noise label mp3death, and a host of others. And I recently uploaded a rare FTP backup from 1998 which allowed him to put up the 10 releases (that we know about!) from funky electronic netlabel Cutoff. These may have been partially online in databases like Modland, but certainly weren’t this accessible, complete, or well-collected.

What’s somewhat crazy about this is that we’re not even talking about ancient history here – at most, these digital files are 20 years old. And they’re already becoming difficult to access, listen to, or in a few cases even find.

For example, I had to dig deep into backup CD-ROMs to find some of the secret bootleg No’Mo releases that we deliberately _didn’t_ put on the Mono website back in 1996 – opting to distribute them via BBSes instead. These files literally didn’t exist on the Internet any more, despite being small and digital-native.

I think that’s – hopefully – the exception rather than the rule. But without diligent work by Zach (much kudos to him!) & similar work by other citizen digital activists like the 4am Apple II archiver, Jason Scott (obviously!) and a host of others, we’d have issues. And we may need more help still – some of this digital-first materials may disappear permanently, as the CD-ROMs or other media they are on become unreadable.

But we’re still doing a PRETTY good job on preservation, especially with CD-ROMs being ingested in massive amounts onto the Internet Archive regularly. (I’m working with MobyGames & another to-be-announced organization on preserving video game press CD-ROMs on, for example, and Jason Scott’s CD-ROM work is many magnitudes larger than mine.)

Yet I actually think contextualization and access to these materials is just as big a problem, if not bigger. Once we’ve got this raw data, who’s available to look through it, pick out the relevant stuff, and make it easily viewable or streamable to anyone who wants to see it? That’s why the game art/screenshots on those press CD-ROMs is also being extracted and uploaded to MobyGames for easy Google Images access, and why Netlabel Archive’s work to put streamable versions of the music on and YouTube is so vital. (And why playable-in-browser emulation work is SO very important!)

In the end, you can preserve as much data as you want, but if nobody can find it or understand it, well – it’s not for naught, but it’s also not the reason you went to all the trouble of archiving it in the first place. And the fact the Netlabel Archive does both – the preserving AND the accessibility – makes it a gem worth celebrating. Thanks again for all your work, Zach.


Posted in News | 1 Comment

Persistent URL Service,, Now Run by the Internet Archive


OCLC and the Internet Archive today announced the results of a year-long cooperation to ensure the future of The organizations have worked together to build a new service hosted by the Internet Archive that will manage the persistent URLs and sub-domain redirections for, and

Since its introduction by OCLC Research in 1995, has provided a source of Persistent URLs (PURLs) that redirect users to the correct hosting location for documents, data, and websites as they change over time.

With more than 2,500 users including publishing and metadata organizations such as Dublin Core, has become important to the smooth functioning of the Web, data on the Web, and the Semantic Web in particular.

Brewster Kahle of the Internet Archive said “We share a common belief with OCLC that what is shared on the Web should be preserved, so it makes perfect sense for us to add this important service to our set of tools and services including the WayBack Machine as part of our mission to promote universal access to all knowledge.”

Lorcan Dempsey of OCLC welcomed the announcement as “a major step in the future sustainability and independence of this key part of the Web and linked data architectures. OCLC is proud to have introduced persistent URLs and in the early days of the Web and we have continued to host and support it for the last twenty years. We welcome the move of to the Internet Archive which will help them continue to archive and preserve the World’s knowledge as it evolves.”

All previous PURL definitions have been transferred to Internet Archive and can continue to be maintained by their owners through a new web-based interface located at here.

About OCLC:
OCLC is a nonprofit global library cooperative providing shared technology services, original research and community programs so that libraries can better fuel learning, research and innovation. Through OCLC, member libraries cooperatively produce and maintain WorldCat, the most comprehensive global network of data about library collections and services. Libraries gain efficiencies through OCLC’s WorldShare, a complete set of library management applications and services built on an open, cloud-based platform. It is through collaboration and sharing of the world’s collected knowledge that libraries can help people find answers they need to solve problems. Together as OCLC, member libraries, staff and partners make breakthroughs possible

About Internet Archive:
The Internet Archive ( is a 501(c)(3) non-profit that was founded to build an Internet library, with the purpose of offering permanent access for researchers, historians, and scholars to historical collections that exist in digital format.

Posted in Announcements, News | 6 Comments

Tales from the TV News Archive presidential debate near real-time livestream

During last night’s presidential debate, the Internet Archive’s TV News Archive experimented with something new: a near real-time live stream of the first presidential debate. This online video stream is editable, embeddable, and shareable on social media. We were the only public library of the debate capturing these clips within minutes, while the candidates were still duking it out. The debate is preserved on the TV News Archive site for posterity. And when the vice presidential candidates, Tom Kaine and Mike Pence, meet for their debate on October 4, the TV News Archive will be making this live stream available to  journalists and the general public.

During the debate, we matched up TV debate video with fact checks from our Political TV Ad Archive partners at and PolitiFact. Here are some representative tweets and links from last night’s debate:

Minute 15: Hillary Clinton said, “Donald thinks that climate change is a hoax perpetrated by the Chinese.” “I do not say that,” said Trump. “Mostly True,” read the fact check posted by PolitiFact reporters. Jessica Clark, founder of Dot Connector Studio and a consultant to the TV News Archive, was able to link the two here:

Minute 20: Donald Trump said, “I was against the war in Iraq.” posted this timeline of Trump’s statements about the Iraq war, pointing out that Trump had voiced support for the war in 2002 in an interview with “shock jock” Howard Stern. I tweeted that here:

Minute 36: Donald Trump said, “You learn a lot from financial disclosures” as opposed to tax returns. “False,” posted PolitiFact, “Trump has not released his tax returns, which experts say would offer valuable details on his effective tax rate, the types of taxes he paid, and how much he gave to charity, as well as a more detailed picture of his income-producing assets.” This sort of information is not included on financial disclosure forms. I linked to the fact check in this tweet:


Minute 44: Hillary Clinton said: “The gun epidemic is the leading cause of death of young African American men, more than the next nine causes put together.” “True,” posted PolitiFact. Roger Macdonald, TV News Archive director, tweeted the following link to the TV debate clip, along with the fact check.

Overall, fact checking was a crucial part of last night’s debates, as Clark noted:

The near real-time live stream experiment was part of our collaboration around the debates with the Annenberg Public Policy Center, to bring context to the 2016 presidential debates. Stay tuned: today we are drilling down on how TV news is covering the debates. Which video clips are they picking up from the debates in post-debate analyses? We’ll be making that information available to the public, as well as to academic researchers at the Annenberg Public Policy School for integration into their post-debate surveys.

Posted in News | Comments Off on Tales from the TV News Archive presidential debate near real-time livestream

The Internet Archive Turns 20!

For 20 years, the Internet Archive has been capturing the Web– that amazing universe of images, audio, text and software that forms our shared digital culture.  Now it’s time to celebrate and we’re throwing a party! Please join us for our 20th Anniversary celebration on Wednesday, October 26th, 2016, from 5-9:30 pm.

Annual Celebration 2014 exterior

Get your free tickets here.

We’ll kick off the evening with cocktails, tacos trucks and hands-on demos of our coolest tools. Come scan a book, play in a virtual reality arcade, or try out the brand new search feature in the Wayback Machine. When you arrive, be sure to get your library card.  “Check out” all the stations on your card and we’ll reward you with a special gift commemorating our 20th anniversary.

Tracy Demo Station 2015

Starting at 7 p.m., we’ve commissioned Paul D. Miller aka DJ Spooky — composer, author and multimedia artist — to create a short musical montage drawn from the Internet Archive’s audio collections. We’ll look back on some of the defining digital moments of the past 20 years, and explore how media and messaging captured in our Political TV Ad Archive is impacting the 2016 Election.

And to keep you dancing into the evening, DJ Phast Phreddie the Boogaloo Omnibus, will be spinning 45rpm records from 8-9:30. We hope you can join our celebration!

Event Info:Gaming Booth 2015                    Wednesday, October 26th
5pm: Cocktails, tacos, and hands-on demos
7pm: Program
8pm: Dessert, Dancing and more Demo stations

Location:  Internet Archive, 300 Funston Avenue, San Francisco

Be sure to reserve your ticket today!


Posted in Announcements, News | Comments Off on The Internet Archive Turns 20!

Dear Congress: Please Don’t Make It More Difficult And Dangerous To Be A Library

copyrightoffice1Last Friday, the Internet Archive and several of our library, archive, and museum partners sent a letter to House Judiciary Committee Chairman Bob Goodlatte (R-VA) urging him not to make it more difficult and dangerous to be a library.

As we wrote about over the summer, the U.S. Copyright Office is proposing to completely rewrite Section 108, the part of the law that is designed to support traditional library functions such as preservation and inter-library loans. Although the proposal has not been made public yet, we understand from our meeting with them that the Copyright Office wants to redefine who gets to be a library, making it harder for small players and virtual libraries to be protected under the law. The proposal is also likely to be damaging to fair use and may add new, burdensome regulations on libraries who archive the web (among other things).

Thankfully, the Copyright Office does not write the law–that is up to Congress. Our letter explains that now is not the time to scrap the old law, which is working well. The Copyright Office’s proposal is not only unnecessary, but potentially harmful to library efforts to increase access to information. We hope Congress will take the strong objections of the library community seriously when considering the Copyright Office’s proposal to rewrite the law that applies to libraries.

Posted in News | 3 Comments

SAVE THE DATE — The Internet Archive Turns 20!

View from last year's annual celebration. Our 20th anniversary is coming up and we’re throwing a party! Please save the date and join us for our annual celebration on Wednesday, October 26th, 2016.

We’ll kick off the evening with cocktails, tacos and hands-on demo stations. Come scan a book, play in a virtual reality arcade, search billions of Web pages in our Wayback Machine and so much more! Then check out the interactive new media projects by talented artists working with our collections.

Starting at 7 p.m., Paul D. Miller aka DJ Spooky — composer, author, teacher, electronics DJ and multi-media artist — will perform a short, original musical retrospective of the Internet Archive’s audio collections. We’ll look back on some of the defining moments of the past 20 years, and explore how media and messaging is impacting the 2016 Election.

And to keep you dancing into the evening, DJ Phast Phreddie the Boogaloo Omnibus, will be spinning 45rpm records from 8-9:30. We hope you can join our celebration!
Event Info:

Wednesday, October 26th
5pm: Cocktails, tacos, and hands-on demos
7pm: Program
8pm: Dessert and Dancing

Location: Internet Archive, 300 Funston Avenue, San Francisco, CA 94118

Posted in Event | 5 Comments

Rock Against the TPP is Coming to San Francisco…TOMORROW!

On Friday, September 9th hip hop icons Dead Prez, actress Evangeline Lilly, punk legend Jello Biafra, Grammy winners La Santa Cecilia, and others will play a free concert at the Regency Ballroom in San Francisco to protest the Trans-Pacific Partnership (TPP).

The TPP is a contentious trade agreement that is getting quite a bit of negative press in the 2016 U.S. election cycle. Among many other issues, the TPP would govern how signatory countries protect and enforce intellectual property rights. The TPP could have a large negative impact on libraries by increasing copyright term limits and neglecting the essential limitations on copyright law that libraries around the world rely on. Many different groups have vocally opposed the TPP, both for its substance and for the secrecy of the negotiations process.

tppmorrelloOrganized by Fight for the Future and Rage Against the Machine guitarist Tom Morello, the  tour is designed to pull new audiences into the fight against the TPP. See more details and a full lineup at

The concert will be followed by a teach-in on “How to Fight the TPP” on Saturday, Sept. 10th from 1pm – 3pm at 1999 Bryant Street, hosted by experts from a wide range of organizations opposing the TPP.

Posted in Event, Music, News | 2 Comments

Saving the 78s

Written by B. George, the Director of ARChive of Contemporary Music in NYC, and Curator of Sound Collections at the Internet Archive in San Francisco.

While audio CDs whiz by at about 500 revolutions per minute, the earliest flat disks offering music whirled at 78rpm. They were mostly made from shellac, i.e., beetle (the bug, not The Beatles) resin and were the brittle predecessors to the LP (microgroove) era. The format is obsolete, and the surface noise is often unbearable and just picking them up can break your heart as they break apart in your hands. So why does the Internet Archive have more than 200,000 in our physical possession?Music

A little over a year ago New York’s ARChive of Contemporary Music (ARC) partnered with the Internet Archive to focus on preserving and digitizing audio-visual materials. ARC is the largest independent collection of popular music in the world. When we began in 1985 our mandate was microgroove recordings – meaning vinyl – LPs and forty-fives. CDs were pretty much rumors then, and we thought that other major institutions were doing a swell job of collecting earlier formats, mainly 78rpm discs. But donations and major research projects like making scans for The Grammy Museum and The Ertegun Jazz Hall of Fame placed about 12,000 78s in our collection.

For years we had been getting calls offering 78 collections that we were unable to accept. But when space and shipping became available through the Internet Archive, it was now possible to begin preserving 78s. Here’s a short history of how in only a few years ARC and the Internet Archive have created one of the largest collections in America.

Our first major donation came from the Batavia Public Library in Illinois, part of the Barrie H.Thorp Collection of 48,000 78s.

We’re always a tad suspicious of large collections like these. First thought is, “Must be junk.” Secondly, “It’s been cherrypicked.” But the Thorp Collection was screened by former ARC Board member Tom Cvikota, who found the donor, helped negotiate the gift and stored it. That was in 2007. Between then and our 2015 pickup Tom arranged for some of the recordings to be part of an exhibition at the Greengrassi Gallery, London, (UK, Mar-Apr, 2014) by artist Allen Ruppersberg, titled, For Collectors Only (Everyone is a Collector).

What makes the Thorp collection unique is the obsessive typewritten card catalog featured in a short film hosted on the exhibition’s webpage. Understanding why you collect and how you give your interests meaning is a part of Allen’s work – artworks that focus on the collector’s mentality. One nice quote by Allen referenced in Greil Marcus’ book, The History of Rock n’ Roll in Ten Songs is, “In some cases, if you live long enough, you begin to see the endings of things in which you saw the beginnings.”

Philosophical musings aside, there are 48,000 discs to deal with. That meant taking poorly packed boxes — many of them open for 20 years — and re-boxing them for proper storage. The picture below shows an example of how they arrived (on the right), and how they were palletized (on the left.)

PalletizedThe trick to repacking in a timely fashion is to not look at the records. It’s a trick that is never performed successfully. Handling fragile 78s requires grabbing one or just a few at a time. So we’re endlessly reading the labels, sleeving and resleeving, all the time checking for rarities, breakage and dirt.

Now we didn’t do all this work on our own. Working another part of the warehouse was two-and-a-half month old Zinnia Dupler — the youngest volunteer ever to give us a hand. Mom also helped a bit.


A few minutes after the snap I found this gem in the Thorp collection. Coincidence? I don’t think so…burpinthebaby

“Burpin” is a country novelty tune from out of Texas by Austin broadcaster and humorist Richard “Cactus” Pryor (1923 – 2011). It came from a box jam-packed with country and hillbilly discs. This was a pleasant surprise, as we expected the collection to be like most we encounter – big band and bland pop. But here was box-after-box of hillbilly, country, and Western swing records. Now, I use’ta think I knew a bit about music. But with this collection, it was back to school for me. Just so many artists I’ve never heard of or held a record by. As we did a bit of sorting, in the ‘G’s alone there’s Curly Gribbs, Lonnie Glosson and the Georgians. Geeez! Did you know that Hank Snow had a recordin’ kid, Jimmy, and he cut “Rocky Mountain Boogie” on 4 Star records, or that Cass Daley, star of stage and screen, was the ‘Queen of Musical Mayhem?” Me neither.  The Davis Sisters, turns out, included a young Skeeter Davis(!) and not to be confused with the Davis Sister Gospel group, also in this collection. Then there’s them Koen Kobblers, Bill Mooney and his Cactus Twisters, and Ozie Waters and the Colorado Hillbillies. No matter they should be named the Colorado Mountaineers, they’re new to me.

For us this donation is a dream: it allows us to preserve material that was otherwise going to be thrown away; it has a larger cultural value beyond the music; and it contained a mountain of unfamiliar music, much of it quite rare. And most of it is not available online.

It was a second large donation that prompted the Internet Archive to move toward the idea that we should digitize all of our 78s. The Joe Terino Collection came to us through a cold call, the collection professionally appraised at $500,000. The 70,000 plus 78s were stored in a warehouse for more than 40 years, originally deposited by a distributor. Here’s the kicker: they said that we could have it all, but we had to move it – NOW! Internet Archive did and it came in on 72 pallets, in three semis, from Rhode Island to San Francisco, looking like this…JoeTernino

So Fred Patterson and the crackerjack staff out in our Richmond warehouses (Marc Wendt, Mark Graves, Sean Fagan, Lotu Tii, Tracey Gutierrez, Kelly Ransom, and Matthew Soper) pulled everything off the ramshackle pallets and carefully reboxed this valuable material.


How valuable? Well, we’re really not so sure yet, despite the appraisal, as just receiving and reboxing was such a chore. One hint is this sweet blues 78 that we managed to skim off the top of a pile.


The next step is curating this material, acquiring more collections and moving towards preservation through digitization. Already we have a pilot project in the works with master preservationist George Blood to develop workflow and best digitization practices.

We’re doing all this because there’s just no way to predict if the digital will outlast the physical, so preserving both will ensure the survival of cultural materials for future generations to study and enjoy. And, it’s fun.


Posted in Announcements, Audio Archive, Music | 8 Comments

Hacking Web Archives

The awkward teenage years of the web archive are over. It is now 27 years since Tim Berners-Lee created the web and 20 years since we at Internet Archive set out to systematically archive web content. As the web gains evermore “historicity” (i.e., it’s old and getting older — just like you!), it is increasingly recognized as a valuable historical record of interest to researchers and others working to study it at scale.

Thus, it has been exciting to see — and for us to support and participate in — a number of recent efforts in the scholarly and library/archives communities to hold hackathons and datathons focused on getting web archives into the hands of research and users. The events have served to help build a collaborative framework to encourage more use, more exploration, more tools and services, and more hacking (and similar levels of the sometime-maligned-but-ever-valuable yacking) to support research use of web archives. Get the data to the people!

pngl3s_hackathon_postFirst, in May, in partnership with the Alexandria Project of L3S at University of Hannover in Germany, we helped sponsor “Exploring the Past of the Web: Alexandria & Archive-It Hackathonalongside the Web Science 2016 conference. Over 15 researchers came together to analyze almost two dozen subject-based web archives created by institutions using our Archive-It service. Universities, archives, museums, and others contributed web archive collections on topics ranging from the Occupy Movement to Human Rights to Contemporary Women Artists on the Web. Hackathon teams geo-located IP addresses, analyzed sentiments and entities in webpage text, and studied mime type distributions.

unleashed attendeesunleashed_vizSimilarly, in June, our friends at Library of Congress hosted the second Archives Unleashed  datathon, a follow-on to a previous event held at University of Toronto in March 2016. The fantastic team organizing these two Archives Unleashed hackathons have created an excellent model for bringing together transdisciplinary researchers and librarians/archivists to foster work with web data. In both Archives Unleashed events, attendees formed into self-selecting teams to work together on specific analytical approaches and with specific web archive collections and datasets provided by Library of Congress, Internet Archive, University of Toronto, GWU’s Social Feed Manager, and others. The #hackarchives tweet stream gives some insight into the hacktivities, and the top projects were presented at the Save The Web symposium held at LC’s Kluge Center the day after the event.

Both events show a bright future for expanding new access models, scholarship, and collaborations around building and using web archives. Plus, nobody crashed the wi-fi at any of these events! Yay!

Special thanks go to Altiscale (and Start Smart Labs) and ComputeCanada for providing cluster computing services to support these events. Thanks also go to the multiple funding agencies, including NSF and SSHRC, that provided funding, and to the many co-sponsoring and hosting institutions. Super special thanks go to key organizers, Helge Holzman and Avishek Anand at L3S and Matt Weber, Ian Milligan, and Jimmy Lin at Archives Unleashed, who made these events a rollicking success.

For those interested in participating in a web archives hackathon/datathon, more are in the works, so stay tuned to the usual social media channels. If you are interested in helping host an event, please let us know. Lastly, for those that can’t make an event, but are interested in working with web archives data, check out our Archives Research Services Workshop.

Lastly, some links to blog posts, projects, and tools from these events:

Some related blog posts:

Some hackathon projects:

Some web archive analysis tools:

Here’s to more happy web archives hacking in the future!

Posted in Archive-It, News | Tagged , , , | 3 Comments

The Hidden Shifting Lens of Browsers


Some time ago, I wrote about the interesting situation we had with emulation and Version 51 of the Chrome browser – that is, our emulations stopped working in a very strange way and many people came to the Archive’s inboxes asking what had broken. The resulting fix took a lot of effort and collaboration with groups and volunteers to track down, but it was successful and ever since, every version of Chrome has worked as expected.

But besides the interesting situation with this bug (it actually made us perfectly emulate a broken machine!), it also brought into a very sharp focus the hidden, fundamental aspect of Browsers that can easily be forgotten: Each browser is an opinion, a lens of design and construction that allows its user a very specific facet of how to address the Internet and the Web. And these lenses are something that can shift and turn on a dime, and change the nature of this online world in doing so.

An eternal debate rages on what the Web is “for” and how the Internet should function in providing information and connectivity. For the now-quite-embedded millions of users around the world who have only known a world with this Internet and WWW-provided landscape, the nature of existence centers around the interconnected world we have, and the browsers that we use to communicate with it.


Avoiding too much of a history lesson at this point, let’s instead just say that when Browsers entered the landscape of computer usage in a big way after being one of several resource-intensive experimental programs. In circa 1995, the effect on computing experience and acceptance was unparalleled since the plastic-and-dreams home computer revolution of the 1980s. Suddenly, in one program came basically all the functions of what a computer might possibly do for an end user, all of it linked and described and seemingly infinite. The more technically-oriented among us can point out the gaps in the dream and the real-world efforts behind the scenes to make things do what they promised, of course. But the fundamental message was: Get a Browser, Get the Universe. Throughout the late 1990s, access came in the form of mailed CD-ROMs, or built-in packaging, or Internet Service Providers sending along the details on how to get your machine connected, and get that browser up and running.

As I’ve hinted at, though, this shellac of a browser interface was the rectangular window to a very deep, almost Brazillike series of ad-hoc infrastructure, clumsily-cobbled standards and almost-standards, and ever-shifting priorities in what this whole “WWW” experience could even possibly be. It’s absolutely great, but it’s also been absolutely arbitrary.

With web anniversaries aplenty now coming into the news, it’ll be very easy to forget how utterly arbitrary a lot of what we think the “Web” is, happens to be.

There’s no question that commercial interests have driven a lot of browser features – the ability to transact financially, to ensure the prices or offers you are being shown, are of primary interest to vendors. Encryption, password protection, multi-factor authentication and so on are sometimes given lip service for private communications, but they’ve historically been presented for the store to ensure the cash register works. From the early days of a small padlock icon being shown locked or unlocked to indicate “safe”, to official “badges” or “certifications” being part of a webpage, the browsers have frequently shifted their character to promise commercial continuity. (The addition of “black box” code to browsers to satisfy the ability to stream entertainment is a subject for another time.)

Flowing from this same thinking has been the overriding need for design control, where the visual or interactive aspects of webpages are the same for everyone, no matter what browser they happen to be using. Since this was fundamentally impossible in the early days (different browsers have different “looks” no matter what), the solutions became more and more involved:

  • Use very large image-based mapping to control every visual aspect
  • Add a variety of specific binary “plugins” or “runtimes” by third parties
  • Insist on adoption of a number of extra-web standards to control the look/action
  • Demand all users use the same browser to access the site

Evidence of all these methods pop up across the years, with variant success.

Some of the more well-adopted methods include the Flash runtime for visuals and interactivity, and the use of Java plugins for running programs within the confines of the browser’s rectangle. Others, such as the wide use of Rich Text Format (.RTF) for reading documents, or the Realaudio/video plugins, gained followers or critics along the way, and were ultimately faded into obscurity.

And as for demanding all users use the same browser… well, that still happens, but not with the same panache as the old Netscape Now! buttons.


This puts the Internet Archive into a very interesting position.

With 20 years of the World Wide Web saved in the Wayback machine, and URLs by the billions, we’ve seen the moving targets move, and how fast they move. Where a site previously might be a simple set of documents and instructions that could be arranged however one might like, there are a whole family of sites with much more complicated inner workings than will be captured by any external party, in the same way you would capture a museum by photographing its paintings through a window from the courtyard.  

When you visit the Wayback and pull up that old site and find things look differently, or are rendered oddly, that’s a lot of what’s going on: weird internal requirements, experimental programming, or tricks and traps that only worked in one brand of browser and one version of that browser from 1998. The lens shifted; the mirror has cracked since then.

This is a lot of philosophy and stray thoughts, but what am I bringing this up for?

The browsers that we use today, the Firefoxes and the Chromes and the Edges and the Braves and the mobile white-label affairs, are ever-shifting in their own right, more than ever before, and should be recognized as such.

It was inevitable that constant-update paradigms would become dominant on the Web: you start a program and it does something and suddenly you’re using version 54.01 instead of version 53.85. If you’re lucky, there might be a “changes” list, but that luck might be variant because many simply write “bug fixes”. In these updates are the closing of serious performance or security issues – and as someone who knows the days when you might have to mail in for a floppy disk to be sent in a few weeks to make your program work, I can totally get behind the new “we fixed it before you knew it was broken” world we live in. Everything does this: phones, game consoles, laptops, even routers and medical equipment.

But along with this shifting of versions comes the occasional fundamental change in what browsers do, along with making some aspect of the Web obsolete in a very hard-lined way.

Take, for example, Gopher, a (for lack of an easier description) proto-web that allowed machines to be “browsed” for information that would be easy for users to find. The ability to search, to grab files or writings, and to share your own pools of knowledge were all part of the “Gopherspace”. It was also rather non-graphical by nature and technically oriented at the time, and the graphical “WWW” utterly flattened it when the time came.

But since Gopher had been a not-insignificant part of the Internet when web browsers were new, many of them would wrap in support for Gopher as an option. You’d use the gopher:// URI, and much like the ftp:// or file:// URIs, it co-existed with http:// as a method for reaching the world.

Until it didn’t.

Microsoft, citing security concerns, dropped Gopher support out of its Internet Explorer browser in 2002. Mozilla, after a years-long debate, did so in 2010. Here’s the Mozilla Firefox debate that raged over Gopher Protocol removal. The functionality was later brought back externally in the form of a Gopher plugin. Chrome never had Gopher support. (Many other browsers have Gopher support, even today, but they have very, very small audiences.)

The Archive has an assembled collection of Gopherspace material here.  From this material, as well as other sources, there are web-enabled versions of Gopherspace (basically, http:// versions of the gopher:// experience) that bring back some aspects of Gopher, if only to allow for a nostalgic stroll. But nobody would dream of making something brand new in that protocol, except to prove a point or for the technical exercise. The lens has refocused.

In the present, Flash is beginning a slow, harsh exile into the web pages of history – browser support dropping, and even Adobe whittling away support and upkeep of all of Flash’s forward-facing projects. Flash was a very big deal in its heyday – animation, menu interface, games, and a whole other host of what we think of as “The Web” depended utterly on Flash, and even specific versions and variations of Flash. As the sun sets on this technology, attempts to be able to still view it like the Shumway project will hopefully allow the lens a few more years to be capable of seeing this body of work.

As we move forward in this business of “saving the web”, we’re going to experience “save the browsers”, “save the network”, and “save the experience” as well. Browsers themselves drop or add entire components or functions, and being able to touch older material becomes successively more difficult, especially when you might have to use an older browser with security issues. Our in-browser emulation might be a solution, or special “filters” on the Wayback for seeing items as they were back then, but it’s not an easy task at all – and it’s a lot of effort to see information that is just a decade or two old. It’s going to be very, very difficult.

But maybe recognizing these browsers for what they are, and coming up with ways to keep these lenses polished and flexible, is a good way to start.

Posted in Emulation, Technical, Wayback Machine | 2 Comments

No More 404s! Resurrect dead web pages with our new Firefox add-on.

No More 404sHave you ever clicked on a web link only to get the dreaded “404 Document not found” (dead page) message? Have you wanted to see what that page looked like when it was alive? Well, now you’re in luck.

Recently the Internet Archive and Mozilla announced “No More 404s”, an experiment to help you to see archived versions of dead web pages in your Firefox browser. Using the “No More 404s” Firefox add-on you are given the option to retrieve archived versions of web pages from the Internet Archive’s 20-year store of more than 490 billion web captures available via the Wayback Machine.


To try this free service, and begin to enjoy a more reliable web, view this page with Firefox (version 48 or newer) then:

  1. Install the Firefox “Test Pilot”:
  2. Enable the “No More 404s” add-on:
  3. Try viewing this dead page:

See the banner that came down from the top of the window offering you the opportunity to view an archived version of this page?  Success!

Wayback MachineFor 20 years, the Internet Archive has been crawling the web, and is currently preserving web captures at the rate of one billion per week. With support from the Laura and John Arnold Foundation, we are making improvements, including weaving the Wayback Machine into the fabric of the web itself.

“We’d like the Wayback Machine to be a standard feature in every web browser,” said Brewster Kahle, founder of the Internet Archive. “Let’s fix the web — it’s too important to allow it to decay with rotten links.”

“The Internet Archive came to us with an idea for helping users see parts of the web that have disappeared over the last couple of decades,” explained Nick Nguyen, Vice President, Product, Firefox.

The Internet Archive started with a big goal — to archive the web and preserve it for history. Now, please help us. Test our latest experiment and email any feedback to

Posted in Announcements, Wayback Machine | 10 Comments

Microphone Check: Thousands of Hip-Hop Mixtapes at the Archive

The Internet Archive has been growing an interesting sub-collection of music for the past few months: Hip-Hop Mixtapes. The resulting collection still has a way to go before it’s anywhere near what is out there (limited by bandwidth and a few other technical factors), but now that it’s past 150 solid days of music on there, it’s quite enough to browse and “get the idea”, should you be so inclined.

Note: Hip-Hop tends to be for a mature audience, both in subject matter and language.

I’m sure this is entirely old knowledge for some people, but it was new to me, so I’ll describe the situation and the thinking.


There’s some excellent introductions and writeups about mixtapes in Hip-Hop culture at these external articles:

So, in quick summary, there have been mixtapes of many varieties for many years, going back to the 1970s to the dawn of what we call Hip-Hop, and throughout the time since the “tapes” have become CDs and ZIP files and are now still being released out into “the internet” to be spread around. The goal is to gain traction and attention for your musical act, or for your skills as a DJ, or any of a dozen reasons related to getting music to the masses.

There is an entire ecosystem of mixtape distribution and access. There are easily tens of thousands of known mixtapes that have existed. This is a huge, already-extant environment out there, that was established, culturally critical, and born-digital.

It only made sense for a library like the Internet Archive to provide it as well.

There’s a lot coded into the covers of these mixtapes (not to even mention the stuff coded into the lyrics themselves) – there’s stressing of riches, drug use, power, and oppression. There’s commentary on government, on social issues, and on the meaning of entertainment and celebrity. There’s parody, there’s aggrandizement, and there’s every attempt to draw in the listeners in what is a pretty large pile of material floating around. It’s not about this song or that grandiose portrait, though – it’s about the fact this whole set of material has meaning, reality and relevance to many, many people.

How do I know this has relevance? Within 24 hours of the first set of mixtapes going onto the Archive, many of the albums already had hundreds of listeners, and one of them broke a thousand views. Since then, a good amount have had tens of thousands of listens. Somebody wants this stuff, that’s for sure. And that’s fundamentally what the Archive is about – bringing access to the world.

The end goal here is simple: Providing free access to huge amounts of culture, so people can reference, contextualize, enjoy and delight over material in an easy-to-reach, linkable, usable manner. Apparently it’s already taken off, but here you go too.

Get your drank on here.

Posted in Announcements, Music, News | 2 Comments

Wayback Machine captures Melania Trump’s deleted internet bio

Melania Trump’s personal website is now gone from the internet — but is preserved by the Internet Archive’s Wayback Machine — after a Huffington Post reporter and other news outlets began questioning elements of the would-be First Lady’s biography.

Yesterday Christina Wilkie, a national political reporter for the Huffington Post, published a story noting that Melania Trump’s elaborate website,, which existed as recently as July 20, now redirects to the Trump Organization’s official website. The removal of the website followed questions about a biography that appeared on it, that claimed  that Melania Trump had “earned a degree in design and architecture at University in Slovenia.”

Many media outlets have followed suit, writing that the website has now disappeared.

Today Melania Trump tweeted that the website was taken down because  “it does not accurately reflect my current and professional interests.”

Screenshot 2016-07-28 13.13.40


Wilkie and other reporters had questioned whether Trump truly obtained those degrees from the university. The inquiries took on new potency after she was accused of possible plagiarism in her speech before the Republican National Convention last week. The campaign has not answered questions about the biography. has reported that there is no “University of Slovenia.”

Meanwhile, Melania’s original biography is preserved on the Internet Archive’s Wayback Machine, which crawls websites to create a historical archive. The most recent snapshot was taken on July 20 — see the screenshot below.

Screenshot 2016-07-28 13.00.56


The Political TV Ad Archive is tracking and archiving political ads in the 2016 elections. In addition, we’ve set up a special Archive-It collection to track candidates’ and political organizations’ social media websites here, with more 320 million captures to date.

Cross posted on the Political TV Ad Archive. July 29: quote from Melania Trump’s defunct website corrected.

Posted in Announcements, News | Tagged , , , , | 9 Comments

Pokébarbarians at the Gate

Millions of people from around the world visit the Internet Archive every day to read books, listen to audio recordings, watch films, use the Wayback Machine to revisit almost half a billion web pages, and much more. Lately, though, we’ve had a different kind of visitor: gaggles of Pokémon Go players.

(In case you’ve been living in a cave without Internet connectivity for the last month, Pokémon Go is an augmented reality Internet game. Participants on three different teams band together to find and capture as many types of Pokémon as they can, sending Nintendo a goldmine of personal data in the process.)


It turns out that the stairs of the Internet Archive’s San Francisco headquarters are a PokéGym, a site where players can train their Pokémon and fight with other Pokémon. Fortunately, the Pokémon warriors aren’t rowdy or disruptive; they resemble somnambulistic zombies stumbling around under the control of their glowing smartphone screens.

As Jean Cocteau noted, “Fashion is everything that goes out of fashion.” Pokémon will join pet rocks, beanie babies, and chia pets in the annals of popular fads sooner than later. Perhaps then the gamers will take advantage of their Internet devices to discover that the Internet Archive has much more to offer than the ephemeral, pixelated creatures outside of our doors.

Posted in Announcements, News | 7 Comments

The Copyright Office is trying to redefine libraries, but libraries don’t want it — Who is it for?

The Library Copyright Alliance (which represents the American Library Association and the Association of Research Libraries) has said it does not want changes, the Society of American Archivists has said it does not want changes. The Internet Archive does not want changes, DPLA does not want changes… So why is the Copyright Office holding “hush hush” meetings to “answer their last questions” before going to Congress with a proposed rewrite of the section of Copyright law that pertains to libraries?

This recent move, which has its genesis in an outdated set of proposals from 2008, is just another in series of out of touch ideas coming from the Copyright Office. We’ve seen them propose “notice and staydown” filtering of the Internet and disastrous “extended collective licensing” for digitization projects. These and other proposals have lead some to start asking whose Copyright Office this is, anyway. Now the Copyright Office wants to completely overhaul Section 108 of the Copyright Act, the “library exceptions,” in ways that could break the Wayback Machine and repeal fair use for libraries.

We are extremely concerned that Congress could take the Copyright Office’s proposal seriously, and believe that libraries are actually calling for these changes. That’s why we flew to Washington, D.C. to deliver the message to the Copyright Office in person: now is not the time for changes to Section 108. Libraries and technology have been evolving quickly. Good things are beginning to happen as a result. Drafting a law now could make something that is working well more complicated, and could calcify processes that would otherwise continue to evolve to make digitization efforts and web archiving work even better for libraries and content owners alike.

In fact, just proposing this new legislation will likely have the effect of hitting the pause button on libraries. It will lead to uncertainty for the libraries that have already begun to modernize by digitizing their analog collections and learning how to collect and preserve born-digital materials. It could lead libraries who have been considering such projects to “wait and see.”

Perhaps that’s the point. Because the Copyright Office’s proposal doesn’t seem to help libraries, or the public they serve, at all.

Posted in Announcements, News | 13 Comments

Is it 1968? Not really — but past convention video clips show controversy

Research by Robin Chin

Is it 1968? Many pundits have been asking this question in recent days, in the lead up to what is expected to be a contentious–and some worry about violent–GOP convention in Cleveland, where Donald Trump is expected to accept the GOP nomination. A spate of mass gun killings, the death of two African American men in recent weeks at the hands of police, the murder of five police officers by a sniper during a demonstration and then three more by a lone gun man in Baton Rouge, terrorism here and abroad, involvement overseas in intractable conflicts, growing economic inequality — none of these developments quite parallel the tumultuous events of the 1960s. But the situation was volatile then, and it’s volatile now.

To set the scene, thanks to the TV News Archive, the Internet Archive‘s online free library of TV news clips, revisiting some of the more “crazy” conventions of years past (headline by Politico), or simply notable or controversial moments, is just a search away. All of these clips are editable, embeddable, and shareable on social media.

Chicago, 1968

When the Democrats met in Chicago in 1968, it was in the shadow of the assassinations of Martin Luther King and Democratic primary candidate Robert Kennedy. Vice President Hubert Humphrey had the support of the some 60 percent of the delegates, largely local party leaders — people who would be super delegates today. While a liberal, Humphrey’s support of the war as Lyndon B. Johnson’s vice president made him unpopular in the anti-war movement.

As described by Politico, “With Humphrey’s nomination all but certain, protesters associated with the Youth International Party (the Yippies) and National Mobilization Committee to End the War in Vietnam (the MOBE) took to the streets outside Chicago’s convention hall; inside, city policemen allied with the local political machine roughed up liberal delegates and journalists in plain view of news cameras. “I wasn’t sentenced and sent here!” a prominent New York Democrat bellowed as a uniformed officer dragged him off the floor. “I was elected!”

The clip below, from the CNN documentary series, “The Sixties,” shows police beating up protestors on the streets. A special commission appointed to investigate the protests characterized the violent events as a “police riot” directed at protesters and recommended prosecution of police who used indiscriminate violence.

That same night, Humphrey took to the podium to accept the nomination. He referred the violence outside when he said, “[O]ne cannot help but reflect, the deep sadness that we feel over the troubles and the violence which have erupted regrettably and tragically in the streets of this great city and for the personal injuries that have occurred. Surely we have now learned the lesson that violence breeds counter violence and it cannot be condoned whatever the source.”

San Francisco, 1964

In 1964, GOP moderates Nelson Rockefeller and George Romney, then governor of Michigan, led an unsuccessful campaign against conservative insurgent Barry Goldwater, at a convention Goldwater biographer Robert Alan Goldberg later dubbed the “Woodstock of the right.” (Romney was former presidential candidate Mitt Romney’s father.) Goldwater was a fierce opponent of the Civil Rights Act and strong supporter of military intervention against the Soviet Union.

Some have compared him to Trump because of his belligerence and unpopularity with the establishment Republicans. For example, like Trump, he was not one to mince words about his enemies. At the convention, when asked by a reporter about LBJ and the Civil Rights Act, he replied, “He’s the phoniest individual who ever came around.”

The convention was raucous, filled with delegates booing the moderates — as when Rockefeller called on the crowd to reject extremists. But the moment most remembered was when Goldwater took the podium to accept the nomination, when, to enormous applause, he said:

“I would remind you that extremism in the defense of liberty is no vice. [applause] And let me remind you also that moderation in the pursuit of justice is no virtue.”

Goldwater went on to lose the election, badly, to Lyndon B. Johnson.

Other historic moments

The TV News Archive is full of many other convention speech clips of moments that turned history’s tide. Here, for example, is John F. Kennedy, accepting the Democratic nomination in 1960, stating that voters should not “throw away” their vote because of concern about his religious affiliation. He went on to become the first Catholic president of the United States.

And here is Richard Nixon, in his 1968 nomination speech, talking about the increase in crime and criticizing those who say “law and order” was code for racism. He was speaking to the charged issues surrounding race and policing at the time:

“Time is running out for the merchants of corruption…and to those who say law and order is a code word for racism there and here is the reply. Our goal is justice for every American. If we are to have respect for law in America we must have laws that deserve respect.”

Nixon’s words, however, have a doubly ironic ring today. First, because the debate over policing in the African American community stubbornly persists decades later. And second, because of his own role in covering up the Watergate scandal, which involved dirty tricks against the Democrats during the 1972 campaign. Nixon would eventually resign from the presidency in 1974. Three years later, in 1977, the journalist David Frost asked Nixon under what circumstances a president can do something illegal. Nixon’s famous answer: “Well, when the president does it, that means that it is not illegal.”

For those wanting to plumb the riches of past convention speeches, below is a list, with links, of most major convention speeches by nominees, starting with Harry Truman in 1948 and going to Barack Obama in 2012. The speeches were broadcast on C-Span.

1948: Harry Truman acceptance speech at Democratic National Convention in Philadelphia, PA Part 1.

Harry Truman acceptance speech at Democratic National Convention in Philadelphia, PA Part 2.

1952: Adlai Stevenson acceptance speech at Democratic National Convention in Chicago, IL Part 1.

Adlai Stevenson acceptance speech at Democratic National Convention in Chicago, IL Part 2.

1956: Republican Convention and Eisenhower’s nomination  Universal newsreel.

Dwight D. Eisenhower acceptance speech at Republican National Convention in Daly City, CA Part 1.

Dwight D. Eisenhower acceptance speech at Republican National Convention in Daly City, CA Part 2.

1960: John F. Kennedy acceptance speech at 1960 Democratic National Conventions in Los Angeles, CA Part 1.

John F. Kennedy acceptance speech at 1960 Democratic National Conventions in Los Angeles, CA Part 2.

Former President Hebert Hoover speech at Republican National Convention Chicago, IL.

Henry Cabot Lodge VP acceptance speech at  National Convention Chicago, IL.

1964: Barry Goldwater acceptance speech at Republican National Convention Daly City, CA.

Robert Kennedy speech at Democratic National Convention Atlantic City, NJ.

Lyndon Johnson acceptance speech Atlantic City, NJ Part 1.

Lyndon Johnson acceptance speech Atlantic City, NJ Part 2.

1968: Spiro Agnew VP acceptance speech at Republican National Convention in Miami Beach, FL.

Richard Nixon acceptance speech at Republican National Convention Miami Beach, FL.

Hubert Humphrey acceptance speech at Democratic National Convention Chicago, Il  NBC News.

1972: McGovern acceptance speech at Democratic National Convention Miami Beach, FL Part 1.

McGovern acceptance speech at Democratic National Convention Miami Beach, FL Part 2.

Richard Nixon acceptance speech at Republican National Convention Miami Beach, FL.

Richard Nixon acceptance speech at Republican National Convention Miami Beach, Florida NBC News.

1976: Barbara Jordan keynote speech at Democratic Convention New York, NY.

Jimmy Carter acceptance speech at Democratic National Convention New York, NY Part 1.

Jimmy Carter acceptance speech at Democratic National Convention New York, NY Part 2.

August 17, 1976 Republic National Convention Kansas City, MO delegates debating Ronald Reagan rule requiring Ford to name VP before they vote  CBS News Part 1.

August 17, 976 Republic National Convention Kansas City, MO includes delegates debating Ronald Reagan rule C16 requiring Ford to name VP before they vote  CBS News Part 2.

Gerald Ford acceptance speech at the Republican National Convention Kansas City, MO Part 1.

Gerald Ford acceptance speech at the Republican National Convention Kansas City, MO Part 2.

Ronald Reagan endorsement speech of Gerald Ford as Presidential Nominee at Republican National Convention Kansas City, MO.

1980: Ronald Reagan acceptance speech  at the Republican National Convention Detroit, MI.

Ted Kennedy speech at Democratic National Convention in New York. Kennedy was a rival for the Democratic presidential nomination.

Jimmy Carter acceptance speech at Democratic National Convention in New York, NY Part 1.

Jimmy Carter acceptance speech at Democratic National Convention in New York, NY Part 2.

1984: Geraldine Ferraro VP acceptance speech at Democratic National Convention San Francisco, CA.

Walter Mondale acceptance speech at Democratic National Convention San Francisco, CA Part 1.

Walter Mondale acceptance speech at Democratic National Convention San Francisco, CA Part 2.

Ronald Reagan acceptance speech at Republican National Convention Dallas, TX.

Mario Cuomo keynote speech at Democratic National Convention San Franciso, CA.

1988: Ann Richards keynote speech at Democratic National Convention Atlanta, GA.

Michael Dukakis acceptance speech at Democratic National Convention Atlanta, GA Part 1.

Michael Dukakis acceptance speech at Democratic National Convention Atlanta, GA Part 2.

Dan Quayle VP acceptance speech at Republican National Convention New Orleans, LA.

George H.W. Bush acceptance speech at Republican National Convention New Orleans, LA.

1992: Barbara Jordan speech at Democratic National Convention New York, NY.

Al Gore VP acceptance speech at Democratic National Convention New York, NY.

Bill Clinton acceptance speech at the Democratic National Convention New York, NY.

Pat Buchanan Keynote speech at Republican National Convention Houston, TX.

Ronald Reagan speech at Republican National Convention  Houston, TX Part 1.

Ronald Reagan speech at Republican National Convention  Houston, TX Part 2.

George H. W. Bush acceptance speech at the Republican National Convention Houston, TX.

1996: Jack Kemp VP acceptance speech at Republican National Convention San Diego, CA.

Bob Dole acceptance speech at Republican National Convention San Diego, CA.

Hillary Clinton speech at the Democratic National Convention Chicago, IL.

Bill Clinton acceptance speech at the Democratic National Convention Chicago, IL. (Currently not available on the TV News Archive.)

2000: Dick Cheney VP 2000 acceptance speech at Republican National Convention in Philadelphia, PA.

George W. Bush acceptance speech at Republican National Convention in Philadelphia, PA Part 1.

George W. Bush acceptance speech at Republican National Convention in Philadelphia, PA Part 2.

Al Gore acceptance speech at Democratic National Convention in Los Angeles, CA.

2004: Barack Obama keynote speech at Democratic National Convention Boston, MA. (Currently not available on the TV News Archive.)

2004 John Edwards speech at Democratic National Convention  Boston, MA.

John Kerry acceptance speech at  Democratic National Convention  Boston, MA.

John McCain speech at Republican National Convention New York, NY.

Laura Bush speech at  Republican National Convention New York, NY.

George W. Bush acceptance speech at Republican National Convention New York, NY.  (Currently not available on the TV News Archive.)

2008: Ted Kennedy speech at Democratic National Convention Denver, CO.

Michelle Obama speech at Democratic National Convention Denver, CO.

Bill Clinton speech at Democratic National Convention Denver, CO.

Joe Biden VP portion of acceptance speech at Democratic National Convention Denver, CO.

Barack Obama acceptance speech at Democratic National Convention Denver, CO.

Sarah Palin VP acceptance speech at Republican National Convention St. Paul, MN.

Cindy McCain speech at Republican National Convention St. Paul, MN.

John McCain acceptance speech at Republican National Convention St. Paul, MN.

2012: Barack Obama acceptance speech at Democratic National Convention Charlotte, NC CSPAN coverage.

Mitt Romney acceptance speech at Republican National Convention Tampa, FL CSPAN coverage.

Posted in Announcements, News | Tagged , , , , , , , , , , , , , , , | 2 Comments

New Rita Allen Foundation grant fuels political ad tracking through Election Day

As the Democrats and Republicans convene at their national party conventions in coming weeks, the general election kicks into full swing. Thanks to generous support from the Rita Allen Foundation, we are delighted to announce that the Political TV Ad Archive, a project of the Internet Archive, will be ramping up to track political ads airing in eight key battleground states in the lead up to Election Day.

The $110,000 grant will enable Political TV Ad Archive to continue the work begun during the primary months, when the project tracked more than 145,000 airings of ads in 23 markets in key primary states. The project uses audio fingerprinting algorithms to track occurrences of ads backed by candidates, political action committees, “dark money” nonprofit groups and more—all linked to information on where and when ads have aired, sponsors, subjects and messages.




The website provides a searchable database of all the political ads archived, and all ads are embeddable and shareable on social media. In addition, the underlying metadata on frequency ad airings is available for downloading, and journalists from such outlets as The Washington Post, Fox News, and have used it to inform reporting, visualizations, and other creative uses to put these ads in context for readers. The Political TV Ad Archive also partners with respected journalism and fact checking organizations, such as the Center for Responsive Politics, PolitiFact, and

The Rita Allen Foundation supported the initial development of the Archive’s technology through a pilot project, the Philly Political Media Watch Project, which collected ads aired in the Philadelphia region in the lead-up to the 2014 midterm election. The Rita Allen Foundation also helped to sponsor the primary election phase of the Political TV Ad Archive, which received funding from the Knight News Challenge on Elections.


Posted in News | 4 Comments

Unlocking Books for the Blind and Visually Impaired

imageThe Internet Archive has been making print materials more accessible to the blind and print disabled for years, but now with Canada’s joining the Marrakesh Treaty, our sister organization, the Internet Archive Canada might be able to serve people in many more countries.

In 2010, we launched the Open Library Accessible Books collection, which now contains nearly 2 million books in accessible formats. Our sister organization, Internet Archive Canada, has also been working on accessibility projects, and has digitized more than 8500 texts in partnership with the Accessible Content E-Portal, which is on track to have over 10,000 items available in accessible formats by the end of the month.

On June 30th, Canada tipped the scales towards broader access to books for all by joining the Marrakesh Treaty. This move will allow the Treaty to go into effect on September 30, 2016 in the nations where it has been ratified, so that print-disabled and visually impaired people can more fully and actively participate in global society.

The goal of the Marrakesh Treaty is to help to end the “book famine” faced by people who are blind, visually impaired, or otherwise print disabled. Currently only 1% to 7% of the world’s published books ever become available in accessible formats. This is partly due to barriers to access created by copyright laws–something the Treaty helps to remove.

The Marrakesh Treaty removes barriers in two ways. First, it requires ratifying nations to have an exception in their domestic copyright laws for the blind, visually impaired, and their organizations to make books and other print resources available in accessible formats, such as Braille, large print, or audio versions, without needing permission from the copyright holder. Second, the Treaty allows for the exchange of accessible versions of books and other copyrighted works across borders, again without copyright holder permission. This will help to avoid the duplication of efforts across different countries, and will allow those with larger collections of accessible books to share them with visually impaired people in countries with fewer resources.

The first 20 countries to ratify or accede to the Marrakesh Treaty were: India, El Salvador, United Arab Emirates, Mali, Uruguay, Paraguay, Singapore, Argentina, Mexico, Mongolia, Republic of Korea, Australia, Brazil, Peru, Democratic People’s Republic of Korea, Israel, Chile, Ecuador, Guatemala and Canada. People in these countries will soon start realizing the tangible benefits of providing access to knowledge to those who have historically been left out.

To date this material has only been available to students and scholars within Ontario’s university system. The Marrakesh Treaty now makes it possible for these works to be shared more broadly within Canada, and with the other countries listed above. Hopefully the rest of the world will take note, and join forces to provide universal access to all knowledge.

Posted in Announcements, Books Archive, News | 2 Comments

Those Hilarious Times When Emulations Stop Working

Jason Scott, Software Curator and Your Emulation Buddy, writing in.

With tens of thousands of items in the stacks that are in some way running in-browser emulations, we’ve got a pretty strong library of computing history afoot, with many more joining in the future. On top of that, we have thousands of people playing these different programs, consoles, and arcade games from all over the world.

Therefore, if anything goes slightly amiss, we hear it from every angle: twitter, item reviews, e-mails, and even the occasional phone call. People expect to come to a software item on the Internet Archive and have it play in their browser! It’s great this expectation is now considered a critical aspect of computer and game history. But it also means we have to go hunting down what the problem might be when stuff goes awry.

Sometimes, it’s something nice and simple, like “I can’t figure out the keys or the commands” or “How do I find the magic sock in the village.”, which puts us in the position of a sort of 1980s Software Company Help Line. Other times, it’s helping fix situations where some emulated software is configured wrong and certain functions don’t work. (The emulation might run too fast, or show the wrong colors, or not work past a certain point in the game.)

But then sometimes it’s something like this:


In this case, a set of programs were all working just fine a while ago, and then suddenly started sending out weird “Runtime” errors. Or this nostalgia-inducing error:


Here’s the interesting thing: The emulated historic machine would continue to run. In other words, we had a still-functioning, emulated broken machine, as if you’d brought home a damaged 486 PC in 1993 from the store and realized it was made of cheaper parts than you expected.

To make things even more strange, this was only happening to emulated DOS programs in the Google Chrome browser. And only Google Chrome version 51.x. And only in the 32-bit version of Google Chrome 51.x. (A huge thanks to the growing number of people who helped this get tracked down.)

This is what people should have been seeing, which I think we can agree looks much better:


The short-term fix is to run Firefox instead of Chrome for the moment if you see a crash, but that’s not really a “fix” per se – Chrome has had the bug reported to them and they’re hard at work on it (and working on a bug can be a lot of work). And there’s no guarantee an update to Firefox (or the Edge Browser, or any of the other browsers working today) won’t cause other weird problems going down the line.

All this, then, can remind people how strange, how interlocking, and even fragile our web ecosystem is at the moment. The “Web” is a web of standards dancing with improvisations, hacks, best guesses and a radically moving target of what needs to be obeyed and discarded. With the automatic downloading of new versions of browsers from a small set of makers, we gain security, but more-obscure bugs might change the functioning of a website overnight. We make sure the newest standards are followed as quickly as possible, but we also wake up to finding out an old trusted standard was deemed no longer worthy of use.

Old standards or features (background music in web pages, the gopher protocol, Flash) give way to new plugins or processes, and the web must be expected, as best it can, to deal with the new and the old and fail gracefully when it can’t quite do it. As part of the work of the Decentralized Web Summit was to bring forward the strengths of this world (collaboration, transparency, reproducibility) while pulling back from the weaknesses of this shifting landscape (centralization, gatekeeping, utter and total loss of history), it’s obvious a lot of people recognize this is an ongoing situation, needing vigilance and hard work.

In the meantime, we’ll do our best to keep on how the latest and greatest browsers deal with the still-fresh world of in-browser emulation, and try to emulate hardware that did come working from the factory.

In the meantime, enjoy some Apple II programs. On us.

Posted in Emulation, Software Archive, Technical | 2 Comments