The Wayback Machine’s Save Page Now is New and Improved

Every day hundreds of millions of web pages are archived to the Internet Archive’s Wayback Machine. Tens of millions of them submitted by users like you using our Save Page Now service. You can now do that in a way that is easier, faster and better than ever before.

Save Page Now (SPN) just got a major upgrade as a result of a total code rewrite, adding a slew of new and awesome features, with more on the way.  

Let’s explore what’s new with Save Page Now    

You can now save all the “outlinks” of a web page with a single click. By selecting the “save outlinks” checkbox you can save the requested page (and all the embedded resources that make up that page) and also all linked pages (and all the embedded resources that make up those pages). Often, a request to archive a single web page, with outlinks, will cause us to archive hundreds of URLs.  Every one of which is shown via the SPN interface as it is archived.

My Web Archive keeps a record of the pages you personally saved in the Wayback Machine using Save Page Now.

The new and improved SPN is based on the modern, server-side Brozzler software, which is capable of running web page JavaScript when saving a URL. With this new approach, we can replay the original more faithfully than was possible before.  And, because this software is actively supported by several developers, bugs are quickly fixed, and new features added at a rapid pace. 

When users are logged in with their free account, SPN-generated archives can be saved to that user’s “My web archive” public gallery of archived pages.  

In addition to capturing more high-quality archives of web page elements (HTML, JavaScript, Image files, etc.), SPN can now also produce a screenshot. If screenshots of archived pages are available, we will display an icon on corresponding playback pages and if selected the screenshot will be shown. 

Have you ever wanted to archive all the web pages linked from an email message?  Well, you are in luck because now you can forward that email to “” and after a few minutes you will get an email back filled with Wayback Machine playback URLs. 

Some of you might like the new “First capture” badge you will see if any of the URLs you submit to be archived (including outlinked URLs and URLs included in emails) have not been archived yet. And, yes, for those of you who are feeling competitive, we are planning to launch a “leader board” soon. Let the games begin!

Maybe you want the URLs embedded in a web-based PDF file, RSS feed, or JSON file archived. The new SPN will parse those files and archive all the URLs they contain.  To use this feature, simply submit PDF/RSS or JSON URLs to SPN, and don’t forget to select the “capture outlinks” checkbox.

This new version of SPN is also being used as the back-end support for a number of Wayback Machine services, including the iOS and Android apps as well as the Chrome, Firefox and Safari browser extensions. And, in case you wondered, those apps and extensions will also be getting major updates very soon.

And, yes, of course SPN has a brand new API that you can use to automate a range of Web archiving projects. Please write to us at if you would like to learn more about the API.

We have often gotten requests to archive URLs from a Google Sheet. We now support that feature for authorised users. Please write to us for access to this advanced capability at

We LOVE hearing about ways we can make the Wayback Machine better. In fact most of these new SPN features started with your user suggestions.  

Please let us know what you think. Good, bad, or otherwise. Who knows, the next cool SPN feature might be invented by you!

And remember, “If you see something, save something!”

Posted in News | Leave a comment

Unlocking the Potential for Every High School Library: 2019 Internet Archive Hero Award

Announced today, Phillips Academy has received the Hero Award from the Internet Archive for its leadership in adopting controlled digital lending for school libraries. The Hero Award is presented annually to an organization that exhibits leadership in making its holdings available to digital learners all over the world, and when Phillips Academy was renovating its Oliver Wendell Holmes Library, librarian Michael Barker wanted to update more than the physical space. This was also an opportunity to bring the private preparatory high school up to speed digitally – and in the process, share its vast book collection with others.

Barker, Director of Academy Research, Information and Library Services, has embraced Controlled Digital Lending (CDL), where a library digitizes a book it owns and lends out one secured digital version to one user at a time. In this case, the Andover, Massachusetts school owns 80,000 books.

Michael Barker

“With the closure of so many high school libraries, this allows us to share the collection we’ve built up over 100 years with all other high schools,” Barker said. “I can’t think of any better way the library could contribute its private resources for a public purpose.”

Phillips, which has roughly 1,100 students in grades 9-12, has been active in the Digital Public Library of America. It has already digitized about 4,000 of its titles published prior to 1923.

With all the books already boxed up for the renovation, the school’s decision to expand its CDL project was clear: “There would never be a better time than now,” Barker said. This summer it shipped most of the remaining volumes to be digitized by Internet Archive at its scanning facility in the Philippines.

Sharing the cost of scanning and shipping with Internet Archive was critical to the digitization process happening, said Barker. The books are expected back early in 2020 and will be placed back on library shelves over spring break.

Rather than most books being on display, the renovated Phillips library includes more open space for collaboration. It was last updated in 1987 and was not wired for a world that included the Internet. Renovations began in early 2018 and the newly updated facility opened to students this fall.

Originally designed like a “book fortress,” Barker said the center of the library now has room for students to study together while some books are on shelves around the periphery. Most books are now in the attic and basement where they can be called up to lending.

“One local benefit of CDL is that students don’t necessarily need to call the book from the attic. With a digital version there is no delay in getting the book,” Barker said.

As Barker awaits the return of the book collection from the Philippines, he is tracking the shipment (which went on two separate ships and was insured). In the meantime, Phillips is preparing to share the news of its vast collection becoming open to students everywhere. Barker is excited to offer the school’s resources openly and said it’s particularly timely as school library budgets are being cut, making it hard for libraries to fulfill their mission.

“The truth of the matter is that some schools don’t have libraries anymore,” Barker said. “If other schools like us got involved in CDL in the same way and shared their copies, many public schools would not have to worry about their students having access to collections in the same way they might be doing now. I encourage others to explore it and jump in. It seems like it can only get stronger the more libraries that join.”

NOTE: Come meet Mike Barker and learn more about Phillips Academy when he speaks at Internet Archive’s World Night Market, Wednesday 10/23 from 5-10 PM.  Tickets available here.

Posted in Announcements, Lending Books, News | Tagged | Leave a comment

How the Internet Archive is Digitizing LPs to Preserve Generations of Audio

Albums available in the Boston Public Library Vinyl LP Collection on

Imagine if your favorite song or nostalgic recording from childhood was lost forever. This could be the fate of hundreds of thousands of audio files stored on vinyl, except that the Internet Archive is now expanding its digitization project to include LPs. 

Earlier this year, the Internet Archive began working with the Boston Public Library (BPL) to digitize more than 100,000 audio recordings from their sound collection. The recordings exist in a variety of historical formats, including wax cylinders, 78 rpms, and LPs. They span musical genres including classical, pop, rock, and jazz, and contain obscure recordings like this album of music for baton twirlers, and this record of radio’s all-time greatest bloopers

Unfortunately, many of these audio files were never translated into digital formats and are therefore locked in their physical recording. In order to prevent them from disappearing forever when the vinyl is broken, warped, or lost, the Internet Archive is digitizing these at-risk recordings so that they will remain accessible for future listeners.

“The LP was our primary musical medium for over a generation. From Elvis, to the Beatles, to the Clash, the LP was witness to the birth of both Rock & Roll and Punk Rock. It was integral to our culture from the 1950s to the 1980s and is important for us to preserve for future generations.”

– CR Saikley, Director of Special Projects, Internet Archive

Since all of the information on an LP is printed, the digitization process must begin by cataloging data. High-resolution scans are taken of the cover art, the disc itself and any inserts or accompanying materials. The record label, year recorded, track list and other metadata are supplemented and cross-checked against various external databases. 

High resolution imaging of album cover art. The boxed area is shown at high resolution at right.

“We’re really trying to capture everything about this artifact, this piece of media. As an archivist, that’s what we want to represent, the fullness of this physical object.”

– Derek Fukumori, Internet Archive Engineer

Once cataloged, the LP’s are then digitized. The Internet Archive partners with Innodata Knowledge Services, an organization focused on machine learning and digital data transformation, to complete the digitization process at their facilities in Cebu, Philippines. An Innodata worker digitizes 12 LPs at a time, setting turntables to play and record by hand, then turning each record over to the next side. Since each LP is digitized in real time, it takes a full 20 minutes to record an average LP side. By operating 12 turntables simultaneously, the team expects to be able to digitize ten LPs per hour.

Audio stations complete with turntables & recording equipment set up in Cebu, Philippines.

Once recorded, there is a large FLAC file for each side of the LP, which needs to be segmented so listeners can easily begin at the desired song. There are two different algorithms used for segmenting; the first one looks at images of the vinyl disc to locate gaps in its grooves, which usually line up with gaps between songs. A second algorithm listens to the audio file to find the silent spaces between songs. When these two algorithms align, our engineers have a good measure of confidence that the machine has found the proper tracks.

These algorithms currently predict segmenting with about 80% accuracy, but some audio files are more difficult. For example, recordings of live music fill in the spaces between songs with applause, while classical music utilizes silence as part of a song. In order to account for these anomalies, digitized LP files are always checked manually before being added to the online database.

Identifying the empty spaces between songs for segmenting.

Currently, there are more than 900 LPs from the Boston Public Library LP collection available on The Internet Archive continues to digitize the remainder of the BPL collection in addition to more than 285,000 LPs that have been donated by others. The organization aims to engage a greater community of LP and 78 rpm enthusiasts by welcoming contributions and improvements to the recorded metadata. Many of the audio files online can be listened to in full, but some of the albums are only available in 30 second snippets due to rights issues.

For decades, vinyl records were the dominant storage medium for every type of music and are ingrained in the memories and culture of several generations. Despite the challenges, the Internet Archive is determined to preserve these at-risk records so that they can be heard online by new audiences of scholars, researchers, and music lovers around the world.

ABOUT THE AUTHOR: Faye Lessler is a California-born, Brooklyn-based freelance writer and founder of lifestyle blog, Sustaining Life. She is an expert in mission-driven communications and enjoys writing while sipping black tea in a beam of sunshine.

Posted in Announcements, News | Leave a comment

Everyone Deserves to Learn

The nation’s K-12 school libraries are hurting. Although the student population is rising in many districts, the number of librarians and media specialists dropped by nearly 20 percent from 2000 to 2015. Budgets are being reduced and some schools are no longer able to afford their school librarians, or are simply closing their libraries altogether. The cuts are particularly deep in underserved communities.

Controlled Digital Lending (CDL), the digital equivalent of traditional library lending, holds the promise of broadening access to knowledge for public school libraries, according to Lisa Petrides, PhD, founder of the non-profit Institute for the Study of Knowledge Management in Education. ISKME’s focus is to provide research, tools, and training to help democratize access to education through the practice of continuous learning and collaboration. Research has shown that well-resourced libraries matter, she noted.

Lisa Petrides

“When students don’t have access to school libraries, it impacts learning outcomes. It’s a dire situation in many districts across the country,” said Petrides. “Libraries and the librarians that serve them are intricately connected to pedagogy and curriculum, and are necessary to reinforce the basic tenets of learning, including problem-solving, curiosity, and exposure to new ways of thinking. The school library has been and continues to be a critical link for teaching and learning in our K-12 schools.”

In partnership with Internet Archive, Petrides has amassed a team of librarian partners to create the Universal School Library, a collection of digitized books that can serve as a lending library for those without access to a physical school library. A small grant is funding teams of school librarians working to curate 15,000 book titles. It’s a labor-intensive process selecting fiction and non-fiction titles for the core collection, while ensuring diverse viewpoints and voices are represented and included. A beta version of the  Universal School Library is now online at, and using CDL, the project team will work with states, districts, and schools to fill in where there are gaps. When the Universal School Library is officially launched in 2020, it will encompass all genres and reading levels, and across cultural, college, and career literacy.

“This has the potential to make a high-quality curated collection available for any student,” Petrides said. “It really is a democratization issue. CDL can be transformative for equal access to education in this country.”

NOTE: Come meet Lisa Petrides and learn more about the Universal School Library when she speaks at Internet Archive’s World Night Market, Wednesday 10/23 from 5-10 PM.  Tickets available here.

Posted in Announcements, Lending Books, News | Leave a comment

Calculating the True Value of A Library that is Free

By Omar Rafik El-Sabrout

A new program at encourages you to “put your money behind something that matters to you:” sponsoring a book so everyone can read and borrow it online for free.

We live in the era of Venmo and CashApp, when after a nice meal with friends, you no longer have to argue over who will pick up the bill. On the surface, this is an extremely promising way to keep people from accidentally going into debt with each other. But it also reinforces interactions that are extremely transactional. The old idea of “I’ll get you back next time” is part of the give and take that members of a close community engage in. In our transactional present, people don’t have to rely on the idea of trust–trusting the butcher at the farmer’s market won’t price gouge me, trusting my friend will pay me back. People aren’t learning that you can vote by caring, by putting your money behind something that matters to you. At a moment when “you get what you pay for” is the capitalist norm, enter the Internet Archive, which today is asking you to make an investment in community-wide sharing.

The Internet Archive, which runs the project Open Library, is working to create a vast network of online book lending in order to make all books accessible to all people. Open Library cares about the input of its readers. As Open librarian and Internet Archive Software Engineer Mek Karpeles describes, “Open Library’s theory is that readers deserve a say in what’s on their bookshelves,” which is why he and his team have created a new Book Sponsorship feature.

A blue box on the book page lets you know that this is a book you can sponsor. With your donation, we will buy the book, digitize it, store it, and make the ebook available for borrowing–first by you.

Founded on the idea that a library ought to have books that “reflect [a] community’s needs and values,” Book Sponsorship allows any of the more than two and a half-million users of Open Library to #saveabook. This is a natural follow-up to the long standing “Want to Read” functionality whereby a reader can indicate a book is missing from the Archive that they wish to read.

You can contribute just $11.32 to make sure this book from Marley Dias’ #1000BlackGirlBooks list is available for all.

With our new book sponsorship program, readers are given the option to put money towards directly sponsoring the acquisition of a particular book, after which the Internet Archive will digitize, store, and make the ebook available for lending–for free. Among other possibilities, this would allow people to combat the lack of representation of young black protagonists that Marley Dias, creator of the #1000BlackGirlBooks, found at her school and local library. We currently feature almost 400 of the #1000BlackGirlBooks on and with your support, we can buy and digitize all of them.

When people are given the opportunity to be generous in an obligation-free way, we find that typically brings out their desire to do good.

By giving people a say and making them feel represented, they become more invested. The care that comes from the investment of individuals is what eventually creates a community, and our hope is that the Open Library community will use this feature to help disenfranchised patrons gain access to materials that would enrich their education. When people are given the opportunity to be generous in an obligation-free way, we find that typically brings out their desire to do good. It’s relatively easy to put a price on a book, to calculate printing costs and publishing costs, but what’s harder to determine is the value of giving a gift. If you’re interested in sponsoring a book, either for yourself or for someone else, just click on a Sponsor an eBook button or visit to learn more.

Go to to lear more about how to #saveabook

Posted in Announcements, News | 2 Comments

Jai Hind! Jai Gyan! India on the Internet Archive

In India, many speeches begin or end with the phrase “Jai Hind.” Jai means “long live” and “Hind” of course is the great Republic of India, the largest democracy in the world. Jawaharlal Nehru popularized this phrase, and it became the battle cry of the fight for liberation. “Gyan” means knowledge, and I have taken to ending my speeches in India with “Jai Hind! Jai Gyan!”

In this blog post, I’d like to tell you a little bit about my work in India, talk about some of the amazing resources having to do with India on the Internet Archive, and recognize some of my colleagues who have shown me so much kindness and given me so much inspiration.

A young Ganesha reading. From reading comes wisdom.

India was not a complete stranger to me. It was one of the stops in my Internet travelogue Exploring the Internet, I finished one of my first books on a houseboat in Srinigar, and India played an important role in the Internet World’s Fair, and I was honored to have His Holiness the Dalai Lama write the foreword to my book about the fair and allowed me to present him a copy in Dharamshala.

For the last years however, I’ve been spending a great deal of my time in this amazing country. I wrote about my passage to India last year in the book Code Swaraj, which is of course available for free, no rights reserved, and has been translated into Hindi, Urdu, Bangla, Punjabi, Gujarati, Kannada, Martha, Tamil, and Telugu. My fascination with and commitment to India comes from many sources, but I was particularly inspired by the movement for liberation and the lessons we can learn from Mahatma Gandhi for how one can confront authority and change the world.

My work in India has been made possible by a generous grant from the Arcadia, to whom I am immensely grateful for allowing me to pursue this work. Arcadia has been instrumental in promoting open access throughout the world, including support for the Internet Archive and many other groups.

Logo of Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin

The Internet Archive now hosts one of the very largest collections of materials by and about India. Let me tell you about a few of these:

  • The Public Library of India is a collection of books mirrored from the Internet in over 100 languages. Many of those books we archived are no longer available in their original locations, so we are thrilled that the Internet Archive became a home for these valuable materials. Some of the scans are old, some of the metadata and quality control is not so good, but the materials are unique and our 425,121 texts have received over 62 million views.
  • Because the metadata on the Public Library of India had been entered in Roman characters, that made the collection much less useful for non-Roman scripts, a team of Wikipedians led by my friend Arjuna Rao Chavala painstakingly reentered the titles and creators for 17,655 books in Telugu into the original script, making the books findable for those that speak those languages. That effort is now being replicated for other languages, such as Kannada and Tamil.
  • One of my personal passions has been the Hind Swaraj collection, which is devoted to the fight for Indian independence. The collection features 595 works about Gandhi Ji including all 100 volumes of his Complete Works, as well as the complete works of Nehru, Ambedkar, and substantial collections of texts and audio by figures such as Rabindranath Tagore, Sarvepalli Radhakrishnan, Subhas Chandra Bose, and Sardar Patel.
  • Public Resource has been honored to work closely with the Indian Academy of Sciences, with whom we have a formal Memorandum of Cooperation. As part of that effort, we have digitized all the Indian Academy’s books, and maintain an extensive collection of science resources of India. It has a great pleasure to work with and become friends with my colleagues at the Academy, including the distinguished Professor Amitabh Joshi who has spearheaded this effort and Professor Partha Majumder, the President. We have in place similar Memoranda of Cooperation and have hosted collections for the JC Bose Trust and the National Center for Biological Sciences.

One of the most gratifying things about working in India is the public spirit, technical skills, and enthusiasm of volunteers all over the country. We have banded together and call ourselves the Servants of Knowledge, a hat-tip to Gokhale’s Servants of India society. The Indian Academy allowed kindly allowed us to place a Table Top Scribe in their Bengaluru headquarters, a unit which was donated by the Kahle Austin foundation. It has been a true delight to work with and learn about the Internet Archive digitization framework which is extensive and incredibly powerful.

The Table Top Scribe located at the Indian Academy of Sciences

The Servants of Knowledge collection and the scanning effort in Bengaluru is managed by my friend Omshivaprakash, a long-time wikipedian and a passionate advocate for the Kannada language and heritage. Likewise, Shiju Alex has been toiling for years to digitize key works in Malayalam. In Mangaluru, Prashanth Shenoy has led the effort for Konkani texts, and in Chennai the indomitable T. Shrinivasan has long worked to make more Tamil resources available on the Internet.

In Mangaluru at the Unicourt headquarters before heading off to give a talk. On Carl’s left is Prashanth Chenoy.
Carl with Shrinivasan T. in Chennai with the Linux User’s Group.

Since, 2011, Public Resource has worked with Dr. Sushant Sinha, the founder of Indian Kanoon, the amazing free site that provides access to all case law and other legal materials in India. Indian Kanoon was recently honored with the prestigious Agami Prize for service to the citizens of India. Sushant and I have been working on a project for a year that we believe is transformational, pulling in the Official Gazettes of India from the central government and 19 states, an archive that is updated daily and has over 455 documents. (The Official Gazettes are the newspapers of government, akin to the Federal Register in the United States.)

Dr. Sushant Sinha being presented the prestigious Agami Prize by Chief Justice Gita Mittal of the Jammu-Kashmir High Court

Particularly impressive has been Sushant’s effort to extend the Internet Archive by doing OCR in Indian languages. He has written code that pulls a document off the Internet Archive and bounces it off Google Vision for the OCR, then recreates the files that the Archive would expect to see if it had done the OCR in the Abby software it uses. The code is now working, and he’s been applying it to mixed-language Gazettes in Hindi and English and to the Karnataka Gazette in the Kannada lanaguge. The code is totally open source, we are beginning to apply it to books, and we are hoping to supplement Google Vision with tesseract and other modules.

Public Resource has two other major efforts in India, both of which we believe have the potential to be transformational not only in India but in the rest of the world.

  • In Delhi, we have a formal memorandum of research cooperation with Dr. Andrew Lynn of Jawaharlal Nehru University where we have created the JNU Data Depot, an effort to advance text and data mining on the scientific corpus by researchers. The system is carefully modeled after the Hathi Trust effort in the U.S. and makes carefully secured access to the corpus available to non-commercial university researchers who are able to perform non-consumptive text and data mining. This project was recently featured in Nature, the international journal of science. In addition to the JNU facility, Public Resource has installed a mirror at IIT Delhi, under the direction of Dr. Sanjiva Prasad. We have a distinguished board of advisors from universities throughout India and have received legal advice and counsel from some of the most distinguished intellectual property experts in the country, including Professor Arul George Scaria, Professor N. S. Gopalakrishnan, Professor Feroz Ali, and Professor Lawrence Liang. I have also been grateful for the personal insights and friendship provided to me by Dr. Zakir Thomas, a senior civil servant and the former Registrar of Copyrights for India.
  • One of our initial programs in India has been to make available all Indian Standards, the public safety codes of India. We have made 18,471 such standards available in our Public Safety Codes Collection and the documents have been invaluable for millions of Indian students, government officials, and others who need to consult these valuable government-issued rules and regulations. We have filed a public interest litigation writ petition before the Hon’ble High Court of Delhi after the government objected to our efforts. My co-petitioners are Dr. Sinha and my friend Srinivas Kodali. We are represented before the Hon’ble High Court by senior advocates Jawahar Raja and Salman Khurshid and the law firm of Nishith Desai and Associates.

You can read more about my efforts in India on the Public Resource Docket where I keep a listing of speeches, press, and other public information.

I close on a sad note. India lost a remarkable person this year and I lost a dear friend. Shamand Basheer passed away at the young age of 43 after a very long illness. In his short time on earth he touched so many lives, mine included.

A recent photograph of Shamnad Basheer.

Shamnad had the finest legal mind of anybody I have ever met, and I have had the privilege of working with many of the best. Shamnad, though, was on a wholly different level. He was considered to be the leading intellectual property expert in India, but he also pursued justice in many other areas of the law. He filed a petition that challenged discrimination in law school admissions, he intervened in the landmark Novartis case, he played an instrumental role in the Delhi University Photocopy case.

But he did so much more. His greatest accomplishment is IDIA, which he created and spent his greatest efforts with. IDIA’s mission is increasing diversity by increasing access to legal education. They find young students with great potential but living in impoverished circumstances, get them ready to take the exam to get into law school, then stick with them to get them through the program. These young lawyers then go back to their communities to provide justice. It is an immensely inspirational program and you can do nothing better to honor Shamnad’s memory than donate to IDIA.

I read a lot about Gandhi, I speak about him frequently and learned much from his work. Gandhi is an important part of my life, but with Shamnad it was different. Shamand, more than anybody I ever met, lived his life like Gandhi. He was a public worker, devoted to justice and equality. He knew a vast number of people, and every person he met, he touched their lives deeply.

He had been dreadfully sick for so many years, but he did not let that stop him. I admonished him once when he invited me to speak at the IDIA annual event and he had just flown in the night before from Iran, where he was advising people on intellectual property. You should take it easy, I told him. He replied that he wasn’t going to let his illness rule his life, and he never stopped his public work for one minute.

The night before he passed away, we were exchanging messages on WhatsApp. He had been especially sick of late and had gone on a pilgrimage to Bababundangiri. His last words to me, which I wish to share were:

I am in a very special place right now. Even as my body is battered my spirit is strengthening. In Baba Budangiri a site of amazing syncretic spirituality. Where I have found much peace and meaning. And a place that has helped me transcend the body. And in this special place, I am offering prayers for you. And sending you lots of good energy. To continue the good fight. Lots of love. Shams

Wherever Shamnad is now, his star will shine bright for the ages. He will forever be missed and forever remembered. May the gods bless you Shamnad, thank you for all you did for me and for so many others.

Posted in Announcements, News | Tagged , , , | Leave a comment

The Wayback Machine: Fighting Digital Extinction in New Ways

Extinction isn’t just a biological issue. In the 21st century, it’s a technical, even digital one, too.

The average web page might last three months before it’s altered or deleted forever. You never know when access to the information on these web pages is going to be needed. It might be three months from now; it might be three decades. That’s how the Wayback Machine serves—making history by saving history. Now, the Wayback Machine is fighting digital extinction in brand new ways.

Wayback MachineAs the Internet Archive prepares for its anniversary celebration on Oct. 23, our Wayback Team is unveiling some new features to make what some call “the memory of the web” even more detailed and responsive. 

Try out some of our new Wayback Machine Features:

  • Changes: a new service enabling users to select two different versions of a given URL and compare them side by side. Differences in the text of the content are highlighted in yellow and blue.
Our new Wayback Changes tool highlights how a web page differs through time, comparing two versions of the same page side-by-side.
This is the high level view of “Changes.”

Just click the  “Changes” link at the top of the “Calendar View” page to find an index of archives of the selected URL with a high-level indicator of the degree of change between the available archives.  When no content has changed, the page appears in the same color. You can then select any two archived versions of the page so they can be rendered side-by-side with the changes between them highlighted in blue and yellow.  Best of all you can then share this “Changes” URL with others (e.g. via Twitter or embedded in a news story) so others can easily see the changes as well.  

  •  Save Page Now: an updated version of perhaps the most popular feature of the Wayback Machine. Of particular import is the new ability to archive all the embedded links and outlinks (connections to external web sites) with just one click.  
The new and improved Save Page Now function allows you to save web pages in various ways and share them in your own archive of favorites.

Also new is the ability for users to save web archives in a public directory of favorite items. It’s essentially a personal but public bookmarking system of pages that others can follow. Imagine how important this might be for future researchers, family members or fans interested in the web pages you chose to personally save for all time.

  • Collections:  A new way to learn about why a given URL has been archived into the Wayback Machine. Start by clicking the “Collections” link at the top of any “Calendar View” page. You will then be shown a list of all the collections that this URL is included in, plus you can select individual playback URLs from any of those Collections. Click on the Collection name to learn more about its provenance. And if it was created as part of the Internet Archive’s Archive-It service, you can execute full-text searches on archived web pages that are part of that collection.
  • Show All Captures: The Wayback Machine archives some URLs many times a day. In some cases hundreds, or even thousands of times a day. While all of those captures have been available for playback, the calendar view would only show a sampling of those captures. The new Show All Captures feature now presents a list of each and every capture available per day, even for captures that are made seconds apart.

Who will be using these new features? Earlier this month, Mark Graham, the Wayback Machine’s director, got a request from a TV journalist for help—not just for something Trumpian or Brexit-ish. Instead, the just-married journalist saw that her wedding day web page was about to expire and wanted to be sure it would be preserved. Using the new and improved Save Page Now, she was able to preserve the page (including all outlinks) with one click.

The Environmental Data & Governance Initiative (EDGI) partners with the Internet Archive’s Wayback Machine in order to produce reports that monitor the government.

The Environmental Data and Government Initiative (EDGI) partners with the Wayback Machine in its work monitoring government websites with particular emphasis on environmental issues. Our new Changes feature will help them track and publicize how government agencies are deleting and altering information about climate change and environmental protection issues, by comparing and publishing web pages side-by-side.

Graham underscores that the Wayback Machine, which has many scholarly, historical and journalistic uses, “is relevant to how you live in the United States today. Wayback Machine captures are even admissible in many courts.”

 “It can be used for holding people and governments accountable,” he said. “At the same time, it can be used for other things, like a bride’s request to preserve a wedding page.”

Fighting digital extinction, the Wayback Machine way.

NOTE: On October 23rd, come by the Wayback Machine Demo station at our World Night Market event to meet the team who built these new features. You can purchase your tickets here.

Posted in Announcements, News | 37 Comments

Offline Archive Brings Knowledge Anywhere

Three women look at a phone

The Internet Archive’s central mission is establishing “Universal Access to All Knowledge,” and we want to make sure that our library of millions of books, journals, audio files, and video recordings is available to anyone. Since lack of an internet connection is a major obstacle to that goal, we created the Offline Archive project—that works to make online collections available regardless of internet availability.

For many of our readers, the internet seems omnipresent—like electricity and running water, it’s available everywhere from our homes and offices to trains and planes. But for more than half of the world’s population, that access is far from guaranteed. In many developing countries and rural areas, the infrastructure that enables internet access is unreliable, slow, or nonexistent, while natural disasters and conflicts may exacerbate the problem. Additionally, internet access can be too expensive for many people, and some governments limit internet access or censor the content for political reasons. All of these factors can combine to make internet access inconsistent, low-quality, or altogether unavailable for billions of people, which in turn leads to poor educational outcomes and intergenerational poverty. Compounding the challenge, the internet in wealthier countries is growing rapidly, and high-bandwidth videos and graphics are making it harder than ever for people on low-quality networks to participate in the modern web.

As part of a solution to this problem, we have built an offline server that transfers Internet Archive collections to a local server, caches content while browsing, and delivers the Internet Archive UI offline in the browser. The system moves content between servers by “sneakernet”—on disks, USB sticks, and SD cards. This approach should improve access for anything from a Raspberry Pi to an institutional server holding terabytes of data. Right now, we’re working to make it available in a variety of different languages, so that anybody can utilize it—not just English speakers.

An Orange Pi, a Raspberry Pi, and an Australian 20-cent coin for scale. These small devices can serve the media of the Internet Archive in remote off-line locations.

Best of all, the Offline Archive project is open source, so that people around the world can collaborate to make it better. We are currently integrating the Archive’s APIs with those of our partners, to make it easier for them to incorporate Internet Archive content. Together with our collaborators, we can bring the Internet Archive anywhere—ensuring that people everywhere can enjoy our digital library.

If you would like to lend a hand, there are lots of ways to collaborate:

  • Software developers can help us add features, platforms, and internationalization
  • Platform developers can talk to us about integrating the Internet Archive’s content or server
  • Content owners and aggregators can help make more content available, especially educational content and material in other languages.
  • Community networks and internet access practitioners can help by becoming early adopters

See for more information, or contact to collaborate or contribute to this project.

If you would like to see the Offline Archive in action and meet its builder, Mitra Ardron, then come to the Internet Archive World Night Market on October 23rd and look for the Offline Archive demo table!

Posted in Announcements, News | 1 Comment

Adding New Features to the Internet Archive Music Experience

IA Music Player

The recently reconstructed music player has more, much more, to offer in making music accessible.

This is a time of transition, musically speaking, at the Internet Archive..

Our online digital library is best known for its immense archive of web pages and websites in the Wayback Machines. Less well known are the million-plus recordings the site has stored digitally and made available to the general public, mostly from 78s, albums and CDs.

Highlighting the growing importance of music on is the debut this month of our new music player. While you can listen to only a sample of most modern songs, the new player now embeds Spotify and YouTube versions of the full song, so listeners are now able to click right from to those services and listen to the full track. Examples: and

Liner Notes, Santana
Using the Internet Archive’s new music player, album covers and full liner notes are available with just a click.

We’ve digitized at high resolution the album liner notes, including full CD booklets and the paper labels on the discs themselves. And at the bottom of each page are lists of related music tracks – covers, other versions of the same song done by the same artist and compilations where that song has been used.

Related music
Want to find music related to the music you already know? IA’s music player is good at making those matches.

“It’s exploratory; it’s not exact,” said Internet Archive’s Brenton Cheng, who is at the head of the product team engineering the new music player. “The system uses each song’s acoustic ‘thumbprint’ to match it with songs in other services. The goal here is to start engaging with the music.”

“With our related music tracks listed down below, you are going to be exploring and discovering items, covers and versions that you didn’t know existed before. I think now we’re doing a better job of presenting the content that we have, and then helping people discover more.”

As streaming services gain popularity, the rich fountain of information found on album covers and CD liner notes is in danger of being lost. The Internet Archive seeks to fill that void by preserving the entire package that makes for a deeper musical experience. Now exploring those covers is right there in the music player itself.

“I think our presentation experience has until now not been as much of a focus as our gathering of materials from different sources,” Cheng said. “So now we are really trying to take time and check with our users, finding out who’s using the site and what they need. And we’re trying to present better experiences for exploring, consuming and searching for content.”

Posted in Announcements, News | 6 Comments

2,500 More MS-DOS Games Playable at the Archive

Another few thousand DOS Games are playable at the Internet Archive! Since our initial announcement in 2015, we’ve added occasional new games here and there to the collection, but this will be our biggest update yet, ranging from tiny recent independent productions to long-forgotten big-name releases from decades ago.

To browse the latest collection, hit this link and look around.

The usual caveats apply: Sometimes the emulations are slower than they should be, especially on older machines. Not all games are enjoyable to play. And of course, we are linking manuals where we can but not every game has a manual.

If you’ve been enjoying our “emulation in the browser” system over the years, then this is more of that. If you’re new to it or want to hear more about all this, keep reading.

A Recognition of Hard Work, and A Breathtaking View

The update of these MS-DOS games comes from a project called eXoDOS, which has expanded over the years in the realm of collecting DOS games for easy playability on modern systems to tracking down and capturing, as best as can be done, the full context of DOS games – from the earliest simple games in the first couple years of the IBM PC to recently created independent productions that still work in the MS-DOS environment.

What makes the collection more than just a pile of old, now-playable games, is how it has to take head-on the problems of software preservation and history. Having an old executable and a scanned copy of the manual represents only the first few steps. DOS has remained consistent in some ways over the last (nearly) 40 years, but a lot has changed under the hood and programs were sometimes only written to work on very specific hardware and a very specific setup. They were released, sold some amount of copies, and then disappeared off the shelves, if not everyone’s memories.

It is all these extra steps, under the hood, of acquisition and configuration, that represents the hardest work by the eXoDOS project, and I recognize that long-time and Herculean effort. As a result, the eXoDOS project has over 7,000 titles they’ve made work dependably and consistently.

Separately from the eXoDOS project, I’ve been putting a percentage of these games into the Emularity system on the Internet Archive for research, entertainment and quick online access to the programs. The issues that are introduced by this are mine and mine alone, and eXoDOS is not able to help with them. You can always mail me at with questions or technical concerns.

This should be all that needs to be said, but since the Archive is doing things a little strangely, there’s a lot to keep in mind before you really dive in (or to realize, when you come back with questions).

That Hilarious Problem With CD-ROMs

Putting these games into the Internet Archive has, over time, brought into sharp focus particular issues with browser-based emulation. For example, keyboard collision, where the input needs of the emulator are taken over by the browser itself, and the problems of a program needing a lot more horsepower to run in a browser emulator than a user’s system can handle.

Some of these have solutions that aren’t always great (Buy faster hardware!) and in some cases the problem is currently terminal (these programs have been taken offline for a future date). But the most obvious and pressing is that games based off CD-ROMs take a significant, huge amount of time to load.

CD-ROMs were a boon to the early-to-late 1990s, allowing games to have audio and video like never before. Depending on the tricks used, you got full-motion video (FMV), the playing of CD audio tracks for background music, and levels and variation of content for the games far beyond what floppy disks could ever hope.

But it was also a very large amount of data (up to 700 megabytes per CD) and it’s one thing to have the data sitting on a plastic disc in a local machine, and yet another to have a network connection pull the entire contents of the CD-ROM into memory and hold it there as a virtual file resources. This is going to be an enormous lean on the vast majority of Internet users out there – downloading multi-hundred-megabyte files into memory and then keeping them there, and then losing it all when the browser window closes. Network speeds will improve over time, but this is probably the biggest show-stopper of them all for many folks.

If you find yourself loading up one of these games and facing down a hundred-megabyte download, consider one of the smaller games instead, unless it’s a title you really, really want to try out. Maybe in a few years we’ll look back at cable-modem speeds and laugh at the crawling, but for now, they’re pretty significant.

Some Jewels in the Mix

Luckily, there are some smaller-sized games in this new update that will load relatively quickly and are really enjoyable to look at and to play. Here’s some of my recommendations:

First, a game special to me: the IBM DOS version of Adventure, calling itself “Microsoft Adventure”. It’s actually a small rebranding of the original start of the text adventure world, “Colossal Cave” or ADVENT, by Don Woods and Will Crowther. Remixed to be sold by IBM and Microsoft, this is how I first got into these, and it boots up instantly, providing hours of fun if you’ve never tried it before.

Mr. Blobby, a 1994 DOS Platform game, has all the hallmarks of the genre – bonkers physics, bright and lovely graphics, and joyful music. Be sure to redefine the keys before you try to play it, because besides running and jumping, you can spin and take things. The game does not get less weird as you go along.

Super Munchers: The Challenge Continues is a 1991 remix of the original educational game that sent your “muncher” gathering up words representing a given topic or idea. The speed of the game, along with the learning aspect, make this one of the more zesty “edutainment” titles available from the time.

Street Rod is a wonderfully compact 1989 racing game where it’s the 1960s and you’re going to buy your first hot-rod, tune it up, and race it for money to buy better and better rides. It’s a mouse-driven interface and loaded with all sorts of tricks to make the game fit into a “mere” 600 kilobytes compressed. Initially simple and then well worth the effort!

Digger from 1983 is a Dig-Dug-Clone-but-Not that came out right as IBM PCs were starting to take off, and it’s a lovely little game, steering around a mining machine while avoiding enemies and picking up diamonds. The most unintuitive thing is you need to fire using the “F1” key, so hopefully your keyboard has one.

I’m also going to suggest Floppy Frenzy from Windmill Software because it’s so much closer to the beginning of the IBM PC’s reign and you can see the difference in what the authors were comfortable with – the graphics are simpler, the game movement a little more rough, and the theme is geekiness incarnate: You’re a floppy disk avoiding magnets to leave traps for them, so you can gather the magnets up before the time runs out. If you don’t make it, an angel comes down and brings you to Floppy Disk Heaven. Again, F1 is the unusual key to leave traps.

There’s many more and I suggest people browse around and try things out, really soak in that MS-DOS joy. (And feel free to leave comments with suggestions.)

Thanks so much for coming along on this emulation journey!

  • Jason Scott, Internet Archive Software Curator
Posted in Announcements, News | 6 Comments