Category Archives: Announcements

Welcome to the Webspace Jam

It stood as either a memorial, embarrassment or in-joke: the promotional website for the 1996 film Space Jam, a comedy-action-sports film starring Michael Jordan and the Warner Brothers Looney Tunes characters.

Created at a time when the exact relevance of websites in the spectrum of mass media promotion was still being worked out, www.spacejam.com held many of the fashionable attributes of a site in 1996: an image map that you could click on, a repeating star background, and a screen resolution that years of advancement have long left in the dust. The limits of HTML coding and computer power were pushed as far as they could go. The intended audience was a group of people primarily using dial-up modems and single-threaded browsers to connect to what was still called The Information Superhighway.

By all rights, the Space Jam site should have died back in the 1990s, lost in the shifting sands of pop culture attention and flashier sites arriving with each passing day.

But it didn’t die, go offline or get replaced with a domain hosting advertisement or a 404.

Unlike a lot of websites from the 1990s, the Space Jam movie site simply didn’t change.

It persisted.

Just as every city seems to have that one bar or restaurant that can trace itself back for over a century, this one website became known, to people who looked for it, as a strange exception – unchanging, unshifting, with someone paying for the hosting and advertising a movie that, while a lot of fun, was not necessarily an oscar-winning cinematic experience. You could go to the site and be instantly transported back to a World Wide Web that in many ways felt like ancient history, absolutely gone.

Years turned into decades.

For those in the know and who paid close attention to this odd online relic, the real mystery was that the site was not actually staticsomeone was making modifications to the code of the website, the settings and web hosting, to jump past several notable shifts in how websites work, to ensure that deprecated features and unaccounted browser issues were handled. That costs money; that’s the work of people. Somehow, this silly movie site represented the held-out flame that with a small bit of care and dedication, a website could live forever, like we were once promised.

It wasn’t just a clickable brochure – it became a beacon in the dark, a touchstone for some who were just children when the World Wide Web was started, and who grew up with this online world, which has shifted and consolidated and closed and tracked us.

Then the unthinkable happened.

In 2021, the sequel arrived.

It is abundantly clear the abnormally long life of the original 1996 site helped see the sequel through the endless mazes and corridors of Hollywood development turnaround.

Because websites and online presence are the way that movies are now promoted, the very place that spawned this consistent brand through decades had to go. A new Space Jam site was created, using the www.spacejam.com domain.

In a nod to its beginnings, the 1996 website still exists, shoved into a back room; adding /1996 to the URL will give you the old site as it used to appear before this year, and a small note in the corner lets you know you could optionally visit this once-dependable hangout.

But now the site is broken.

Links from around the net to the Space Jam site, to specific sub-pages and specific images, now break. A browser arriving at the spacejam.com page from a link elsewhere will see Just Another Movie Promotion Site, utilizing all the current fads: Layered windows to YouTube videos (which will break), javascript calls (which will break) and a dedication to being as flashy, generically designed and film-promoting as literally any other movie site currently up. Links that worked for decades have been cast aside for the spotlight of the moment.

The word is disposable.

There’s still one place you can see the old site, as it was once arranged, though.

The same year the Space Jam movie and website arrived, another website started: The Internet Archive.

Unlike Space Jam, the Internet Archive’s site did change constantly. You can use the Wayback Machine to see all the changes as they came and went; over half-a-million captures have been done on archive.org.

We have changed across the last 25 years, but we also have not.

The ideas that the Web should keep URLs running, that the interdependent linking and reference cooked into it from day one should be a last-resort change, and that the experience of online should be one of flow and not of constant interruptions, still live here.

Hundreds of webpages that have also survived since the time of Space Jam are inside the stacks of the Wayback Machine, some of them still running, and still looking unchanged since those heady days of promises and online wishes.

And if the unthinkable happens to them, we’ll be ready.




Filecoin Foundation Grants 50,000 FIL to the Internet Archive

Amidst the speculative boom for NFTs and crypto-currencies, one decentralized technology foundation is taking the long view by investing in deep history and the far future. 

Today, the Filecoin Foundation announced a 50,000 FIL grant to the Internet Archive – the largest single donation in the digital library’s 25-year history. 

“Holy Crow! This is a big deal,” said Brewster Kahle, the Internet Archive’s founder. “And what are we going to do with it? We’re going to invest it in making the Internet Archive more decentralized, so that our digital history is available from thousands of computers, not just a few. The idea is to make a robust and private Internet that has a history that will persist over decades and maybe centuries.”

Filecoin is a decentralized storage system designed to preserve humanity’s most important information. The creators of Filecoin envisioned an independent foundation that would serve as the long-term governance body for the Filecoin ecosystem. In awarding the grant to the Internet Archive, Filecoin Foundation board chair, Marta Belcher, stressed the two organizations’ “common goal of preserving the web and fostering its future.”

It was back in 2015 that Protocol Labs‘ founder, Juan Benet, first visited the Internet Archive, to share his vision for an academic conference dedicated to preserving “humanity’s greatest treasures using decentralized storage.” Building on these conversations, the Internet Archive organized the  Decentralized Web Summit in 2016 in San Francisco, the first gathering of its kind. Back then, a decentralized web was mostly a concept, with little working code.

Decentralized technologists, Trent McConaghy of Ocean and Juan Benet of Protocol Labs at the 2016 Decentralized Web Summit at the Internet Archive in San Francisco.

Since 2016, the Internet Archive has worked with several decentralized tech startups to create a decentralized prototype of the digital library. And when the Filecoin main net took off in 2020, stored in Filecoin servers were public domain audiobooks and films from the Internet Archive. Together, the two organizations created the Filecoin Archives, a community-led project to curate, disseminate and preserve important open access to information often at risk of being lost.

“It’s wonderful to see Filecoin come of age. We started six years ago by putting out a call to make a Decentralized Web, a web that would serve us better than the current web–one that is now starting to be dominated by just a few tech behemoths. Can we make a game with many winners?” asked Kahle. “Filecoin has made a huge step forward by deploying decentralized storage at the exabyte level. That’s very different from AWS (Amazon Web Services). It has many participants, not just one player. And its protocols are open-source. We want to see more technologies like this. This was the original vision of the Decentralized Web that the Internet Archive was hoping for five, six years ago. And it’s starting to come to fruition and Filecoin is a leader in that area.”

Although purveyors of cryptocurrencies are often accused of being driven only by short-term gain, in this group Kahle sees a different motivation. “This donation by the Filecoin Foundation is significant financially for the Internet Archive, but I’d say it’s a more interesting one than that,” said the Internet Hall of Fame engineer. “It’s a donation by a new generation of technologists that are building interesting new technologies…bringing the Archive along with it to make it so that history is preserved –that the Internet Archive makes it into this next generation. That is an interesting thing! You don’t often see that. But the Filecoin Foundation, Filecoin and IPFS, and Juan Benet himself have always been interested in preserving history and how history can be woven into the present and the future of these technologies.”

Author and Open Source Advocate VM Brasseur: Internet Archive ‘Legitimately Useful’ for Lending and Preservation of Her Work

In her 20-year career in the tech industry, VM (Vicky) Brasseur has championed the use of free and open source software (FOSS). She hails it as good for businesses and the community, writing and presenting extensively about its merits.

VM Brasseur, Raleigh, North Carolina, 2018. Credit: Peter Adams Photography

To spread the word, Brasseur has made her book, Forge Your Future With Open Source, available for borrowing through the Internet Archive. She’s also saved all of her blogs, articles, talks and slides in the Wayback Machine for preservation and access to anyone.  

“I do it to share the knowledge,” Brasseur said. “Uploading the resources to Internet Archive ensures that more people will be able to see it and will be able to see it forever.”

As soon as her book was published by The Pragmatic Programmers in 2018, Brasseur said she wanted to have it represented in the Internet Archive. She donated a copy so it could be available through Controlled Digital Lending (CDL).

“I think CDL is great. I love libraries,” Brasseur said. “To me, I don’t see how CDL is any different from walking into my local branch of the public library, picking up one of the copies that they have, going up to the circ desk, and taking it home. How is that different from the Internet Archive? They have one copy of my book and check it out one copy at a time. It just happens to be an e-book version. I, frankly, don’t see the material difference.”

A supporter of the Internet Archive since its inception, Brasseur says she’s a regular user of the Wayback Machine. It’s been useful for her to be able to do research and for others to find her body of work. Recently, she revamped her blog and removed some pages—later getting a request from someone who wanted some of the deleted material. Brasseur provided a Wayback Machine link to where she’d stored them, making it easy for that person to find the missing pages. “It’s a gift. It’s legitimately useful,” she said. “Having the Wayback means that other people can still have access” to materials she no longer has on her website.

Borrow the book through the Internet Archive, or purchase a copy for your own library.

Brasseur has led software development departments and teams, providing technical management and strategic consulting for businesses, and helping companies understand and implement FOSS. She wrote her book not just for programmers, but rather says it’s intended to be inclusive and for anyone interested in FOSS including technical writers, designers, project managers, those involved in security issues, and all other roles in the software development process.

In the book, she helps walk readers through why they might want to contribute to FOSS and how to best embrace the practices involved. The book was been positively received and was #1 on the BookAuthority list of 18 Best New Software Development Books To Read In 2018. Recently, it has been picked up by people transitioning to telecommuting and looking for resources for doing collaborative work.

“Obviously, I do want people to buy the book, but I’m also strongly pro library, as most intelligent publishers are. My publisher is a big fan of making sure that their books are available in libraries,” Brasseur said. “So the Internet Archive is a library that anyone can access all over the world. And it just makes it a lot easier to make sure that the book gets in the hands of people.”

Brasseur is committed to helping people contribute to open source; for people who can’t afford to buy the book, checking it out from the library is an alternative. “If they can get a copy from Internet Archive, then they can learn how to contribute and they can make a difference from wherever they are in the world. Nigeria, Thailand, Netherlands, or Montana. You don’t have to worry if your local library has it,” she said. “In these times, in particular, it’s very difficult to get to your library. This is a great service that the Internet Archive is providing.”


Forge Your Future with Open Source by VM Brasseur is available for purchase through a variety of retailers and local book stores.

Early Web Datasets & Researcher Opportunities

In July, we announced our partnership with the Archives Unleashed project as part of our ongoing effort to make new services available for scholars and students to study the archived web. Joining the curatorial power of our Archive-It service, our work supporting text and data mining, and Archives Unleashed’s in-browser analysis tools will open up new opportunities for understanding the petabyte-scale volume of historical records in web archives.

As part of our partnership, we are releasing a series of publicly available datasets created from archived web collections. Alongside these efforts, the project is also launching a Cohort Program providing funding and technical support for research teams interested in studying web archive collections. These twin efforts aim to help build the infrastructure and services to allow more researchers to leverage web archives in their scholarly work. More details on the new public datasets and the cohorts program are below. 

Early Web Datasets

Our first in a series of public datasets from the web collections are oriented around the theme of the early web. These are, of course, datasets intended for data mining and researchers using computational tools to study large amounts of data, so are absent the informational or nostalgia value of looking at archived webpages in the Wayback Machine. If the latter is more your interest, here is an archived Geocities page with unicorn GIFs.

GeoCities Collection (1994–2009)

As one of the first platforms for creating web pages without expertise, Geocities lowered the barrier of entry for a new generation of website creators. There were at least 38 million pages displayed by GeoCities before it was terminated by Yahoo! in 2009. This dataset collection contains a number of individual datasets that include data such as domain counts, image graph and web graph data, and binary file information for a variety of file formats like audio, video, and text and image files. A graphml file is also available for the domain graph.

GeoCities Dataset Collection: https://archive.org/details/geocitiesdatasets

Friendster (2003–2015)

Friendster was an early and widely used social media networking site where users were able to establish and maintain layers of shared connections with other users. This dataset collection contains  graph files that allow data-driven research to explore how certain pages within Friendster linked to each other. It also contains a dataset that provides some basic metadata about the individual files within the archival collection. 

Friendster Dataset Collection: https://archive.org/details/friendsterdatasets

Early Web Language Datasets (1996–1999)

These two related datasets were generated from the Internet Archive’s global web archive collection. The first dataset, “Parallel Language Records of the Early Web (1996–1999)” provides a dataset of multilingual records, or URLs of websites that have the same text represented in multiple languages. Such multi-language text from websites are a rich source for parallel language corpora and can be valuable in machine translation. The second dataset, “Language Annotations of the Early Web (1996–1999)” is another metadata set that annotates the language of over four million websites using Compact Language Detector (CLD3).

Early Web Language collection: https://archive.org/details/earlywebdatasets

Archives Unleashed Cohort Program

Applications are now being accepted from research teams interested in performing computational analysis of web archive data. Five cohorts teams of up to five members each will be selected to participate in the program from July 2021 to June 2022. Teams will:

  • Participate in cohort events, training, and support, with a closing event held at Internet Archive, in San Francisco, California, USA tentatively in May 2022. Prior events will be virtual or in-person, depending on COVID-19 restrictions
  • Receive bi-monthly mentorship via support meetings with the Archives Unleashed team
  • Work in the Archive-It Research Cloud to generate custom datasets
  • Receive funding of $11,500 CAD to support project work. Additional support will be provided for travel to the Internet Archive event

Applications are due March 31, 2021. Please visit the Archives Unleashed Research Cohorts webpage for more details on the program and instructions on how to apply.

Milton Public Library Reaches Patrons Through Controlled Digital Lending

Leaders at the Milton Public Library (MPL) in Canada say they are continually questioning their operations and looking for ways to better serve their patrons. That’s why the Ontario institution joined the Internet Archive’s Open Libraries program.

“We are always keen to innovate, in meaningful ways” said Mark Williams, MPL chief executive officer and chief librarian. “Why would we not want to be in this partnership that expands our collection, but also extends assets to other people’s collections in a digital realm? It was a no brainer.”

In making its decision to become part of Open Libraries in September 2019, Williams said rather than being concerned about publishers, the focus was on the interests of the public. 

Mark Williams, Milton Public Library

“If it challenges the status quo for the benefit of readers, wherever those readers are, then I think we should engage,” Williams said.

As it happens, the timing of its membership was fortuitous. With COVID-19 disrupting access to the print collection at its branches, being part of the Open Libraries meant broader access to digital materials for patrons quarantined at home.

MPL has been a central part of the Milton, Ontario, community since 1855, serving a population of more than 120,000 through three physical libraries and its website (and with a bookmobile and four new branches in the pipelines over the course of the next 10 years), Library services were forced to be flexible in the past year as health circumstances changed in the province.

The three MPL locations closed on March 17, 2020, under a state of emergency in Ontario. By May, a phased reopening allowed libraries to begin limited operations. During the state of emergency, librarians pivoted to providing access to services only through virtual interactions and the website was changed to focus on promoting electronic resources. As restrictions eased, MPL provided curbside, contactless pickup. Eventually, 50 to 100 patrons were allowed inside the buildings with safety protocols. The libraries had to close again when COVID-19 cases spiked in the winter, and then reopened in February.

We’ve seen overwhelming demand…Patrons think it’s a fantastic option…

Mark Williams, Milton Public Library

“The staff have been remarkably agile and good at adapting their approach,” Williams said. “We’ve done the best we possibly could to ensure the public library services continued, but the way we deliver it is different than anyone would have expected.”

In addition to joining Open Libraries, MPL donated 30,000 books to the Internet Archive. Williams said the expanded access to content in the larger online library has been a boon to the public. Regardless of the pandemic, MPL would have spread the word about access to Open Libraries, he said, but it was likely accelerated because there was no choice but to focus on digital offerings in the pandemic.

Milton Public Library

“The lockdown highlighted the ability for us to raise awareness about the partnership and introduce it to more patrons,” Williams said. MPL is creating a new portal on its website that will be dedicated to Open Libraries but has been promoting its availability in the meantime and the response has been positive.

“We’ve seen overwhelming demand,” Williams said. “Patrons think it’s a fantastic option for them to have increased materials than we currently have available.”

The transition to becoming part of the Open Libraries program was seamless, said Williams, and he’s encouraging other libraries to consider joining.

“I hope if other libraries sign up, they will be equally inspired by the partnership. The content is amazing,” Williams said. “Our patrons think it’s phenomenal. Our board thinks it’s a great idea, philosophically. Everyone believes this is an important service addition.”

To browse the books now available for lending through Milton Public Library’s participation in the Open Libraries program, please visit: https://archive.org/details/miltonpubliclibrary-ol. Learn how your library can participate in the Open Libraries program.

Internet Archive Expresses Concerns Over Sweeping Copyright Reform Proposal

You may have heard that, in the waning days of 2020, controversial new copyright provisions were slipped into the end-of-year, must-pass COVID relief bill. Many commenters were troubled by this departure from the ordinary legislative process. Unfortunately, there are more controversial copyright revisions waiting in the wings.

Recently, Senator Thom Tillis released draft legislation which would substantially change the copyright landscape for the worse. It’s called the “Digital Copyright Act,” and our friends at the Electronic Frontier Foundation have described it as disastrous. The proposed Digital Copyright Act would change the rules that govern the Internet in a lot of ways, including requiring automated content filtering that would reduce access to knowledge. While the proposal nods towards making the rules better for Internet users, the draft legislation is still far better for Big Content and Big Tech than it is for libraries, non-profits and regular people.

Even small changes to copyright rules can have substantial consequences for the internet information ecosystem. That is why it is so important that sweeping proposals like this one not be passed in the dead of night, but instead be subject to rigorous study and open comment by everyone. We have drafted a short comment on this proposal which you can review here.

Search Scholarly Materials Preserved in the Internet Archive

Looking for a research paper but can’t find a copy in your library’s catalog or popular search engines? Give Internet Archive Scholar a try! We might have a PDF from a “vanished” Open Access publisher in our web archive, an author’s pre-publication manuscript from their archived faculty webpage, or a digitized microfilm version of an older publication.

We hope Internet Archive Scholar will aid researchers and librarians looking for specific open access papers that may not be otherwise available to them. Judith van Stegeren (@jd7g on Twitter), a PhD candidate in the Netherlands, encountered just such a situation recently when sharing a workshop paper on procedural generation in computer games: “Towards Qualitative Procedural Generation” by Mark R. Johnson, originally presented at the Computational Creativity & Games Workshop in 2016. The papers for this particular year of the workshop are not indexed in the usual bibliographic catalogs, and the original workshop website hosting the Open Access papers is no longer accessible. Fortunately, copies of all the 2016 workshop papers were captured in the Wayback Machine, and can be found today by searching IA Scholar by title or conference name.

As another example, dozens of papers from the Open Journal of Hematology are no longer resolvable via DOI. As mentioned in a previous blog post, the publisher’s website vanished and has been replaced with unrelated advertisements. But before that happened, the papers were captured in the Wayback Machine, indexed in our catalog, and can now be searched in full:

IA Scholar Search Results

IA Scholar is a simple, access-oriented interface to content identified across several Internet Archive collections, including web archives, archive.org files, and digitized print materials. The full text of articles is searchable for users that are hunting for particular phrases or keywords. This complements our existing full-text search index of millions of digitized books and other documents on archive.org.

The service builds on Fatcat, an open catalog we have developed to identify at-risk and web-published open scholarly outputs that can benefit from long-term preservation, additional metadata, and perpetual access. Fatcat includes resources that may be useful to librarians and archivists, such as bulk metadata dumps, a read/write API, command-line tool, and file-level archival metadata. If you are interested in collaborating with us, or are a researcher interested in text analysis applications, we have a public chat channel or can be contacted by email at info@archive.org.

IA Scholar marks a milestone in our work initiated in 2018 to leverage the automation and scale of web and API harvesting in providing open infrastructure for the preservation of and perpetual access to scholarly materials from the public web. We particularly want to thank the Mellon Foundation for their original and ongoing support of this work, our many current partners, and the other collaborators, contributors, and volunteers.

All of this is possible because of the incredible open research ecosystem built and collectively maintained by Open Access advocates. Thank you to the DOAJ and other groups for helping catalog open access journals which has aided preservation. Thank you to the Biodiversity Heritage Library and its supporters for digitizing print journal literature. And thank you to the many other organizations we have worked with, integrated, or whose services we have utilized, including open web indices (Unpaywall, CORE, CiteseerX, Microsoft Academic, Semantic Scholar), directories of open journals (DOAJ, ROAD SHERPA/ROMEO, JURN, Wikidata), and open bibliographic catalogs (Crossref, Datacite, J-STAGE, Pubmed, dblp). 

IA Scholar is built from open source software components, and is itself released as Free Software. The website has been translated into eight languages (so far!) by generous volunteers.

Leveling the Playing Field for Students with Print Disabilities

The Internet Archive is bringing more periodicals and scholarly resources to students directly and by working with disability offices in the United States, Canada and elsewhere.

As more students with disabilities pursue higher education, demand is growing for books, journal articles and other learning materials to be available in accessible formats. This includes digitizing print materials for people who are blind or have low vision, those with dyslexia or attention deficit/hyperactivity disorder (ADHD), and people with limited mobility who might have difficulty holding print documents.

The Internet Archive is part of an expanding effort to make it easier for people with print disabilities to access information by digitizing books, periodicals, and microfilm needed to succeed in school and beyond. Once print materials are converted to machine-readable formats, users can listen with a screenreader, text-to-speech software or other forms of audio delivery—starting, stopping, and slowing down the information flow, as well as change the colors of text and background of pages.

With 10 percent or more of students at colleges in the United States requesting accessibility accommodations (Government Accountability Office, 2009, p.37), providing digitized learning materials is critical. Each semester Disability Service Offices (DSOs) on campuses respond to student requests to convert materials into accessible formats—often doing so in silos with limited budgets.

Libraries are being called into action to coordinate the delivery of accessible instructional materials. Doing its part to improve access to knowledge for all, Internet Archive is collaborating with others to share its collection and streamline the search process.

A level playing field

“There is a need for a fast turnaround with materials. Students [with print disabilities] need a level playing field,” said John Unsworth, dean of libraries at the University of Virginia. “The library is not just here for the able-bodied.”

John Unsworth, University of Virginia

UVA is working with the Internet Archive, BookShare, and the HathiTrust to reduce duplication of efforts across the country to convert text materials to accessible formats. Together, they are participating in the Federating Repositories of Accessible Materials for Higher Education (FRAME) project funded with a $1 million grant from The Andrew W. Mellon Foundation.

Since 2019, the partners have established Educational Materials Made Accessible (EMMA), a hub and repository for digitized materials. The pilot includes six other universities: George Mason University, University of Virginia, Texas A&M University, University of Illinois at Urbana-Champaign, University of Northern Arizona, University of Wisconsin-Madison and Vanderbilt University.


“When looking at the intersection between copyright and civil rights…civil rights win every time”

John Unsworth, university librarian, University of Virginia

EMMA provides DSO staff (on behalf of students) with a central place to retrieve—and library staff re-deposit—machine-readable texts from the Internet Archive, HathiTrust, and Bookshare. It provides a searchable database to locate materials requested by students more efficiently. Users can filter by repository, format and accessibility features—which will become more valuable as texts are remediated. The project relies on the Internet Archive as a large digital repository to provide a federated network of storage and delivery, as well as technical expertise.

Unsworth said the goal of EMMA is to speed up access to materials and help DSOs avoid duplication. If faculty tinker with a syllabus and add a book at the last minute, students with print disabilities need to be able to have a copy they can use at the same time their peers do. “It’s the nature of education that what you need to read changes during the semester,” Unsworth said. “[Students with print disabilities] can’t get materials at the last minute when everyone else has had it for two weeks.”

Often, libraries are not involved in collecting, cataloguing, or preserving educational materials for people with disabilities on their own campus, or making them discoverable to others. EMMA is designed to connect DSOs and libraries on the same campus — and with other institutions. Once materials are remediated, DSOs put them in a drop box that the library validates with the new metadata and uploads it.  “Libraries shoulder the burden of sharing—and by doing that, they help fulfill their mission,” Unsworth said.

Despite publisher warnings about what DSOs can do with their remediated content, Unsworth said concerns are not supported by law. “When looking at the intersection between copyright and civil rights…civil rights win every time,” Unsworth said. “Libraries are used to pushing back on publisher claims. Libraries bring a willingness to stand up to appropriate use rights.”

A coordinating hub for materials was desperately needed and, Unsworth said, something DSOs have been waiting to have for years.


“Everyone should have the same shot at succeeding”

– Angella Anderson, disability specialist, University of Illinois at Urbana-Champaign

Angella Anderson, UIUC

Based on a student’s syllabus, Angella Anderson, a disability specialist at the University of Illinois at Urbana-Champaign, arranges for needed accessible materials for students at all levels—from undergraduates to law students to doctoral students. “We have several students who—without this service—would have had significant challenges being successful in their programs.”

Now, with EMMA, if a book or journal article a student needs is already shared on the hub, the DSO can download it and save time. Anderson estimates it has cut her time searching for learning materials by half. “The problem we’ve all had over the years is that we are converting the same book at the same time. That’s a huge resource drain,” Anderson said, noting the potential benefit of EMMA. “Everyone should have the same shot at succeeding at whatever it is they want to do, so I feel this will be extremely useful to a lot of schools and a lot of students.”

Canadian efforts advance

In Canada, the Internet Archive supports work of the Accessible Content E-Portal (ACE), a service of the Ontario Council of University Libraries. At the Internet Archive digitization center at the University of Toronto, staff digitize on demand and prioritize requests received by ACE from students who need materials for accessibility. The turnaround used to take weeks, but Andrea Mills, digitization program manager, said the system has been improved and students with print disabilities now can get materials digitized often in less than two days. 

Andrea Mills, Internet Archive

Mills said requested materials most often include non-fiction research books and novels, often printed between 1990 and 2010—before e-books were widely available. Elsewhere in Canada at the University of Alberta, another Internet Archive scanning center provides the same service, through their Accessibility Resources office, to students who have qualifying perceptual challenges.. 

“Sometimes people not part of the mainstream are forgotten,” Mills said. “It may only be a handful of users who have this need, and not represent a high number of downloads or uses, but these are people who truly need assistance.”

Learn more

Librarians: Join our free program to qualify your patrons to access the Internet Archive’s resources for users with print disabilities. Individuals can gain access by having a qualifying authority like the Vermont Mutual Aid Society enroll you in their program.

Howard University Joins Open Libraries, Embraces Digital Access for Students

Howard University’s Founders Library. Image courtesy Tyrone Turner / WAMU

Like campuses across the country, Howard University in Washington, D.C., shut down last March when COVID-19 hit. Most of its nearly 6,000 undergraduate students have been remote learning ever since.

Without access to the physical library, demand for e-books has increased.  The university recently joined the Open Libraries program to expand the digital materials that students can borrow. Through the program, users can check out a digital version of a book the library owns using controlled digital lending (CDL).

Amy Phillips, head of technical services for Howard University Libraries, learned about the opportunity last fall through the Washington Research Library Consortium. Howard is one of nine D.C.-area libraries in the nonprofit consortium, which recently collaborated with the Internet Archive to do an overlap analysis of its shared collection. When the digital materials became available to use for free through the consortia, Howard decided to join, too.

After Alisha Strother, metadata librarian, ran an analysis of books in the Howard collection by International Standard Book Number (ISBN), it was discovered that more than 14,000 books matched a copy that the Internet Archive had acquired and digitized. Howard decided to join the Open Libraries program in January. This means that students can now check out these Howard books from across the country as they engage in online instruction.

“I see this as being an important resource for students to be able to access materials from anywhere,” Phillips said. “And I think it will have value and be heavily utilized even when we are back on campus.”

Historic view of Howard University’s Founders Library.

Howard is one of nearly 100 historically black colleges and universities (HBCU) in the United States. One of Howard’s most important entities is the Moorland-Spingarn Research Center, which is recognized as one of the world’s largest and most comprehensive repositories for the documentation of the history and cultural of people of African descent in Africa, the Americas and other parts of the world. Portions of its materials are also now available for digital borrowing through the Open Libraries program. The collection will now have greater exposure since it had previously only been accessible onsite for researchers who scheduled appointments. 

“This opens up a premier collection to public usage. From a scholarly and cultural point of view, this material is very much in demand,” Phillips said. “Looking forward, we think it will get a lot of traffic.”

COVID-19 has disproportionately affected people of color, prompting Howard to be cautious and extending online learning into the spring semester for most all students, Phillips said.  The university is doing all it can to connect students with resources and its libraries have been investing more in digital items. But budgets are limited and licensing agreements curb the library’s ability to broadly lend e-books.

“The Internet Archive has been an important way to open up more library materials to students,” said Phillips, adding that it’s new and just beginning to be promoted to students and faculty. “We’re excited and we know this will have a positive impact on student success and scholarship.”

Giving “Last Chance Books” New Life Through Digitization

The Dedication of Books by H.B. Wheatley (1887), as presented for scanning. View the digitized book online.

Sometimes they arrive tied up in string because their binding is broken. Others are in envelopes to protect the brittle pages from further damage.

Aging books are sent from libraries to the Internet Archive for preservation. Thanks to the careful work of the nearly 70 people who scan at digitization centers in the United States, United Kingdom and Canada, the books get a second life with a new audience.

Scanners sometimes call these “Last Chance Books” and they take pride in restoring them. As they turn the pages one at time to be photographed and digitized, they develop a daily cadence—but it must be adjusted with fragile materials.

“We do our best with the flaking or cracking pages,” said Andrea Mills, digitization program manager for the Internet Archive stationed in Toronto, Canada. “You have to be really cautious that the flake doesn’t fall off and cover a word. It’s almost like a puzzle.”

Elizabeth MacLeod, demoing a Scribe in the foyer of the Internet Archive in San Francisco, pre-COVID.

Some books that land at the Internet Archive digitization centers date back to the 1700s. They are fiction and nonfiction, journals and pamphlets covering a range of topics. And, it can be surprising to learn what reviving the material means to patrons.

“We chuckled when we digitized a book on sea captains. We thought – who will care? And then a year later, it had hundreds of views,” said Elizabeth MacLeod, senior manager of satellite digitization services who manages remote operations out of Wilmington, North Carolina.

Digitization helps preserve materials that are no longer in circulation at their holding library because they are falling apart. It also gives new exposure to books that are out of print that may otherwise be forgotten.

Both Mills and MacLeod began working for the Internet Archive more than 10 years ago as book scanners – also known as Scribe operators. Mills has an arts degree in jewelry design and teaching; MacLeod studied biology. They were both drawn to the mission of the Internet Archive and share a passion of connecting people with resources.

A cart of “last chance books” awaiting digitization at the University of Toronto.

Over the years, Mills and MacLeod have worked closely with librarians and archivists around the world to digitize their collections, learning more with each project. They now manage digitization and support sites with training and best practices, many embedded in libraries, in 10 countries and upwards of 30 locations. Digitizing is a somewhat solitary task and some people “get in the zone” while scanning; others are very chatty or listen to music, Mills said.

Andrea Mills, showing off the Scribe to a tour celebrating the 2020 ALCTS Outstanding Collaboration Citation for digitizing a collection of Tamil materials at University of Toronto.

Many employees have worked together for nearly a decade and there is a friendly, collaborative vibe at the centers. “We have all sorts of people—artists, printers and photographers. They are people who are meticulous and love books,” Mills said. A recent viral video shared on the Internet Archive’s Twitter account features Scribe operator Eliza Zhang, who has worked at the Archive for more than ten years. Book conservators from larger institutional partners also offer additional training for Internet Archive operators on best practices for handling their unique collections.

MacLeod says the scanners are all committed to providing a service to readers and it’s satisfying to help people with disabilities connect with books, “It’s energizing to be part of an organization that is thinking outside the box,” she said. “I want people to be able to have more access to whatever they are trying to find.”

Added Mills: “I’m an information junky. I love the search and the hunt and the finding the answer. The power of the internet and digitization is that you can find that answer faster. It just sort of opens up the possibilities of what you can do.”