Internet Archive’s Modern Book Collection Now Tops 2 Million Volumes

The Internet Archive has reached a new milestone: 2 million. That’s how many modern books are now in its lending collection—available free to the public to borrow at any time, even from home.

“We are going strong,” said Chris Freeland, a librarian at the Internet Archive and director of the Open Libraries program. “We are making books available that people need access to online, and our patrons are really invested. We are doing a library’s work in the digital era.”

The lending collection is an encyclopedic mix of purchased books, ebooks, and donations from individuals, organizations, and institutions. It has been curated by Freeland and other librarians at the Internet Archive according to a prioritized wish list that has guided collection development. The collection has been purpose-built to reach a wide base of both public and academic library patrons, and to contain books that people want to read and access online—titles that are widely held by libraries, cited in Wikipedia and frequently assigned on syllabi and course reading lists.

“The Internet Archive is trying to achieve a collection reflective of great research and public libraries like the Boston Public Library,” said Brewster Kahle, digital librarian and founder of the Internet Archive, who began building the diverse library more than 20 years ago.

“Libraries from around the world have been contributing books so that we can make sure the digital generation has access to the best knowledge ever written,” Kahle said. “These wide ranging collections include books curated by educators, librarians and individuals, that they see are critical to educating an informed populace at a time of massive disinformation and misinformation.”

The 2 million modern books are part of the Archive’s larger collection of 28 million texts that include older books in the public domain, magazines, and documents. Beyond texts, millions of movies, television news programs, images, live music concerts, and other sound recordings are also available, as well as more than 500 billion web pages that have been archived by the Wayback Machine. Nearly 1.5 million unique patrons use the Internet Archive each day, and about 17,000 items are uploaded daily.

Presenting the (representative) 2 millionth book

Every day about 3,500 books are digitized in one of 18 digitization centers operated by the Archive worldwide. While there’s no exact way of identifying a singular 2 millionth book, the Internet Archive has chosen a representative title that helped push past the benchmark to highlight why its collection is so useful to readers and researchers online.

On December 31, The dictionary of costume by R. Turner Wilcox was scanned and added to the Archive, putting the collection over the 2 million mark. The book was first published in 1969 and reprinted throughout the 1990s, but is now no longer in print or widely held by libraries. This particular book was donated to Better World Books via a book bank just outside of London in August 2020, then made its way to the Internet Archive for preservation and digitization. 

“The dictionary of costume” by R. Turner Wilcox, now available for borrowing at archive.org.

As expected from the title, the book is a dictionary of terms associated with costumes, textiles and fashion, and was compiled by an expert, Wilcox, the fashion editor of Women’s Wear Daily from 1910 to 1915. Given its authoritative content, the book made it onto the Archive’s wish list because it is frequently cited in Wikipedia, including on pages like Petticoat and Gown

Now that the book has been digitized, Wikipedia editors can update citations to the book and include a direct link to the cited page. For example, users reading the Petticoat page can see that page 267 of the book has been used to substantiate the claim that both men & women wore a longer underskirt called a “petticote” in the fourteenth century. Clicking on that reference will take users directly to page 267 in The dictionary of costume where they can read the dictionary entry for petticoat and verify that information for themselves. 

Screenshots showing how Wikipedia users can verify references that cite “The dictionary of costume” with a single click.

An additional reason why this work is important is that there is no commercial ebook available for The dictionary of costume. This book is one of the millions of titles that reached the end of its publishing lifecycle in the 20th century, so there is no electronic version available for purchase. That means that the only way of accessing this book online and verifying these citations in Wikipedia—doing the kind of research that students of all ages perform in our connected world—is through a scanned copy, such as the one now available at the Internet Archive. 

Donations play an important role

Increasingly, the Archive is preserving many books that would otherwise be lost to history or the trash bin.

In recent years, the Internet Archive has received donations of entire library collections. Marygrove College gave more than 70,000 books and nearly 3,000 journal volumes for digitization and preservation in 2019 after the small liberal arts college in Detroit closed. The well-curated collection, known for its social justice, education and humanities holdings, is now available online at https://archive.org/details/marygrovecollege.

Several seminaries have donated substantial or complete collections to the Archive to preserve items or to give them a new life as their libraries were being moved or downsized. Digital access is now available for items from the Claremont School of Theology, Hope International University, Evangelical Seminary, Princeton Theological Seminary, and Anabaptist Mennonite Biblical Seminary

Just like The dictionary of costume, many of the books supplied for digitization come to the Archive from Better World Books. In its partnership over the past 10 years, the online book seller has donated millions of books to be digitized and preserved by the Archive. Better World Books acquires books from thousands of libraries, book suppliers, and through a network of book donation drop boxes (known as “book banks” in the UK), and if a title is not suitable for resale and it’s on the Archive’s wish list, the book is set aside for donation.  

“We view our role as helping maximize the life cycle and value of each and every single book that a library client, book supplier or donor entrusts to us,” said Dustin Holland, president and chief executive officer of Better World Books. “We make every effort to make books available to readers and keep books in the reading cycle and out of the recycle stream. Our partnership with the Internet Archive makes all this possible.”

The Archive provides another channel for customers to find materials, Holland added.

“We view archive.org as a way of discovering and accessing books,” said Holland. “Once a book is discoverable, the more interest you are going to create in that book and the greater the chance it will end up in a reader’s hands as a new or gently used book.”

Impact

Having books freely available for borrowing online serves people with a variety of needs including those with limited access to libraries because of disabilities, transportation issues, people in rural areas, and those who live in under-resourced parts of the world.

Sean, an author in Oregon said he goes through older magazines for design ideas, especially from cultures that he wouldn’t be exposed to otherwise: “It gives me a wider understanding of my small place in the global historical context.” One parent from San Francisco said she uses the lending library to learn skills like hand drawing to draw characters and landscapes to interact deeper with her child.

The need for information is more urgent than ever.

“We are all homeschoolers now. This pandemic has driven home how important it is to have online access to quality information,” Kahle said. “It’s gratifying to hear from teachers and parents that are now given the tools to work with their children during this difficult time.”

Kahle’s vision is to have every reference in Wikipedia be linked to a book and for every student writing a high school report to have access to the best published research on their subject. He wants the next generation to become authors of the books that should be in the library and the most informed electorate possible.

Adds Kahle: “Thank you to all who have made this possible – all the funders, all the donors, the thousands who have sent books to be digitized. If we all work together, we can do another million this year.”

Take action

If you’re interested in making a physical donation to the Internet Archive, there are instructions and an online form that start the process in the Internet Archive’s Help Center: How do I make a physical donation to the Internet Archive?

21 thoughts on “Internet Archive’s Modern Book Collection Now Tops 2 Million Volumes

  1. Jack El-Hai

    I applaud much of what the Internet Archive accomplishes, and I used to be a donor. Real libraries, however, do not make loans of unlimited duplicates of copyrighted materials, without a license from the copyright holder, as you did earlier in the pandemic. These loans directly hurt literary creators when many were financially vulnerable. That’s why I no longer contribute money to you.

    1. Jennie Rose Halperin

      Hi Jack!

      Thanks for commenting on this post – my organization Library Futures is in a coalition and community with the Internet Archive to champion the right to equitable access to knowledge for all libraries. We’re also working with author’s rights organizations like Author’s Alliance to ensure that authors are fairly compensated and libraries can do what they do – buy and lend materials. Learn more at https://libraryfutures.net

      Without commenting on the specifics of the National Emergency Library, I think we’re both in agreement that libraries not only provide a public good to society, they also buy books (lots of them!) This podcast lays out how library lending and ebook sales were way up this year, and sales are definitely one metric of a healthy publishing ecosystem: https://beyondthebookcast.com/up-for-2020-book-business-braces-for-2021/

      What’s not healthy in publishing, and I am sure you know this, is the way in which both libraries and writers are taken advantage of through restrictive licensing agreements that cut off knowledge and lock down rather than facilitate access. Library budgets have not kept pace with the astronomical price of digital content, and most authors don’t see their fair share of these rising costs. We believe that in order to provide equitable access, the system needs to work for everyone – authors, libraries, publishers of all size, schools, and the public sector. We are fighting for digital first sale, which uses technology to mirror the library’s right to loan legally acquired books under controlled conditions while respecting copyright. The Internet Archive’s Open Libraries program is at the cutting edge of this technology, working with hundreds of libraries to facilitate access to content.

      Our principles (https://www.libraryfutures.net/our-principles) lay out what we’re working towards. I’d also encourage you to take a look at this pricing explainer from the Canadian Library Council (https://econtentforlibraries.org/) and this article from Maria Bustillos in the New Republic (https://newrepublic.com/article/160649/book-companies-follett-overcharge-public-schools)

      I appreciate your contribution and look forward to a continuing conversation. As a writer myself, I know how difficult it is to build a career. I believe that it’s time for all of us (particularly creators) to think critically about the power structures at play between large corporations, nonprofits, libraries, and writers, how this interplay is exploitative to knowledge in the service of the public good, and how we can make it better together.

      Feel free to reach out!

      1. Jack El-Hai

        Jennie, thank you for your comment. I wonder, however, if you misunderstand the objections of authors to the actions of the National Emergency Library during the early months of the pandemic. I don’t know any authors who object to the work of libraries in lending books or to fair terms for libraries in the acquisition of books and ebooks to lend.

        Instead, we object that the Internet Archive took advantage of the pandemic to supplant the role of bookstores in providing books to readers who otherwise would have bought their books. I am not talking about students and other library users. By providing unlimited digital duplicates of books to its users for an unlimited period of time, the National Emergency Library deprived bookstores and authors of sales – at an economically precarious time – that would have otherwise been made. And the Internet Archive had no legal license to do so.

        Authors are dependent upon libraries, not the enemies of those institutions. We are the enemies of organizations that violate copyright and pirate our work, contrary to the practices of legitimate library lenders, in the guise of librarianship.

          1. Dean

            Please rethink your comment. Amazon never closed. Deliveries never stopped. Books were available to order to receive a regular print copy or purchase for download on Kindle or other e-devices. Yes. Revenue was lost.

          2. Jack El-Hai

            I don’t know of any online bookstores and online libraries that were closed, even during the pandemic’s earliest weeks. In addition, many brick-and-mortar bookstores and libraries were offering curbside pickup.

        1. Sylvester Wrzesinski

          As someone who does a fair amount of research, I think you are perhaps… overstating how accessible many of these books are. Older books especially. As someone who does research into tall ships (sailing ships like the tea clippers, and smaller and slower coasters), the best and indeed only books that have been published on this were last printed back in the ’70s. And the better and more detailed copies are usually even older.

          They are not available in bookstores, except for a few that have enough older used books that somehow a few copies of R.H. Dana’s works have somehow fallen into their laps. A few books I have only seen in university research libraries or cited elsewhere. And while I don’t know if Dana’s books have been archived like others here have, I certainly hope they will. Otherwise it will be direct knowledge lost, never to be appreciated or available.

          1. Jack El-Hai

            A high percentage of the books available through the National Emergency Library were available for sale during the pandemic’s worst days by online booksellers. These retailers suffered financial harm as a result.

    2. Ken

      Good day Jack,

      I am from the Philippines. We are looking for a machine like this for our school library and records. Please help.

  2. Crawford

    The law might be on your side, but you know very well that part of the people loaning “duplicates” via the Internet Archive did so because they couldn’t access those works due to the pandemic. Many of the loans wouldn’t result in any additional revenue to authors.

    Being affected by the pandemic, I get the position of (some) creators, but the lack of understanding in the context of a pandemic, where many people couldn’t leave their home and visit the library, lost their jobs, and were also financially vulnerable, left me a bit disappointed.

    Personally, I won’t be supporting some authors from now on.

      1. Joseph

        That’s exactly what you argued, along with accusing the IA of “taking advantage of the pandemic”, as if they were part of some nefarious plot.

        “By providing unlimited digital duplicates of books to its users for an unlimited period of time” – it wasn’t an unlimited period of time; it was until the general lockdown was lifted.

        1. Jack El-Hai

          If you ask the folks at the IA whether they took advantage of the pandemic to challenge U.S. copyright law, and they replied honestly, their answer would be yes. They needed no fellow plotters or conspirators to do so, and I didn’t suggest there were any.

      2. Stephen Plotkin

        Mr. El-Hai:

        You have reiterated many times now that bookstores and authors have been deprived of sales. But I think you need to quantify this deprivation. To what extent have booksellers of all sorts – authors, publishers, and stores – lost sales because of the IA initiative? What methodology is being used to make those estimates? In the absence of some supporting argument, your repeated insistence looks more and more like begging the question. Certainly if you have studies that underwrite your claims, I, for one, would be interested to see them. It would be helpful if they were more or less impartial, but I’ll look at anything I can get.

        I appreciate that you may believe this kind of support is unnecessary because the law obviously is on your side, but first of all, that is precisely one of the things that is being challenged here, and secondly even if it is “the law” that doesn’t mean it is good law. I will freely show my hand here, and say that I am skeptical of the current copyright regime. Although it may be benefiting some – and you may be one of them – I think it is bad for the larger community of creators and their audiences.

        Like anything human, copyright has a history, and anything with a history eventually reaches an end.

  3. Pingback: Internet Archive’s Modern Book Collection Now Tops 2 Million Volumes – Modding The World

  4. Ramsey Nolan

    How one contribute/donate articles etc., which are rare. Germany Univ. has a receiving such archival material. The goal is rare materials are not thrown into garbage by inheriters of estate.

Comments are closed.