Most 20th Century Books Unavailable to Internet Users – We Can Fix That

The books of the 20th century are largely not online.  They are mostly not available from even the biggest booksellers. And, libraries who have collected hard copies of these books have not been able to deliver them in a cost-efficient, simple, digital form to their patrons. 

The way libraries could fill that gap is to adopt and deliver a controlled digital lending service. The Internet Archive is trying to do its part but needs others to join in. 

The Internet Archive has worked with 500 libraries over the last 15 years to digitize 3.5 million books. But based on copyright concerns the selection has often been restricted to pre-1923 books. We need complete libraries and comprehensive access to nurture a well-informed citizenry. The following graph shows the number of books digitized by the Internet Archive, binned by decade:

Up until 1923 the graph shows our collection increasing and mirroring the rise in publications.Then it dips and slows because of concerns and confusion about copyright protections for books published after that date.  It picks up again in the 1990s because these books are more readily available and separate funding has helped us digitize some recent modern books Nevertheless, the end result is that the gap is big – the digital world is missing  a huge chunk of the 20th Century. 

Users can’t even fill that gap by buying the books from that time period. According to a recent paper by Professor Rebecca Giblin, the commercial life of a book is typically exhausted 1.4 to 5 years from publication; some 90% of titles become unavailable in physical form within just two years. Most older books are therefore not available to be purchased in either physical or digital form. The following graph, pulled from a study by Professor Paul Heald, shows books by decade that are available on Amazon.com. It shows that the world’s largest bookseller has the same huge gap – the 20th century is simply missing. 

The 20th Century represents a significant portion of published knowledge – approximately one-third of all books – as shown in the graph below.  These books are largely unavailable commercially, BUT they are not completely lost. Many of these books are on library shelves, accessible only if you physically visit the library that owns those books. Even if you’re willing to visit, those books might still not be accessible. Libraries, pressed to repurpose their buildings, have increasingly moved volumes to off-site storage facilities.

The way to make 20th Century books available to library patrons is to digitize those books and let every library who owns a physical copy lend that book in digital form. This type of service has come to be known as controlled digital lending (CDL).  The Internet Archive has been doing this for years. We lend out-of-copyright and in-copyright volumes that we physically own. We’ve reformatted the physical volume, produced a digital version and lend only that digital version to one user at a time. Our experience shows that this responds to a real demand, fills a genuine need satisfactorily, gives new life to older books, and brings important knowledge to a new audience. Check out this case study for CDL involving the book Wasted which figured prominently in the Brett Kavanaugh Supreme Court nomination hearings.  

Our experience has been replicated by other early adopters and providers of a CDL service. Here’s a list of some of them. We believe every library can transform itself into a digital library. If you own the physical book, you can choose to circulate a digital version instead.

We urge more libraries to join Open Libraries and lend digitized versions of their print collections, making more copies of books available for loan and getting more books into the hands of digital  readers everywhere. 

30 thoughts on “Most 20th Century Books Unavailable to Internet Users – We Can Fix That

  1. Lost in the 21st century

    [quote]
    This type of service has come to be known as controlled digital lending (CDL). The Internet Archive has been doing this for years.
    [/quote]

    Yes, you have. Unfortunately, your particular implementation of CDL seems to be wholly dependent on my first installing Adobe Reader. Alas, my computer’s 12-year-old OS (OS X 10.5/Leopard) will not allow me to install a version of Adobe Reader that would allow me to borrow books from the Internet Archive (Oh, how I would love to be mistaken about this!).

    If you have ever stood outside a closed bookstore looking in the window, then you know just how I feel. I can visit the Internet Archive all I like via TenFourFox (a browser still being supported for OS X 10.4/5), but nary a book can I hope to borrow.

    Is the IA working on eventually implementing CDL in a way that does not depend on an end user installing Adobe Reader? Would it be possible for the IA to lend books using, say, an encrypted cookie, instead, so that one only needs a web browser to borrow books?

  2. Adam

    Short commercial-life, long copyright lifespan. Same goes with software, called abandonware. Video games you cannot obtain legally have a similar story with books.

    1. Adam S Yik

      Sadly, unlike books, the law prohibits even personal copies of software, books can get 3 copies which is beneficial since most old books uses outdated technology, causing them to age faster but you follow this exception to prevent them from being lost.

  3. Ed Brown

    In the 19th Century, as public education gradually became universal in Europe and North America, Australia, NZ and elsewhere, public libraries were seen as a solution to the problem that although most people could now read and write, many could not afford to own books – not in any qualtity – because they were quite expensive to buy. Libraries were a solution to the problem, because one book could sit on a shelf, and over many years be read by thousands of people.

    Today, libraries are no longer the solution: they are now the problem. Terrified of the fact that technology has rendered them obsolete, they now want to prevent one book being read by many people.

    Publishers and authors only sell any one book for maybe a couple of years. Very few (paper) books remain in print 5 years after publication. Very few authors make money on a book after that point, because of that. But libraries and publishing companies want to prevent public access to out of print books, except through them.

    They feel threatened by the new technologies that make digital access available to everyone of their customers in their own homes. Copyright used to be their way of locking other commercial publishers out of the marketplace, but now that anyone with a home computer can publish any book that exists as a digital edition, those who were formerly their customers have become their competitors, and they want to oppose that, as it threatens their survival. But it means locking their customers out of the marketplace, which has unexpectedly become a vastly bigger marketplace.

    They are dinosaurs, and they have no future, they just can’t accept it. Publishers will probably survive, because they create new product (books). But libraries, which don’t create anything, probably have no future. The future is online, not in a 19th century mausoleum.

    It would make sense to abolish the obsolete copyright laws, and replace them with a term not exceeding 5 years, which is the longest that 99 percent of (paper) books remain in print. That would help preserve publishers. But libraries, in their 19th century form, look to have no future.

    1. Carl A. Librarian

      I believe your evaluation of libraries as an enemy is misinformed.

      Libraries now provide access to both digital and print versions of published works. They circulate both and concentrate on providing the most service to their communities.

      In fact libraries can purchase access to multiple digital copies of a title, often less expensively than buying an equivalent number of print copies.

      Libraries are coping with some of the same problems of availability as the general public plus publisher’s restrictive policies. there is also the problem that once a digital book is “purchased” the supplier may be able to discontinue access to it.

      I do agree that we should probably review the copyright laws. But, we should remember that it exists to ensure there is an incentive and a benefit to authors who produce valuable content.

      It is not so simple as we all wish.

      1. CarynW

        Thanks for saying that!

        I think libraries are in the right place at the right time now. Librarians don’t see their patrons as the enemy; if there is an enemy in this, it’s the publishers, with their fears that more than one reader per book will have a negative impact on their sales. Library patrons are justifiably angry when only one “copy” of a digital book can be read at a time, when everyone knows that they could lend millions of copies concurrently if the publisher allowed it.

        But I really don’t want to paint anyone as an enemy. I think it would be difficult to make a case for leaving the copyright laws as they are – we need to fix them, and soon. Then libraries will be able to return to their mission of supplying all patrons with all reading materials, and all books that deserve to be read will be.

  4. آهنگ جدید

    It would make sense to abolish the obsolete copyright laws, and replace them with a term not exceeding 5 years, which is the longest that 99 percent of (paper) books remain in print

  5. Toni

    I can’t count the number of times I have wanted to read an out of print history to use in genealogy research only to hit a new brick wall. The book is still in copyright but not for sale anywhere and the two libraries that have it are in another country. If every library would lend to all the other libraries I would have a chance of seeing that book.

    1. K Manion

      Most libraries participate in an “interlibrary loan” system that allows them to borrow books from other libraries across the country (sometimes a small fee is charged). Outside of the country can be more difficult. However, you could contact the owning library to ask if they would send photocopies or scans of the relevant pages. Most libraries are eager to help with information requests.

  6. آهنگ جدید

    Short commercial-life, long copyright lifespan. Same goes with software, called abandonware. Video games you cannot obtain legally have a similar story with books.

  7. Ed Brown

    The initiative by the Internet Archive to digitise/digitize books is a terrific step forward. It moves libraries on-line, by enabling anyone with internet access to read a book held in the Archive’s online collection from anywhere in the world.

    The only negative aspect of the project is the restriction that a book in the collection can only be accessed by one user at a time. Presumably this is done to comply with the copyright law in certain countries, by providing the same service as in an ordinary library, where only one borrower at a time can take a (printed) book out.

    However, the restriction makes no sense in the new online world: a book no longer need be borrowed only by one user at a time: the technology has now abolished that limitation, since a “book” is no longer a collection of paper pages glued together.

    As the book can only be read online, in the web browser, the notion of restricting viewing to one user at a time is not necessary. It does not matter that two or more users are reading it, since it cannot be taken away (i.e. cannot be downloaded).

    What has been done, by limiting access to reading it only in a browser, is to create, in effect, a Reference library. In that context, there is no copyright issue, because the book never leaves the “building”. So there is no point in restricting access to one reader at a time.

    There is no value in having the new technology but not using it. It is not desirable to retain the old ways just for the sake of it. In a digital library, single-user access is undesirable, and should not be enforced just for the sake of “doing things the way they’ve always been done”.

    1. Nathan

      The last time I borrowed a book via archive.org or openlibrary.org, I was able to download the encrypted ebook and read it offline for 14 days.

  8. licensenod32

    Thank you for the useful information you provided
    Most of the books are copyright
    I hope all books will be available to all people
    It should also respect the rights of the journalists
    thank you

  9. Emilio de Gogorza

    Is it possible for me as a private person and a collector of music books to “open” a library at Archive.org in order to make rare books, which are still under Copyright restrictions, available to other collectors? If yes, how can I do this?

    1. Brewster Kahle Post author

      We have not started working with individuals that lend from their collections through the mechanisms on Archive.org. You an do it yourself, or if you are ever interested, you could donate the books to the Internet Archive.

      1. Elmer Steele

        Some folks might consider their books like their children… not to be parted from until circumstances allow nothing less. However, that begs the question as to whether you have a process in place whereby folks that have no one who would appreciate such a ‘gift’ to present you with their ‘library’ of physical specimens. Certainly some sort of list of library contents could be provided by individuals to predetermine whether any items might be worthy of preservation (not a duplicate). Subsequently, arranging a distribution to the archive could be planned. Just a thought. Another in a similar vein is that Crowd-Sourcing donors in larger cities might produce a wealth of similarly adoptable or orphaned books, recordings, etc. The later logically requires an infrastructure of volunteers to consolidate and intermediate.

  10. Ryan H

    What I don’t understand is why there are so many duplicates of scanned books on the CDL platform. It seems every public library is tasked with scanning their book collection, even if a scanned version of the same book already exists on archive. Why can you not use another scan of a book in CDL? No one is going to want to go through the cost of digitization when it’s already been done somewhere else on the internet, for pete’s sake. This type of redundancy is an impediment to widespread adoption by many public libraries. Is it because of legal reasons? D.R.Y. (don’t repeat yourself) principal seems particularly relevant here.

    1. Ryan H

      It would also be helpful if, instead of many redundant links to multiple copies of the same book, there was a single page for each book and then you can “check out” a book from a public library whereever in the nation it hosts one. I maintain many lists of archive.org books available for CDL on my github page, and I’m responsible now for continuously updating the list each time a new scan comes out (I, for example, have six different links for a single book on my custom syllabi). This means I have to continuously check archive.org for new scans if I ever want to add a copy. If there was a single link for each book, with discrete copies available from that single source at various public libraries, I could just link to that single archive.org page for each book on my curriculum. This “multiple links for each book” is somewhat unsustainable, and I’m forced with not having new links to new copies of books just because I can’t continuously go through each book on my syllabi and check if there are any new copies (let alone determine if I have already linked to the previous copies already on my syllabi). Just a note from someone who is creating open source curriculum using archive.org CDL books. Obviously I could just add a note to the top of my repos that says something like “If there are no available copies of a book at all of the links provided for each book, be sure to re-search for that book on archive.org as there may be new links to new copies that I haven’t yet included on the list.” But my ideal situation would be something like worldcat, where there’s a single point of entry for each book that I can link to on my syllabi with a listing of each public library on archive that makes the book available for checkout.

      Thank you for the service.

    2. Brewster Kahle Post author

      We try not to digitize the same book twice unless there is some reason, like a special edition. But we have not always succeeded, as you have found.

  11. Preziosia

    My ONLY problem with Archive.org is the fact that you often make patrons “check out” books that are already, at least in the legal sense, no longer valid, time wise, in regards to their publication dates.

    Copyright neutral titles should be screened and put up for public consumption, i.e., direct”Downloading”, rather than making someone “check the book out”.

    I do however, love and support Archive.org, warts (and it has many, but in the name of Free Speech, I [at least try to] understand) and all.

    Keep up the good work, but by the same token, please try and upload tomes that can be downloaded freely.

    BTW…either Microsoft or Google Chrome do NOT like Archive.org. Whenever I stay on site beyond 5-7 minutes, “bubbles” appear on the right upper corner of my PC screen and forcibly mess with my ability to explore your site, due to constant screen shrinkages or even kicking me off, at times, forcing me to wait a few minutes in order to restart the process.

    If I’m Downloading, it will “TRY” the same tactics, but cannot/will not kick me off because of the downloading process itself.

    Have you heard of any similar instances or complaints?

Comments are closed.