Why Preserve Books? The New Physical Archive of the Internet Archive

by Brewster Kahle, June 2011   Press on this:  NYtimes

Books are being thrown away, or sometimes packed away, as digitized versions become more available. This is an important time to plan carefully for there is much at stake.

Digital technologies are changing both how library materials are accessed and increasingly how library materials are preserved. After the Internet Archive digitizes a book from a library in order to provide free public access to people world-wide, these books go back on the shelves of the library. We noticed an increasing number of books from these libraries moving books to “off site repositories” (1 2 3 4) to make space in central buildings for more meeting spaces and work spaces. These repositories have filled quickly and sometimes prompt the de-accessioning of books. A library that would prefer to not be named was found to be thinning their collections and throwing out books based on what had been digitized by Google. While we understand the need to manage physical holdings, we believe this should be done thoughtfully and well.

Two of the corporations involved in major book scanning have sawed off the bindings of modern books to speed the digitizing process. Many have a negative visceral reaction to the “butchering” of books, but is this a reasonable reaction?

A reason to preserve the physical book that has been digitized is that it is the authentic and original version that can be used as a reference in the future. If there is ever a controversy about  the digital version, the original can be examined. A seed bank such as the Svalbard Global Seed Vault is seen as an authoritative and safe version of crops we are growing. Saving physical copies of digitized books might at least be seen in a similar light as an authoritative and safe copy that may be called upon in the future.

As the Internet Archive has digitized collections and placed them on our computer disks, we have found that the digital versions have more and more in common with physical versions. The computer hard disks, while holding digital data, are still physical objects. As such we archive them as they retire after their 3-5 year lifetime. Similarly, we also archive microfilm, which was a previous generation’s access format. So hard drives are just another physical format that stores information. This connection showed us that physical archiving is still an important function in a digital era.

There is also a connection between digitized collections and physical collections.    The libraries we scan in, rarely want more digital books than the digital versions that we scan from their collections. This struck us as strange until we better understood the craftsmanship required in putting together great collections of books, whether physical or digital.  As we are archiving the books, we are carefully recording with the physical book what the identifier for the virtual version, and attaching information to the digital version of where the physical version resides.

Therefore we have determined that we will keep a copy of the books we digitize if they are not returned to another library. Since we are interested in scanning one copy of every book ever published, we are starting to collect as many books as we can.

We hope that there will be many archives of physical books and other materials as they will be used and preserved in different ways based on the organizations they reside in. Universities will have different access policies from national libraries, say, and mostly likely different access policies from the Internet Archive. With many copies in diverse organizations and locations we are more likely to serve different communities over time.

Physical Archive of the Internet Archive

catalogued book

Books are cataloged, and have acid free paper insert with information about the book and its location

Internet Archive is building a physical archive for the long term preservation of one copy of every book, record, and movie we are able to attract or acquire.  Because we expect day-to-day access to these materials to occur through digital means, the our physical archive is designed for long-term preservation of materials with only occasional, collection-scale retrieval. Because of this, we can create optimized environments for physical preservation and organizational structures that facilitate appropriate access. A seed bank might be conceptually closest to what we have in mind: storing important objects in safe ways to be used for redundancy, authority, and in case of catastrophe.

The goal is to preserve one copy of every published work. The universe of unique titles has been estimated at close to one hundred million items. Many of these are rare or unique, so we do not expect most of these to come to the Internet Archive; they will instead remain in their current libraries. But the opportunity to preserve over ten million items is possible, so we have designed a system that will expand to this level. Ten million books is approximately the size of a world-class university library or public library, so we see this as a worthwhile goal. If we are successful, then this set of cultural materials will last for centuries and could be beneficial in ways that we cannot predict.

To achieve a goal of long-term preservation we have assumed:

  • Infrequent access,
  • Manage millions of books, records, and movies,
  • Adapt to needs of different physical media and collection value,
  • Facilitate storage evolution by monitoring existing systems and introducing new ideas,
  • Adapt to multiple facilities in different environments, and
  • Sustainable from a financial and maintenance perspective.
box of books

Boxes then store approximately 40 books with labeling on the outside

To start this project, the Internet Archive solicited donations of several hundred thousand books in dozens of languages in subjects such as history, literature, science, and engineering. Working with donors of books has been rewarding because an alternative for many of these books was the used book market or being destroyed. We have found everyone involved has a visceral repulsion to destroying books. The Internet Archive staff helped some donors with packing and transportation, which sped projects and decreased wear and tear on the materials.

These books are digitized in Internet Archive scanning centers as funding allows.

To link the digital version of a book to the physical version, care is taken to catalog each book and note their physical locations so that future access could be enabled. Most books are cataloged by finding a record in existing library catalogs for the same edition. If no such catalog record can be found, then it is cataloged briefly in the Open Library. Links are made from the paper version to the digital version by printing identifying and catalog data on a slip of acid free paper that is inserted in the book. Linking from the digital version to the paper version is done through encoding the location into the database records and identifiers into the resulting digital book versions. The digital versions have been replicated and the catalog data has been shared.

pallet of boxes

Pallets hold 24 boxes each, and are the stable location unit

Most of these first books have been digitized with funding from stimulus money for jobs programs and funding from the Kahle/Austin Foundation. This served to build the core collection of modern books for the blind and dyslexic. Many of these digital books are also available to be digitally borrowed through the Open Library website.

This was a change from our previous mass digitization procedures when a library would deliver and retrieve books from our scanning centers. Where the libraries would have already done the sorting and de-duplication of books, we now need to do these functions ourselves. The process to identify titles that have not been preserved already is now in place, but is in active development to improve efficiency. The thorough work of libraries in cataloging materials is key in this process because we can leverage this for these books. Identifiers such as ISBN, LCCN, and OCLC ids have helped determine which books are duplicates.

In January of 2009, we started developing the physical preservation systems. Fortunately there is a wealth of literature on book preservation documenting studies on the fibers of paper as well as results from multi-year storage experiments. Based on this technical literature and specifications from depositories around the world, Tom McCarty, the engineer who designed the Internet Archive’s Scribe book-scanning system, began to design, build, and test a modular storage system in Oakland California. This system uses the infrastructure developed around the most used storage design of the 20th century, the shipping container. Rows of stacked shipping containers are used like 40′ deep shelving units. In this configuration, a single shipping container can hold around 40,000 books, about the same as a standard branch library, and a small building can hold millions of books.

shipping containers

Modified 40′ shipping containers are used for secure and individually controllable environments of 50 or 60 degrees Fahrenheit and 30% relative humidity

 

 

Based on this success and the increasing availability of physical materials, a production facility leveraging this design will be launched in June of 2011 in Richmond, California. The essence of the design from the book’s point of view is to have several layers of protection, each able to be monitored and periodically inspected:

  • Books are cataloged, and have acid free paper inserts with information about the book and its location,
  • Boxes store approximately 40 books with labeling on the outside,
  • Pallets hold 24 boxes each,
  • Modified 40′ shipping containers are used as secure and individually controllable environments of 50 or 60 degrees Fahrenheit and 30% relative humidity,
  • Buildings contain shipping containers and environmental systems,
  • Non-profit organizations own and protect the property and its contents.
Internet Archive physical archive building

Buildings contain shipping containers and environmental systems

This physical archive is designed to help resist insects and rodents, control temperature and humidity, slow acidification of the paper, protected from fire, water and intrusion, contain possible contamination, and endure possible uneven maintenance over time. For these reasons the books are stored in isolated environments with a regulated airflow that depends on few active components.

Internet Archive logo

Non-profit organizations own and protect the property and its contents

The Internet Archive is now soliciting further donations of published materials from libraries, collectors, and individuals.

This collection and methodology has already helped in mass digitization and preservation, and we hope that we will offer a wealth of knowledge to future generations.

Thank you to Tom McCarty, Robert Miller, Sean Fagan, Internet Archive staff, San Francisco Public Library leadership, Alibris, HHS of the City of San Francisco, and the Kahle/Austin Foundation for being leaders on this project.

 

This entry was posted in Announcements, Books Archive, News. Bookmark the permalink.

238 Responses to Why Preserve Books? The New Physical Archive of the Internet Archive

  1. Bob says:

    Isn’t this duplicating what the Library of Congress already does…at least for US books?

    • brewster says:

      Yes, the Library of Congress is doing a great job. We believe we have a role by being a very different organization. We believe we will provide access to different groups and different approaches to preservation which can be valuable.

      But it is really just a guess. But lets keep the books we digitize, and lets try our best to keep copies for the future.

      -brewster

      • Larry Moniz says:

        Sounds like another scam, like Google’s to acquire intellectual property and pass it around WITHOUT conforming to U.S. copyright laws and also paying royalties to authors. Please consider this notification that I prohibit you from copying and retaining any of my work in any form whatsoever.

        • Common sense says:

          How do you feel about the used book store?
          Suppose I come across one of your books at the used book store that you have already been paid royalties for, buy it, and take it home.
          Who are you to decide what I do with it?
          If i choose to keep it in a sealed container, or choose to shred it and use the pulp to line my hamsters cage, you have absolutely no legal right to dictate what I choose to do with it.
          As an author myself I am delighted to hear that my ideas could be preserved indefinitely at absolutely no cost to me.
          Keep up the good work fellas and maybe I’ll be lucky enough to have one of my works included in your preservation project to ensure that my ideas and name will be known to future generations, long after I’m dead and my precious royalties mean absolutely nothing.

        • Dave says:

          Ha – Larry….

          I just looked at your bibliography and I can say with certain that the future doesn’t need to know about your work. I hope the archive complies with your selfish request and keeps all of that drivel away from the hands of future historians and scholars. In 50 years when your work is public domain, nobody will remember and nobody will care…. sort of like now.

          Archive.org is doing a great service here and you are a shortsighted fool not to support all this organization does.

        • Troy Truchon says:

          Did you just pick a random article in their blog to post a paranoid right wing rant to? Did you actually read the post? They are only PHYSICALLY archiving works they legally own.

          They have the ultimate goal of digitally archiving ever book ever written, but that’s not the primary focus of this article. And for that matter it is neither illegal nor dubious nor a “scam” to digitally archive a book, with or without the authors permission.

          As near as I’m able to tell from my many spelunking adventures through the Internet Archive they have conformed with copyright law in all media they distribute, and it is within the realm of distribution that Copyright law concerns itself.

          And as an aside, if they legally purchase any of your “works” they can freely archive, copy, or do whatever the hell they wish with them provided they do not distribute them without permission.

        • Jon Gustafson says:

          Dear Larry.
          I cannot see any reason for NOT saving books for the future, regarding Mr. Kahle´s statements “If we are successful, then this set of cultural materials will last for centuries and could be beneficial in ways that we cannot predict.” – I.e. your books. As i reckon you being an author it will be a cultural and historical statement in your and your descendants favor – Regarding royalties, the copyright laws are likely to exist in the future!…I think of this project as being one of the most impressing cultural investments in our future, if, of course the idea is not abused in any economically illegal/selfish form.
          Regards/Jon

      • Larry Moniz says:

        Not your right under the law to be the guardian of others intellectual property. In my opinion as an author, journalist and publisher, what you are doing is pirating intellectual property. Even the Library of Congress doesn’t actively seek works from all sources. It accepts works from the author or publisher as part of copyright law. You try to sound holier-than-thou and well intentioned, but what’s your real agenda? You also say this being paid for by “funding from stimulus money for jobs programs.” Sounds like a flagrant misuse of stimulus funds.

        • Larry, sorry to be so blunt, but your comments are ill informed. This project is quite different from the Google Book Scanning project. Visit the Archive in San Francisco, and prepare to be amazed.

          Daniel Erasmus
          Amsterdam, The Netherlands

        • Olivia says:

          Stop being such a capitalist. The importance of this project is apparent. As a published author myself, I think it’s an incredible idea. While the authors of said books will not receive royalties–monetary or otherwise–their work instead can be eternalized. A fair trade-off in the name of future historical accuracy, if I do say so myself.

        • Really? says:

          I suppose the best thing for them to do if they come across one of your books is to burn it.

        • Maureen O'Brien says:

          Are you folks above seriously protesting the owning and storing of physical books? Do you also run around torching bookstores, or what?

          • Teresa says:

            Maureen,

            That was my feeling / thought as well. Crazies come in all forms, and paranoia latches onto anything.

            Teresa

        • mas says:

          I work at a major university who worked with the Internet Archive (IA) as an affiliate through the Open Content Alliance a few years back, and unless the IA’s policies have changed they expressly did not digitize any book that was still in copyright – only in the open domain. When it came to copyright IA played by the rules.

        • David Thier says:

          Being an avid reader of books, magazines and comic books, all of which are being digitized in this manner by Independant Libraries or by the personally owned Companies such as Marvel Comics and DC, etc. that store originals and have back up digitized copies of their output. What these people are doing is a great idea, I personally enjoy original “paper” format and always will. I simply enjoy holding a book in my hands and turning the pages, and the smell of paper is also something I would miss as well as having immediate access to my bookcases that I love for their appearance. I will not buy an E-book reader, don’t want one. They might be great for students. That being said, if I ever run across any of your “Work’s”, I will give you the option of buying it back. If you don’t find this palatable. That’s my offer, then I hold the right to do whatever I want with it. Including copying and storing it or shredding it or giving it away for free to anyone I want. Will that be fine with you Larry? Fact is there is nothing you can do about it and that bothers you. Sorry Larry.

        • Keith says:

          If they buy or are given a book, a real book, they get to keep it. Pretty sure that’s what “possession” means. What is your problem with them collecting books?

        • wap-tek says:

          you have the right to first sale ONLY , SHUT UP or you will become as famous as Metallica VS napster

      • Teresa says:

        Hi,
        I think this is an amazing project, and I’m astounded at the paranoia-filled and hateful people who apparently just must be nasty for the heck of it.

        I have valued books since my mother taught me to read long before I started school, and I am so grateful that you are preserving the physical objects that seem to be getting less and less important to so many people.

        Again, great project!

        Teresa

      • All very interesting, but nowhere do I see any more of an address than “Richmond, CA”, so how do people send you books if you aren’t bright enough to assure your address appears everywhere? I’ve one science fiction novel (“Time for Patriots”), two books on astronomy, and a planned anthology of short stories I would send if I knew where.

        • Keith says:

          I likewise have a few books that I would gladly donate to anyone, as long as I knew that it was going to be kept or offered and not simply re-sold or trashed.

          At the same time, this project certainly doesn’t need an influx of duplicate books. They don’t need 5,000 copies of Going Rogue, for example.

          But if it were possible to a) personally donate small sets of or individual books and b) have a system by which it could be determined if you already had the book, I think this project would become very rich in books indeed.

    • p. says:

      I have an recently received MSIS degree. One of the things you learn about digitizing records is that creating a digital copy doesn’t mean you should chuck out the physical copy. Having a few archives that preserve the same material is a good thing. It plans ahead for disaster if something were to happen to one archive and also makes accessibility to physical objects wider as you can spread them out geographically.

    • MaestraR says:

      I would really like it if this article could be proofread by someone with English as a first language, in conversation with the author. It is an incredibly interesting, thoughtful and important discussion I think and yet it simply doesn’t make sense in several crucial parts.

    • Homer Goodall says:

      Many of the ones in the Library Of Congress are falling to dust. Reason? They had been printed on pulp paper. Older books still are printed on hemp paper. Hemp paper will last a long long time.

    • David S. says:

      The Library of Congress throws out a lot more works then you think. They don’t bother to keep all but a very few roleplaying books, and I suspect whole other genres I’m not interested in just get tossed. They tossed the 1890 census, for crying out loud.

  2. Pingback: Why preserve books? The new physical archive of the Internet Archive, by Brewster Kahle | Ebooks on Crack

  3. Marijane says:

    Bob,

    see http://www.loc.gov/about/faqs.html#every_book

    also, from http://www.loc.gov/loc/legacy/colls.html:
    “The Library’s role as a copyright depository has contributed to the popular belief that it contains one copy of every book published in the United States. It does not. Its collections are the most comprehensive in the country, but it is not a library of record in the legal sense; it is not required to retain all copyright deposits and, except for the period 1870-1909, it has never attempted to do so.”

  4. Jonathan says:

    The Library of Congress does not preserve copies of every book published. It is the research library of the Congress.

    From the LOC’s website: “The Library receives some 22,000 items each working day and adds approximately 10,000 items to the collections daily. The majority of the collections are received through the Copyright registration process, as the Library is home to the U.S. Copyright Office. Materials are also acquired through gift, purchase, other government agencies (state, local and federal), Cataloging in Publication (a pre-publication arrangement with publishers) and exchange with libraries in the United States and abroad. Items not selected for the collections or other internal purposes are used in the Library’s national and international exchange programs. Through these exchanges the Library acquires material that would not be available otherwise. The remaining items are made available to other federal agencies and are then available for donation to educational institutions, public bodies and nonprofit tax-exempt organizations in the United States. “

  5. James says:

    If you want to avoid anobium punctatum and xestobium rufovillosum, then why not cool to 40 degrees Fahrenheit and 10% humidity?

    • brewster says:

      Good idea. Reducing the relative humidity that far has been recommended. We are going to be working on cost effective ways of doing dropping the humidity and will try to get that low. As we understand the Library of Congress does, when we want to access the books again after they have been in such a dry environment, it will need to slowly re-hydrate. Pretty nifty.

      • Library of Congress Preservation Directorate says:

        In regard to the June 7th comments referencing the Library of Congress, we at the Library would like to provide the following point of clarification on our temperature and relative humidity (RH) controls: The Library of Congress has a state-of-the-art, specialized facility in Fort Meade, Maryland, designed to efficiently hold library collections at a controlled temperature and relative humidity optimized for the long-term preservation of the collections. The specific temperature and relative humidity (RH) for the storage of different materials (50 degrees F and 30% RH for books and paper materials; 35 degrees F and 30% RH for black and white photographs, microfilm, and microfiche; 25 degrees F and 25% RH for photographic negatives, transparencies, and color prints) effectively prolong the useful life of the collections (by reducing the rate of chemical degradation) and follow existing ISO standards. Collections are removed from the cool module and from the cold and freezing vaults to a staging area, which allows the materials to acclimate to a warmer temperature without the formation of condensation (http://www.loc.gov/preservation/)

    • as an information scientist,preserving resources is very important

  6. Carlos May says:

    Excellent. A huge amount of data was lost in the last big transfer of info from paper to “the medium of the future” — microfilm. Let’s not make the same mistake twice.

  7. Liz Ardly says:

    There’s a family in Saskatchewan who just took over a neighbour’s 350,000 strong collection of rare and vintage books. They appear to be harrowed at the task of rescuing this collection. You can read about it on CBChere.

    Is there something that can be done to connect these two problems?

  8. Steve says:

    This story might be on interest, 350,000 books in immediate threat of being burned(!) and rescue needed. Some rare and vintage.

    http://www.cbc.ca/news/canada/saskatchewan/story/2011/06/06/sk-book-collection.html

  9. Until a digital system can be designed that isn’t open to file corruption or human error, and that has a long-term shelf life, it is essential to retain paper copies. Otherwise you risk losing masses of data in the migration to the next system (as CarlosMay says).

  10. Pingback: Marilyn Johnson, Brewster Kahle and the risks of leaping into digital with both feet « Librarian of tomorrow

  11. Pingback: Why Preserve Books? The New Physical Archive of the Internet Archive | Internet Archive Blogs « am i?

  12. Pingback: Daily Digest for June 7th « Jason Kucsma

  13. Russ McClay says:

    Kudos to anyone involved in this important work.

  14. Justin Sherrill says:

    Have you considered using other spaces for storage? Underground storage may have some cost advantages. (I work at a salt mine, where it’s always 60 degrees and 20% humidity.) Of course, there may not be the right sort of geology near you for easy access.

    • I have seen pictures of salt mine storage… nifty.

      Are there ones available to buy?

      -brewster

      • Eleanor Cook says:

        I think if you plan to store books archivally at this massive level, you do need to think about the location. Not that there is any one perfectly safe place, but San Francisco d0es have a high probability of having earth quake damage. Just sayin’…

      • Sandeep says:

        I think this storing of a physical archive is fantastic. Do you accept credit card or cash donations ?

      • I believe both the government and private corporations have bought space in these underground facilities- there are several but I know of one in particular in Pennsylvania (Iron Mountain is the company, the storage space is in Boyers PA) but there are more companies- Underground Vaults and Storage, Hunt MidwestSubTropolis, for example.
        It may be beneficial to have such an important depository of texts split in two or more parts. I know that film archives from movie studios, Army documents, and Bill Gates’ Corbis photography collection are stored this way.

        I am very impressed with the scope of the project- and if there is a way to volunteer my time and/or books, I would love to be able to contribute.

      • Kerry Burns says:

        There are abandoned limestone mines in the Bluffs of the Mississippi River not far from St Louis. These would be better than abandoned coal or metal mines (no toxic vapours or waters). They also have lower seismic risk than Richmond.

      • Eric Seale says:

        Brewster — get thee to Hutchinson (Kansas). Lots of space for lease in mined-out salt mines there. There’s a museum to see:

        https://en.wikipedia.org/wiki/Kansas_Underground_Salt_Museum

        …and a company set up “next door” for archival storage:

        http://www.undergroundvaults.com/

  15. Pingback: Internet Archive Turns To Books | WebProNews

  16. magscanner says:

    Wonderful! But … you need to build a second one, identical in content, somewhere far away in a safe location. I know, I know; and I’m sorry to have to say this; but it’s true.

    One really big fire, and everything’s gone. A single copy is never enough.

    • Kent Pitman says:

      Important works could and should be stored redundantly in places that vary in climate and politics, but for the great mass of literature in an archive that large, it might be sufficient to just store parts of it in different locations so risk was statistically distributed. In a sense, there is already a great deal of redundancy built into the stuff mankind has written even without duplicating individual works.

      Also, my mind keeps coming back to a repository, perhaps another copy of the subset of important works, buried on the Moon, perhaps with an attached radio beacon, just in case of catastrophe on Earth, so there’s a record that we were ever here. A one-way robotic mission to the moon to accomplish such a task might be an interesting challenge. (Figuring out what kinds of information to include as a rosetta stone for aliens to help them understand such a repository if they ever encountered it would be a fun problem, too.)

  17. Vacuus says:

    Have you considered preserving software, while you’re about it? I’m thinking of pre-1990 software on original media. It has begun to bother me increasingly that the traditional library system with all the preservation, indexing and legal resources at its disposal has no interest in hundreds of thousands of software titles which are in danger of being lost to the world.

  18. Pingback: Internet Archive: konténerbe kerülnek a nyomtatott könyvek | Gépnarancs

  19. Dr. Q says:

    Yeah, put all the books together.

    It will be, then, easier to destroy as much as possible of them in one hit, or by an accident (the typical self-fulfilled profecy of a “single point of failure”).

    First step to rewrite history.

  20. Pingback: Digitizing books « Tales From An Open Book

  21. Jim A. says:

    Are you making any attempt to deal with the problem of brittle books? ISTM that the book will slowly acidify over time. After a mere century you’d be left with a collection of pages that you can hardly touch without breaking them. Most of the wood-pulp paper used for the last 100 years or more just isn’t a particularly good archival format. That’s one of the main reasons WHY libraries and archives have been microfilming and scanning books in the first place.

    • christie says:

      According to Nicholson Baker’s book Double Fold, only the edges of bound newspapers (v.v. acidic, lower-quality paper) have been shown to deteriorate in that way — unless they are specifically subjected to the stress test of creasing a corner over and over until it breaks. Even without careful control of temperature and humidity, the middles of the pages were preserved in a largely anaerobic state by the pressure of being stacked together.

      True that papers with a higher rag content (cotton/linen) will hold up better, but under mass storage conditions there isn’t actually a huge problem.

  22. Chris C. says:

    Jim A: The brittle book problem is a result of residual acid in the wood pulp paper left as part of the manufacturing process. The Internet Archive is specifically using acid-free paper to avoid that problem.

    • David says:

      Chris,

      Yes they use acid free paper for the cataloging process but the books themselves are still acid paper. However that is why they must place them in a place with tightly controled temprature and humidity to help prevent the paper from breaking down.

  23. Pingback: Internet Archive to preserve print

  24. Pingback: The Internet Archive, Now Preserving Printed Books As Well — NetworqScience Blog

  25. Pingback: The Internet Archive, Now Preserving Printed Books As Well | SEO College

  26. Pingback: The Internet Archive, Now Preserving Printed Books As Well

  27. Pingback: The Internet Archive, Now Preserving Printed Books As Well | Derivations of Thought

  28. Pingback: Internet Archive becomes archive of physical books, too | Barwo Blog

  29. Bibliophile says:

    This is a great project. Thank you for starting to do this.

    Have you read Nicholson Baker’s Double Fold? http://en.wikipedia.org/wiki/Double_Fold . I wonder if any of what he discusses could have a bearing on a physical preservation project of this scale.

    As someone else mentioned, having one copy of all items is a single point of failure. Have you considered having 2 or 3 copies (assuming the items are not so rare as to be unique, or at least not have too many unique items)? And making sure the copies are distributed across, say, continents?

  30. Pingback: links for 2011-06-07 | Kuple

  31. Pingback: buba8 › Internet Archive becomes archive of physical books, too

  32. David says:

    I hope the building is earthquake proof. Though having them in cargo containers would go a long way to protect them. I would think you would do this in a more stable place than San Fransico.

  33. Pingback: The future of books = shipping containers | Ebooks on Crack

  34. Ratz says:

    The Bodliean library in Oxford often throws out books it discovers duplicates of, I don’t suppose archive.org have a UK chapter which could potentially take them instead of letting them go to waste?

  35. Pingback: The Internet Archive, Now Preserving Printed Books As Well

  36. K says:

    Librarians recognize that the commercial companies (ie. Google books) could at any time simply turn off access to their books. Google is not a public organization working for the public good, they are a private company with the goal of growth and profits. There is nothing wrong with that, but as a library, we cannot (or wish we didn’t have to) count on the accessibility of materials through Google (or other organizations) to replace our own resources to which we can guarantee access, have a set price, and which are not going to go away if our services are not profitable.

    I think this is a great project – I just wish the author hadn’t framed so much of the article as librarians throwing away books. Libraries are not archives – they provide the information resources that are useful and used by their patrons. They are also perpetually short on funds, space, collection budgets, and staff. I don’t think they are tossing books wily-nilly into the trash!

    It is too bad that this project is not more closely tied to libraries – organizations that have been storing, cataloging and providing access to information in all formats, for free to all comers for decades. And which are dedicated to public access – not profits. I wonder who the “non-profit organizations [that] own and protect the property and its contents” are? And what, if any, their access policies for libraries and the public will be in the future.

  37. Pingback: Scanning books but keeping them, too | 4080Records

  38. Pingback: The Internet Archive is Now Going Analog - eBookNewser

  39. Jill says:

    Brewster – two things:
    1-Your comment about microfilm being the previous generation’s access format. Au contraire, mon frere – microfilm is, primarily, a preservation format, and much more stable than any digital format available. You don’t need to migrate it as platforms evolve, and, if you don’t have a microfilm reader, a good magnifier and flashlight will do in a pinch. As with books, considerations of climate control need to be addressed for long-term stability of the film. And, like the book original, it cannot be altered easily. That is why many archives microfilm their holdings for preservation, then scan the film to make the documents digitally accessible.
    2-Fire in the hole, so to speak. It would take a major effort to burn the books in your storage trailers. I don’t think people realize how hard it is to burn a bunch of tightly packed papers. Plus the oxygen in the trailers would be exhausted pretty quickly, so any fire would be short-lived. However, that being said, if you did want to create a “second silo”, microfilm is a great way to preserve a lot of information in multiple locations. Plus, they’re a lot smaller than books, so storage costs are less.
    In any case – I think this is a great project, and I’m happy that you have taken it on. Thank you!

  40. Susan says:

    I wish there was a similar project to collect and preserve film content. Nitrate and celluloid film deteriorates very quickly compared to paper, and we are losing our original cultural, social and historical resources daily. In a few decades, a digital copy may be the only access we have to those resources.

  41. Sean says:

    Are you going to scan all these books, too, or are you just going to seal them up? This could really add to the Archive’s digital collection!

    • The idea is to scan the books as funding and opportunities arise. So far, we are able to mostly keep up with the inflow. And, uh, donations are always welcome.

      -brewster

      • Sean says:

        I donate whenever I can! And it’s good to hear that! I hope you do as well as the Library of Congress!

  42. Jerry Dupont, Content Manager, LLMC says:

    Your stated long-range criteria include: “Sustainable from a financial and maintenance perspective.” Our organization stores materials in a dark-archive plan similar to what you propose at the proven rate of $0.10 per volume per year. Extrapolating our costs to a project of your size, that would come to $1-million dollars of annual storage costs once you hit your 10-million-volume goal. How do the cost projections under your chosen plan compare? Secondly, we follow one commenttor’s advice by having our facility is in a salt mine 650 feet deep; so that we get near total security plus perfect storage conditions (humidity and temperature) for free. Is it possible that salt mine storage would address most of the security concerns expressed by your other writers?

    • Thank you for sharing your costs. Are they for renting space or do you own the salt mine? As for our costs, we do not know yet. We are spending on the upfront design and build in order to try to decrease the ongoing costs, but this is still just hypothetical.

      -brewster

  43. sheri israels says:

    what a wonderful idea-there is good in the world still! I agree with magscanner though–you need 2 copies of each!All the best to you and your project-you have restored a bit of my faith in the so-called human race.

  44. Jerry Dupont, Content Manager, LLMC says:

    As to Brewster’s question on salt mines –”can one buy one?” I don’t know if you can, but space in the commercial ones is dirt cheap to rent. That’s where we achieved our $0.10-per-volume-per-year benchline mentioned earlier. And we’re not alone. The facility where we have our dark archive is also where Hollywood stores the bulk of it past films.

  45. David says:

    What a great operation! Do you collect only books from the US? Only in English?

    All the best with this contribution to world intellectual future.

    David -Givat haviva

    • brewster kahle says:

      we would like books from every country and in every language. If you know of people that are de-accessioning collections abroad, please let us know.

      -brewster

      • Kerry Burns says:

        I’ve just come into possession of a small specialized library, about 100 books, mostly European texts in German language, which provided European expertize to colonial workers. Mostly very good condition considering publication dates back to 1830. It was taken out of Indonesia on the Japanese invasion and resided for 60 years in upstate New York and New Mexico. I would like to submit these to your physical archive. Can you give me an address to which to ship?

        • internetarchive says:

          Hi,
          Thanks so much for thinking of us to preserve and share these texts. Contributions can be sent to:
          Internet Archive
          300 Funston Avenue
          San Francisco, CA 94118

  46. Pingback: Link love: language (31) « Sentence first

  47. Pingback: Physical Archive Launch | Internet Archive Blogs

  48. Pingback: Why Preserve Books? The New Physical Archive of the Internet Archive | Internet Archive Blogs

  49. Pingback: In Defense of the Physical « metamayhem

  50. Gerald Storey says:

    Hasn’t everyone who has posted realized that we are entering another dark age, where people can make up anything and it becomes true if posted or screamed loud enough AND ignorance is worn as a badge of honor (La Palin and crowd) AND religion is confused with spirituality AND tolerance is fading from existence. Excuse me while I go offline to read a book, from my library. Everyone should build themselves a library, a worthwhile endeavor.

  51. Pingback: futureofthebook.com » Blog Archive

  52. Judy Schlosser says:

    I live in Richmond and didn’t know about the open house until now. I also have too many books! How does one donate? Do you take VHS at all?

  53. Pingback: Archive books, seeds, animals and people. « Probaway – Life Hacks

  54. Pingback: Link Love: 6/10/2011 | The Bigger Picture

  55. Peter Wellburn says:

    As a retired librarian from a National Library (National Library of Scotland) I believe libraries deserve thanks for their efforts to digitise collections thereby making them available to the wider public. However, some libraries have passed the task to commercial organisations who then require payment for access to the databases – this defeats the whole purpose of widening access and is to be deplored. In the long term hard-copy must be retained to avoid losing our cultural heritage. Think of the many books of the classical writers, Aristotle, Euripides, Aeschylus and the like, which are known to have existed but no-one thought to preserve a copy. In more recent times we have lost many of Shakespeare´s plays in the same way – and are culturally poorer for this loss.

  56. Pingback: Archive books, seeds, animals and people. « Probaway – Life Hacks

  57. Pingback: Why preserve books? The new physical archive of the Internet Archive, by Brewster Kahle « Evil Corner

  58. Pingback: THIS WEEK IN BOOKISHNESS Vol. 4 & 5 (+ DISPATCHES FROM OLYMPIA) | 8vo

  59. Gordon Fischer says:

    I have a fascination with archival and shipping containers and loved this post. However I do have a question on the math around a 40′ shipping container holding 40,000 books. A 40′ container normally holds 20-21 pallets. Given your statement around 40 books per box, 24 boxes per pallet – I get around 20,000 books. Did I miss something?

  60. Pete Warden says:

    Apologies for an off-topic note, but a few days ago Brewster and I chatted after my panel on data journalism at the SemTech conference. I somehow lost his card, despite promising to contact him. I was hoping this might reach him, if it does I’m pete at petewarden dot com.

  61. Bryan says:

    Moving in-house inventory to an offsite location or removing it from the collection altogether has been a popular topic for academic libraries, as you probably know. Some talk about “Managing down collections” or “responsible withdrawal” (see Managing down collections | http://orweblog.oclc.org/archives/002151.html and “What to withdraw…” | http://www.ithaka.org/ithaka-s-r/research/what-to-withdraw). Anyway, if it’s the case that academic libraries are going to be withdrawing print collections en masse, then the obvious recipient of all these materials would be the Internet Archive. For the libraries contemplating mass withdrawal, I imagine that knowing that the materials were going to the Internet Archive would reduce much their anxiety and many of the political fights with their faculties.

  62. Pingback: Weekbericht #58 « Zeemanspraat

  63. Pingback: » even the digital is physical Sarah Werner

  64. Pingback: Preserving the physical form | The Harvard Library Innovation Laboratory

  65. Pingback: The New Physical Archive of the Internet Archive | Product Sourcing - Industrial News

  66. Pingback: P2PTalk » Internet Archive starts backing up digital books on paper

  67. Gergely Csaba Nagy says:

    Here in the heart of Europe, in Charpatian Mountains there be a few old salt-mines. The mines usually used to sanitarium or mining-museum. As Jerry Dupont said below, the mine storage’s cost near 1 millon/year. I think the poor countries like Ukranien or Poland, Romaina, Slovakia, Hungary can make more money with rent the mines, than use it as costly hospital or museum.
    Be worth a try!

  68. While I applaud Internet Archive’s dedication to archiving and storage of backup material, I have to point out that you’re taking a step backward here. You don’t preserve backups of microfiche newspaper articles by saving the newspapers. Likewise, storing digital documents on paper is wasteful and energy/storage-demanding (a single hard drive could save everything in those shipping containers you depict above).

    What the archive ought to be doing is working to improve and use digital storage and backup systems. Yes, they are not perfect as-is; but considering how easy it is to back up a single hard drive in multiple redundant systems, all of which can be designed to cross-check each other to eliminate “electron-flipping,” you could accomplish the same thing with just four drives placed in four safe sites. Want to be safer? Try eight drives.

    Let’s face it: Paper is far from the perfect storage medium, as your shipping containers ably illustrate. Let’s be sensible about archiving and storage, and not let romanticism over paper lead us astray.

  69. Pingback: Internet Archive archives digital texts… on paper. WTF. « Tech Catcher

  70. Pingback: Internet Archive archives digital texts… on paper. WTF. | Ebooks on Crack

  71. SK says:

    I’m an academic librarian and it is entirely true that some libraries remove (destroy, etc.) their physical copies when they feel that a stable digital copy is available. But your article makes it sound haphazard and possibly irresponsible. The vast majority of libraries are extremely careful with those decisions.

    In the US, we have good systems that tell us whether we have the last copy or almost-last copy of anything. Things in those categories are never thrown out — they are put in special collections where we can keep an eye on them. Whenever a library removes books, they check them one by one. If a book is valuable or rare, it’s either kept or sent somewhere where it can be kept … forever.

    I’m willing to bet that there are a minimum of 100 existing copies held in other libraries for each title that the library removed.

    So, yes, if every library removes its copy, we have a serious problem. But this is not corporate America where if one person outsources, everyone has to follow. We can afford for there to be 99 copies instead of 100 copies. When you start getting into smaller numbers worldwide, the red flags start waving.

    Space is limited and a system to both *store* and *retrieve* millions of titles (light archive) are more expensive than the system you describe here (dark archive). That said, there is no conflict between a light archive and a dark archive — they serve different purposes.

    Print (and microfilm) are good formats. They will easily last hundreds of years and are more stable and easier to maintain than digital formats. There is no reason not to preserve those formats in archives, even if we stop using them day to day.

    Just be sure to talk to a few preservation librarians rather than reading a few articles online … we have been doing this for a really long time and we can probably help you out.

    It would be wise as well to invest / create a really exact reprint-on-demand device — one that does not just reprint the book onto acid-free high-quality paper but which does so in a way that replicates the physical book as exactly as possible, including size, cover art, etc.

    You never know what is going to be meaningful to the future. Yes, in general, the *content* of books is what we want to preserve. But someday, it may matter whether there was an ink smudge at the top of page 52. We have the technology to create very exact copies and we should do so.

    You can even imagine a future fad, where people read mid-century American pulp novels or comics on the same paper they were originally printed on. These would be print on demand, of course, intentionally onto bad paper (like the originals), so that you could carry it around and feel “retro”.

    • Connor says:

      hey, we do that now, it’s just we have to go out of our way to find all of those old books on their really crappy paper

  72. Pingback: Internet Archive Begins Backing Up Books on Paper. Huh??

  73. Pingback: No, Dave, it's just you » Blog Archive » Digital Archiving: paper is still best

  74. Pingback: Internet Archive Starts Backing Up Digital Books

  75. Pingback: Internet Archive Starts Backing Up Digital Books … on Paper

  76. Pingback: L’Internet Archive i la conservació de llibres en paper « Bloc de la Biblioteca de Matemàtiques

  77. Pingback: Internet Archive Starts Backing Up Digital Books … on Paper | Single Name Server

  78. Pingback: Internet Archive Starts Backing Up Digital Books … on Paper | wifihotspot.za.net

  79. Pingback: archiving every book ever published « everydaythingsetc

  80. Pingback: CIBER NewsLetter » Blog Archive » Il nuovo archivio fisico di Internet Archive

  81. Pingback: Physical Archive of the Internet Archive | Jesse J. Saunders | Librarian

  82. Wanda says:

    This Archive sounds like a great idea. I have wondered what would happen if we only had digital copies of everything and suddenly could not access them. The more ways we can store our precious books data, the more possibility that they will survive longer. Thank you for your great work.

    Have you heard about this, maybe these books need a home:

    SASKATOON, Saskatchewan, June 7 (UPI) — The weight of about 350,000 books a Canadian Prairies couple wanted to save from burning is damaging a second house they bought to shelter them, officials say.

    Shaunna Raycraft and her husband live in the remote town of Pike Lake, Saskatchewan, southwest of Saskatoon. Months ago, a neighbor who collected books died and the man’s widow said she wasn’t interested in the uncatalogued collection and was going to burn it.

    The Raycrafts, both book lovers, told the Canadian Broadcasting Corp. they would take the collection and had a small house towed to their property to store the books.

    “We’re talking 30 tons of books,” Raycraft said. “The weight of the books is pulling the house apart.”

    The couple have tried selling some of the books on the e-Bay online auction site but can’t find any appraisers willing to travel to the remote Canadian town.

    “It took a minimum of three days to pack the baseball books alone into boxes [and] five days for the bibles and religious texts,” she said. “Most of the boxes are still unopened and unsorted.”

    Raycraft said it’s impossible to even speculate on how much the whole collection might be worth, the CBC said.

    Read more: http://www.upi.com/Odd_News/2011/06/07/Weight-of-book-collection-damaging-house/UPI-24131307448095/#ixzz1PjDdT0ps

  83. Pingback: the Wooden Cloud: Archiving the internet… on paper…

  84. Elena Schott says:

    I am getting ready to let go of 20 years of computer gaming magazines. As I bemoaned the thought of that extensive collection lost, a neighbor mentioned your group. Would you be interested in having these? I can bring the boxes to you if so, as I live in the SF Bay Area.

    We are talking 10-20 years each of Computer Gaming world, Strategy Plus, and PC Games.

    I will monitor this for a week or two in hope.

    Elena

    • internetarchive says:

      Hi Elena,
      We glady accept donations and would love to archive and share them. We are at 300 Funston, San Francisco, CA 94118. You can drop them off during normal working hours or contact us at infoATarchve.org if you need to make other arrangements.
      Jeff Kaplan (Internet Archive)

      • Elena Schott says:

        Contacting you at info at archive.org was fruitless. I tried again today, perhaps this email will be responded to.

        Elena

  85. Pingback: News – 6/20/2011Brooklyn Art Project | Brooklyn Art Project

  86. Pingback: But what if there’s a fire in the building… « The Book is Dead

  87. cursichella says:

    I think it is great and responsible of you to be cataloging the internet over the years and now the world’s books, too. So many are quick to judge on this, but, like the Global Seed Vault, all I can say is lucky for us and our children that someone has the foresight to protect these items from whatever calamities may be in store for our planet. Living authors should be grateful that someone is looking out for the future existence of their works.

    My only concern is that the archive is in Richmond. I live in CA and can’t help but think of quakes and tsunami. Maybe all would be safer stored somewhere in Utah or New Mexico where there’s less a chance of a book-ravaging natural disasters? If the containers are waterproof, I’ll shut up now…

  88. Adam Whitney says:

    I think this physical archive of printed books is an essential project as we complete our transition into the digital age. I wonder if the concept could be taken a step further by creating a globally-distributed archive that is run by a network of volunteers. The latter would avoid a possible Library of Alexandria outcome.

    The image of the books and photographs that have been stored in my own grandparent’s house for over 60 years in excellent condition is a suggestive. In that vein, I am envisioning a network of volunteers who could register the contents and location of their own personal collections, checking-in and updating the registry every few years. Such a registry would formalize the informal system of impassioned book collectors that already exists throughout the world. It would be less controlled than the Internet Archive’s centralized archive to be sure, but it would have the advantages of being geographically distributed, containing multiple copies of any single work, and run with little or no cost. Volunteers could also take on the responsibility of acquiring books that are rare or missing from the registry, in an effort to make sure the collection remains complete over the centuries ahead. I think combining this informal, distributed collection with a the centralized archives like the Internet Archive and major libraries would be a great combination for keeping our written culture safe and sound for posterity.

  89. Pingback: Link Roundup « Books Worth Reading

  90. Pingback: Omeka, E-Book Lending and Google

  91. Michael says:

    I like the seed bank analogy, but am queased out at putting the book equivalent on top of a major earthquake zone. It’s a great idea to try to recreate the library at Alexandria, but not if it similarly ends up on the bottom of an ocean. Note that the Svalbard Global Seed Vault was very careful about their choice of locations, wanting to protect against such a thing.

  92. Homer Otto Goodall, Jr says:

    Most of the ones I scan and upload was resurrected from dumpsters, recycling bins and other places. The movies and films that I upload come from old homemade DVDs, CDs, and old discarded hard drives. I am currently scanning the entire set of the Encyclopedia Britannica 1898 edition. 30 Volumes. I had also taken old pictures from glass plates and I will be uploading these later.

  93. Pingback: Internet Archive one step ahead: backing-up digital on paper | Sustainable Libraries

  94. Pingback: WFU | wake the herald | meanwhile, back at the internet archive…

  95. Homer Otto Goodall, Jr says:

    I am also uploading books I had written myself. These was published as underground textx.

  96. Claire says:

    More reasons to keep the originals (as if more is needed):

    1) Readers are at the mercy of the digitizers or of the producers of microfilms/ microfiche. You only get the information that the processors think is important. What’s more, you wouldn’t even know that certain “unimportant” pages have been skipped (like the “blank” front and back pages that give the history of that particular book – who owned it, what they wrote there, etc.). This is more of a problem with older books like the ones on EEBO (Early English Books Online) (covering 1473-1800), which often rely on old, black-and-white microfilms.

    2) Another thing that is not available in an online text: size and quality of the book. Size indicates use. A Kathy Reichs paperback is printed for a different audience than a folio of the King James Bible. In a digital copy, everything is the same size. The same follows for the quality of the paper (cheap woodpulp or acid-free or handmade) and the binding.

    3) I have noticed on GoogleBooks that pages in some digitized books are blurred or illegible at times when the person scanning pulls the volume off the scanner before it is quite finished.

    Basically, there are limitations to what digital imaging can do, and I am delighted to see that you are working to acquire and maintain as many original copies as possible.
    Also, it is encouraging to see that you are working with Gutenberg.org and other organizations dedicated to the preservation and spread of the written word.

    Thank you!

  97. GK4 says:

    Where can we see the catalog of physical books you already have? I wouldn’t want to send you any duplicates.

    Or would you want duplicates so you can choose to keep the better copy?

    Thanks for doing this.

  98. George Oates says:

    Hi GK4 – You can get a pretty good sense of what’s already been digitized (and put into the physical archive) by searching on either http://archive.org/details/texts or http://openlibrary.org. The hard part is that there’s a bit of a gap between books that are donated or acquired and what shows up in these two sites, since books will only appear in these 2 catalogs once they’ve been digitized.

    When in doubt, just send stuff along :)

  99. Pingback: Link Irresponsibly – July 2011 edition | Read Irresponsibly

  100. Jan Eklund says:

    As Director of the History of Art Visual Resources Collection at UC Berkeley, I fought for many years to retain our analog collection of still images from which we were building our digital collection. Time, budget, and space constraints made this an increasingly unpopular argument and during my tenure there I witnessed the de-accessioning of thousands of 35 mm slides and mounted photographs. Because most of these materials were copy rather than original photography, the justification for liberating the space these materials occupied for faculty office space was that the print materials from which these images were made (mostly from materials housed in the library) could easily be recalled and re-photographed if necessary. I shudder to think how many of these print sources may now be unavailable.

    Now I hear that a more aggressive campaign of de-accessioning is currently underway to make way for more classroom space and that the entire research collection of black and white photographic prints, each one cataloged, labelled, and mounted on archival board, is about to be discarded. These photographs represent decades of scholarly research and document, in some cases, works of art and architecture that are not published in any print source. These photographs don’t require the burning of fossil fuels to view or appreciate, but they do require space to house until there is staff time, money and (most importantly) the will to digitize it for use in the modern teaching environment. Sadly, I fear that the current fiscal and administrative climate at UC makes it unlikely that these materials will survive the push to create “active learning spaces.”

  101. David Shapiro says:

    How Do Your Send Books to the Archive????? ,David, (davsha@jps.net)

    • I am 80 and have been collecting books for over 60 years. I have over 50,000 mostly pre-1900, never seen by anyone. Obviously I cannot afford to give them away nor pay for the shipping. Please reply if interested. I am disabled and unable to travel. I started out trying to collect first editions of any book I purchased.

  102. Pingback: Realidades de ciencia ficción: el proyecto de conservación de libros del Internet Archive | Nisaba

  103. Kelly Benning says:

    I have almost 5,000 books all boxed up (weeded a high school library that has never been weeded). Do you pick up? How can I arrange for shipping? I hate to just donate this collection to the Salvation Army.

    • internetarchive says:

      We are located in San Francisco. If you are in the area you can drop them off at 300 Funston. If you are out of the area please send an email to infoATarchive.org with some description of what you have. Thank you.

  104. Pingback: The “Death” of Print: Plotting a Middle Course | R00td

  105. I have been a collector for over 60 of my 80 yrs. I began collecting first editions of every book I purchased. This collection of over 60,000 books has never been seen by anyone. What with my age I feel I need to dispose of them. What do you suggest? As I am indigent, I cannot afford to give them away and shipping and handling would be buyers expense. Please advise. Most of these volumes are pre-1900 and many are in specialized categories.

    • Paul says:

      I have been told the Library of Congress does buy missing books to fill the gaps in its collection. You might need to start cataloging your books and checking against the LOC online database to see if you have what they don’t. As extensive a collection as the LOC is, I know from experience the LOC is missing many books. Good luck.

  106. Pingback: AP piece on the Physical Archive of the Internet Archive | Brewster Kahle's Blog

  107. J Dyson says:

    Will the archive be selling off duplicate books to help fund itself?

  108. Pingback: Scanning a Braille Playboy | Internet Archive Blogs

  109. Peter Udbjørg says:

    Do you have any “chapter” in Norway yet? I have some books I’d hate see go to the Salvation Army…

  110. yubaraj sharma says:

    Words are too frail to express my appreciation and thankfulness for the great work you are doing. I’ll just say “Thanks”.

  111. We are, as we follow technology, slowly taking a different approach to written words. Books should be archived in some sort of electronic form and saved for eternity. I`m just afraid that soon we will not have time to read them in a old fashion way.

  112. Pingback: From Scroll to Screen and Back: Vendor Lock-In and eBooks « nydawg New York Digital Archivists Working Group

  113. The article says you are soliciting books from collectors and individuals. I have some books I’d like to get rid of. Where can I mail them?

    • internetarchive says:

      Thanks for the offer. Our address is:
      Internet Archive
      300 Funston Ave.
      San Francisco, CA 94118

  114. Pingback: Why Preserve Books? The New Physical Archive of the Internet Archive | SLA Information Futurist Caucus

  115. MaestraR says:

    example: this paragraph.

    The libraries we scan in, rarely want more digital books than the digital versions that we scan from their collections. This struck us as strange until we better understood the craftsmanship required in putting together great collections of books, whether physical or digital. As we are archiving the books, we are carefully recording with the physical book what the identifier for the virtual version, and attaching information to the digital version of where the physical version resides.

    ….is incredibly hard to decipher.

  116. Dr.Ameer Badami says:

    This is an incredible project.You might be aware but there are now highspeed cameras which can photograph a book in minutes .Please see this link.

    http://www.popsci.com/technology/article/2010-03/video-blazing-fast-book-scanner-captures-flipping-pages-high-speed-camera

  117. Pingback: Being-with Books « maphmaticallyyours

  118. Pingback: Volunteer – Help us get 200,000 books on Sunday! | Internet Archive Blogs

  119. Pingback: Tre milioni di libri su Internet Archive | wiBlog

  120. Pingback: Bibliotheken en het Digitale Leven in September 2011 « Dee'tjes

  121. Pingback: Thank you Friends of the SF Public Library for 130,000 books | Internet Archive Blogs

  122. Pingback: What Should You do with Your Books After Crossing the Digital Divide? - The Digital Reader

  123. Pingback: Bibliotheken en het Digitale Leven in September 2011 | toepassingen voor mobiele apparaten

  124. Pingback: 6 Physical Books Sites | Hold Your Future

  125. Pingback: 9 Physical Books Sites | Hold Your Future

  126. Pingback: 10 Physical Books Sites | Hold Your Future

  127. at least for US books?

  128. matt says:

    I am 100% in favor of scanning in all books and making them available to people everywhere. What a beautiful thing that anyone hungry for knowledge…ANYWHERE on our planet can simply download. Some paid..some free. Either way they have access to
    any book available.

    I disagree unfortunately with keeping the old copies. As a CEO, what I would tell you is this. Scan in the books then resell them on amazon.com to make a profit. Then use the money you would have spend warehousing and the money you made on amazon to purchase book rights from the authors and publishing companies.

    Just like open source software revolutionized the software industry… by buying out the rights to authors then making it digital an open source. Then you dont have to warehouse anything and can focus on your data center and really help even more people with information.

  129. Martha Genser says:

    I have an original set of “The American Peoples Encyclopedia Books” which includes 20 books in the set, published by Spencer Press Inc. in Chicago. I also have ensuing years events books from 1956 through 1984 published by Grolier Inc. I no longer have room to store them and would like to donate them to your Archive collection. I feel it is not right to dispose of them, as they record an important part of History. If you cannot accept them, can you give me some information as to whom or where I can donate these precious books. Thank you

  130. Hey There. I discovered your weblog the use of msn. That is an extremely smartly written article. I will be sure to bookmark it and return to read extra of your helpful information. Thank you for the post. I?ll definitely return.

  131. As a volunteer for Project Gutenberg and Distributed Proofreaders (www.pgdp.net), I have been preparing text and html versions of books for years. The earliest books in Project Gutenberg were typed-in directly from the paper original, and the earliest books produced by Distributed Proofreaders where from self-made scans, but by now, Archive.org has become the mayor source of scans.

    One thing I noticed working with scans is that often things get lost. Especially the scans made by Google are very sub-standard. Many details, such as full stops and accents on letters have simply disappeared, although they are present in the print edition. The full color scans at Archive.org are much better in that respect, but still we sometimes need to consult a paper copy to be able to make out what was in the source. This really shows the need to me to preserve the physical

    A more serious issue is with the quality of the illustrations. Those are far more sensitive to improper lighting conditions and calibration, as well as vignetting effects (could you please keep the EXIF data from the camera, or apply de-vignetting software in your digitization process) that are natural when making photographs, as compared with flat-bed scanners. Some old books have marvelous pictures in them, or collotypes, that capture a huge amount of detail, as they are photographic reproductions without half-toning. It would be great if those could be digitized at a better quality than today is done. Again a compelling reason to keep the originals.

  132. Pingback: Thursday Night 5:30pm Books in Browsers in San Francisco | Internet Archive Blogs

  133. Edwin Rivera says:

    I am currently in possession of some very old textbooks that I would like to donate if you do not already have them in your digitalized format and physically stored.
    -General History for Colleges and High Schools by P.V.N. Myers 1906
    -Introductory American History by Bourne and Benton 1912
    -Essentials of Geography with Oregon Supplement by Brigham and McFarlane 2nd Book 1925
    -New American History with Social Study Skills by James Allison 1950
    -College Book of American Literature Briefer Course edited by Ellis, Pound and Spohn 1940
    I found these in a garage sale years ago in Heppner, Oregon and held on to them due to their age and my regard to old texts. If these are of interests, please notify me by email.

  134. Wow, this is an interesting topic. I run a website on photography books and definitely see the need for physical books. I like your points about needing the original physical books in case of a claim for authenticity. However, I am pessimistic about the future oh hard books given the trend and ease of use of ebooks.

  135. Pingback: Thank you Friends of the SF Public Library for 130,000 books | Break The Glass Ceiling

  136. Pingback: The Internet Archive Visit « Sonoma County Digital Library Project

  137. Deborah Kempe says:

    How can libraries and individuals contribute books? Who will be responsible for shipping costs?

  138. Pingback: Internet Archive’s Repository Collects Thousands of Books | News Fringe

  139. Pingback: Internet Archive’s Repository Collects Thousands of Books | Inter News Daily - Daily News Magazine

  140. Pingback: Internet Archive’s Repository Collects Thousands of Books | G7Finance.com - Finance News & Personal Finance Resources

  141. Pingback: Internet Archive’s Repository Collects Thousands of Books - Online AFGHAN

  142. Pingback: In a Flood Tide of Digital Data, an Ark Full of Books | collegefix

  143. Pingback: Internet Archive’s Repository Collects Thousands of Books « Breaking News « Theory Report

  144. Pingback: Internet Archive’s Repository Collects Thousands of Books

  145. Pingback: Internet Archive’s Repository Collects Thousands of Books | Mobile News Plus

  146. Pingback: Internet Archive’s Repository Collects Thousands of Books - Financial news

  147. Pingback: Internet Archive’s Repository Collects Thousands of Books | The RealN3ws Post

  148. Pingback: Atelier Deltos Restauro e Conservazione di opere d'arte su carta | Blog | In a Flood Tide of Digital Data, an Ark Full of Books

  149. Pingback: “We must keep the past even as we’re inventing a new future.” « The Curious Collective

  150. Pingback: Internet Archive’s Repository Collects Thousands of Books | Local Website Solutions

  151. Pingback: USA: Archive-ing EVERY book-NYT « FACT – Freedom Against Censorship Thailand

  152. Pingback: 1internetentrepreneur.com » Blog Archive » Internet Archive’s Repository Collects Thousands of Books

  153. nithika says:

    As a volunteer for Project Gutenberg and Distributed Proofreaders (www.pgdp.net), I have been preparing text and html versions of books for years. The earliest books in Project Gutenberg were typed-in directly from the paper original, and the earliest books produced by Distributed Proofreaders where from self-made scans, but by now, Archive.org has become the mayor source of scans.

    One thing I noticed working with scans is that often things get lost. Especially the scans made by Google are very sub-standard. Many details, such as full stops and accents on letters have simply disappeared, although they are present in the print edition. The full color scans at Archive.org are much better in that respect, but still we sometimes need to consult a paper copy to be able to make out what was in the source. This really shows the need to me to preserve the physical

    A more serious issue is with the quality of the illustrations. Those are far more sensitive to improper lighting conditions and calibration, as well as vignetting effects (could you please keep the EXIF data from the camera, or apply de-vignetting software in your digitization process) that are natural when making photographs, as compared with flat-bed scanners. Some old books have marvelous pictures in them, or collotypes, that capture a huge amount of detail, as they are photographic reproductions without half-toning. It would be great if those could be digitized at a better quality than today is done. Again a compelling reason to keep the originals.

  154. Pingback: Architizer Blog » Preserving the Printed Word in the Age of the Cloud

  155. Pingback: The Physical Archive of the Internet Archive Aims to Collect A Copy of Every Book in Existence

  156. Jamezicarius says:

    I love what you are doing. Thank you.
    I’m glad someone mentioned comic books :)

    I’m reminded of the switch from vinyl recordings to compact disc. Many people ditched their bulky record collections … in the 80′s and 90′s… yet, in 2011, vinyl lp’s outsold CD’s.

  157. Pingback: Physical archive of the Internet archive « allhomosapienswelcome

  158. Pingback: Every Book Ever Made « Stuff I Google At Work

  159. Pingback: Repository Aims to Preserve One Copy of Every Published Work « The Blog of the ABAA

  160. Great to read this.
    We should have many, redundant, physical and digital copies of these archives with guaranteed public access. I’m coming from the digital side of things though and don’t have a romantic connection to physical books, but can clearly see the dangers of pure-digital archives as well.
    Perhaps there is a way to combine the two somehow?

    Got me thinking of all kinds of different ways the information might be stored… most books made nowadays are use cheap paper that will eventually disintegrate – they were not designed to last. Would be cool if there was a way to:
    1. take the physical copies
    2. digitize them
    3. somehow convert that into a *really* durable form.
    Perhaps etch into stone or metal with lasers or something else crazy. Even printing them on huge rolls of (acid free) paper in micro-print (think seismograph with pigment ink) might work much better than keeping all these copies that were printed cheaply, waste space with covers, etc and were intended to last through a few readings.

    Anyhow, great read and good luck!

  161. bonnie hummel says:

    I have some books that I cant find any information on –alot of www’s nothing I have Grimms Fairy Tales-translated from the german,WalterCrane and E. H. Wehnert,1898 -/-Mark Twain(Samuel L. Clemens)1879,Tramp Aboard/Den Store Strid i den Kriftne tisalder 1890/1922 Strazvlasti vojensky kalendar na rok/Festskrift Den norske Synodes Jubilaeum 1853-1903 ? to sell

  162. Pingback: Preserving The Internet… and Everything Else | GeekFreak

  163. Warrenb says:

    Thank you very much …………….
    Warren

  164. Dan Moran says:

    What if we have books we want to donate?

  165. Pingback: Physical archive of the Internet. « Nerd musing

  166. Pingback: Preserving The Internet… and Everything Else | Indoor Digital Billboards

  167. Homer Goodall says:

    I have a real old set of Irish literature books I am currently scanning. These things are so old that I need to sweep off the scanner with each image. done. I have severral incomplete sets of images. This will take a couple of weeks to complete. Stay well everyone.

  168. Judith says:

    Just wanted to say THANK YOU for your efforts!!! I wish every state would invest in a system like this for both SEEDS & BOOKS! even some websites should be printed and saved. So that if anything catastrophic ever happened there would be a system in place to get thing up and running again.

  169. Jason says:

    I hope these folks realize that the paper that most books are printed on doesnt tend to last more than a hundred years. The culprit is the stabilizers used to hold the ink and over time they tend to crystalize destroying the paper.

  170. Pingback: The best archive for digital media is… analog media | doink blog

  171. With technology updates comes change. It is sad to see libraries becoming thinner with their stock and we can only pray that storage of books and keeping the books in their original condition will become priority for library historians.

  172. Pingback: WordPros Publications Inc. » A Most Worthy Project: Archive.org Founder Archives Physical Books for Posterity

  173. Pingback: IV. Must-See Video Playlists and Articles | Quinn Class of 2014

  174. Pingback: FLAME GLOVE — Mind Blow #37 | Stahuj cz filmy zdarma

  175. Pingback: Nullary Sources » Why Preserve Books? The New Physical Archive of the Internet Archive

  176. Pingback: El libro no se muere sino todo lo contrario. Marta Peirano. Eldiario.es « Valor de Cambio

  177. Pingback: El libro no se muere sino todo lo contrario | Petite Media

  178. Pingback: El libro no se muere »

  179. Pingback: La doppia vita del libro secondo Brewster Kahle - Infolet – Informatica e letteratura

  180. Pingback: FLAME GLOVE — Mind Blow #37 | Stahuj filmy zdarma

  181. Pingback: Enough digital. Give me some paper, film and/or vinyl. | Robin L Connell – Unique Subset Jewelry Design

  182. Pingback: Spotlight: Brewster Kahle | Book History, Illuminated

  183. Pingback: The majesty of books « 2-Blog

  184. Pingback: Where there is Wifi there can be Sir Francis Drake’s Voyages of the West Indies | Laura Walter

  185. Rick says:

    Very interesting! Talked with my dad about this recently but saw that this article was from 2011. We were debating whether or not libraries would even exist in 10 years. The one thing that may keep physical libraries open would be public access to the internet.

  186. Michael Ward says:

    Has the Physical Archive project been put on the back burner? I can’t find anything about it on the front pages or the FAQ. I have (as always) some books to donate. The donations page just talks about special sets of books (as specialist libraries, I suppose).

Comments are closed.