Ever try to read a physical book passed down in your family from 100 years ago? Probably worked well. Ever try reading an ebook you paid for 10 years ago? Probably a different experience. From the leasing business model of mega publishers to physical device evolution to format obsolescence, digital books are fragile and threatened.
For those of us tending libraries of digitized and born-digital books, we know that they need constant maintenance—reprocessing, reformatting, re-invigorating or they will not be readable or read. Fortunately this is what libraries do (if they are not sued to stop it). Publishers try to introduce new ideas into the public sphere. Libraries acquire these and keep them alive for generations to come.
And, to serve users with print disabilities, we have to keep up with the ever-improving tools they use.
Mega-publishers are saying electronic books do not wear out, but this is not true at all. The Internet Archive processes and reprocesses the books it has digitized as new optical character recognition technologies come around, as new text understanding technologies open new analysis, as formats change from djvu to daisy to epub1 to epub2 to epub3 to pdf-a and on and on. This takes thousands of computer-months and programmer-years to do this work. This is what libraries have signed up for—our long-term custodial roles.
Also, the digital media they reside on changes, too—from Digital Linear Tape to PATA hard drives to SATA hard drives to SSDs. If we do not actively tend our digital books they become unreadable very quickly.
Then there is cataloging and metadata. If we do not keep up with the ever-changing expectations of digital learners, then our books will not be found. This is ongoing and expensive.
Our paper books have lasted hundreds of years on our shelves and are still readable. Without active maintenance, we will be lucky if our digital books last a decade.
Also, how we use books and periodicals, in the decades after they are published, change from how they were originally intended. We are seeing researchers use books and periodicals in machine learning investigations to find trends that were never easy in a one-by-one world, or in the silos of the publisher databases. Preparing these books for this type of analysis is time consuming and now threatened by publisher’s lawsuits.
If we want future access to our digital heritage we need to make some structural changes: changes to institution and publisher behaviors as well as supportive funding, laws, and enforcement.
The first step is to recognize preservation and access to our digital heritage is a big job and one worth doing. Then, find ways that institutions– educational, government, non-profit, and philanthropic– could make preservation a part of our daily responsibility.
Long live books.
Illustration: midjourney AI generated.
Thanks for this article. A tangental point about the longevity of e-books ; I had a number of Kindle(tm ) e-books, on my Windows7(tm ) device, and one day Amazon ™ informed me that they were no longer supporting that operating system, and so the books were lost. I asked Amazon(tm) about it, and they suggested I upgrade my devices. In other words, pay more for the ‘privelege’ of reading something I thought I already owned. Now, I refuse to give Kindle(tm) any business. Thanks for all that you do to keep books available.
You really should be regularly upgrading your operating system just for safety reasons. But that said, you don’t have to pay for it and your books were not lost. You can use a free OS and if there isn’t a Kindle app for it, you can use the “Kindle Cloud Reader”. Not saying you should give them more money, but you can access the books you already gave them money for.
THIS!!! All of it. Don’t use unsupported OSs! The hacks are too well known, freely avail. Unfortunately.
Computer security is endless, tireless, we’ve become indentured servants to our devices. And we can’t live w/o them!
The cloud support does not always work with Kindle or Amazon. If they decide not to keep or sell the work of an author then your access to it can be restricted or completely denied. In other words they can act in the same way as the Nazi regime and suppress their work.
The only safe way, which still requires at least occasional checks, seems to be to download your books, music, films or any similar media onto an external hard drive, but these can degrade over time, it seems.
The writer refers to the frailties in the technology of the reproduction of physical books. If you look into that subject, you may find that his point is valid and that payments to or capabilities of other providers are not the issue. Upgrading one’s own OS is of course a good thing to do, but it does not address the serious and pervasive problem addressed in this article. Your ideas are hopeful, always a good thing, too.
It sounds very true that moving books from one “pub” format or pdf format etc would be time consuming. But arent these formats used to provide for drm and or proprietary readers with licenses to read those formats? If the text from books were just stored as data they would be universally readable/searchable with almost no work. The storage media would still need to be kept up, but we need to do that not just for books but all data. It is incredible how many books can be stored on a thumb drive that costs $5 in the checkout line at Microcenter. Digital for me please. Also for the love of Zeus all text books should be digital so my children dont get the same back problems I did from hauling a backpack full of them for decades.
Research has shown that digital reading is an effective way to learn discrete facts, however, comprehension on more abstract concepts are much more easily understood when read in print. Children’s backs may be saved carrying a tablet, but their comprehension will suffer.
Interesting, I spent several months reading Samuel Richardson’s ‘Clarissa’ – 15 or so pages a day of the 2500 pages in my digital copy. I switched to digital because the 2 volume edition is large and heavy, and text notes are only at the end of volume 2. I have however been dutifully following the excellent illustrations in the hbk. volume. Recently reading the printed text alongside an illustration really exemplified some points being made that I’d not responded to digitally which made me wonder whether it was the illustration or the text that was producing the more pronounced impact; you’re comment makes me wonder – I think it was a combination.
Where did you find the research that you commented on in your post ? I would like to know more about this in depth.
Thank you Nola.
Good morning Nola Thank you. As a volunteer for our local community library every opportunity I get to save donated books which are surplus to the Libraries requirements I store them for sending to underprivileged children. The look in a child’s eye receiving a book is precious because family’s are so poor they have great difficulty in putting food on the table let alone medical needs.A book is a luxury we often take for granted. Digitization is important especially in conflict zones where libraries are often attacked leaving illiteracy to flourish hence so much conflict. Lots to say but will leave it there for now.Take care. cheers Don Campbell
The same thing happened to music – I used to own a mountain of songs that I legitimately bought/paid for, purchased via different platforms (i.e., Rhapsody, Apple, even from the bands’ own websites). They must still be out there…’somewhere’; but, for all intents and purposes, I don’t have in my possession right now even a ‘speck’ of that ‘mountain’.
I seldom payed for digital music, but when I did I made sure I downloaded a file I could keep and use offline. I keep around 50 GB of music in several backups like hard drives, Ssds and some of it in free cloud services. I still keep the music I got from napster days.
This is why I love Programs that allow me to break the DRM protection of my ebooks I purchase and I can then own the file and convert it to whatever I need to read it on. If I buy an ebook I want the file and to do with what I want with it in a non-illegal manner.
That’s why you don’t own Kindle books. Unfortunately. Look it up.
Son unos estafadores de “cuello blanco” por eso las estafas se propagan cada día más…
This doesn’t make sense. The books are in your account, you can read the in your smartphone, pc, tablet or reader. They simply dropped support on you OS. You didn’t lost the books.
So true! I love my ebooks but tend to only buy novels and magazines. Books I want to keep, reference, cookbooks, photo and art books, etc. I still buy in hard copy.
The same problem exists with photos. As the keeper of the family genealogy, I have photographs that are nearly 200 years old. Still very viewable. I’ve been scanning and restoring them by the thousands but keeping up with media and formats is an endless challenge. My biggest fear is that when my toes curl up no one will know where or how to find all the family photos or take on the job of keeping them current.
This is not to mention all the photos that exist only on cell phones or social media by so many people. Down the line, there will be nothing for the generations that follow to find. Our history will largely die with them.
“Down the line, there will be nothing for the generations that follow to find. Our history will largely die with them.” Heartbreaking thought, but that is the whole intent of all those behind the anti-White Western culture drive today, isn’t it?
I now buy hard backs, they will certainly last me a lifetime, I don’t intend playing football with them.
Just because Kindle no longer supports the Windows 7 device you had, does not mean your books are lost. You can read them on any Kindle device. My Kindle library stretches across multiple platforms – Apple, Android, Windows – anything I can put a Kindle app on. The books are tied to your account – not to any specific device.
This is why I remove copy-protection from the electronic media I own. For example, I have taken my Apple //e floppy disks, turned them into a hex-dump disk image (they are only 140K…), and store the resulting plain text file. At any point in the future, I can turn that hex dump back into a binary disk image. And, by using one of the license-free OCR fonts, I can ensure it’s easily machine readable.
Whether they are books, audio, or computer media, once I spend good money on them, I will continue to do what’s needed to make them transportable to future technology. I used to read books on my Palm Pilot. Those books were converted to EPUB and/or PDF so I could read them on current devices. When the next big tech change arrives, my property will be ready to make the switch.
Jeff – your comment is the exact point of the article. Your technical expertise seems to be higher than your average end-user. I applaud your technical efforts to keep your electronic media readable over generations. We need to have easy-to-use (e.g., a couple of button clicks and use of drop downs) tools that allow the average, non-technical user to be able to accomplish what you have accomplished.
It exists… look up Calibre. You just have to be technical enough to search for it and the instructions on how to set it up (there are many step-by-step walkthroughs).
I have found Calibre to be a versatile program to convert from one format to another. I like just loading a book in then tell Calibre what format I want it to be converted into.
Great article! Even if 100 years vs 10 years isn’t 100% accurate, digital books are NOT your property. You bought a license to read them, but you didn’t buy any property. Don’t believe? Read the contract you digitally signed…
Example: Amazon recently canceled my account. The reason they gave was that my account was accessed by an unauthorized third party overseas… I used a strong password and didn’t write it down or shared it with anyone, so that wasn’t my fault, but that’s how I lost access to hundreds of books I’ve paid to read over the years… I guess that could have happened if thieves got into my library and stole all my books too… But I don’t trust Amazon anymore…
The ebooks in question were prepared from physical books. Now ever since the 1960’s, publishers have included words such as “may not be stored in any information retrieval system” on the copyright page of each book they publish, but printing that restriction doesn’t make it enforceable vs. the first sale doctrine or the special library privileges in copyright law. That’s what this lawsuit is about.
Thank-you for clarifying that. Of course, most people realise that the rapid changes in technology, which are embraced by most young people but daunt many older people, make our technologies out of date fairly quickly. The home movies I took of my children reside (if they haven’t failed) on a small format tape that I have no way of reading/watching. None-the-less, I have had an expectation that digital books would last forever. As I prefer hard copy printed books and do not buy e-books I haven’t had experience of the changing technologies. I’m pleased to support your organisation in a small way so that these kinds of things can be preserved and made available to all for free.
Lol, did you try to download a physical book unavailable in stores for decades?
Électronic books are easily available for everyone for free…
Yes, quite easily. It’s called going to the Library.
I still do not understand why people by Kindle’s. At least if you have an e-reader, phone or computer you can check out and read the materials – For FREE. Also Amazon will not allow Kindle’s to download books from library servers because they want to charge for the book.
Going to the library is a lost art. I love my library.
Even if I can’t annotate on the books I get, the upside is when I move, I don’t need to move the books.
When I moved from Florida to Massachusetts, the biggest cost by a long shot was the moving of my books. Three-fourths of the cost. Had I known that, I would have donated those books to. I don’t even have them now. I could weep.
I agree with the use of Calibre. If I pay for a ebook, itʼs mine, no matter what Amazon says.
Where would I find a public library that has the number of 100+ year old books that Internet Archive and Project Gutenburg have? Even not so old books are weeded out regularly because libraries curate their collections based on space and what would be interesting or useful for their patrons.
Paper books were the vector of our collective knowledge for centuries and we would have much to loose if there were not the constant efforts of organizations such as yours.
The same problem occurs with every kind of digital media. People take countless photographies every days. How much will be available a few years? Most of them will be locked in a proprietary cloud anyway…
The Domesday book (1086) handwritten on vellum. Providing a complete list of the contents of England.
The Domesday book 2nd edition (1986) virtually unreadable due to digital obsolence
Revenge of analog. This is why I still own tapes and a VCR.
Richard Stallman predicted this in 1997, unfortunately the publishers read his story as an instruction manual.
“This put Dan in a dilemma. He had to help her—but if he lent her his computer, she might read his books. Aside from the fact that you could go to prison for many years for letting someone else read your books, the very idea shocked him at first. Like everyone, he had been taught since elementary school that sharing books was nasty and wrong—something that only pirates would do.”
In verità la maggior parte delle cose scritte su questo articolo si possono tranquillamente automatizzare(effettuando conversioni, tenendo più formati dei libri) in 8 ore un computer può eseguire decine di migliaia di conversioni… Purtroppo l’unica cosa impossibile da automatizzare è la gestione dei metadati, che inevitabilmente deve essere fatta e mantenuta aggiornata dall’uomo e successivamente i computer aggiorneranno ogni singolo libro in base alle informazioni aggiornate, però la considero una soluzione accettabile rispetto al dover fare ristampe di libri ogni anno
Questo è vero, ma la conversione dei metadati non sembra abbastanza semplice da essere completata regolarmente dalla persona media. Inoltre, la conversione automatica di molti libri non sembra essere abbastanza precisa: troppi errori.
Not going to disagree with how digital books will not last longer than physical book. But saying “Our paper books have lasted hundreds of years on our shelves and are still readable. Without active maintenance, we will be lucky if our digital books last a decade.” can be misleading.
Because paper books also need active maintenance. The environment that they’re in have to be controlled to ensure it’s not too damp nor too dry. It has to cleaned from imsects that likes to eat paper. Same with rodents.
Basically, don’t expect papers to last if they’re not kept properly.
What you have described is passive maintenance of paper books, not active per sé. There’s a pretty major difference
Cataloging and metadata also have to be upgraded. As a kid I was taught to find books using a card catalog; my kids have never seen one and wouldn’t be able to find a book in a library where that was the catalog method.
Paper quality can differ massively , I have a few ( PAPERBACK ) books 40 years old still white pages
Hello, In the Bulgaria have paper books on hardcover and paperback and on the Bulgarian language only. Unfortunately, the books on English language are not available at the Bulgarian bookstores and the digital libraries of foreign languages are not approved by the Bulgarian institutions (the most educational), because the parents, scholars and teachers are doesn’t approved to read paper and electronic of foreign languages. Fortunately, in 2022 I found digital libraries on foreign languages (the most English language) and I brought foreign language literature on English language. The Bulgarian language is not required. We can change the world to be better for new 2023 year.
Hi Vladamir from the US! We here take it for granted and your story is sad but inspiring. I hope you get to travel and live in an English speaking county. We have free public libraries. This site is good and all but look at Brooklyn New York has some free services, as does the US Library of Congress. Use a Vpn service and you access countless English resources!
Photographs have similar (parallel) issues.
Photos even more so , probably in the early era of mobile phones , let so many go due to crap picture quality expensive printers ect , no problem today.
Yes, photos are historical documents which are physically more durable than they are identifiable.
The National Library of Ireland, and the US Library of Congress, however, have Flickr accounts with regular users who are wizards at identifying old photographs. I especially enjoy reading the comments by those sleuths as they uncover the identity of photographs from the early days of photography. Famous Irish photographers are easier to date and geographically identify. I consider the NLI to have the best approach to crowd-identifying photographic material.
As a former public librarian, and as an Amazon digital customer, I’ve always insisted on downloading to my home devices every purchase I make. I keep one Kindle 8″ which has a SD card slot to hold the Audible titles, the ebook titles on the main device. I also have both audiobooks and ebooks on my iPadPro which runs the Amazon apps for those materials. Amazon seems to make the Kindle downloads unnecessarily time consuming to download. But I am persistent. When it comes to my reading material, I still have the reflexes of an old librarian!
Great Post !
If you keep a PDF file and EPUB files locally on your system (and backup) them – then its the same as having a ‘real’ or ‘physical’ book. With open source (Linux/FreeBSD/…) world/ecosystem there will always be some tool that will read (PDF) or convert (EPUB) it to other format.
Just make sure you keep those files on a filesystem that does not have bit rot problem and have ability to heal itself when some errors come from thy physical disk – like ZFS for example.
This is what I wanted to say in response to the article. Simple text files have existed for decades and will probably continue to be easily usable for a long time. The same is true of PDF files, and of HTML (and EPUB ebooks are made up of zipped HTML files). That many commercially sold ebooks (like music, etc.) are likely to become inaccessible sooner due to incompatible software is more the fault of DRM than of ebooks generally.
The basic premise of this argument seems flawed. Paper books that have “sat on our shelves for hundreds of years and are still readable” are the exception, not the rule.
The vast majority of medieval-to-early-modern period printed material no longer exists, whether through destruction or natural decay of the inks and print media. The reason these books are rare is *because* of the inherent fragility of these print media.
There are no absolutes.
“Our paper books have lasted hundreds of years on our shelves and are still readable. Without active maintenance, we will be lucky if our digital books last a decade.”
I am currently scanning some books that are right at 100 years old. They are flaking and deteriorating in my hands. The kind and quality of the paper used for physical books will have a lot to do with their longevity.
Why not print books by etching them into silicon wafers with microscopic text, like making microprocessors? Then, encase the book wafers in amber, and bury them in the dirt. They will last for millions of years, without any climate control interventions. They can be read with an optical microscope after polishing the amber. Much more durable than paper.
I still remember a Rod Serling TWILIGHT ZONE episode with Burgess Meredith. I recall it was in the 1950s.
Meredith played a very near-sighted man with a love for reading. The END was coming, and he prepared a place in some deep secure area with thousands of books. He was ready to settle down with his first book, everything seemed perfect, but he STEPPED ON HIS GLASSES. There he was, plenty of food, plenty of water, light enough, BOOKS everywhere……..and he did not have another pair of glasses.
Considering a DoomsDay scenario, when would mankind after an apocalypse know to , or perhaps be able to, see what is at a microscopic level, not visible to the naked eye. Other than that, it could be a foolproof set up.
The “devil is always in the details.”
I maintain an off-line archive of every eBook I’ve obtained. I run everything through Calibré once, I strip the DRM if there is any and that’s it. I’ve had Kindles since the K3 (Kindle keyboard) to the latest 6.8″ PaperWhite, never had an issue.
Great article !
I have always been a reader. My book collection was up to 38 boxes last time we moved.
I am visually handicapped. Without the ability to blow up fonts on devices, I would have lost the ability to read books at all.
If you have a Kindle, Amazon will send you samples of pretty much any book. I finish a much higher percentage books then I previously did.
I said something similar to my eye doctor when he said I didn’t need readers yet. Can just zoom digital fonts, thank goodness.
My eyes wear out faster reading from a screen than from a physical book.
But yea, I have downloaded many books and essays that I never found a physical copy of for purchase or borrowing. The eye strain wins out however.
Get an eink screen! The eye processes it the same way as a printed page.
Thank you! I will look into it.
I’ve spent, literally, decades, sitting in front of a computer screen of one kind or another, and my eyesight remains 20-20 even though I’m 81 years old.
So don’t go telling me that computer screens will destroy my vision.
Is it worth persevering everything if nothing will ever stop evolving or changing? I fear libraries will be come outdated with the ever rise of freedom works works that can be legally copied and modified without checking into a library. I am writing free works under share alike but my efforts have been in vain.
Anyone regret storing data on Zip Drives? I think it’s called data extinction.
I don’t regret using Zip Drives, because I was able to get a USB Zip Drive (you can still get them today) and retrieve the data. It’s been awhile since I tried, but the last one I tried was at least 20 years old and I could still retrieve the data. What I regret using was CDs and DVDs – they deteriorated and the data was irretrievable after just a few years.
That’s very strange!
I have CDs and DVDs which I read from time to time, and many of them are 20 years old or older.
thanks a lot
The pdf existed 10 years ago. Just don’t buy stuff in stupid formats that were created to limit what you can do with a book you buy.
Take a look at the free Calibre program. It processes a variety of book formats, and it’s been around quite a few years.
This is a fascinating topic and one that needs to be discussed.
I have a lot of ebooks, and I have tried several readers. I settled on and bought Sony readers early on, as they were ecosystem-independent and could read several open formats. I still read on my beloved PRS-950 and my PRS-T2, and I hike and travel with my PRS-350, which easily slips into a pocket.
I use Calibre to manage my library, to strip DRM, and to manage my readers. I put every book into ePub format for my reader and I keep several archival copies both locally and in the cloud. I maintain about 5000 books in the Calibre library, and I have several thousand more waiting to catalog and store.
I could not reasonably keep a library of paper books this big in my home. I also couldn’t reasonably carry fifty or sixty books, along with periodicals and notes, in my pocket! Ebooks are remarkably liberating.
As long as I practice the same reasonably prudent data management and security techniques I practice with all of my other data, and I use a secure, open-source operating system like Linux, there’s no reason to expect that my ebooks won’t always be there.
Current versions of Calibre do not “strip DRM” because doing so is a violation of US Federal Law!
You have a few more books than I have, but not by very much.
I don’t see that you need quite so many copies of each file, but there’s little harm in that as disk space gets cheaper and cheaper.
While Linux is open-source, don’t delude yourself into believing it’s all that secure. It isn’t. As long as Linus Torvolds keeps the kernel as his own personal domain, you cannot be sure it is secure.
It is an issue that digital works need to be continually updated as reader software gets updated. But it still allows books to be more accessible by the general public than the few copies of printed books in the local library.
Or you could just get your books and put them on Google and you wouldn’t have any of these problems. Stop messing with bs software, OS and put them in the cloud where you always have access to them via any device. This isn’t rocket science, but maybe meteorology.
Except Google can delete the books if they believe you don’t have permission to make copies. It’s best to keep them in storage that YOU control (But you can set up your own server that allows you to access your books from any device…) But either way, you can’t access them without software and an OS. If you have something that works for you, sure, you as an individual don’t need to convert the books to the latest format. But, say there’s someone that can only access books using features the latest formats offer.
Get rid of copyright laws. Then books can be backed up, copied, etc.
Get rid of copyright laws? And how do you expect authors to be paid for their effort? Or even be recognized as authors.? Without copyright, you can steal someone’s intellectual property and parade it as your work.
I’ve been using calibre for years to manage my ebook library. Development can be slow and it has the occasional frustrating bug but I highly recommend it for drm free books. It allows you to edit metadata and convert to virtually any format your device might need.
I have had Calibre for some years, too, and recently, the makers have been updating it at a surprising rate. Almost monthly.
I have not tried to convert a format. I stopped when I got to the part about “connect your device to your computer” because I don’t use “devices.”
Well, I do have a Kindle Fire, but I have not figured out where the books get stored.
This 100%. Anyone reading e-books, irrespective of the device used, should be using calibre to manage their libraries. I’ve moved (almost) my entire library through several devices, from nook to kindle to kobo, various os tablets/phones and pcs. The ability to strip drm is the icing on the cake, buy from any of the dozens of vendors and use the books you pay for on all your devices.
First of all thank you for this article, I think this happens same with the music.
thats great !
I don’t remember where I read or heard this factoid but, apparently, there are millions of early digitized governmental documents inaccessible in stacks of obsolete computers deteriorating in the National Archives. I may not have gotten all the specifics right, but the point is that it’s very unlikely these records can ever be recovered.
Thank you, Brewster. In case no one noticed, we use your archive quite a bit on our online blogs and news/comments, research/advocacy pages:
Gordon Wayne Watts
Editor-in-Chief, The Register
Nat’l Dir, CONTRACT WITH AMERICA: PART II (TM)
physical books, like tp and amazon boxes, are horrible fates for living breathing trees. the ideas and words matter, so let’s get our collective tuchus in gear and standardize format so the ideas aren’t restricted to the privileged few with dough.
My wife and I have a personal library of over 3,000 books, which stretches back more than sixty years (although a handful of the books are well-over a century old). None of our books are collector’s items, nor are any individual books worth much. I guess that our library will be carted off to a charity shop when we die, but it will see us out.
We have also had a wide selection of books and other documents in digital format. One of the problems is not simply that digital formats change, as mentioned in the article, but also that the machines on which the documents can be read become obsolete. My first sub-PC computer used 160KB 5.25″ floppy disks; followed by PC computers that read only 720KB and 1.4 MB 3.5″ floppy disks; followed by computers that read only CD-ROMs; followed by computers that use only USB drives. Most of my digital documents are ‘locked’ into thousands of computer disks (and not forgetting the Amstrad PCW disks) that are now, in practice, unreadable. I do have an intention to try to ‘rescue’ some of the digital documents, but my intention has to compete with my desire to read (and learn), as well as all the other demands on daily life.
I sometimes write about ideas that interest me, for which I now consider a digital copy of relevant material to be all but essential. I am currently considering buying a paper copy of the Penguin ‘Essays’ by Michel de Montaigne. There is no point in buying a paper copy if I am not going to read it. I have already accessed a digital copy of the book and have started to read it online. However, if and when I decide that the essays will continue be of value to me, I shall buy a paper copy, partly in order to have it easily to hand, but also because I guarantee that in a short handful of years, my digital copy will have been left behind on yesteryear’s computer. Whilst all my most recent documents are backed-up in ‘the cloud’, I have lost access to countless photographs and a lot of music that was backed up in ‘the cloud’, that is, until the respective services were closed.
I had a song on a vinyl record. That tech went by the wayside.
So, I bought it on reel-to-reel tape. That tech went by the wayside.
So, I bought it on cassette tape. That tech went by the wayside.
So, I bought it on a CD. That tech went by the wayside.
So, I bought it on a DVD. That tech went by the wayside.
So, I bought it on an MP3 device. That tech is going by the wayside.
And each new recording cost me, which means the publisher got multiple $ for the same song.
Is it any wonder that people take the “illegal” route?
And don’t get me started on phone tech, video tech, cars, etc!
I am an artist and photographer doing a lot or research in antique manuscripts and technical manuals in different languages to help me with my projects. Thanks to the Internet Archive, I have been able to freely access thousands of books, some of which are so precious that they can never be seen in their original form, or only one page at a time behind heavy glass in museums the world over… Archive.org is a TREASURE, and I can’t thank it enough for the amazing work it is doing. As I consider myself technologically inept, I can’t navigate the constant technical upgrades; thankfully there are experts who can do it for me. In the meantime, I just keep my downloads in pdf format on individual hard drives, hoping that they’ll remain usable for some years to come. Even with limitations, the digital era is a blessing, and I am truly grateful for the incredible resources it places right at everyone’s fingertips.
Slightly off topic here. This is indeed an important point of consideration regarding archiving of books or print of the past. Native E-books though, those created without a tangible format, will slowly become increasingly more of a multimedia endeavor with less text as we move forward, potentially with their own evolving formats.
Wow, thanks for putting into words what I have sensed all along. This is why I’ve never owned an ebook and still read good old fashioned books (and value them) today. I value the old books because it is documented information that cannot be manipulated, and it’s history like it or not. I love books, collect and read them, love the library, and archives. Thanks to all who contribute to these things I value.
There is some truth in the article, but only SOME!
Changing from SATA drives to SSD makes absolutely NO difference to ebooks! That’s one false claim.
I’ve been buying Kindle books for considerably longer than 10 years, and they all still work. (Well, sometimes I have to go to Amazon and remove the book from my devices and then download it again, but not very often.)
I did have an ebook reader (name forgotten) that went out of business; they offered a free program that converted all of them to .PDF format, and that’s still readable. The DRM was all removed.
But, I agree that none of that is likely to last 100 years. That’s one reason I keep my several-hundred printed books right where I can find them any time I want.
Brewster Kahle Can I help? I mean, do you need volunteers to update books (formats)?
Yes! we need lots of help.
Music, images, sounds, and texts – all are available in various digital formats. Our eyes and ears can’t make sense of digits without some electronic device. The formats change and new devices are required. I hope we never loose the originals. Someday we will have machines that are analog, as we are.
Just last winter I made a project of handling about 200 3.5″ floppy disks from the Apple 2+ days in the late 1970’s. I WAS actually able to locate a USB version of a 3.5″ floppy drive to ready them, but I would say probably over 30% of the disks were no longer readable. The original data was lost, lots of old outdated software installs, nothing of any real value, but in most cases I was able to reformat them which would renew the magnetic storage.
We don’t really store any digital books on our systems since we read everything via Kindle devices, but I have 70 thousand tracks of digital music and am digitizing 100 years of b&w famiy photos, and over 60 years of 35mm slides, along with about 50 thousand photos from digital cameras.
My computer network provides 30 terabytes of online storage, and as the projects progress, I make multiple backup copies on removable drives which will be distributed to younger family members.
For you techies out there, my moto is nested functions: Backup ( backup ( your backup ) )
Which all goes to show how good it is to have the Internet Archive available.
I did not read all of the comments and everyone should know that, when a digital book goes out of print, censored or in purchased by another publisher, it not only disappears from the online library but from your personal library as well and you will not get your money back!
Do you have a specific example of a book this has happened for? Or is it something they have in the Terms and Conditions?
In the same vein, records are easily lost as well, especially if there is a desire on someone’s part to change them. I’ve been in the same university job for 51 years. I wrote the Grad Program. As various other Grad Advisors take the position, I find my grad students telling me that they don’t need this course, or they need that course. I go back to my original paper trail, hike to the advisor’s office, and say, “unless you can show me that this passed the departmental and university curriculum committee, this is the Bible.” Can’t imagine what has happened to millions of departmental “edicts” over the years. If it isn’t on the shelf in the minutes books, it will be superseded by a new bright idea 10 years later.
Well thanks so much for the update
Consider saving the e-books as pdf, which can’t be changed. Any chance that future readers or computers will not be able to read pdf in future. I don’t think so.
To my mind next century will be from the mass of data of the information age just a big empty hole – except maybe at the libraries. Downright ludicrous.
Scholarly societies such as the American Institute of Physics have wrestled with this problem for decades. They might have useful advice, and data on how expensive it is to do what they have been doing.
Pingback: Digital Books wear out faster than Physical Books | Later On
Revenge of analog. This is why I still own a VCR. I also own a number of books (I love old books) that are well over 100 years old, have never had any special maintenance, and are still perfectly readable. The oldest print matter I own is a pamphlet from 1706. As easy to read now as the day it was printed.
Interesting. This is something I started pondering about a year ago and is what first got me to start looking at The Internet Archive more closely.
Thanks for the article.
This only applies to formats that want to prevent excessive sharing or guarantee someone has paid for the ebook. Once it is stored in plain text like html, or even PDF, it is easy to read and access, I have documents from the early 90s that I can still read and share, and it has been 32 years and counting.
Problem arises when people want to profit from the ebook itself and try to complicate things.
Anyone who has worked in a used bookstore or used a library pre-tech (card catalogs!) knows paper books are a LOT of maintenance. If you have a large library, you better have a sorting system. You better also have lots of empty shelf space because adding new titles in could force you to shuffle every shelf down trying to make space. As someone else said, books 100+ years old are the exception – while they’re readable, you likely wouldn’t want to. If they’re leather bound, you don’t want to touch them too much. Most of the paper is very fragile. Covers and binding are falling apart (these are usually the only ones we see in the shop – old Bibles for repair.) If there is a scan available, you’ll want to read that one.
I will be pleased to publish an Italian translation of this post in my “Cassandra Crossing” column. However, I was unable to find the license of it.
If you agree, is it possible to have a republish right statement from you for use under a free license like CC By Sa?.
Keep up the good work! Marco Calamari
I wrote this and you can use it as CC-by. Thank you!
oh great post thanks
Texts, books, music and computers have been in my (academic) interests since the mid-1980s. I live in Greece but I did my graduate work in the humanities in the US in the early 1980s where I first encountered library computer terminals. At that time some US friends of mine had just bought and used the first “word processors” (typewriters with very small screens on which you could check the 2-3 typed lines of text before send them to paper). As Greece has always been part of “West Europe” books in English, French, German, Italian have been available in the book market and there were/are specialized and general non-Greek language libraries and institutions available to academics and the general public.
My first PC (1986) was a British made Amstrad PCW with a separate printer attached to it but a few years later I could afford to buy an IBM PC and save my writings on WordPerfect for DOS. In any case, I kept printouts of my Amstrad PCW texts that I later scanned and converted to Word/Writer texts. My WordPerfect DOS texts were later machine converted to Word/Writer docs. Similarly, when at some point in the early 2000s turntables almost disappeared from the market I recorded my vinyl records and created music files that I burned them on music CDs. In the 2010s after my house basement flooded fortunately with little damage to my stored physical academic books for personal preservation reasons I scanned a lot of my books in PDF format just in case the originals were destroyed. All this resulted in a waste of time and money just for me to have my work, library and entertainment available when needed.
This personal story of “digital preservation” shows that even individuals who have lived half of their lives in the 20th century cannot rely anymore on just one format in both printed matter and sound if they want to keep using it. Therefore, I fully understand the library/ries problem nowadays and I am for a standardization in formats and reasonable prices especially for the same content but in a different format. However, the cost of converting from one format to the other must not be out of the pocket of the user but from the budget of the seller. A friend of mine was furious recently because he told me: I bought the same book in paper but I found it most versatile to have the electronic version which I must now buy in the same price I bought the original book which I owe and may not even be able to download to my PC. Also, he said, I first bought the vinyl record of my favorite band, then as a good buyer I bought the cassette version to play on my car player, then the CD version, then a subscription on a music platform and now I found out that there is a technologically improved vinyl record again in the market. How many times am I supposed to pay for the same music in my lifetime?
I think it is obvious why technology changes and DRM serve very well the seller but not the buyer. So, something must be done about it and both libraries and individuals must own what they buy regardless of technology changes.
Pingback: Digital books do wear out – just like digital music, digital films and video games – Walled Culture
While you are pretty accurate in your depiction of the potential fragility of electronic documents, I fear you have fallen into the ‘either-or’ logical fallacy. Paper documents and their handwritten predecessors are no less fragile than electronic ones. It is not one or the other.
Have you heard of the burning of the library of Alexandria? How about the scholarly excitement when Dead Sea Scrolls or the Nag Hammadi manuscripts were discovered? Or the treasures found among the discarded documents in a walled-off Genizah in a middle eastern synagogue? These documents had been lost for centuries, even more than a millennium.
I have been hunting for books in my areas of interest and expertise for more than fifty years. In that time I have frequented countless used book stores, thrift stores, church and charity book sales, and recycling centers. My observation is that most books have an average life cycle between purchase and disposal of about twenty years. People move, change their interests, die. They downsize, clean house, make room for new purchases. And that’s just the copies that turn up for resale.
There are many books that don’t make it that far. Burned in coal furnaces and fireplaces for heat. Put between walls for insulation. Used for wiping paper in the outhouse. You get the picture. Or our local recycling center where they have a bin the cubic size of a standard shipping skid. The majority of the books that come in don’t get put on the shelf – and not based on any logical policy I can determine. They get thrown in the bin, which, when full, gets sent off to the pulping plant to reappear as egg cartons. The chances of any individual book making it to 100 years old are slim to none.
I will give you two concrete examples. In a very old book, I saw a reference to an early modern book in English which I expected would be useful in my research. I went on line, including EEBO (Early English Books Online) and could not find a copy. Couldn’t find one in WorldCat either. So I reluctantly accepted that the book was lost. More than a year later, I quickly looked again. This time there was a scholar in Florida who referred to it. Two emails later I discovered there were two surviving copies in the world, both in the library of the Bishop of London. What if that library had been destroyed during the Blitz?
The second example was trying to find all the works of an author of interest. Her books were aimed at a minority interest group which is almost never represented on the shelves of local libraries. Not even scholarly libraries. So I had to buy physical copies one by one on ABE Books. In the back the publishers listed other works, and in one of these lists I discovered a document I had never heard of. Could it be an error? Another author had used the same exact title. Again I reluctantly accepted the apparent unanswerability. Many months later I took a second look. One copy survived in the collection of an obscure and apparently unrelated European Not-for-Profit organization. I was able to purchase PDF scans of the document from them online. And this document had virtually disappeared after only sixty years.
So the preservation of knowledge is not what game theory refers to as a zero sum game. Neither electronic documents nor paper ones are indestructable, immortal, nor eternal. Neither survives at the expense of the other. Neither destroys the other. Nor are they necessarily mortal or evanescent. The real question is this: “What are the conditions of survival?”
As other responders have pointed out, the risks of decay, and therefore the conditions for preservation, are different for the two families of media. But the conditions of preservation are not independent. The two paper documents I just described came to me and are now potentially available to others in electronic PDF format. The smaller one I have printed out and can read in paper format.
What we have here is an ecosystem in which both means of preserving documents are useful and important, and in both of which we need to pursue all reasonable means of preserving knowledge. In any ecosystem no element stands alone. All elements are interdependent in ways we are only beginning to comprehend. And no ecosystem is static. When I studied biology more than fifty years ago, ecology was an infant branch of the science. It was not concerned with pollution. It was concerned with the study of the natural progression of ecosystems – the study of change. Not the study of decay, but the study of life.
Where the Internet Archive is of inestimable value to humanity is in keeping knowledge alive in the midst of the inevitable progression of technology. It is part of the interdependent cooperation of scholars, enthusiasts, and normally curious human beings. Its specialty is electronic means of preservation. But how much of what it preserves would not be in electronic format if it were not for the prior existence of inherently fragile physical documents?
I found the article interesting trying to compare digital books with analog varieties. It’s a bit like trying to compare chalk and cheese you can’t. They each serve their own unique position in our existence.
I tend to only read Kindle books both fiction and non fiction and find that I can have several hundreds of books at my fingertips on a machine that weighs very little. But the problem is that cross reference between pages is not straightforward but the advantage is that a digital book can be very easy to update.
This may be surprising to you because I am a Bookbinder who manufactures hard back books and also repairs and rebinds old books sometimes hundreds of years old but reads a kindle!
I can see the advantages and disadvantages of both and love them both.
But for long term reliability the only true long term sustainability is for the printed word because if printed on correct paper they will last decades of lifetimes
As many of your readers have commented on digital formats changed on a regular basis so that can you guarantee that a digital book published today can still be read in say 25 years time. Obviously not. Particularly if you take into account cyber attacks, CMS from the Sun or any other digital destroying technologies.
Sure you can compare digital books with paper books. You can do it when the exact same book is in multiple formats. You can also compare digital books with digital books.
A very rare and important physical book is discovered and scanned to a PDF. It has some missing and damaged pages. It has a number of typographic errors. It is uploaded to Internet Archive. In spite of its incompleteness it is the only copy readily available to researchers and scholars, and proves useful. The original owner’s signature is on the front flyleaf. There are astute marginal notations. Upside down on the back paste-down there is a handwritten verse which gives a new insight into the character of the scholar who previously owned it. All these addenda are visible in the PDF.
Conclusion one: The book, once forgotten in a library store room, is now available to everyone, and useful in spite of its incompleteness and typos.
The PDF is run through an optical character recognition program and made available in epub format. In spite of the increased convenience for those with e-reader tablets, it has many recognition errors, particularly in the small print footnotes. Much of the original copyright and publishing information has been deleted. The pagination does not match the original. The additions and marginal notes by the previous scholar are gone. The frontispiece illustration is not really useful in epub. But it is now downloaded by more people.
Conclusion two: the epub and similar electronic formats are inherently lossy unless carefully edited.
Now a publisher makes the book available in a print-on demand edition taken straight from the unedited epub, ‘warts and all’. Its incompleteness is covered by a disclaimer absolving the publisher from all the OCR errors. There is no bibliographic information on the original print edition.
Conclusion three: Print on demand is inferior to a good PDF version of the original print book.
Next a pristine copy is discovered in a private library and scanned to PDF, which is uploaded to the Internet Archive and made available to the Gutenberg Project. Gutenberg converts it to epub, and then painstakingly corrects the epub by comparing it to the original pages. Their version is superior to the previous epub version and to the print on demand version which is still widely available. But it is still missing the original publication and copyright data, and lacks the original pagination.
Conclusion four: Even the best epub versions lack information which is available in the original book. The PDF versions are superior to the epub versions, but differ in value. One is a good copy of the pristine original book. The other is a good copy of an imperfect book, but contains hand written marginalia which are of great interest to later scholars.
All the available copies in whatever format contain useful information. They can be compared on the basis of the character and quality of information they contain. Pagination may be of little importance when reading the text, but of great importance when quoting a passage in a later work. Marginalia may be unimportant to a casual reader, but of supreme importance to a scholar. Even the quality of a reproduced illustration in one PDF version may be far superior to that in another. The differences are in how the information is processed and handled. The basic rule: All reprocessing, including from original manuscript to first publication, risks introducing random noise. But it is all information. It is all digestible.
Chalk and Cheese both contain Calcium. But Cheese is digestible. Chalk is not.
You are correct on “if printed on correct paper”, but it is a big IF. Rag paper from the eighteenth century may still be durable and safely readable. But surviving copies of a book may be very rare. A lot of early wood pulp paper from the late 19th century seems to have had a very high acid content, and is now very brown and very brittle, even when it was published in attractive bindings. If you turn a page carelessly, the edges break off, or it goes to pieces. A few readings could do irreparable damage.
The most likely solution to both problems is to scan the pages to PDF format. Since individual surviving copies often have weaknesses, scans of more than one copy may be useful. This the Internet Archive does very well.
The problem with printed copy sustainability is when you need to travel from Michigan to a university library library in California to use it. As long as one good electronic copy survives, it can still be shared electronically around the planet.
This is why I still own a VCR. I also own a number of books (I love old books) that are well over 100 years old, have never had any special maintenance, and are still perfectly readable
I buy old books on sales, book stores and flea markets. I found a lot of them in a trash in the nature (somebody got rid of a passed away parent’s books). I scan them in pdf format and upload on archive.org for future preservation. Thanks, Brewster Kahle.
I have +24k physical books. I like the smell of old books.
Thousands of scanned photos, books and other e-books are stored on several hard disks.
One hard disk failed but I never lost a single e-book due to regular backup.
Recently, a mouse damaged a pile of old books previously scanned. I would like to scann my entire home library but that’s impossible. Dozens of encyclopedia, dictionaries… Every non scanned book is endangered if not stored properly.
Was it an intentional choice to accompany an article praising physical material with an AI-generated illustration? The irony!
I, for one still prefer the personal physical contact and relationship that I have with an ‘actual’ book than the tiny, unfeeling, diluted space that all your literature has to share. Each book has it’s own space and identity. Digital books share their covers with every other. It’s no wonder then that they’re easily forgotten
A few months ago I bought a 2012 NOOK(BNRV300) in a Mexican swap meet for 100 pesos, roughly 5 USD, all the books are intact, and been downloading epubs on it. Works perfectly.
People diss on ebook readers because they grew up with books and feel nostalgic for the smell and feel of turning pages. I feel the same for physical books. And also I feel nostalgic for my old Palm pilot that I used to read on back in the early 2000s. Nostalgia is a big factor that nobody wants to address when talking about the usefulness or value of physical books. I guess the scribes who transitioned from scrolls to books also felt the same about their old musty papyrus.
I used to work in a library by the beach and it was insane how many books went to the trash because they had mold. I don’t agree that paper books are superior to digital copies in regard to it’s preservation and durability.
The problem with ebooks today is that the big corporations make restrictions for copying and distribution. In the long run, it’s the PDFs and text files that will survive.
Really Good !
This is true and online book can easily get missed as compare to hard copy books
great post thanks a lot
The reason most books will be unusable is copy protection. I have many books that would have been obsolete as MS stopped supporting the reader that used .lit files. Copy protection and the need to continually upgrade the programs is what makes digital books have a shorter shelf life than paper books.
As someone who lives in a smaller senior apartment, I am very happy that I have my library in a digital format. I don’t have the room for many shelves of books here. The convenience of being able to carry enough books to last a month on my phone is just too convenient.
It should be necessary for any vendor of ebooks that are marketed for a single specific type of reader to provide a way to break the copy protection at will if the reader type is no longer available.