3 Million Texts for Free

Posted on September 17, 2011 by Brewster Kahle

Hundreds of libraries reached the milestone of offering 3 million freely downloadable texts yesterday through the Internet Archive website. Our 3 millionth text is a Galileo pamphlet from the rare book collection of the University of Toronto.

Internet Archive has been scanning books since 2005. We have made approximately 2 million books from 1,000 libraries in 200 languages available online since that time. Another 1 million texts have been uploaded by others, including everything from original books to court records to scans from other digitization projects and 37,000 books from Project Gutenberg.

More than 100 people digitize books in Internet Archive scanning centers in 27 libraries in 6 countries. At 10 cents a page, we are bringing over 1,000 new books online every day.

Archive.org is visited by more than 1 million different users every day. Books are downloaded or read on archive.org about 10 million times each month, and approximately 2,000 books for the blind and dyslexic (print disabled) are downloaded every day.

Other projects use the texts archive in bulk. Researches at the University of Massachusetts have used millions of archive.org books to do digital scholarship. OpenLibrary.org integrates these books with many thousands of recent books for the print disabled and library borrowers. All of the public domain books are full text searchable, indexed by multuiple search engines, and downloadable individually or in bulk.

Please help us build the library of free books by scanning and uploading, by donating physical books to the Internet Archive, or by sponsoring the digitization of great collections!

43 thoughts on “3 Million Texts for Free”

Gerard Arthus September 17, 2011 at 2:47 pm

Great milestone. And the pace is accelerating. It is wonderful that there are so many different types of items. The Internet Archive is truley becoming a great repository.

Gerry
Antoinette Baranov September 17, 2011 at 4:35 pm

What a great idea ! Would love to know how to access books!
Pingback: Internet Archive Just Scanned their 3 Millionth Text - The Digital Reader
The Glove Compartment Isn't Accurately Named September 17, 2011 at 6:54 pm

Congratulations on reaching this milestone, and good luck for the future!
Stian Håklev September 17, 2011 at 9:21 pm

This is absolutely wonderful news, however, when I visit openlibrary.org is says 928,782 ebooks available for online reading. Where are the other two million books? On Archive.org? But it’s almost impossible to browse Archive.org in a meaningful way. I hope you make efforts to make these books available to the great interface at openlibrary.org!
1. David Edwards September 19, 2011 at 5:13 am
  
  Openlibrary.org is a subset of Archive.org.
  
  “Another 1 million texts have been uploaded by others, including everything from original books to court records to scans from other digitization projects and 37,000 books from Project Gutenberg.”
  
  Everything else is what Archive.org has scanned itself. It is the same search engine so I am not sure why you fine it harder to search Archive.org.
  
  Anyway, Congrats on the 3 million books.
  1. Charles Redman mailmandeliver@yahoo.com October 2, 2011 at 8:13 pm
    
    I agree, I find http://www.archive.org/ a great tool in tracking down my text, be it word search by title, Author Etc. I have a easy time there and can spend hours searching and reading…I would love to be a part of this, I only do not know how a Country boy can help. 🙂
    1. Lorri Cornett November 17, 2011 at 3:28 pm
      
      Help out with Project Gutenberg by helping to proof documents online. You can do it anytime, for any amount of time. Look into it if you are interested in being part of this world-wide community effort!
Pingback: 3 Million Texts for Free | Library Stuff
Rodrigo Barba September 19, 2011 at 1:06 pm

Great milestone! Live long and prosper
Dennis Higman September 19, 2011 at 3:36 pm

I welcome the opportunity provided by such organisations to make public thousands of books from the seventeenth, eighteenth and nineteen centuries. As a student of history, it enables me to access many primary sources referred to in modern publications which would otherwise be difficult to access. However, I have found Google to be a frustrating organisation when it comes to digitising history books as they frequently fail to publish maps and tables that fold out and while such information may appear ‘dated’, as these are history books, I feel that they are integral to the work as published. I therefore seek to avoid Google e-books and seek others who take great pains to publish works in full. Other than that, I reiterate my applauding of the idea of making these books available to the masses.
Jeroen Hellingman September 20, 2011 at 8:08 pm

Good to hear about the contribution of Project Gutenberg. Although 37.000 books may sound like just a small fraction of 3 million, those books are not just scans, but careful proofread text versions (often accompanied by illustrated HTML, and downloadable in a range of different formats for all kinds of reading devices from the Project Gutenberg website.

Many of those Project Gutenberg versions have actually been prepared from scans that have been posted in the Internet Archive in the first place, so this is really an enabling project!
Denis September 21, 2011 at 6:52 am

Amazing speed of digitization. But the quaint fact that not all digitized books in English of the 19th century while some books have more than a dozen digital versions.
The reason for the fall of interest in the books is that modern books just are not interesting. So the digitization of these jewels are highly useful for those times when people could start to think deeply and to read relevant books.
Books from the British Museum come to the archive?
I’m a foreigner and learning English by readers and books of fairy tales – the books of the 19th century, and the pleasure derived from reading enough on what to do this all day. And students are suffering is useless without finding interesting ideas in modern books. .
Pingback: Библиотеки и молодёжь » 3 миллиона текстов в свободном доступе
Pingback: Internet Archive, 3 milioni di questi testi | Closer Dynamics - Siti Web Roma Web Agency - Mobile apps - Web & Social Media Marketing
Pingback: Tre milioni di libri su Internet Archive | wiBlog
Pingback: Internet Archive: 3 milioni di e-books free. « Medicina in Biblioteca
Christy September 26, 2011 at 6:37 pm

I like what you people are doing
Paul Richardson October 2, 2011 at 2:33 pm

Congratulations on providing the 3 millionth text! Such a tremendous undertaking.
Sentient October 2, 2011 at 3:15 pm

Congratulations on your 3 millionth text preserved for prosperity!
Paul Richardson October 2, 2011 at 5:11 pm

How do you donate physical books to the archive? How should the books be packaged and shipped? Is there a discount for shipping to the archive?

Great work.
1. internetarchive October 3, 2011 at 12:14 am
  
  Books can be sent to:
  Internet Archive
  300 Funston Avenue
  San Francisco, CA 94127
  
  Thank you.
Linux October 10, 2011 at 10:53 pm

You guys are doing a great job, keep it up!
radaca@libero.it October 18, 2011 at 8:38 pm

Thank you !!!
PBS October 19, 2011 at 10:02 pm

Congratulations. Real Great Job.

However, titles of some Telugu Books happen to be mis-spelt and require correction. I can do my bit by helping the Archive in correcting these titles.
Jampa Namgyal October 30, 2011 at 3:06 pm

I’ve been following you for years. I’m glad to see this new success of yours. As to me, I began recently to upload some of the works I think interesting end useful to share. In my blog about digital libraries (myfullresearch.wordpress.com) I frequently cite your site or items. Your text collections indeed satisfy every request.
Thanks again!
Steven Shepard November 6, 2011 at 2:18 pm

I love what you are doing here! I appreciate all of your noble projects, including the Netlabel archive! Books are great!
Chris K November 6, 2011 at 2:43 pm

An amazing achievement..only wish it had happened 20 years ago !

But PLEASE note one BIG annoyance..journal volume numbers and dates are NOT given in the journal volume listing. So necessary to trawl through volume by volume to find the one I’m looking for, as even when in date order there are duplications and sometimes volumes missing !
Jampa Namgyal November 11, 2011 at 10:39 pm

I have just ended the analysis of 1/3 of the IA texts that can be browsed under the tag ‘language=italian’, and with my surprise I found many times a heavy difference between the declared and the real.
In fact, only about the 75 per cent of the texts labelled ‘Italian’ are written in Italian. Of the remaining 25 per cent, more or less a half are written in Portuguese, then in Latin, then a few in Spanish, French, German and, last, English.

In this moment, the texts declared as ‘Italian’ are 45,974. This means that the ‘real’ Italian texts could be more or less 34,500.

It’d be interesting to know how the language attribution is performed by IA. If it is a software algorithm, it is simple to argue the fact that many words in Portuguese, Spanish, Latin and Italian can easily overlap in appearance if not in meaning.

I hope that the same algorithm may make the same mistake (reversed) in those similar languages – so the general balance is restored. Anyhow it is better to search a text using safe elements like ‘title’ or ‘author’.

My second point here is that the possibility of reviewing the item is not a connection with the ‘sytem administration’ in order to correct errors or omissions. The review remains dead letter. I wrote a number of reviews in which I gave notice of the language misattribution, but they are one-way communications. What about a change in this matter?

Anyhow, thanks for the huge work performed. I greatly appreciate it.
1. brewster kahle November 12, 2011 at 5:40 am
  
  Cool! I hope we can work together to fix up the language fields and anything else. onward.
  
  -brewster
  1. Jampa Namgyal November 12, 2011 at 10:17 am
    
    Hi, brewster.
    I’m interested. As you know, the road to hell in paved with good intentions, but we can try.
    You have my e-mail – please use it to establish the correct know-how and the road map of the matter.
    I’ll follow.
    Hear you next time.
smp3lembang November 15, 2011 at 10:20 am

That’s a great source for history subject in my school , thanks for share
Jessica November 23, 2011 at 10:37 am

I’m using Project Gutenberg now since several months and can say that I really like. There are many inspiring books that really helped with my studies.

Highly recommended from me.
JC November 30, 2011 at 9:54 pm

Truly an amazing achievement. We’re all about supporting better reading/writing and plan to donate books ASAP.

It’s a good idea for everyone to spread the word: don’t toss your books, donate them to the Internet Archive.
Homer Otto Goodall December 8, 2011 at 3:56 am

I have about 3000 books I am now scanning and I will be uploading them in the future. This doesn’t include 10 tb of films recovered from old computer hard drives, old homemade dvd’s, and other sources.
zara james December 10, 2011 at 12:24 am

Thanks for sharing this , but million text! That is truly amazing, how does one go through all of that?

How many are available for the new Kindle?
Iftekhar Ahmed December 10, 2011 at 11:22 am

Book is the best friend and Archive have a sum of 3000000 friends inside, one who desire to get more friends must peep inside Archive as well!
Great Congratulations!
subbarao December 12, 2011 at 2:31 pm

Useful web to all , you all doing good job and good service to feature ,
How can i got print from DjVu files.please.
Thanks.
Homer Otto Goodall December 13, 2011 at 2:44 am

I will go and get a thumb drive. The ones I have scanned and assembled are

Powder Valley Payoff a very rare pulp paper book

Fundamentals of Chemistry

Applied Chemistry for Nurses

E.L. Godkin and American Foreign Policy 1865 to 1900

The Great Fog Weird Tales of Terror and Detection

Light in My Window

CRAM Outer Space and World Globe Handbook

Favorite Country Cooking

Ross’s Business English

The Parish School Hymnal

Just Jenifer

This is just a small taste. I’ve become a book hoarder and need to get rid of a few in the paper version.
forever December 24, 2011 at 11:20 am

This project is so great!We can share all we have
ramamohan December 26, 2011 at 5:54 pm

it is nice to join the club
John Roberts-James December 26, 2011 at 5:55 pm

I give up. This site is TOO difficult to manage. I’m going elsewhere!!!
Diana August 14, 2012 at 9:12 am

Its so amazing 2 million books. Cool

Comments are closed.

Internet Archive Blogs

A blog from the team at archive.org

3 Million Texts for Free

43 thoughts on “3 Million Texts for Free”

Upcoming Events

Book Talk: Big Fiction