3 Million Texts for Free

Hundreds of libraries reached the milestone of offering 3 million freely downloadable texts yesterday through the Internet Archive website.  Our 3 millionth text is a Galileo pamphlet from the rare book collection of the University of Toronto.

Internet Archive has been scanning books since 2005.  We have made approximately 2 million books from 1,000 libraries in 200 languages available online since that time.   Another 1 million texts have been uploaded by others, including everything from original books to court records to scans from other digitization projects and 37,000 books from Project Gutenberg.

More than 100 people digitize books in Internet Archive scanning centers in 27 libraries in 6 countries.  At 10 cents a page, we are bringing over 1,000 new books online every day.

Archive.org is visited by more than 1 million different users every day.  Books are downloaded or read on archive.org about 10 million times each month, and approximately 2,000 books for the blind and dyslexic (print disabled) are downloaded every day.

Other projects use the texts archive in bulk.  Researches at the University of Massachusetts have used millions of archive.org books to do digital scholarship.  OpenLibrary.org integrates these books with many thousands of recent books for the print disabled and library borrowers.  All of the public domain books are full text searchable, indexed by multuiple search engines, and downloadable individually or in bulk.

Please help us build the library of free books by scanning and uploading, by donating physical books to the Internet Archive, or by sponsoring the digitization of great collections!

Posted in Announcements, Books Archive, News | 42 Comments

Hard Drive Archaeology – And Hackerspaces

Two different, but somewhat related additions to the archive you might want to check out.

First, I was contacted earlier this week about a project to recover information off of an old Cray-1 supercomputer hard drive. Unlike, say, trying to get your old floppies to read or pulling an old mix tape off of a cassette, with something as old as a Cray-1 (a computer once called the “World’s Most Expensive Love Seat“), you don’t even have a place to really plug it in: functioning Cray-1 machines are rare as you can get, and even if you were to get the hard drives spinning up and read off of – where would you get the data off the Cray?

Researcher Chris Fenton has a thing about Cray supercomputers – he built a tiny homebrew version of one that used emulation to allow you to experience some aspect of Crays, from his desktop. So when he found himself with a 80 Megabyte CDC 9877 disk pack, which was quite a lot for the early 1970s, it wasn’t just a matter of hooking it up to USB. (Actually, we have a brochure for the behemoth you would put this disk pack into to read it.)  Here’s what a nearly-the-same CDC 9987 looks like:

Ultimately, Fenton got the information off of the disk pack using a whole variety of techniques and experiments, as part of a research project this summer. He wrote a paper about the process, entitled Digital Archeology with Drive-Independent Data Recovery: Now, With More Drive Dependence!” and it’s now mirrored here at the archive. If nothing else, be sure to browse through the paper just to see the customized stepper motor and reader he build to pull the magnetic data off the platters. And I was kind of understating things… ultimately he did hook it up to USB.

From this careful, forensic-quality magnetic scan of the drive, Fenton has produced a large image of the disk, one far larger than the data on it but allowing further experimentation and reading from the image without having to build a robot in your basement. And now, we’re offering this image on the archive. Remember, you won’t be able to pull this data down and go back to the 1970s, instantly – you should be reading up documentation of disk formats, learn about how pull information off of magnetic flux recording, and a whole other host of material and knowledge…. but hey, weekends are for having fun, right?

Even ten years ago, the idea of offering several gigabytes of something (that expands out to about 20 gigabytes of something) online was beyond crazy – that we’ve come so far in offering this much to so many people speaks how much the world has changed since the era of this disk pack.

Fenton is associated with the NYC-based hackerspace, NYC Resistor and it was their mailing list that got in contact with me to get this disk image up to the archive.

Coincidentally, this was also the week that two NYC Resistor members released a book, for free, which you might really enjoy. Bre Pettis and Astera Schneeweisz hatched a plan to make a book on hackerspaces at the end of 2008. They wanted to put it together in less than two weeks, and as people submitted photos, essays and other material, the project increased in size, more folks were brought in, and this month the end result was released for free.

Entitled “Hackerspaces: The Beginning”, this photo-filled book is available at the archive to read online or download. A worldwide view of hackerspaces throughout the world as of 2008, it also includes memories of spaces past and dreams of spaces future. It’s an excellent snapshot of a beautiful, technological world well worth browsing this weekend (and weekends to come).

So if you’re in the mood for advanced research or just to check out some great photos, the archive’s got something for you!

Posted in Cool items, Software Archive | 13 Comments

Understanding 9/11: A Television News Archive

We are proud to announce the launch of Understanding 9/11: A Television News Archive, a library of news coverage of the events of 9/11/2001 and their aftermath as presented by U.S. and international broadcasters. A resource for scholars, journalists and the public, the library presents one week (3,000 hours from 20 channels over 7 days) of news broadcasts for study, research and analysis, with select analysis by scholars.

911 collection pageTelevision is our preeminent medium of information, entertainment and persuasion, but until now it has not been a medium of record. Scholars face great challenges in identifying, locating and adequately citing television news broadcasts in their research. This archive attempts to address this gap by making TV news coverage of this critical week in September 2001 available to those studying these events and their treatment in the media.

Background on the Television Archive

Internet Archive is a non-profit library founded in 1996 that started by attempting to collect every webpage from all websites. This is a major task but it is doable even by a non-profit.

Another medium, television, struck us as historically under-appreciated, despite its tremendous importance. Television is pervasive and persuasive, but it is difficult to access programs for research and analysis.  We felt that TV should be a medium of record, a moniker generally reserved for newspaper publishing. As we learned in high school, to effectively understand we need to be able to ‘compare and contrast’. We need to be able to quote.

Talking with the Library of Congress in 2000 we found that they were not systematically recording TV. Talking with the Federal Broadcast Information Service which was collecting TV for the US intelligence community, we found it would probably be difficult to get the recordings from them for library use. The notable Vanderbilt TV News archive at that time was struggling financially and only captured several hours of television news each night. As a result, we decided to create the Television Archive to help preserve this culturally important medium.

Starting in late 2000, we began collecting Russian, Chinese, Japanese, Iraqi, French, Mexican, British, American, and other stations… 20 channels of TV in DVD quality.

When the events of September 11, 2001 occurred, we, like most Americans, urgently wanted international perspectives on the United States. Stunned by the attacks, we tried to figure out what we could do to help.  Seventy-one people and organizations worked together to get one week of TV News up on the Internet to be launched on October 11, 2001. (Bear in mind this is 3 years before YouTube started.) Launched at the Newseum in Washington DC, we made a website that allowed anyone to research the collection of 20 channels for the week of September 11th.

Today, we are relaunching this collection with an updated interface with a conference at NYU.

Posted in Announcements, Event, News, Television Archive | 22 Comments

Scanning a Braille Playboy

Hi. I’m Jason Scott, adjunct archivist at archive.org, and I wanted to talk about the time I watched the Internet Archive scan in a Braille issue of Playboy magazine.

Many people might not know there have even been Braille editions of Playboy, but they’ve been printed since 1970, a function of the National Library Service for the Blind and Physically Handicapped (NLS) , which puts out a variety of transcribed versions of periodicals and books and other materials, in forms such as digital text, mp3/spoken, and of course Braille. The service has been going on since the 1930s, and makes these materials available to a wider audience than might otherwise get access. (In the 1980s, an attempt was made to drop the magazine, but it was re-instated after a court battle.) A friend of mine, Thomas Dell, had a copy and I brought it to the Archive, where we decided to scan it in.

For many folks, the Internet Archive is the group that houses the Wayback Machine, a miraculous petabytes-large collection of archived web pages that show progression of the World Wide Web since the 1990s, a browsable museum that grants near-instant access to over a decade and a half of information, much of it located nowhere else. To make this marvel work, a lot of engineering, planning and coding has gone into place, much of which is not instantly obvious as you go looking for that special page you saw once but which is now gone.

But the Archive is also the location of much more, including what I want to talk about today: an amazing, globe-spanning book scanner operation that is currently bringing in a fantastic amount of scanned books from a huge variety of sources. What rates as a fantastic amount? Try over one thousand books a day. Checking the internal statistics, I see an average of 47 books added every hour for the last year, which means a book is being added to the archives every 90 seconds.

How is this possible? I figured I’d find out.

So, I took my copy of Playboy and brought it over to the scanning center, where I was greeted by an impressive array of equipment and staff who can bring in so much so quickly.

There are lots of these scanning centers affiliated with or run by the Internet Archive worldwide, often in alliance with libraries or academic institutions, bringing in a whole range of materials – not just books. Audio, video, microfiche and a few other mediums are being brought in via a very well engineered combination of machines, processes and trained staff. This link gives a lot of information if you’re a group who has a bunch of books to scan in and want a great open service to work with.

There’s a bunch of incoming material to these scanning centers, and items patiently wait next to the equipment for their big chance. On another side of the room are the items on their way out, where they’re being made available for auditing, quality control and verification. Eventually these books might go back to donating institutions or to the Physical Archive, the recent addition to the Internet Archive’s family of projects. They’ll go into deep storage should the originals be needed again.

Venus, who was running the shift that day, allowed me to jump this queue with my strange little artifact (Officially “Playboy:Braille Edition, February 1992, Part 3 of 4, Volume 39 Number 2″) and run it through the scanning process.

It had already been assigned a unique identifier, playboybraile00nlsu, which makes it easy to find later, account for, and find on the web afterwards. A barcode reader at the scanning station assures it’s in the system.

The setup took longer than the majority of books do because these Braille issues are oversized, an odd color (kind of a paper bag consistency) and more like a newspaper stapled through than a standard binding. This was very informative, because the crew has a variety of tests and tools to make sure the scans are as good as they can be, including foam bracing, dowels, and shims. They tried a variety of approaches before settling on one.

Once the arrangement had been decided on, the calibration worked out, the lights adjusted and the process begun, it went very fast – this 98-page book was scanned in less than five minutes, and I only got a shot or two of the process. The Scribe system works very efficiently and someone trained with the system can work smoothly, with no damage or stress to the book or binding. Good thing, too – while the Playboy is only 19 years young, some of the books scanned have been around for centuries and wish to continue to do so.

Personally, this little note on the machines does it best to bring it home for me,  reminding that the goal is to scan one thousand pages an hour, and to shoot for eight thousand pages a day. Now imagine the multiple stations in this location, and the locations all over the world, and you begin to see how much is being done here.

I took back the Playboy and a few hours later, after a process of deriving the original scans into a whole host of convenient formats such as PDF, DjVu, and Epub, a Braille version of Playboy can now be seen on archive.org.

Now, I will be the very first to admit – the result is pretty silly. You’ve got something that needs to be read by touching it, which can’t be touched, and the two-sided indentations on the paper means it all looks pretty darn strange. So on one hand, it all can seem pretty useless.

But what can we learn by clicking on the link? Well, we find out that this sort of thing exists at all, and why, and what it looks like, and how Braille can be printed on both sides, and that it would take four copies to produce the text of a single issue… and that apparently, there’s no centerfold.

If you’ve not given the Archive a chance as a place to check out books, you should head on over to the OpenLibrary or the main Internet Archive site, where there are millions of books waiting for you, your friends, your family, your school.. and where it’s not just a scan of Braille, but some truly stunning works, like:

It’s all right there, waiting for you, an endless and amazing supply of information, research, entertainment and learning brought in by this spectacular group. May I suggest a browse?

And remember, if anyone catches you reading that issue of Playboy we’ve been discussing and whose journey we got to witness… just claim you were truly, honestly reading it for the articles.

Posted in Books Archive | 12 Comments

Open Hardware: Inexpensive Enclosures From Junction Boxes.

I had a need for a cheap, standard enclosure for a humidity and temperature monitoring project. While there are many, many options for enclosures out there, few are cheap AND locally available. It occurred to me that electrical junction boxes are widely available, inexpensive, and consistently dimensioned.

So, off to Home Depot I went, wallet and calipers in tow. There were a few attractive junction boxes, each around $1 each:

Raco 1-Gang Drawn Square Box
Model # 8190 Home Depot SKU # 587799

Raco 1-Gang Welded Square Box
Model # 8189 Home Depot SKU # 201863

Carlon 2-Gang 20 cu. in. Switch and Outlet Box
Model # A521DE-CARR Home Depot SKU # 271612

There was even a blue plastic cover!

But, on closer inspection, the cover turned out to be unsuitable. It’s made of PVC, which cannot be cut or marked on the laser. Etching or cutting PVC on the laser forms gaseous hydrochloric acid, which is toxic, corrosive, and voids the warranty on your laser cutter. Don’t cut PVC/Vinyl on the laser if you value your health, safety, and/or warranty. Incidentally, if you are buying a used laser, always look for signs of rust around the optics/cutting area. Rust is a good indicator that the laser was abused in this particular way.

After some iteration on cheap 1/4″ import Baltic Birch plywood…

I came up with this — a simple, Open Hardware cover and liner system for junction boxes. If you have access to a laser cutter, you can now make custom project boxes, suitable for holding Arduino AND a shield, in minutes. It’s as simple as a top plate and a bottom plate – the bottom plate designed to insulate the Arduino or other electronics from the metal box. Of course, as pictured above, you can also use the blue PVC boxes while retaining the laserability of this cover.

Here’s a nice shot showing some of the better features of this setup. First, by knocking out one of the knock-outs on the side, it is possible to feed in ethernet, USB, and sensor cables with room to spare. Second, even with the insulating plate in place, there is enough room for Arduino with a shield and header pins sticking up. Third, the box comes with screws suitable for fixing the cover in place. Pretty slick, and very cheap.

This is Open Hardware.

The Internet Archive is pretty excited about Open Hardware, and most or all of my work here will be released as such. This is release number 1 of many. Here is the artwork. (this link will be updated shortly).

 

 

 

 

 

Posted in Hardware | 14 Comments

LEARNING FROM RECORDED MEMORY: 9/11 TV News Archive Conference

LEARNING FROM RECORDED MEMORY: 9/11 TV News Archive Conference

Co-sponsored by Internet Archive and New York University’s Moving Image Archiving and Preservation Program, Tisch School of the Arts

Wednesday, August 24, 4:00-6:00 pm; reception follows

New York University, Tisch School of the Arts, 721 Broadway, 6th Floor, Michelson Theater, New York, NY 10003

This conference highlights work by scholars using television news materials to help us understand how TV news presented the events of 9/11/2001 and the international response. Our collective recollection of 9/11 and the following days has become inseparable from the televised images we have all seen. But while TV news is inarguably the most vivid and pervasive information medium of our time, it has not been a medium of record. As the number of news outlets increases, research and scholarly access to the thousands of hours of TV news aired each day grows increasingly difficult. Scholars face great challenges in identifying, locating and adequately citing television news broadcasts in their research.

The 9/11 Television News Archive (http://archive.org/details/911) contains 3,000 hours of national and international news coverage from 20 channels over the seven days beginning September 11, plus select analysis by scholars. It is designed to assist scholars and journalists researching relationships between news events and coverage, engaging in comparative and longitudinal studies, and investigating “who said what when.” What kinds of research and scholarship will be enabled by access to an online database of TV news broadcasts? How will emerging TV news studies make use of this service? This conference offers contemporary insights and predictions on new directions in television news studies.

SCHEDULE

4:00:  Welcome: Richard Allen, Chair, Department of Cinema Studies, Tisch School of the Arts, NYU
4:05:  Brewster Kahle, Founder and Digital Librarian at the Internet Archive
4:15:  Brian A. Monahan, Iowa State University
4:25:  Deborah Jaramillo, Boston University
4:35:  Marshall Breeding, Vanderbilt Television News Archive
4:45:  Mark J Williams, Department of Film and Media Studies, Dartmouth College
4:55:  Carolyn Brown, American University
5:05:  Michael Lesk, Rutgers University
5:15:  Beatrice Choi, New York University
5:25:  Scott Blake, Artist
5:35:  Discussion
6:00:  Reception (Remarks by Dennis Swanson, President of Station Operations, Fox Television)

SPEAKERS

Welcome: Richard Allen, Chair, Department of Cinema Studies, Tisch School of the Arts, New York University

 

Brewster Kahle, Internet Archive

“Introducing the 9/11 TV News Archive”

Brewster Kahle is the founder and Digital Librarian of the Internet Archive in 1996.   An entrepreneur and Internet pioneer, Brewster invented the first Internet publishing system and helped put newspapers and publishers online in the 1990′s.  

 

Brian A. Monahan, Iowa State University

“Mediated Meanings and Symbolic Politics: Exploring the Continued Significance of 9/11 News Coverage”

In-depth analysis of television news coverage of the September 11 attacks and their aftermath reveals how these events were fashioned into “9/11,” the politically and morally charged signifier that has profoundly shaped public perception, policy and practice in the last decade.  The central argument is that patterned representations of 9/11 in news media and other arenas fueled the transformation of September 11 into a morality tale centered on patriotism, victimization and heroes.  The resulting narrow and oversimplified public understanding of 9/11 has dominated public discourse, obscured other interpretations and marginalized debate about the contextual complexities of these events. Understanding how and why the coverage took shape as it did yields new insights into the social, cultural and political consequences of the attacks, while also highlighting the role of news media in the creation, affirmation and dissemination of meanings in modern life.

Brian Monahan has extensively researched news coverage of 9/11, resulting in a number of scholarly presentations and a book, The Shock of the News: Media Coverage and the Making of 9/11 (2010, NYU Press).

 

 

Deborah Jaramillo, Boston University

“Fighting Ephemerality: Seeing TV News through the Lens of the Archive”

The experience of watching the news on TV as events unfold is often complicated by the space of exhibition — typically, the domestic space. When hour upon hour of news is catalogued and archived — placed in a space of focused study — the news and the experience become altogether different. What was meant to be ephemeral acquires permanence, and what is usually a short-term viewing experience becomes a rigorous, frame-by-frame examination. In this presentation I will discuss how the archive challenges researchers to adopt new ways of seeing and explaining TV news.

Deborah L. Jaramillo is Assistant Professor in the Department of Film and Television, Boston University.

 

Marshall Breeding, Vanderbilt Television News Archive

“An Overview of the Vanderbilt Television News Archive”

Marshall Breeding will give a brief overview of the Vanderbilt Television News Archive and how it carries out its mission to preserve and provide access to US national television news.   He will relate the incredibly diverse kinds of use that the archive receives, including: academic scholarly research; individuals seeking coverage of themselves or family members that may have appeared on the news in life-changing events; those needing historic footage for current journalism, documentaries or other creative works; or corporations or non-profits researching news coverage of their vested topics.  Breeding will also outline some of the constraints it faces in how it provides access to its collection.

Marshall Breeding is the Executive Director of the Vanderbilt Television News Archive and the Director for Innovative Technology and Research for the Vanderbilt University Library.

 

Mark J. Williams, Department of Film and Media Studies, Dartmouth College

“Media Ecology and Online News Archives”

Online TV news archives are a crucial digital resource to facilitate the awareness
of and critical study of Media Ecology.  The 9/11 TV News Archive will fundamentally
enhance our capacity for the study of historical TV newscasts. Two significant
research and teaching outcomes for this area of study are A) to better understand
the role of television news regarding the mediation of society and its popular
memory, and B) to underscore the significance of television news to the goal of
an informed citizenry.  The 9/11 TV News Archive will enhance and ensure the continued
study of the indelible tragic events and aftermath of 9/11, and make possible
new interventions within journalism history and media history, via online capacities
for access and collaboration.

Mark J. Williams is Associate Professor in the Department of Film and Media Studies, Dartmouth College.

 

Carolyn Brown, American University

“Documentation and Access: A Latino/a Studies Perspective on Using Video Archives”

This talk will explore the possibilities and potential of using accessible video news archives in two areas: immigration research in the field of communication and documentary journalism. I will speak of the significance of video news archives in my current film, The Salinas Project, and discuss my continuing research on Latino/as and immigration in the news.

Carolyn Brown is Assistant Professor in the School of Communication and Journalism at American University. She produced daily news shows for MSNBC News and Fox News Channel, and has worked as a producer and senior producer in local news in San Francisco, Washington, D.C., and Phoenix.

 

Michael Lesk, Rutgers University

“Image Analysis for Media Study”

Focusing on television news coverage of the 9/11 attacks, this talk will outline strategies for automatic quantitative analysis of television news imagery.

After receiving a PhD degree in Chemical Physics in 1969, Michael Lesk joined the computer science research group at Bell Laboratories, where he worked until 1984. From 1984 to 1995 he managed the computer science research group at Bellcore, then joined the National Science Foundation as head of the Division of Information and Intelligent Systems, and since 2003 has been Professor of Library and Information Science at Rutgers University, and chair of that department 2005-2008. He is best known for work in electronic libraries, and his book “Practical Digital Libraries” was published in 1997 by Morgan Kaufmann and the revision “Understanding Digital Libraries” appeared in 2004.  He is a Fellow of the Association for Computing Machinery, received the Flame award from the Usenix association, and in 2005 was elected to the National Academy of Engineering. He chairs the NRC Board on Research Data and Information.

 

Beatrice Choi, New York University

“Live Dispatch: The Ethics of Audio Vision Media Coverage in Trauma and the Legacy of Sound from Shell Shock to 9/11″

What experiential narratives—sensory, aesthetic and political—are invisible to those exposed to traumatic events? Considering September 11, 2001, the media coverage of the event is predominantly visual. People drift in and out of news footage, covered in dust and ash as they exclaim that witnessing the attacks was like watching a movie . In contrast, the wailing of sirens, the staccato thud of feet running from the stricken towers, and the chaotic overlap of voices break through—sometimes even swallow—the visual narratives spun for 9/11. For contemporary American traumatic events, this inquires into how porous the sensory modalities are in experiencing and remembering shock. How, after all, do sensory representations of traumatic events leave in/visible marks on documentation? I address these questions by exploring sound as an alternate modality, evoking a different level of traumatic indexicality. First, I draw attention to the sensory discrepancy between audio and visual content dispersed for American traumatic events, taking 9/11 as the focal event. By investigating the most highly represented media vehicles in the event—television and radio—I delve into a critical visual-acoustic analysis, looking specifically at FDNY radio transmissions and NY1 Aircheck news footage. Finally, I examine the discursive legacy sound imparts in moments of American crisis from shell shock accounts in the late 19th – 20th century to post-9/11 narratives of post-traumatic symptoms. In delineating this legacy, I hope to reveal the ways in which these documented discourses evolve past preconceived sensory boundaries in the experience of trauma.

Beatrice Choi is an NYU MA Graduate from the Media Culture Communication program. She has worked with the 9/11 archives for a year as a Moving Imagery Exhibitions Intern at the National September 11 Memorial & Museum, and recently completed a thesis on Post-Traumatic Landscapes, focusing primarily on post-Katrina New Orleans.

 

Scott Blake, artist

“9/11 Flipbook and Quantitative Media Study”

Scott Blake has created a flipbook consisting of images of United Airlines Flight 175 crashing into the south tower of the World Trade Center. Accompanying the images are essays written by a wide range of participants, each expressing their personal experience of the September 11th attacks. In addition, the authors of the essays were asked to reflect on, and respond to, the flipbook itself. Not surprisingly, the majority of the essayists experienced the events through news network footage. Blake is distributing his 9/11 Flipbooks to encourage a constructive dialog regarding the media’s participation in sensationalizing the tragedy. To further illustrate his point, Blake conducted a media study using the 9/11 TV News Archive to count the number of times major news networks showed the plane crashes, building collapses and people falling from the towers on September 11, 2001.

While best known for his Barcode Art, Scott Blake has created new works that are scandalous, witty, fun, pornographic, humorous and about a thousand other adjectives viewers might use when seeing them for the first time. A self-described “frivolous artist,” he mows over conceptual and visual boundaries to make work that is as thought provoking as it is entertainingly tongue-in-cheek.

RECEPTION

Remarks by Dennis Swanson, President of Station Operations, Fox Television

THANKS TO

We thank the many people at New York University and Internet Archive who have helped to make this conference possible.

Posted in Event, News, Television Archive | Tagged , , , | 7 Comments

Rosetta Project’s Record-A-Thon at the Internet Archive tomorrow

The Rosetta Project is trying to record 50 or more languages in one day in a Record-A-Thon event.  If you’re in the San Francisco area, stop by the Internet Archive’s offices tomorrow to participate.  The event was covered in the New York Times today.

Posted in News | Leave a comment

In-Library eBook Lending Program Expands to 1,000 Libraries

Internet Archive announces 1,000 Library Partners from 6 countries have joined to build and lend a pool of 100,000+ eBooks; Extending the Traditional In-Library Lending Model.

San Francisco, CA – Today, the Internet Archive announced that the 1,000th library from 6 countries has joined its In-Library eBook Lending Program. Led by the Internet Archive, patrons may borrow eBooks from a new, cooperative 100,000+ eBook lending collection of mostly 20th century books on OpenLibrary.org, a site where it’s already possible to read over 1 million eBooks without restriction. During a library visit, patrons with an OpenLibrary.org account can borrow any of these lendable eBooks using laptops, reading devices or library computers. This new twist on the traditional lending model could increase eBook use and revenue for publishers.

“As readers go digital, so are our libraries,” said Brewster Kahle, founder and Digital Librarian of the Internet Archive. “To grow from 150 great, forward-thinking libraries in Feb. 2011 to 1,000 libraries today, suggests that there is a true need for this type of program. We, as libraries,  want to buy eBooks to lend to our patrons.” (See the partial list of participating libraries below.)

This new digital lending system will enable patrons of participating libraries to read books in a web browser. “In Silicon Valley, iPads and other reading devices are hugely popular. Our partnership with the Internet Archive and OpenLibrary.org is crucial to achieving our mission — to meet the reading needs of our library visitors and our community,” said Linda Crowe, Executive Director of the Peninsula Library System.

A recent survey of libraries across North America was conducted by Unisphere Research and Information Today, Inc. (ITI). It reported that of the 1,201 libraries canvassed, 73% are seeing increased demand for digital resources with 67% reporting increased demand for wireless access and 62% seeing a surge in demand for web access.

American libraries spend $3-4 billion each year on publishers’ products. “I’m not suggesting we spend less, I am suggesting we spend smarter by buying and lending more eBooks,” asserted Kahle. He is also encouraging libraries worldwide to join in the expansion of this pool of purchased and digitized eBooks so their patrons can borrow from this larger collection.

How It Works
Any OpenLibrary.org account holder can borrow up to 5 eBooks at a time, for up to 2 weeks. Books can only be borrowed by one person at a time. People can choose to borrow either an in-browser version (viewed using the Internet Archive’s BookReader web application), or a PDF or ePub version, managed by the free Adobe Digital Editions software. This new technology follows the lead of the Google eBookstore, which sells books from many publishers to be read using Google’s books-in-browsers technology. Readers can use laptops, library computers and tablet devices, including the iPad.

What Participating Libraries Are Saying
The reasons for joining the initiative vary from library to library. Judy Russell, Dean of University Libraries at the University of Florida, said, “We have hundreds of books that are too brittle to circulate. This digitize-and-lend system allows us to provide access to these older books without endangering the physical copy.”

“Libraries are our allies in creating the best range of discovery mechanisms for writers and readers…”
Richard Nash
Founder of Cursor, Publisher

Digital lending also offers wider access to one-of-a-kind or rare books on specific topics such as family histories — popular with genealogists. This pooled collection will enable libraries like the Boston Public Library and the Allen County Public Library in Indiana to share their materials with genealogists around the state, the country and the world.

“Genealogists are some of our most enthusiastic users, and the Boston Public Library holds some genealogy books that exist nowhere else,” said Amy E. Ryan, President of the Boston Public Library. “This lending system allows our users to search for names in these books for the first time, and allows us to efficiently lend some of these books to visitors at distant libraries.”

“Reciprocal sharing of genealogy resources is crucial to family history research. The Allen County Public Library owns the largest public genealogy collection in the country, and we want to make our resources available to as many people as possible. Our partnership in this initiative offers us a chance to reach a wider audience,” said Jeffrey Krull, Director of the Allen County Public Library.

Publishers selling their eBooks to participating libraries include Cursor and OR Books. Books purchased will be lent to readers as well as being digitally preserved for the long-term. This continues the traditional relationship and services offered by publishers and libraries.

Jo Budler, Kansas State Librarian, comments, “Kansas librarians are very excited about offering this downloadable service to the residents of Kansas.  Historically Kansas librarians have been strong supporters of collaborative endeavors.  This project fits very nicely with projects undertaken in the past, and with the desire to continue to offer excellent customer service and new services into the future.”

“Creating digital structures that support access to content through public libraries is imperative. The Digital In-Library Lending project is a beginning. California is delighted to be involved a project that will create more online access to content for Californians” said Californian State Librarian Stacey Aldrich.

John Oakes, founder of OR Books, said, “We’re always on the lookout for innovative solutions to solve the conundrum of contemporary publishing, and we are excited to learn about the Internet Archive’s latest project. For us, it’s a way to extend our reach to the crucial library market. We look forward to the results.”

For More Information
Here are some eBooks that are only available to people in participating libraries.
Libraries interested in partnering in this program should contact: info@archive.org.
To use this service, please visit a participating library:

###

List of Participating Libraries

Aboite Branch Library, Allen County Public Library

Dupont Branch Library, Allen County Public Library

Georgetown Branch Library, Allen County Public Library

Grabill Branch Library, Allen County Public Library

Hessen Cassel Branch Library, Allen County Public Library

Little Turtle Branch Library, Allen County Public Library

Main Library, Allen County Public Library

Monroeville Branch Library, Allen County Public Library

New Haven Branch Library, Allen County Public Library

Pontiac Branch Library, Allen County Public Library

Shawnee Branch Library, Allen County Public Library

Tecumseh Branch Library, Allen County Public Library

Waynedale Branch Library, Allen County Public Library

Woodburn Branch Library, Allen County Public Library

Adams Street Branch Library, Boston Public Library

Brighton Branch Library, Boston Public Library

Charlestown Branch Library, Boston Public Library

Codman Square Branch Library, Boston Public Library

Connolly Branch Library, Boston Public Library

Dudley Branch Library, Boston Public Library

East Boston Branch Library, Boston Public Library

Egleston Square Branch Library, Boston Public Library

Faneuil Branch Library, Boston Public Library

Fields Corner Branch Library, Boston Public Library

Grove Hall Branch Library, Boston Public Library

Honan-Allston Branch Library, Boston Public Library

Hyde Park Branch Library, Boston Public Library

Jamaica Plain Branch Library, Boston Public Library

Lower Mills Branch Library, Boston Public Library

Mattapan Branch Library, Boston Public Library

North End Branch Library, Boston Public Library

Orient Heights Branch Library, Boston Public Library

Parker Hill Branch Library, Boston Public Library

Roslindale Branch Library, Boston Public Library

South Boston Branch Library, Boston Public Library

South End Branch Library, Boston Public Library

Uphams Corner Branch Library, Boston Public Library

Washington Village Branch Library, Boston Public Library

West End Branch Library, Boston Public Library

West Roxbury Branch Library, Boston Public Library

Internet Archive

MBLWHOI Library, Marine Biological Laboratory, Woods Hole Oceanographic Institution

Atherton Library, Atherton, California

Bay Shore Library, Daly City, California

Belmont Library, Belmont, California

Brisbane Library, Brisbane, California

Burlingame Public Library, Burlingame, California

Burlingame Library Easton Branch, Burlingame, California

Cañada College Library, Redwood City, California

College of San Mateo Library, San Mateo, California

East Palo Alto Library, East Palo Alto, California

Fair Oaks Library, Redwood City, California

Foster City Library, Foster City, California

Grand Avenue Branch Library, South San Francisco, California

Half Moon Bay Library, Half Moon Bay, California

Hillsdale Branch Library, San Mateo, California

John Daly Library, Daly City, California

Marina Public Library, San Mateo, California

Menlo Park Library, Menlo Park, California

Menlo Park Library Belle Haven Branch, Menlo Park, California

Millbrae Library, Millbrae, California

Pacifica Sanchez Library, Pacifica, California

Pacifica Sharp Park Library, Pacifica, California

Portola Valley Library, Portola Valley, California

Redwood City Public Library, Redwood City, California

Redwood Shores Branch Library, Redwood City, California

San Bruno Library, San Bruno, California

San Carlos Library, San Carlos, California

San Mateo Public Library, San Mateo, California

Schaberg Library, Redwood City, California

Serramonte Main Library, Daly City, California

Skyline College Library, San Bruno, California

South San Francisco Public Library, South San Francisco, California

Westlake Library, Daly City, California

Woodside Library, Woodside, California

Anza Branch, San Francisco Public Library

Bayview/Anna E. Waden Branch, San Francisco Public Library

Bernal Heights, San Francisco Public Library

Chinatown/Him Mark Lai Branch, San Francisco Public Library

Eureka Valley/Harvey Milk Memorial Branch, San Francisco Public Library

Excelsior, San Francisco Public Library

Glen Park Branch, San Francisco Public Library

Golden Gate Valley Branch, San Francisco Public Library

Ingleside Branch, San Francisco Public Library

San Francisco Public Library, Main

Marina, San Francisco Public Library

Merced Branch Library, San Francisco Public Library

Mission, San Francisco Public Library

Mission Bay, San Francisco Public Library

Noe Valley/Sally Brunn Branch, San Francisco Public Library

North Beach Branch, San Francisco Public Library

Ocean View, San Francisco Public Library

Ortega, San Francisco Public Library

Park Branch, San Francisco Public Library

Parkside, San Francisco Public Library

Portola Branch, San Francisco Public Library

Potrero Branch, San Francisco Public Library

Presidio Branch, San Francisco Public Library

Richmond/Senator Milton Marks Branch, San Francisco Public Library

Sunset, San Francisco Public Library

Visitacion Valley, San Francisco Public Library

West Portal, San Francisco Public Library

Western Addition, San Francisco Public Library

The Urban School of San Francisco

Augustana Campus Library, University of Alberta

Bibliothèque Saint-Jean (BSJ), University of Alberta

Cameron Library, University of Alberta

Herbert T. Coutts (Education & Physical Education) Library, University of Alberta

Rutherford Library, University of Alberta

John A. Weir Memorial Law Library, University of Alberta

John W. Scott Health Sciences Library, University of Alberta

Winspear Business Reference Library, University of Alberta

Architecture and Fine Arts Library, University of Florida

Education Library, University of Florida

Health Science Center Library, University of Florida

Borland Library, University of Florida

Veterinary Medicine Reading Room, University of Florida

Allen H. Neuharth Journalism and Communications Library, University of Florida

Library West, University of Florida

Marston Science Library, University of Florida

Mead Library, University of Florida

Music Library, University of Florida

Smathers Library (East), University of Florida

Robarts Library, University of Toronto

Gerstein Science Information Centre, University of Toronto

Centre for Reformation and Renaissance Studies, Victoria University

E J Pratt Library, Victoria University

Emmanuel College Library, Victoria University

Posted in Announcements, Books Archive, News | Tagged , , , , | 12 Comments

new audio/video player — safari/IE improvements

below the current audio/video player on archive.org you have probably seen by now the link:

Would you like to try our new audio/video player? (beta!)

We had some known problems in this beta rollout that affected audio MP3 playback.

Specifically, on Safari, some 30-70% of the time (and it varied widely) the MP3 loading/setup would fail.  This has been fixed.   On Internet Explorer, we didn’t have the MP3 “flash based playback” option setup using the new audio player — and the lead developer, Michael Dale, came over today and fixed that for us.   Hooray!

So at this point, I believe the audio/video player is true “beta” — feature complete with a few things to smooth out left but the finish line is close:

1) i need to add back in captions/subtitles (it’s there in the player, just need to feed them through with our playlist)

2) video items with 3+ videos may play the last video 2x.  working on that!  8-)

hopefully, we can all listen to some nice archive music this weekend in peace without issues with this new player!  now grab your headphones or turn up those speakers…

-tracey

Posted in Audio Archive, Live Music Archive | Tagged , , | 26 Comments

Our Newest Addition – Film Scanning

Müller Framescanner

We’re pretty excited about the film-to-digital scanner we just received. It is the first Müller Framescanner in the United States. It is the first film scanner in the world that supports all movie formats up to super 16 mm.

It scans Regular 8, Super 8, Pathé 9.5, 16 mm, Super 16m film and the images are stored as data.

Break out those home movies!

-Jeff Kaplan

Posted in News | 10 Comments

HTTP Archive joins with Internet Archive

It was announced today that HTTP Archive has become part of Internet Archive.

The Internet Archive provides an archive of web site content through the Wayback Machine, but we do not capture data about the performance of web sites.  Steve Souders’s HTTP Archive started capturing and archiving this sort of data in October 2010 and has expanded the number of sites covered to 18,000 with the help of Pat Meenan and WebPagetest.

Steve Souders will continue to run the HTTP Archive project, and we hope to expand its reach to 1 million sites.  To this end, the Internet Archive is accepting donations for the HTTP Archive project to support the growth of the infrastructure necessary to increase coverage.  The following companies have already agreed to support the project: Google, Mozilla, New Relic, O’Reilly Media, Etsy, Strangeloop, and dynaTrace Software. Coders are also invited to participate in the open source project.

Internet Archive is excited about archiving another aspect of the web for both present day and future researchers.

Posted in News, Wayback Machine | 6 Comments

Why Preserve Books? The New Physical Archive of the Internet Archive

by Brewster Kahle, June 2011

Books are being thrown away, or sometimes packed away, as digitized versions become more available. This is an important time to plan carefully for there is much at stake.

Digital technologies are changing both how library materials are accessed and increasingly how library materials are preserved. After the Internet Archive digitizes a book from a library in order to provide free public access to people world-wide, these books go back on the shelves of the library. We noticed an increasing number of books from these libraries moving books to “off site repositories” (1 2 3 4) to make space in central buildings for more meeting spaces and work spaces. These repositories have filled quickly and sometimes prompt the de-accessioning of books. A library that would prefer to not be named was found to be thinning their collections and throwing out books based on what had been digitized by Google. While we understand the need to manage physical holdings, we believe this should be done thoughtfully and well.

Two of the corporations involved in major book scanning have sawed off the bindings of modern books to speed the digitizing process. Many have a negative visceral reaction to the “butchering” of books, but is this a reasonable reaction?

A reason to preserve the physical book that has been digitized is that it is the authentic and original version that can be used as a reference in the future. If there is ever a controversy about  the digital version, the original can be examined. A seed bank such as the Svalbard Global Seed Vault is seen as an authoritative and safe version of crops we are growing. Saving physical copies of digitized books might at least be seen in a similar light as an authoritative and safe copy that may be called upon in the future.

As the Internet Archive has digitized collections and placed them on our computer disks, we have found that the digital versions have more and more in common with physical versions. The computer hard disks, while holding digital data, are still physical objects. As such we archive them as they retire after their 3-5 year lifetime. Similarly, we also archive microfilm, which was a previous generation’s access format. So hard drives are just another physical format that stores information. This connection showed us that physical archiving is still an important function in a digital era.

There is also a connection between digitized collections and physical collections.    The libraries we scan in, rarely want more digital books than the digital versions that we scan from their collections. This struck us as strange until we better understood the craftsmanship required in putting together great collections of books, whether physical or digital.  As we are archiving the books, we are carefully recording with the physical book what the identifier for the virtual version, and attaching information to the digital version of where the physical version resides.

Therefore we have determined that we will keep a copy of the books we digitize if they are not returned to another library. Since we are interested in scanning one copy of every book ever published, we are starting to collect as many books as we can.

We hope that there will be many archives of physical books and other materials as they will be used and preserved in different ways based on the organizations they reside in. Universities will have different access policies from national libraries, say, and mostly likely different access policies from the Internet Archive. With many copies in diverse organizations and locations we are more likely to serve different communities over time.

Physical Archive of the Internet Archive

catalogued book

Books are cataloged, and have acid free paper insert with information about the book and its location

Internet Archive is building a physical archive for the long term preservation of one copy of every book, record, and movie we are able to attract or acquire.  Because we expect day-to-day access to these materials to occur through digital means, the our physical archive is designed for long-term preservation of materials with only occasional, collection-scale retrieval. Because of this, we can create optimized environments for physical preservation and organizational structures that facilitate appropriate access. A seed bank might be conceptually closest to what we have in mind: storing important objects in safe ways to be used for redundancy, authority, and in case of catastrophe.

The goal is to preserve one copy of every published work. The universe of unique titles has been estimated at close to one hundred million items. Many of these are rare or unique, so we do not expect most of these to come to the Internet Archive; they will instead remain in their current libraries. But the opportunity to preserve over ten million items is possible, so we have designed a system that will expand to this level. Ten million books is approximately the size of a world-class university library or public library, so we see this as a worthwhile goal. If we are successful, then this set of cultural materials will last for centuries and could be beneficial in ways that we cannot predict.

To achieve a goal of long-term preservation we have assumed:

  • Infrequent access,
  • Manage millions of books, records, and movies,
  • Adapt to needs of different physical media and collection value,
  • Facilitate storage evolution by monitoring existing systems and introducing new ideas,
  • Adapt to multiple facilities in different environments, and
  • Sustainable from a financial and maintenance perspective.
box of books

Boxes then store approximately 40 books with labeling on the outside

To start this project, the Internet Archive solicited donations of several hundred thousand books in dozens of languages in subjects such as history, literature, science, and engineering. Working with donors of books has been rewarding because an alternative for many of these books was the used book market or being destroyed. We have found everyone involved has a visceral repulsion to destroying books. The Internet Archive staff helped some donors with packing and transportation, which sped projects and decreased wear and tear on the materials.

These books are digitized in Internet Archive scanning centers as funding allows.

To link the digital version of a book to the physical version, care is taken to catalog each book and note their physical locations so that future access could be enabled. Most books are cataloged by finding a record in existing library catalogs for the same edition. If no such catalog record can be found, then it is cataloged briefly in the Open Library. Links are made from the paper version to the digital version by printing identifying and catalog data on a slip of acid free paper that is inserted in the book. Linking from the digital version to the paper version is done through encoding the location into the database records and identifiers into the resulting digital book versions. The digital versions have been replicated and the catalog data has been shared.

pallet of boxes

Pallets hold 24 boxes each, and are the stable location unit

Most of these first books have been digitized with funding from stimulus money for jobs programs and funding from the Kahle/Austin Foundation. This served to build the core collection of modern books for the blind and dyslexic. Many of these digital books are also available to be digitally borrowed through the Open Library website.

This was a change from our previous mass digitization procedures when a library would deliver and retrieve books from our scanning centers. Where the libraries would have already done the sorting and de-duplication of books, we now need to do these functions ourselves. The process to identify titles that have not been preserved already is now in place, but is in active development to improve efficiency. The thorough work of libraries in cataloging materials is key in this process because we can leverage this for these books. Identifiers such as ISBN, LCCN, and OCLC ids have helped determine which books are duplicates.

In January of 2009, we started developing the physical preservation systems. Fortunately there is a wealth of literature on book preservation documenting studies on the fibers of paper as well as results from multi-year storage experiments. Based on this technical literature and specifications from depositories around the world, Tom McCarty, the engineer who designed the Internet Archive’s Scribe book-scanning system, began to design, build, and test a modular storage system in Oakland California. This system uses the infrastructure developed around the most used storage design of the 20th century, the shipping container. Rows of stacked shipping containers are used like 40′ deep shelving units. In this configuration, a single shipping container can hold around 40,000 books, about the same as a standard branch library, and a small building can hold millions of books.

shipping containers

Modified 40' shipping containers are used for secure and individually controllable environments of 50 or 60 degrees Fahrenheit and 30% relative humidity

 

 

Based on this success and the increasing availability of physical materials, a production facility leveraging this design will be launched in June of 2011 in Richmond, California. The essence of the design from the book’s point of view is to have several layers of protection, each able to be monitored and periodically inspected:

  • Books are cataloged, and have acid free paper inserts with information about the book and its location,
  • Boxes store approximately 40 books with labeling on the outside,
  • Pallets hold 24 boxes each,
  • Modified 40′ shipping containers are used as secure and individually controllable environments of 50 or 60 degrees Fahrenheit and 30% relative humidity,
  • Buildings contain shipping containers and environmental systems,
  • Non-profit organizations own and protect the property and its contents.
Internet Archive physical archive building

Buildings contain shipping containers and environmental systems

This physical archive is designed to help resist insects and rodents, control temperature and humidity, slow acidification of the paper, protected from fire, water and intrusion, contain possible contamination, and endure possible uneven maintenance over time. For these reasons the books are stored in isolated environments with a regulated airflow that depends on few active components.

Internet Archive logo

Non-profit organizations own and protect the property and its contents

The Internet Archive is now soliciting further donations of published materials from libraries, collectors, and individuals.

This collection and methodology has already helped in mass digitization and preservation, and we hope that we will offer a wealth of knowledge to future generations.

Thank you to Tom McCarty, Robert Miller, Sean Fagan, Internet Archive staff, San Francisco Public Library leadership, Alibris, HHS of the City of San Francisco, and the Kahle/Austin Foundation for being leaders on this project.

 

Posted in Announcements, Books Archive, News | 215 Comments

improved h.264 derivatives!

We have thoroughly tested a newer and simpler way to create h.264 derivatives!

Changes you’ll notice:

  • More pixels!  previously 320 x 240    goes to 640 x 480 pixels
  • Slightly higher video bitrate — from about 512kb/s   to   about  700kb/s bitrate
  • Switching from mp4creator container maker to ffmpeg container + qt-faststart
  • Less back-end commands to make high-quality derivative

Nice things about this derivative (similar to prior derivative):

  • Plays in adobe flash plugin
  • Plays on all versions of iphone and ipad
  • Starts quickly, nearly instant seeking even to unbuffered areas of the video

Here’s a sample of how we do it with just 3 simple commands.  (We do/you should adjust “-r” argument appropriately to your video’s frames-per-second.  We also adjust the “640″ in the “-vf scale” argument to be appropriate for the video’s *actual* aspect ratio, etc.  So for example, the 640 might become 852 for 16:9 widescreen video.  Although for our .mp4 specific derivative and playback ability on iPhone (1st gen and thus all versions), we would actually downrez that to 640×360).

ffmpeg -deinterlace -y -i 'camels.avi' -vcodec libx264 -fpre libx264-IA.ffpreset -vf scale=640:480 -r 20 -threads 2 -map_meta_data -1:0 -pass 1 -an tmp.mp4


ffmpeg -deinterlace -y -i 'camels.avi' -vcodec libx264 -fpre libx264-IA.ffpreset -vf scale=640:480 -r 20 -threads 2 -map_meta_data -1:0 -pass 2 -acodec aac -strict experimental -ab 128k -ac 2 -ar 44100 -metadata title='Camels at a Zoo - http://www.archive.org/details/camels' -metadata year='2004' -metadata comment=license:'http://creativecommons.org/licenses/by-nc/3.0/' tmp.mp4

qt-faststart tmp.mp4 'camels.mp4'

our preset file:
http://www.archive.org/~tracey/downloads/libx264-IA.ffpreset

 

For the adventurous out there, you can create this same setup by building ffmpeg on mac, linux, or windows.  Linux is easy, but personally, I’m a mac gal.  So here’s some ffmpeg build tips on the mac.

Happy viewing!

 

Posted in Technical, Video Archive | Tagged , , , , , | 18 Comments

Memorial Day-More than BBQ’s and Fireworks

WWII Newsreels, Vintage Defense Department Videos, Soldiers Field Guides, Classic/Contemporary War BooksFrom Hillary Rodham Clinton’s visit with WWII’s Monuments Men (they recovered and saved Europe’s greatest art and cultural treasures) and Walter Cronkite’s reportage of the Vietnam War to an actual World War I soldier’s field service guide and collections of war poetry; the Internet Archive and Open Library are great resources to get into the true spirit of Memorial Day.   Learn about the heroes and villains who made military history.  Read and hear recollections from the people and families whose lives were impacted by the Revolutionary and Civil War to current conflicts in Iraq and Afghanistan.

Vintage and Documentary Videos

http://www.archive.org/details/RadioatW1944 – Radio at War, the role radio played in World War II

Books

Picture Books

http://openlibrary.org/works/OL8506767W/World_War_I_In_Photographs – World War I Photography

Youth Books

http://openlibrary.org/works/OL8116673W/Nation_at_War – Scholastic Books – Nation at War

Time Life Series

http://openlibrary.org/works/OL5909519W/Life_Goes_to_War – Life Magazine Goes to War

http://openlibrary.org/works/OL5909605W/The_Past_Is_Myself_(Classics_of_World_War_II_the_Secret_War) – Part of the Time Life Series

Primary Sources

Must Reads

http://openlibrary.org/works/OL9396288W/American_Heritage_Chronicles_of_the_Great_Wars_(Boxed_Set) Chronicles of the Great Wars

http://openlibrary.org/works/OL278851W/The_Art_of_War – Classic Book

http://openlibrary.org/works/OL2423057W/The_Oxford_Book_of_War_Poetry – War Poetry

Posted in News | 1 Comment

Why Publishers Support E-book Lending with OpenLibrary.org: A Q&A with Smashwords Mark Coker

Photo of Mark Coker

Mark Coker Founder, CEO Smashwords

This Q&A kicks off a series of conversations with visionary publishers who support e-book digital library lending with OpenLibrary.org.

Mark Coker, Founder, CEO and Chief Author Advocate, founded Smashwords  to change the way books are published, marketed and sold.  In just three years it has become the leading ebook publishing and distribution platform for independent authors and small publishers.  The Wall Street Journal named Mark Coker one of the “Eight Stars of Self-Publishing” in 2010. He is a contributing columnist for the Huffington Post, where he writes about ebooks and the future of publishing. For Smashwords updates, follow Mark on Twitter at @markcoker.

Q. What is the relationship between publishers and Open Library?

A: “There is an intersection of common interest with publishers and Open Library – the passionate desire to get books to readers. The innovators at Open Library understand that the way people access books is an ongoing evolution and they are at the forefront of finding solutions to support all the key stakeholders – publishers and distributors, authors and most of all, readers.

Q: How do Libraries help to support book distribution?

old man reading computer

“Its simple – the more readers have a chance to engage with a book, the more likely they are to recommend it, or purchase it.”


A: Open Library purchases your books and shares them with readers by creating a web page for each book, with a cover photo and descriptive information. There are prompts to read, borrow and buy. Open Library has more than 4,600,000 unique visitors a month.

Q: What makes Smashwords different from other publishing organizations?

A: Smashwords represents 19,000 indie authors and small presses who handle the writing, editing and pricing of their books. We distribute these titles to major retailers such as Apple, Barnes & Noble, Sony, Kobo and Diesel. We believe that authors should maintain the creative and financial control of their work and receive the lion share of income. Our authors keep upwards of 85% of the profits on the books we distribute.

Q. Why are some publishers and authors excited about e-books accessed via public libraries?

“If you build it, they will come.”

A: Our authors and publishers rely on Smashwords to open up new opportunities to reach readers. We’re working with most of the biggest indie authors, and many of them are excited about libraries. Open Library and its partners believe, “if you build it, they will come and I agree.  As demand for ebooks through a digital public library systems increase, publishers will better understand the value of partnering with Open Library. We hope they utilize Smashwords to reach these new distribution venues.

Posted in Cool items, News, Open Library | Tagged , , , , , , , , , , , , , , , , , , , | 2 Comments

Physical Archive Launch

Update:   We Launched!

Everyone is welcome to the open-house and launch of the new Physical Archive of the Internet Archive in Richmond, California on Sunday June 5th from 4-8pm.


After 2 years of prototyping and testing a new design for
sustainable long-term preservation of physical books records and
movies, we are starting with over 300,000 books and gearing up
for millions.

Who should come:

  • if you love books, records, or movies
  • if you are concerned about the future of open access and preservation
  • if you want to have something fun to talk about over the water cooler on Monday….

Then, invest an hour with us on a Sunday – Drinks, food, good people.

What you will see:

  • A high density, modular system for storing books, video and audio
  • A temp controlled environment for long-term preservation
  • Our new logistics facility that will catalog and coordinate large collections of books records and movies.

Who you will meet:

  • The Internet Archive Board, Founder, Management Team
  • Friends and supporters of the Internet Archive
  • Colleagues and leaders from the Library community

Please come!  Bring friends and family.

Secure free parking
2512 Florida Avenue, Richmond California, 30 minutes north of San Francisco and Berkeley, 415 561 6767.

RSVP to rsvp@archive.org, or just come.

Posted in News | Tagged , | 15 Comments

Lost Landscapes of San Francisco

San Francisco

The Embarcadero

On April 10, 2011 Internet Archive kicked off the Internet Archive Presents public events series with a hometown favorite, Rick Prelinger’s Lost Landscapes of San Francisco.

Culled from thousands of hours of home, commercial and institutional movies, Lost Landscapes presents San Francisco the way it was. From Gliders in the Outlands to Joe DiMaggio’s wedding to cityscapes of long gone people and places the movie offers both vitality and nostalgia. What makes the event especially vibrant is that the audience participates by shouting out places, events and people they recognize. Occassionally this results in a conversation about the backstory of these clips.

San Francisco

San Francisco Airport

With over 400 people in attendance with a suggested donation of ’5 bucks or 5 books” it was deemed a huge success. More public events are planned.

Check out video of the event at http://www.archive.org/details/lostlandscapes2011

-Jeff Kaplan

Posted in Video Archive | 1 Comment

Buying E-Books from Smashwords

Young Adult e-Books by Amanda Hocking available on OpenLibrary.org

Smashwords’ best-selling authors contribute to OpenLibrary.org

Smashwords, the largest distributor of independently published literature, recently provided the Internet Archive and OpenLibrary.org with its first installment of e-Books from best known, best-selling e-Book authors including: Young Adult sensation Amanda Hocking; Fantasy author, Brian Pratt; Romance novelist Ruth Ann Nordin; and Business Expert, Gerald Weinberg.

Mark Coker, CEO of Smashwords believes that libraries are crucial to every publisher’s survival because they provide the face to face connection between readers, authors and books.

“We see tremendous value in partnering with the Internet Archive. Their visionary leadership is helping to create a worldwide digital public library.”
Mark Coker, CEO, Smashwords

The deposit by Smashwords was a first attempt at demonstrating the feasibility of making modern books more globally accessible through OpenLibrary.org. Next up – the creation of a new model that supports the on-going purchase of e-Books by participating libraries.

“The publishing world is rapidly changing,” asserts Coker, “There’s plenty of room for numerous distribution models and in my opinion, publishers should be bending over backwards to support these initiatives.”

Posted in Books Archive, Cool items, Open Library | Tagged , , , , , , , , , , , , | Leave a comment

Open Library Buying e-Books from Publishers

The Internet Archive is on campaign to buy e-Books from publishers and authors; making more digital books available to readers who prefer using laptops, reading devices or library computers.  Publishers such as Smashwords, Cursor and A Book Apart have already contributed e-Books to OpenLibrary.org – offering niche titles and the works of best-selling “indy” authors including Amanda Hocking and J.A. Konrath.

“Libraries are our allies in creating the best range of discovery mechanisms for writers and readers—enabling open and browser-based lending through the OpenLibrary.org means more books for more readers, and we’re thrilled to do our part in achieving that.” – Richard Nash, founder of Cursor.

American libraries spend $3-4 billion a year on publisher’s materials.  OpenLibrary.org and its more than 150 partnering libraries around the US and the world are  leading the charge to increase their combined digital book catalog of 80,000+ (mostly 20th century) and 2 million+ older titles.

“As demand for e-Books increases, libraries are looking to purchase more titles to provide better access for their readers.” – Digital Librarian Brewster Kahle, Founder of the Internet Archive.

This new twist on the traditional lending model promises to increase e-book use and revenue for publishers. OpenLibrary.org offers an e-Book lending library and digitized copies of classics and older books as well as books in audio and DAISY formats for those qualified readers.

Posted in Cool items, News, Open Library | Tagged , , , , , , , , , , , , , , | 2 Comments

Digitizing Balinese Lontars

With the help of the Internet Archive and Ron Jenkins, a theater professor at Wesleyan University, the Balinese are leading the world as the first culture to have their entire literature go online. The documents are centuries-old lontar palm leaves incised on both sides with a sharp knife and then blackened with soot. As of today 477 lontars have been scanned and uploaded to the Internet Archive.

The writings consist of ordinary texts to sacred documents on religion, holy formulas, rituals, family genealogies, law codes, treaties on medicine (usadha), arts and architecture, calendars, prose, poems and even magic. The estimated 50,000 lontars are kept by members of the Puri (palace) family and high priests to ordinary families. Some are carefully kept as family heritages while others are left in dirty and dusty corners of houses. Digitizing the lontars makes them available to scholars and students and salvages the documents from getting destroyed by insects or humidity, as many already have.

Very few Balinese have actually read any lontar due to language obstacles and the view that is it sacrilegious. Traditionally, the lontars are read and performed by priests. Forty-one of such performances have been uploaded to the Internet Archive.


Gatutkaca Pralaya Nyoman Catra

Visit the Balinese Digital Library at The Internet Archive:

Balinese Digital Library collection
Collection of Lontars
Collection of Videos

Read more about this project and Balinese lontars at The Jakarta Times:

Ancient ‘lontar’ manuscripts go digital | Rita A. Widiadana and Ni Komang Erviani, Bali
US scholar brings ancient Balinese scripts to digital age | Ni Komang Erviani, Denpasar

-Grace Neveu and Jake Johnson

Edited on May 9, 2011: “Very few Balinese have actually read any lontar due to language obstacles and the view that it is sacrilegious. Traditionally, the lontars are read and performed by priests. Forty-one of such performances have been uploaded to the Internet Archive.”
Posted in Books Archive, Video Archive | Tagged , | 5 Comments