Category Archives: Cool items

Mirroring the Stone Oakvalley Music Collection

soamc_logo

The Internet Archive has begun mirroring a fantastic collection of music called the “Stone Oakvalley Music Collection”. When you visit one of their websites, the archive.org mirror is one of the choices for download. Going forward, the Archive will offer a full backup of the entire site (over a terabyte) for permanent storage.

Why the Stone Oakvalley Collection is important

Manufactured from the early 1980s to the mid 1990s, the Commodore 64 computer was a revolutionary piece of hardware and a critical introduction to programming for generations. It also had, within its design, a very well-regarded sound chip: the 6581/8580 SID (Sound Interface Device), whose unique properties in wave generation and effects gave a special sound in the hands of the right developers and musicians.

MOS_Technologies_6581

 

This successful piece of hardware was manufactured in the millions across the life of the C64, and in the late 1980s, the introduction of the Commodore Amiga computer brought to life an improved chipset for generating sound; the 8364, or PAULA. With a range of improvements to what sounds and music could come out of this chip, the Amiga soared with capabilities that took years to match in other machines.

paula8364The Archive hosts many examples of music generated by these chips: our C64 Games Archive has videos in the hundreds of games played on a Commodore 64, and searching for terms like “Amiga Music”, “Chiptunes” and “C64 Music” will yield a good amount of sound to enjoy.

But nothing comes close to the Stone Oakvalley Collection in terms of breadth, dedication, and craft in ensuring the unique sound of these chips can be enjoyed in the future.

setup01

The process, which is documented here, involved setting up a large amount of Commodore hardware connected to servers which would reboot the machines, over and over, playing thousands of pieces of music in different configurations, and automatically cataloging and saving the resulting waveforms. Considerations for modifications of the chipset over the years, of stereo versus mono recordings, and verification of the resulting 400,000 files have provided the highest quality of snapshots of this period.

Browsing the Collection

Currently, there are two websites for Stone Oakvalley’s collection – one based around the C64, and the other based around the Amiga.  Impeccable work has been done to catalog the music, so if there are songs or games you remember, they are likely to be saved on the site (and powered from Archive.org’s servers). Otherwise, browse the stacks of the sites and enjoy a soundscape of computer history.

The Internet Archive strives to provide universal access to the world’s knowledge. Through mirroring, hosting and gathering of data, our mission allows millions to gain ad-free, fast access to information and materials. Be sure to check our many collections on our main site.

Millions of historic images posted to Flickr

by Robert Miller, Global Director of Books, Internet Archive

flickr_image

“Reading a book from the inside out!”. Well not quite, but a new way to read our eBooks has just been launched. Check out this great BBC article:
http://www.bbc.co.uk/news/technology-28976849

Here is the fabulous Flickr commons collection:
https://www.flickr.com/photos/internetarchivebookimages

BBC articleAnd here is our welcome to Flickr’s Common Post:
http://blog.flickr.net/en/2014/08/29/welcome-the-internet-archive-to-the-commons/

What is it and how did it get done?
A Yahoo research fellow at Georgetown University, Kalev Leetaru, extracted over 14 million images from 2 million Internet Archive public domain eBooks that span over 500 years of content.  Because we have OCR’d the books, we have now been able to attach about 500 words before and after each image. This means you can now see, click and read about each image in the collection. Think full-text search of images!

How many images are there?
As of today, 2.6 million of the 14 million images have been uploaded to Flickr Commons. Soon we will be able to add continuously to this collection from the over 1,000+ new eBooks we scan each day. Dr. Simon Chaplin, Head of the Wellcome Library says, “This way of discovering and reading a book will help transform our medical heritage collection as it goes up online. This is a big step forward and will bring digitized book collections to new audiences.”

What is fun to do with this collection?
Trying typing in the word “telephone’ and enjoy what images appear? Curious about how death has been characterized over 500 years of images – type in “mordis”. Feeling good about health care – type in medicine and prepare to be amazed. Remember, all of these images are in the public domain!

Future plans?
We will be working with our wonderful friends at Flickr and our great Library partners to make this collection even more interesting –  more images, more sub-collections and some very interesting ideas of how to use some image recognition tools to help us learn more about, well, anything!

Questions about this collection, projects or things to come?
Email me at robert@archive.org

Open Call for tumblr Collaborators

Screen Shot 2013-04-08 at 3.25.52 PMWhen it comes to collaborative culture, tumblr is where it’s at – and we’re ready to jump in. We’re not going to just redirect this blog, though, we’re opening up our tumblr URL to anyone interested in messing around with our content.

We’re looking at this as an opportunity to show the world some of the amazing stuff we’ve collected – over 10 petabytes of information just waiting to be juxtaposed, made into macros, remixed, glitched, written on, moshed, analyzed, sequenced and combined in ways we haven’t dreamed of.

We will be accepting 52 people. We’ll be here to offer support and guide them in their exploration with content and code, then we’ll feature their finished work for a week on the official tumblr. Each person’s residency will also be archived, of course. That’s what we do!

Check out http://internetarchive.tumblr.com for more details and an application form.

Lost Landscapes SF6: huge success– Next Lost Landscapes of Detroit February 22

Standing room only for Rick Prelinger’s Lost Landscape of San Francisco 6 at the Internet Archive last night.   New films including “process plates” from studios brought a new sharpness to many of the films presented.   Suggested donations was 5 bucks or 5 books, and people brought lots of great books for the Archive.

Next is Lost Landscapes of Detroit on February 22, 2012– this is Detroit without the narratives being imposed on it.   Doors open at 6:30, show at 7:30.

Thank you all!

Hard Drive Archaeology – And Hackerspaces

Two different, but somewhat related additions to the archive you might want to check out.

First, I was contacted earlier this week about a project to recover information off of an old Cray-1 supercomputer hard drive. Unlike, say, trying to get your old floppies to read or pulling an old mix tape off of a cassette, with something as old as a Cray-1 (a computer once called the “World’s Most Expensive Love Seat“), you don’t even have a place to really plug it in: functioning Cray-1 machines are rare as you can get, and even if you were to get the hard drives spinning up and read off of – where would you get the data off the Cray?

Researcher Chris Fenton has a thing about Cray supercomputers – he built a tiny homebrew version of one that used emulation to allow you to experience some aspect of Crays, from his desktop. So when he found himself with a 80 Megabyte CDC 9877 disk pack, which was quite a lot for the early 1970s, it wasn’t just a matter of hooking it up to USB. (Actually, we have a brochure for the behemoth you would put this disk pack into to read it.)  Here’s what a nearly-the-same CDC 9987 looks like:

Ultimately, Fenton got the information off of the disk pack using a whole variety of techniques and experiments, as part of a research project this summer. He wrote a paper about the process, entitled Digital Archeology with Drive-Independent Data Recovery: Now, With More Drive Dependence!” and it’s now mirrored here at the archive. If nothing else, be sure to browse through the paper just to see the customized stepper motor and reader he build to pull the magnetic data off the platters. And I was kind of understating things… ultimately he did hook it up to USB.

From this careful, forensic-quality magnetic scan of the drive, Fenton has produced a large image of the disk, one far larger than the data on it but allowing further experimentation and reading from the image without having to build a robot in your basement. And now, we’re offering this image on the archive. Remember, you won’t be able to pull this data down and go back to the 1970s, instantly – you should be reading up documentation of disk formats, learn about how pull information off of magnetic flux recording, and a whole other host of material and knowledge…. but hey, weekends are for having fun, right?

Even ten years ago, the idea of offering several gigabytes of something (that expands out to about 20 gigabytes of something) online was beyond crazy – that we’ve come so far in offering this much to so many people speaks how much the world has changed since the era of this disk pack.

Fenton is associated with the NYC-based hackerspace, NYC Resistor and it was their mailing list that got in contact with me to get this disk image up to the archive.

Coincidentally, this was also the week that two NYC Resistor members released a book, for free, which you might really enjoy. Bre Pettis and Astera Schneeweisz hatched a plan to make a book on hackerspaces at the end of 2008. They wanted to put it together in less than two weeks, and as people submitted photos, essays and other material, the project increased in size, more folks were brought in, and this month the end result was released for free.

Entitled “Hackerspaces: The Beginning”, this photo-filled book is available at the archive to read online or download. A worldwide view of hackerspaces throughout the world as of 2008, it also includes memories of spaces past and dreams of spaces future. It’s an excellent snapshot of a beautiful, technological world well worth browsing this weekend (and weekends to come).

So if you’re in the mood for advanced research or just to check out some great photos, the archive’s got something for you!

Why Publishers Support E-book Lending with OpenLibrary.org: A Q&A with Smashwords Mark Coker

Photo of Mark Coker

Mark Coker Founder, CEO Smashwords

This Q&A kicks off a series of conversations with visionary publishers who support e-book digital library lending with OpenLibrary.org.

Mark Coker, Founder, CEO and Chief Author Advocate, founded Smashwords  to change the way books are published, marketed and sold.  In just three years it has become the leading ebook publishing and distribution platform for independent authors and small publishers.  The Wall Street Journal named Mark Coker one of the “Eight Stars of Self-Publishing” in 2010. He is a contributing columnist for the Huffington Post, where he writes about ebooks and the future of publishing. For Smashwords updates, follow Mark on Twitter at @markcoker.

Q. What is the relationship between publishers and Open Library?

A: “There is an intersection of common interest with publishers and Open Library – the passionate desire to get books to readers. The innovators at Open Library understand that the way people access books is an ongoing evolution and they are at the forefront of finding solutions to support all the key stakeholders – publishers and distributors, authors and most of all, readers.

Q: How do Libraries help to support book distribution?

old man reading computer

“Its simple – the more readers have a chance to engage with a book, the more likely they are to recommend it, or purchase it.”


A: Open Library purchases your books and shares them with readers by creating a web page for each book, with a cover photo and descriptive information. There are prompts to read, borrow and buy. Open Library has more than 4,600,000 unique visitors a month.

Q: What makes Smashwords different from other publishing organizations?

A: Smashwords represents 19,000 indie authors and small presses who handle the writing, editing and pricing of their books. We distribute these titles to major retailers such as Apple, Barnes & Noble, Sony, Kobo and Diesel. We believe that authors should maintain the creative and financial control of their work and receive the lion share of income. Our authors keep upwards of 85% of the profits on the books we distribute.

Q. Why are some publishers and authors excited about e-books accessed via public libraries?

“If you build it, they will come.”

A: Our authors and publishers rely on Smashwords to open up new opportunities to reach readers. We’re working with most of the biggest indie authors, and many of them are excited about libraries. Open Library and its partners believe, “if you build it, they will come and I agree.  As demand for ebooks through a digital public library systems increase, publishers will better understand the value of partnering with Open Library. We hope they utilize Smashwords to reach these new distribution venues.

Buying E-Books from Smashwords

Young Adult e-Books by Amanda Hocking available on OpenLibrary.org

Smashwords’ best-selling authors contribute to OpenLibrary.org

Smashwords, the largest distributor of independently published literature, recently provided the Internet Archive and OpenLibrary.org with its first installment of e-Books from best known, best-selling e-Book authors including: Young Adult sensation Amanda Hocking; Fantasy author, Brian Pratt; Romance novelist Ruth Ann Nordin; and Business Expert, Gerald Weinberg.

Mark Coker, CEO of Smashwords believes that libraries are crucial to every publisher’s survival because they provide the face to face connection between readers, authors and books.

“We see tremendous value in partnering with the Internet Archive. Their visionary leadership is helping to create a worldwide digital public library.”
Mark Coker, CEO, Smashwords

The deposit by Smashwords was a first attempt at demonstrating the feasibility of making modern books more globally accessible through OpenLibrary.org. Next up – the creation of a new model that supports the on-going purchase of e-Books by participating libraries.

“The publishing world is rapidly changing,” asserts Coker, “There’s plenty of room for numerous distribution models and in my opinion, publishers should be bending over backwards to support these initiatives.”

Open Library Buying e-Books from Publishers

The Internet Archive is on campaign to buy e-Books from publishers and authors; making more digital books available to readers who prefer using laptops, reading devices or library computers.  Publishers such as Smashwords, Cursor and A Book Apart have already contributed e-Books to OpenLibrary.org – offering niche titles and the works of best-selling “indy” authors including Amanda Hocking and J.A. Konrath.

“Libraries are our allies in creating the best range of discovery mechanisms for writers and readers—enabling open and browser-based lending through the OpenLibrary.org means more books for more readers, and we’re thrilled to do our part in achieving that.” – Richard Nash, founder of Cursor.

American libraries spend $3-4 billion a year on publisher’s materials.  OpenLibrary.org and its more than 150 partnering libraries around the US and the world are  leading the charge to increase their combined digital book catalog of 80,000+ (mostly 20th century) and 2 million+ older titles.

“As demand for e-Books increases, libraries are looking to purchase more titles to provide better access for their readers.” – Digital Librarian Brewster Kahle, Founder of the Internet Archive.

This new twist on the traditional lending model promises to increase e-book use and revenue for publishers. OpenLibrary.org offers an e-Book lending library and digitized copies of classics and older books as well as books in audio and DAISY formats for those qualified readers.

1790-1930 U.S. Census Records Available Free

With the U.S. Census Bureau beginning to release statistics from the 2010 census. It seems a good time to mention that Internet Archive has a complete set of the available U.S. Census back to the first one in 1790:

From the press release of the completion of the most recent census:
_________________________________________________

San Francisco, CA –Internet Archive has announced that a publicly accessible digital copy  of the complete 1930 United States Census – the largest, most detailed census released to date – is available free of charge at www.archive.org/details/1930_census. Previously, 1930 Census records were accessible only through microfilm, or subscription services in which select portions of data are provided for a fee.

The 1930 Census records are being made available online through a collaboration with the Allen County Public Library Genealogy Center in Fort Wayne, Indiana. In the coming months, complete census records from 1790 through 1920 will be made available as part of Internet Archive’s growing Genealogy Collection.

“Internet Archive is pleased to be working on this important collection with the renowned Genealogy Center of the Allen County Public Library,” said Robert Miller, Internet Archive’s Director of Books. “There is tremendous value in seeing the original census source documents without filtering and third-party interpretation of the information. For historical researchers as well as those individuals who are simply passionate about history and genealogy, access to these materials is critical to understanding the past and assessing how the past impacts the present, and how it can shape our future.”

Taken just five months after the Wall Street crash of October 29, 1929, the 1930 Census was the fifteenth census of the United States and includes 2,667 microfilmed rolls of population schedules with names and statistics of more than 137 million individuals. The 1930 Census became available to the public on April 1, 2001. By law, census records are restricted for 72 years.

Information contained about individuals in the 1930 Census includes:

•    Address of home
•    Date and location of birth
•    Occupation
•    Marital status
•    Year of immigration
•    Ability to speak English
•    Ability to read and write
•    Property ownership
•    Military participation

“The 1930 Census represents the zenith of data collected by federal enumerators,” said Curt B. Witcher, Allen County Public Library Genealogy Center Manager. “Having it online for free will allow access for anyone at any time – the classroom teacher who wants to show interested students what an older census looks like, the local historian wanting to study everyone who lives in a particular township or village, the genealogist wanting to search for families missed by indexers. Millions of individuals will benefit from this resource. What a fortunate circumstance to have this historic census widely available in this census year of 2010.”
_______________________________________________________

Note: There is an interesting backstory to the missing 1890 census:

http://en.wikipedia.org/wiki/1890_U.S._Census
“The Eleventh United States Census was taken June 2, 1890. Most of the 1890 census was destroyed in 1921 during a fire in the basement of the Commerce Building in Washington, D.C. In December 1932, according to standard Federal record keeping procedure at the time, the Chief Clerk of the Bureau of the Census sent the Librarian of Congress a list of papers to be destroyed, including the original 1890 census schedules. The Librarian was asked by the Bureau to identify any records which should be retained for historical purposes but the Librarian did not accept the census records. Congress authorized destruction of that list of records on February 21, 1933 and thus the 1890 census remains were destroyed by government order by 1934 or 1935.”

-Jeff Kaplan and Kathy Dalle-Molle

Ted Nelson and Zigzag

One of the great things about the Internet Archive is the sense of adventure. There are always creative ideas bouncing around. Many of them come to fruition and occasionally some fail. In the spirit of innovation the inimitable Ted Nelson just finished up a month long code sprint with some guest programmers to bring to life one of his visionary concepts, Zigzag.

About Zigzag (from Mr. Nelson’s Xanadu website):

“We believe the computer world can be simplified and unified. Today, ordinary people must deal with an appalling variety of programs and mechanisms to maintain their information. We have discovered a new simplification based on one simple concept: a new, liberated form of data that shows itself in wild new ways.

Conventional data structures– especially tables and arrays– are confined structures created from a rigid top-down specification that enforces regularity and rectangularity. But this structure (our trademark is ZigZag®) is created from individual relations, bottom-up; it can be irregular and unlimited. Its uses range from database and spreadsheet to unifying the internals of large-scale software.”

Click image to see the Zigzag presentation

At the end of the month, on November 24, Ted Nelson and Team Zigzag (Edward Betts, Ted Nelson, Marlene Mallicoat, Art Medlar, Andrew Pam, Jeffrey Ventrella) presented a working prototype of Zigzag. You can see the presentation at http://www.archive.org/details/zigzagpresentation (a hi-res version can be seen at here.)

Team Zigzag: Ted Nelson, Art Medlar, Jeffrey Ventrella, Edward Betts. (Marlene Mallicoat and Andrew Pam)

-Jeff Kaplan