Author Archives: Jason Scott

How Can You Help the Internet Archive?

With the Internet Archive being mentioned prominently in the news for the past couple of weeks, we’ve had thousands of people discuss us in social media, and contact us directly with strong concerns and worries.

Above all, many want, in some way, to “help” and have asked us what they can do, if anything.

While your donations during this time have been appreciated, there’s actually many things you can do beyond that, which will have a lasting effect.

Use The Internet Archive Site

It may sound simple, but just using the Internet Archive for why it exists in the first place is a fulfillment of the dream of the many who have worked on it, past and present. An extraordinary amount of hours of continuing support are behind the simple address and website. Some of you are already enjoying the archive in its full potential, but many use it just for the Wayback Machine, or for a favorite set of media that you listen to or watch.

Take a walk through our stacks, browse, meander… enter a search term of something that interests you and see what pops up and what collections it’s part of. You’ll find it endlessly rewarding. Tens of millions of items await you.

The collections themselves vary wildly; a driven group will create a collection, or collaborations and partnerships worldwide will lead to a breathtaking amount of material you can enjoy. And, as always, billions of URLs have been mirrored to bring the unique miracle of the Wayback Machine to you for 20 years. We back up every link Wikipedia links out to at the time’s added, to make sure the web doesn’t forget its citations and relevant information anytime soon.

Speaking of the Wayback Machine… the Wayback is our crowning jewel, and we also encourage people who see something to save a copy of it.

To do so, visit the main Wayback page and enter a URL in the Save Page Now form on the lower right. We’ll do the rest (de-duplication, archiving, and so on). It’s how we become aware of to-the-minute URLs that either don’t have a long shelf life or which we would not normally be aware of for a significant amount of time.

Become a Patron

If you haven’t registered with us, it’s incredibly easy to do so and absolutely free, and always will be. Having a virtual library card lets you build lists of favorites, write reviews for any items you have opinions on, and allow you to upload your own items into our collections. During signup, you can also register for our newsletter, which is really great for keeping track of news and events related to the Archive.

You can always browse anonymously, from anywhere, of course; that’s what a library is about. But consider being a member of the archive as well.

Curate and Upload to the Archive

As a member of the Archive, you can upload items into our stacks instantly. Texts, Images, Movies, Audio. Thousands of new items enter into the collection every day. Our Upload Page has helpful information about what you’re uploading to allow you to describe and verify the items you wish for us to store.

A lot of our strength as a collection comes from individuals uploading items they or their community have created, and in need of a hosting space that will provide access to the item continually, without limits. Artists upload their music albums, podcasters upload their episodes, and hundreds of organizations upload their media and meetings to us, to ensure they’re kept safe.

Tell People That the Internet Archive Exists

It’s always a surprise to us to find out that people don’t know about the Wayback Machine or the Internet Archive, but we live here. Buried among hundreds of tweets have been the excited responses of people discovering us for the first time. What a shame if your friends and family don’t know about us and all they need is for you to tell them we’re a few clicks away. Take a little time to spread the word we’re here and waiting for them. (Just link them to or – the site is pretty self explanatory).

We have a collection of images and logos from our years of work if you wanted to illustrate or link to examples of who we are and what we do.

And really, nothing makes us happier than others writing about what they discover in expeditions into the stacks; essays and posts have been written about discovered unusual magazines or articles, and citing 18th and 19th century predecessors of technology and schools of thought that are flourishing in the present. Our system allows you to bookmark printed items down to the individual page or music track and link to them.

Browse Our Many, Many Collections.

Our petabytes of data have a lifetime’s worth of things to see; here’s a few highlights of our tens of thousands of collections.

For decades, a group of tapers and fans have created the Live Music Archive, a collection of over 225,000 live performances of music, including the vast majority of all live performances of The Grateful Dead, as well as thousands of other bands.

The Bay Area Reporter, the oldest continuously published lesbian, gay, bisexual, transgender and queer weekly newspaper in the United States, made it a mission to scan and upload their entire back catalog of issues from their first to the present day. About 50 years of issues are represented, and are a fascinating deep dive. Other examples of broadsheets and bulletin history that have come to be hosted include the Sparrows’ Nest Library of radical zines and newspapers, as well as the cultural-remix and art potential of thousands of supermarket circulars.

The Netlabels area contains music and performances from “Netlabels”, online-only music groups, “record companies” and communities that have uploaded fully-produced albums with open licenses for years. For example, the Curses from Past Times LP is at 800,000 views and counting. (Be sure to click on the Llama on the right, too.)

The Building Technology Heritage Library is a 11,000 item strong collection of catalogs, layouts and information about all sorts of architecture and aspects of building. Maintained by the Association for Preservation Technology, these readable and downloadable works are a trove of artwork and design that are scanned, including, you’ll soon discover, items that have a tangent to building but also represent massive insights into long-lost items, like this 1,000 page Montgomery Ward Catalog.

Speaking of which.. we’ve partnered with many other libraries, archives, and collectors to mirror or host millions of individual items. Our space and bandwidth are at their service to ensure the maximum audience is ready to interact with them, as needed.

Public Resource hosts 18,000+ Safety and Law Codes with us, allowing individuals to view the laws that affect their lives and functions within society without paying expensive rates to do so. An attempt to prevent this service by the State of Georgia ended up in a legal battle that made its way to the Surpreme Court, which found in favor of Public Resource, allowing you to view these laws immediately. Over 22 million views of these laws have happened over the years.

The Media History Digital Library has a collection in our stacks of film theory, cinema periodicals, and related documents and writings, which can be viewed from the Media History Project site. These scans of industry trade magazines, announcements and advertising related to the film and television industries are instantly available and accessible by students, researchers and writers, as are all our collections.

And we don’t just host music and texts. Among our most storied and referenced items are the uploads of the Prelinger Library, which include government public health films, commercials, instructional movies, and a growing set of home movies, which allow us to parts of visual history that didn’t have a commercial aspect. This work is done, among other ways, by a large-scale digitizing process hosted in the Archive’s Physical Archive.

In our software collections, we have brought back thousands of hypercard stacks that used to be easily available for Macintosh computers in the 1980s and 1990s – they will boot in your browser and let you enjoy them near-instantly.

Just go in any direction in the Archive and you will spend weekends, days and nights finding and sharing what you discover.

However… if passively consuming media doesn’t feel like it’s “helping” us (although it is), there’s an even more active set of roles you can take:

Get Involved In Our Many Projects, Including The Wayback Machine

We’ve made an effort to work with many volunteers and collaborators over the years to ensure the Wayback Machine is capable of playing back as much of the now-lost and forgotten World Wide Web as possible. As you can imagine, the web is a moving target, and the terabytes a day of shifting websites presents one of the hardest technical challenges out there.

We have hundreds of guests in our Slack and other communication channels, working on open-source code and helping us improve the software that drives us.

We have also moved into the real world where we can (even if we, like many others, are taking a break right now). We have co-hosted events like DWebCamp, provided space for book readings, and engaged in a variety of Artist-in-Residency programs; we expect to do more in the future and would love for you to be involved.

You can write us if you have an interest in participating in any of these many and ongoing efforts.

But Most of All, Please Help Yourself First.

We’re touched by everyone who has spoken of their love and support of the Archive and its many missions, but this is also a time of much general uncertainty: economic, health concerns, and upheaval in society.

The Internet Archive is our job and mission. Your job and mission is to take care of yourself and those closest to you. Without you, we’re a bunch of hard drives on the Internet.

We’ll be here when you’re ready.

A Boot Camp for Booting Up: Education and Games at the Internet Archive

Greetings to everyone getting by at home, especially those looking to teach remotely, entertain your family, or find connections to your own past that used to live in programs from the 1980s and 1990s.

Perhaps, locked at home with your computer, you’ve finally got enough time to try out our collections of games for classic Sega and Atari consoles for the very first time. Or you find yourself longing for a simpler time in your history, when Prince of Persia, Pac Man and SimCity could make your troubles disappear, if only for a few hours. Or more likely, your kids have been assigned to set off on the Oregon Trail, but you can’t figure out how to begin that “long, difficult journey…that often resulted in failure and death.”

For all of you video game first time explorers, here’s a little boot camp for booting up Oregon Trail, and much, much more!

Jason Scott (aka “Textfiles”) is the Free Range Archivist at the Internet Archive.

I’m Jason Scott, software curator at the Internet Archive, and I want to help introduce (or re-introduce) you to one of our more unique features—the ability to boot computers in your browser!

What Is Emulation in The Browser?

For more than five years, the ability to boot software, including video game consoles, computers and even handheld plastic games has brought millions through the Internet Archive’s doors. It’s been so extensive and even routine that it hardly gets mentioned now – it’s just something that happens. But for many who suddenly find themselves online a lot more than they used to be, this feature may not be something you’re aware of.

In the same way that music can be listened to inside your web browser and movies can be watched or books read, it is possible at the Internet Archive to “start” up a computer inside your browser that is running software, and that computer can be something completely different than what you’re running.

Imagine an Apple II (1980s) or a Windows 3.1 (1990s) or any of a number of long-gone systems, and the software they used to run that can’t be found or easily used anymore, coming back very quickly and easily, simply by clicking on a screenshot of the program and then playing it. That’s what people do by the thousands every day at the Archive.

Bringing back hundreds of thousands of old software packages brings multiple advantages—the games are generally simpler and extremely clear in goals and play, and have very little connection to later phenomenons like free-to-play, advertising or social media; they’re self-contained worlds, which is ideal for setting up games for children and education.

How to Use Emulation at the Archive

Any software item that can be activated will have a “Play” icon over the screenshot at the top. It looks kind of like a big power button – it’s green and transparent. This means that emulation is available for that program, and you just have to click in the general area of the button for the system to start “booting up” in the browser.

Try it out now! Visit this URL: using a desktop or laptop computer (mobile devices will work, but there’s no keyboard to do anything). Click on the icon as it appears above, and watch as the system starts up an Apple II computer running a simple educational program called Children’s Carrousel from 1982.

There are a number of messages being shown while the system is booting, meant to indicate if something has gone wrong or how long it will take for data to be loaded to your browser. If you see red error messages, or the loading seems to stop, don’t worry—move onto other titles or contact me for tech support.

Where to Start Finding Software?

The hardest part of bringing materials to your students or family is finding appropriate or useful software packages out of a field of potential titles that can range from outdated or broken to simply difficult to use.

There are over 150,000 software at the Archive that can theoretically run in the browser, and naturally the collection runs from timeless classics to best-forgotten past experiments.

To help, I’ve assembled a collection called the Software Kids Zone: – which has a bunch of chosen software packages that I have found are generally easy to use, self-explanatory, fun to play and in some way educational.

Things to Keep in Mind

Here are some tips and advice to bringing your classes or family into programs at the Archive:

  • Always test the software yourself, clicking on items or using keyboard commands, to make sure everything works like you expect it to. There’s always a chance a program has issues past “it runs”, and taking a quick tour beforehand helps remove a lot of chance and issues later.
  • Try to find several versions of software you want to share, or have a list that can be switched to, in case the first doesn’t work. Being able to go from one geography game to another quickly is much better than having to start another search from scratch.
  • The best games and programs feel contemporary, even when 20 years old – it means they weren’t chasing graphics or trends and were trying to build something from the ground up. If you find a program hard to use or not intuitive, skip it, and move on.
  • If you are working with a collection of kids or families, have videos they can watch or music they can listen to instead of using these programs, in case they run into technical issues.

And speaking of technical issues:

Some Common Problems with Emuation in the Browser and Potential Solutions

The most common issues related to emulation in the browser is that it requires a more recent desktop or laptop – the older the machine, the more likely to run into slowdowns, crackling in audio, and other issues.

Another issue is that some software is just not intuitive or needs a lengthy introduction to get working, which is not what kids or really anyone is looking for. The simpler the program, the better it generally is, so don’t hesitate to switch to other titles if playing a game or package is not enjoyable or easy. There’s many, many to choose from!

As above, if you need support or information or even want to ask some general questions, I’m available via e-mail at

Some Suggested Titles

Finally, here are some programs at the Archive that are really fun and easy, and which can tell you if this great feature is right for your purposes.

Bobby Fischer Teaches Chess is an educational chess program that can not just play the game of chess, but teach all the rules about it and give historical information. It’s perfect for a family to learn the game utterly from a true master. It’s located here.

MathOSaurus is a dinosaur-themed mathematics educational game that quizzes you on a range of math concepts with a colorful and fun cartoon dinosaur theme. Originally for the Apple II and with great thematic elements and easy. It’s located here.

The Oregon Trail is still our most popular emulated title on the Archive, and with good reason—it’s well designed, a challenging and interesting experience, and kids love it. I suggest The Oregon Trail Deluxe, which has a more engaging graphical style while maintaining all the fun of the original works, which date back to the 1970s. You can find it here.

2,500 More MS-DOS Games Playable at the Archive

Another few thousand DOS Games are playable at the Internet Archive! Since our initial announcement in 2015, we’ve added occasional new games here and there to the collection, but this will be our biggest update yet, ranging from tiny recent independent productions to long-forgotten big-name releases from decades ago.

To browse the latest collection, hit this link and look around.

The usual caveats apply: Sometimes the emulations are slower than they should be, especially on older machines. Not all games are enjoyable to play. And of course, we are linking manuals where we can but not every game has a manual.

If you’ve been enjoying our “emulation in the browser” system over the years, then this is more of that. If you’re new to it or want to hear more about all this, keep reading.

A Recognition of Hard Work, and A Breathtaking View

The update of these MS-DOS games comes from a project called eXoDOS, which has expanded over the years in the realm of collecting DOS games for easy playability on modern systems to tracking down and capturing, as best as can be done, the full context of DOS games – from the earliest simple games in the first couple years of the IBM PC to recently created independent productions that still work in the MS-DOS environment.

What makes the collection more than just a pile of old, now-playable games, is how it has to take head-on the problems of software preservation and history. Having an old executable and a scanned copy of the manual represents only the first few steps. DOS has remained consistent in some ways over the last (nearly) 40 years, but a lot has changed under the hood and programs were sometimes only written to work on very specific hardware and a very specific setup. They were released, sold some amount of copies, and then disappeared off the shelves, if not everyone’s memories.

It is all these extra steps, under the hood, of acquisition and configuration, that represents the hardest work by the eXoDOS project, and I recognize that long-time and Herculean effort. As a result, the eXoDOS project has over 7,000 titles they’ve made work dependably and consistently.

Separately from the eXoDOS project, I’ve been putting a percentage of these games into the Emularity system on the Internet Archive for research, entertainment and quick online access to the programs. The issues that are introduced by this are mine and mine alone, and eXoDOS is not able to help with them. You can always mail me at with questions or technical concerns.

This should be all that needs to be said, but since the Archive is doing things a little strangely, there’s a lot to keep in mind before you really dive in (or to realize, when you come back with questions).

That Hilarious Problem With CD-ROMs

Putting these games into the Internet Archive has, over time, brought into sharp focus particular issues with browser-based emulation. For example, keyboard collision, where the input needs of the emulator are taken over by the browser itself, and the problems of a program needing a lot more horsepower to run in a browser emulator than a user’s system can handle.

Some of these have solutions that aren’t always great (Buy faster hardware!) and in some cases the problem is currently terminal (these programs have been taken offline for a future date). But the most obvious and pressing is that games based off CD-ROMs take a significant, huge amount of time to load.

CD-ROMs were a boon to the early-to-late 1990s, allowing games to have audio and video like never before. Depending on the tricks used, you got full-motion video (FMV), the playing of CD audio tracks for background music, and levels and variation of content for the games far beyond what floppy disks could ever hope.

But it was also a very large amount of data (up to 700 megabytes per CD) and it’s one thing to have the data sitting on a plastic disc in a local machine, and yet another to have a network connection pull the entire contents of the CD-ROM into memory and hold it there as a virtual file resources. This is going to be an enormous lean on the vast majority of Internet users out there – downloading multi-hundred-megabyte files into memory and then keeping them there, and then losing it all when the browser window closes. Network speeds will improve over time, but this is probably the biggest show-stopper of them all for many folks.

If you find yourself loading up one of these games and facing down a hundred-megabyte download, consider one of the smaller games instead, unless it’s a title you really, really want to try out. Maybe in a few years we’ll look back at cable-modem speeds and laugh at the crawling, but for now, they’re pretty significant.

Some Jewels in the Mix

Luckily, there are some smaller-sized games in this new update that will load relatively quickly and are really enjoyable to look at and to play. Here’s some of my recommendations:

First, a game special to me: the IBM DOS version of Adventure, calling itself “Microsoft Adventure”. It’s actually a small rebranding of the original start of the text adventure world, “Colossal Cave” or ADVENT, by Don Woods and Will Crowther. Remixed to be sold by IBM and Microsoft, this is how I first got into these, and it boots up instantly, providing hours of fun if you’ve never tried it before.

Mr. Blobby, a 1994 DOS Platform game, has all the hallmarks of the genre – bonkers physics, bright and lovely graphics, and joyful music. Be sure to redefine the keys before you try to play it, because besides running and jumping, you can spin and take things. The game does not get less weird as you go along.

Super Munchers: The Challenge Continues is a 1991 remix of the original educational game that sent your “muncher” gathering up words representing a given topic or idea. The speed of the game, along with the learning aspect, make this one of the more zesty “edutainment” titles available from the time.

Street Rod is a wonderfully compact 1989 racing game where it’s the 1960s and you’re going to buy your first hot-rod, tune it up, and race it for money to buy better and better rides. It’s a mouse-driven interface and loaded with all sorts of tricks to make the game fit into a “mere” 600 kilobytes compressed. Initially simple and then well worth the effort!

Digger from 1983 is a Dig-Dug-Clone-but-Not that came out right as IBM PCs were starting to take off, and it’s a lovely little game, steering around a mining machine while avoiding enemies and picking up diamonds. The most unintuitive thing is you need to fire using the “F1” key, so hopefully your keyboard has one.

I’m also going to suggest Floppy Frenzy from Windmill Software because it’s so much closer to the beginning of the IBM PC’s reign and you can see the difference in what the authors were comfortable with – the graphics are simpler, the game movement a little more rough, and the theme is geekiness incarnate: You’re a floppy disk avoiding magnets to leave traps for them, so you can gather the magnets up before the time runs out. If you don’t make it, an angel comes down and brings you to Floppy Disk Heaven. Again, F1 is the unusual key to leave traps.

There’s many more and I suggest people browse around and try things out, really soak in that MS-DOS joy. (And feel free to leave comments with suggestions.)

Thanks so much for coming along on this emulation journey!

  • Jason Scott, Internet Archive Software Curator

Summertime in the Internet Archive Stacks

Around the Internet Archive headquarters (and most of the United States), it’s summertime, meaning high temperatures, a lot of kids out of school, and a sense of taking it easy and being up for some relaxing and fun walks through the Internet Archive’s collection of material. Here’s a light, hopefully interesting set of materials that you might want to make part of your hot days and nights.

DJ Jazzy Jeff and Mick Boogie: The Summertime Mixtapes

Jazz and Boogie have been putting out free mixtapes every year for almost a decade with the idea of being played out on a radio durring summer. Called simply the “Summertime Mixtapes,” they’re a lovely platter of good tunes for a good time.

Wellesley Recreation Summer Concert

Five videos shot during the Wellesley Recreation Summer Concert in 2018 are a perfect blend of good fun and community spirit. Stretching into the hours are all sorts of bands, announcements and performances.

Eaton’s Spring and Summer Catalogue 1917

It’s too late to order (over 100 years too late) but the Eaton’s Catalogue for 1917 had all manner of summer fashions for sale and you can look over some lovely scanned images from that time on our in-browser reader. At the very least, you should check out some of the excellent choices in hats for beachwear.

Cooking With Gelatin

For cooking with gelatin it’s hard to beat this 1907 cookbook for the variety of jellies and gelatins you can make, called the “Cox’s Manual of Gelatine Cookery,” but unfortunately there are no photographs or illustrations, and it’s all about the unique sights and colors of gelatin culinary delight, so illustration from Cox’s is getting pushed aside for this Jello ad:

And Now… from You

That’s what I’ve found in a short stroll through the Archive’s millions of items… maybe you’ve stumbled on some great movies, hot music, and fantastic books that bring you back through summers past or which will be just as great in the present day. Feel free to leave comments with your finds!

  • Jason Scott, Free Range Archivist

The IA Client – The Swiss Army Knife of Internet Archive

As someone who’s uploaded hundreds of thousands of items to the Internet Archive’s stacks and who has probably done a few million transactions with the materials over the years, I just “know” about the Internet Archive python client, and if you’re someone who wants to interact with the site as a power user (or were looking for an excuse to), it’ll help you to know about it too.

You might even be the kind of power user who is elbowing me out of the way saying “show me the code and show me the documentation”. Well, the documentation is here and the code is here. Have a great time.

Boy, they run fast.

So, for everyone still around, a little history about how this client came along and how, if you have a certain set of tasks and interactions you want to conduct with the massive treasures of, it might enable you to do some amazing things indeed. If you’ve never done command-line scripting before, here’s a great excuse to learn.

Started in 2012 and overseen primarily by Archive employee Jake Johnson, the internetarchive client (which is generally just called “ia”) is both a set of libraries and a command-line program for doing a wide range of activities and actions with the archive without having to come in through the website. There’s a range of advantages and differences from using the web interface, mostly that it can be called as a command-line request, and return the results (success, failure, other information) right into your scripts. It is coded to be in lock-step with our APIs and system, and does its best to respect capacity as well as return informative messages about success or errors.

The command comes in the form of ia [command], where command is a variety of functions:

  • It is possible to do a ia search command and return the item identifiers of every item that matches your query, which can then be fed to other scripts or utilitzed as a checklist for your own research.
  • The ia metadata command will return as much metadata as possible, including file sizes, metadata pairs, content type, and other useful information baked into every object in the collections.
  • The ia list command will tell you all the different files within an item identifier, to see which you might specifically want.
  • The ia download and ia upload commands let you pull down and upload items to the archive, setting all the attributes for uploads and adding conditions and specific matches for downloads.
  • The ia tasks command lets your scripts know how the addition of your items went into the archive’s sets, as well as where they stand in terms of post-processing.

All the commands, in fact, that a user might find themselves in desperate need of due to the size or complexity of the task, and clicking endlessly in a browser is just not going to cut it.

The client was originally created for the Archive to do many different processes itself, via scripts, that would both provide clear error messages, give accurate status updates, and allow the scripts to understand what was working or what needed modification. Many internal teams either use this client or depend on its output for information to do their tasks. With over six years of development on it, the tool is very mature and utilized thousands of times a day internally.

In my case, here are some automated or semi-automated tasks I use the ia client command set to do, often daily:

  • Analyze the text of a set of documents to provide me with best guesses as to their publication date, which I then sign off on
  • Take a donation of several hundred PDF files and turn them into individual items in a collection, including taking metadata from a .CSV sheet
  • Compare and contrast screenshots within an item to find the best one and make that a thumbnail for the item
  • Maintain “Pipelines” that pull from content located elsewhere (like the Bitsavers documentation project or the DNA Lounge) and place the resulting items into the Archive with no human intervention

For people who are using the Archive to simply play with and enjoy its many different materials, be they website histories, movies, music, and books – this tool is probably not what you need.

But for the scripting-comfortable folks.. for people who want to become scripting comfortable folks… for people who are maintaining collections or working hard with multiple uploads and doing a lot of manual work to enter metadata.. this multi-tool of Internet Archive access is exactly what you need.

As mentioned above, the documentation is here and the code is here. Have a great time.

Google Plus (or Minus) and the Ephemerality of Community

At the end of this month, on April 2nd, Google will shut down what they called the “consumer version” of Google Plus, their fourth major foray into building a Social Network. The deadline had been the end of the year but was moved up due to a number of cited factors, including data breaches.

When a seismic event like this happens in the online world, especially involving one of the “Tech Giants”, there’s a lot of e-Ink spilled about the money involved, the comparison of markets and post-mortems of performance. However, only a sliver of that coverage tends to mention the social and cultural costs involved.

In fact, to hear it often stated, also-ran social networks are almost like the embarrassing outfit you wore in school or a bad hair day – something we all experienced, but don’t want to talk about.

However, recording and preserving The Web has been our mission for 20 years, and if there’s one thing we’ve learned – it’s that it’s never as simple as “old is terrible, new is good”. In fact, some of the oldest materials of the Web, in all their lower-resolution, lacking-fidelity forms, are also our most emotionally connected and meaningful, due to the passage of time.

On Google+, and before them, on Geocities, FortuneCity, and many others, there’s always been a question who exactly the services are for. Are they meant to be general purpose shared albums of notes, photos and birthday announcements? Or are they places of assembly, where like-minded folks or families gather to communicate and debate, argue and reconcile? The answer, it seems, can often be whatever advertisers want, but in fact it often ends up being a little bit of everything to everyone, and the longer a given service or network exists, the more drift of purpose it will experience.

The biggest difference between “then” and “now” in the eyeblink of Web History is primarily storage and speed. Geocities, at its peak, may not have exceeded 10 or 15 terabytes of data at any one time. Google Plus, however, probably exceeds Petabytes. Choosing to “back up” or make a Wayback-machine compatible snapshot of these places turns into a choice of how much of the Internet Archive’s budget should go towards holding them. Ideally, the answer would always be “all of it”. But sites are getting larger, the shutdown time frames smaller. It’s a constant concern.

Also, when spending this much time and effort to mirror a site, another consideration is how “unique” the material is on it. Were these sites used to share already-available media we could get at other services? Or were special conversations and creations living on the closing site that we will never see again?

Throughout the history of our online times, experts and keepers of special knowledge will share what they know – be it on mailing lists, image boards, ‘groups’ or ‘clubs’. For many, from 2011 to this shutdown year, Google Plus worked to make it easy to be one of those destinations. Time will tell how much might be lost, and how much efforts to mirror it have saved.

ZIP is Broken, Except it’s Not, Except it Is

With many thousands of software items up at the archive, we’re both very useful and also very intimidating, depending on how exactly you know what you’re looking for. While it’s great when your search query gives you exactly what you need (like, say, a manual for the greatest elevator simulator of all time or a lovely flip-album of floppy disk sleeves), it’s not so great when it doesn’t.

Our rather expansive approach to acquisition of items means that if you have a long-hazy memory of something you want to see again or want to do a query in a generalized “show me all the shooters that came out for this platform”, you’ve got a lot of digging ahead of you. I’ve had many lovely conversations with people who are looking for something specific software or game-wise, that have ended with being able to point them to an emulated version of it. Other times, I have to hand them a way to look inside a CD-ROM image from nearly 20 years ago, like this URL inside a GIF CD-ROM from 1992, which was a lovely rendered image of the Apple Logo and semi-transparent balls.

Here’s the image, which is just nice to look at:

Beyond the findability problem, there’s also the deeper problem that computer history has a lot of buried bodies. There were conflicts and issues related to interoperability, who ran what standards, and which programs actually did what they were supposed to. These problems persist in the modern world, but they have rapidly become the province of several abstract layers away: “my Playstation 4 doesn’t play every Playstation 3 game”, or “I can’t paste this image into my twitter post with a simple copy-paste, I have to put it in a paint program and copy-paste that.”

It used to be a lot, lot worse.

Which brings us to .ZIP.


Since computers have come onto the scene, connections between them (and to the user) have always suffered for lack of bandwidth. Sending text, data, images and sounds between different locations has always been some level of slow or undependable. There have been lots of innovations across the decades to deal with it; one of them is compression techniques.

This is where the computer takes a file or sets of files, combines them, finds similar parts, and replaces those similar parts with one-off references to them. The algorithms to do these have become more complicated over time and require more computing power on the compressing end, and in some cases the decompressing end.

And here’s the thing: There have been a lot of file compression formats.

So many of them, in fact, that there’s some legitimate concern that there are compressed files out there for which no decompression program exists anymore. That’s certainly the case for a lot of proprietary file storage formats that were meant to run with one specific program (think a game data file, or a word processing program), but we’re sticking to generalized “File Compression Utility” formats in this essay.

Just in the IBM/DOS world, here are some file compression format extensions that have been created for a variety of reasons and which have been considered as in use:


Some of these were made for other machines, but were made available via utility to the DOS world. They’ve got great names, reflected in the filename but just barely; names like Hamnersoft HAP/ Knowledge Dynamics, Voof, Zoo, Novosielski, ShrinkIt, and ReeveSoft Freeze. Pretty much all have fallen to the wayside in various usage (as has DOS itself) so we don’t generally see new versions of these show up.

Except .ZIP. ZIP won the battle, and is the dominant compression scheme for “files” (as opposed to video/audio compression).

But what is .ZIP?

ZIP is ZIP, except Not ZIP

Co-created by Phil Katz and Gary Conway in 1989, .ZIP was a reaction to a lawsuit. In the growing realm of file compression utilities, one format, .ARC, created by System Enhancement Associates, had started to rise, and PKWARE (Katz’ company) made a competing product, PKARC, that used original .ARC source code but rewrote it in faster routines, making it speedier. System Enhancement Associates sued PKWARE and won in a settlement, resulting in abandoning .ARC and a new format being created. The bad blood and publicity from the lawsuit helped drive adoption/conversion to the replacement format, .ZIP.

(I actually made a documentary about this part of the story.)

ZIP’s wide adoption and easy, clear documentation of the format meant support for it started expanding over time. Besides compressing the files themselves, a format like .ZIP preserves timestamps, has integrity checks, and maintains directory structure. (Many others do this as well.). If you uncompress a .ZIP file from 1992, you’ll be able to see when it was created and compressed, and other important data from a historical perspective. Also, if the file is from the early 1990s, chances of unpacking these .ZIP files successfully with any of a large range of current methods are really, really high. Drag it to your Windows, OSX or *nix environment, and chances are you’ll do fine.

The closer you get to now, though, and problems arise.

The most damning issue is that different operating system versions approach .ZIP slightly differently, which mostly works, and lets you even treat a .ZIP file like a little disk drive or folder, adding and removing files within it while preserving the compression. Why unpack 800 megabytes of files when you only need this single 5 megabyte one? Similarly, you can construct a new .ZIP file on your desktop, adjust a bunch of parameters within it, and poof, a .ZIP file you can attach to e-mail or pass along via other ways.

But between 1989 to now, with ZIP being 30 years old, there have been expansions to the format, small changes that make it backwards compatible, but with nothing to easily tell a user that they’re using an out of date or different uncompression program.

The current cross-platform king is Info-ZIP, which has a homepage that credits the many people who have worked on it and access to the versions from over the years. It has been continually maintained to handle new issues, and is generally excellent at backwards compatibility. It’s probably your best bet to getting the information back out of a .ZIP file.

But that’s not what everyone uses.

“It Doesn’t Work”

On dozens of software items at the Internet Archive are reviews where a strange phenomenon happens:

  • Some reviews indicate the contents were just what they were looking for.
  • Some declare it broken, and terrible and truncated.

They’re both right.

One of the most problematic technical issues on a day to day basis with computers are the bit limits. When you hear discussions of “8-bit”, “16-bit”, “32-bit” and “64-bit”, it usually reflects some resource within the system (graphics, filesystem, pipeline) being limited to a certain amount of addressing. If your daily job is computer development, this is probably old news to you; but not everyone’s daily job is computer development.

In general, a modern system will be some amount of 64-bit, with some 32-bit addressing thrown in a few corners simply because it’s not thought there’ll be a use for more. 32-bit is, very roughly, about 3 gigabytes of information.

This means that when someone on the Archive uploads a .ZIP file that is larger than 3 gigabytes, there’s a somewhat good chance that a patron who downloads that file will not have the ability to uncompress/unpack that file using the tools on their specific desktop. If they use the internal tools (or a downloaded tool) to go through that .ZIP, the program (or even the operating system itself) won’t know what to do with this very large file, and begin throwing out errors.

However, since the nature of .zip files is to be somewhat resilient, some files will make it out. It’ll start to unpack them, then declare a corruption or a bug and stop working. So it looks like some of it’s there, but not what the user was expecting or needed.

What Is The Lesson Here?

As the Internet Archive continues growing in acquiring software and files, our propensity for easily searchable and accessible programs means that people will rush in, encounter a file like a .ZIP file, and not know about this 30 year+ history with that format and issues that could arise. How could they be expected to?

In earlier eras of computer history, the user was expected to be able to build and pilot the ship as comfortably as ride in it as a passenger. Thankfully, those days are mostly behind us and picking up a piece of technology and using it runs into issues like placement of buttons or lacking a headphone jack, instead of concerns of header information or data formats.

But under this surface of ease and frictionless experience is the occasional roiling current of decisions, movements and changes. It reflects how truly unsettled our computer world is, and how, every once in a while, we get a glimpse into it in ways that are not obvious.

It’s a privilege to be able to hold and present these vintage programs and documents from technology and time long past. But these items lived in an environment and support structure now truly gone, and it is sometimes a period of rediscovery for researchers professional, academic and hobbyist to re-learn what we’ve forgotten.

Hopefully the archive can help remember that too.

Further Reading

The 12 Games of Christmas (And Nearby Holidays)

The Internet Archive has had thousands of games available to play in your browser for over five years now, but the joy of booting up these items immediately never seems to grow old. In fact, the main issue is there’s so many, and they’re from all different eras and times, that it might be worth it to point out 12 Christmas (and general Holiday Season) themed games just to try out.

(Most of these should work fine in most modern browsers, including Chrome, Firefox, and Edge, along with browsers that use the same engines. Safari and Internet Explorer, as well as others, might have issues here and there. Always give Jason Scott, our Software Curator, a heads-up as to what problems you might have.)


The Daze Before Christmas is a platformer for the Sega Genesis. 

This is a pretty wild game, made in 1994 by a Norwegian game studio and featuring a very santa-like character who fights a huge range of enemies across a wide range of levels. Your command buttons are ARROW KEYS for movement, the CTRL key for the A button, ALT/OPTION key for B button, and the SPACE bar for C. The manual for this game is located here.


A conversion mod was done for an earlier iD Software creation, Commander Keen; again, all the usual sprites and graphics have been totally redone to give us holiday cheer. You can play the redone Commander Keen here.

The commands are the usual ARROW KEYS to move and CTRL to take actions. After a top-down view, it switches to a fast paced platform for everyone’s favorite kid, wearing a Santa hat.


This 1993 platformer game has it all – stunning MS-DOS graphics, slick and easy controls, and a sense of real craft put into every frame. Complete all seven levels and Christmas will be saved.

When you start the game, there’s a small selection screen. Be sure to hit the F key, so you get that rocking Christmas music in the background. Use ARROW KEYS to move and SPACE to… throw snowballs.


Nightmare Before Christmas ~ Handheld Electronic Game

Trust me, this sounds a lot better than it looks. Part of our larger handheld collection, this license of the original Burton-Selick movie has Jack walking, minding his own business while avoiding snowballs and other creatures. You use the ARROW KEYS as well as the CTRL key to take action, although you’ll be hard pressed to enjoy it! Unless the Pumpkin King holds such a sway with you that you’ll take the effort…


This ZX game has a lovely set of colors and graphics as you guide santa through finding pieces of his sleigh, then riding through the night. If you’ve never played a game on the ZX Spectrum (a fascinating machine in its own right) then the controls are going to seem a little bit odd. Be sure to select 1. KEYBOARD at the selection screen, and then check out these controls:

Use the O KEY for left, the P KEY for right, A KEY for down and Q KEY for up. Press SPACE for action and fire. Trust me, the keyboard was very small and your hands would have thanked you, back then.


If you ever played text adventures in decades past, you’ll have feelings about the fact they’re still around, still accessible to play, and still text-based interactive stories that allow you to play them one sentence at a time. In this case, you can play THE ELF’S CHRISTMAS ADVENTURE, an Adventure Game Toolkit story of a hapless elf pulled back into an emergency back at the North Pole.

Just curl up near a crackling fire, boot the game up, and start typing commands – you’ll fall into the old fun and frustrations of text adventures in no time.


The groundbreaking Castle Wolfenstein by iD Software (1992) got a holiday makeover in the late 1990s, with the WWII imagery replaced by trees, wreaths, nutcrackers, banners of holiday cheer – you name it.  Just click here to try this version out.

It’s still a first-person shooter, however, so you’re armed and causing mortal damage, although maybe tell yourself it’s evil people wearing Santa suits at the annual Dungeon Holiday Party. The standard keys work: ARROW KEYS to move and CTRL to fire, with SPACE  to open doors and secret wall entrances.


This Commodore 64 game is rather slow in places (you can wait a long time for it to load), but a parent playing with a child can enjoy the music and graphics a lot. This 1986 interactive christmas card came from American Greetings. There’s even a singalong! 

(Not kidding about how long it takes to load – but the music and graphics make it worth the wait.)


When Lemmings, an incredibly popular game of the early 1990s, decided to release a holiday version with Christmas themes including graphics and sound, it too was an enormous hit. Some people even preferred it to the original, since it was so incredibly festive and the music was a beautiful Amiga soundtrack of holiday hits. Click here to play.

After a grey bootup screen, the game will come up, with you clicking your mouse into the window to activate the little lemming hand/mouse pointer. Choose PLAY and enjoy the game: You’re guiding dozens of little lemmings dropping out of a trap door to send them into an exit. Assign them different duties (building, digging, blocking) by clicking on the tiles at the bottom. (There are numbers to indicate how many times you can assign the lemmings a job). If you get stuck, there’s a little nuclear option to choose too. 

(If you’ve never played Lemmings before, you’ll be in love with the little guys in minutes.)


This revamping of the classic platformer JAZZ JACKRABBIT came out as a holiday gift, with a green bunny fighting to save the world while dressed for handing out presents. Use the ARROW KEYS to move around, ALT/OPTION to jump, and SPACE to shoot.

This game is fast, an obvious nod to Sonic the Hedgehog, and so once you get going you’ll be hard-pressed to keep track of everything going on the screen. But the festive graphics and sound will keep you coming back. Click here to play it.


JETPACK CHRISTMAS SPECIAL! is a platformer with a small santa running around collecting presents and causing havoc trying to save Christmas. When starting up the game, press I for an excellent included instruction manual about the backstory and how to play the game. Otherwise:

Press S to start, and then the ARROW KEYS to move, SPACE for your status, ALT/OPTION to thrust, and CTRL to “Phase”. Note that this game is all about the Jetpack, allowing you, Santa, to fly all over the place.

Fun fact: If you leave the title/credits screen going, the snow will start to pile up. 25 years ago, this was a big deal, computer graphics-wise. 

Another fun fact: This game has one of the legendary BOSS KEYS that were a staple of videogames of the time – pressing F10 during the game will kick it over to look like just a regular MS-DOS prompt, complete with blinking cursor. Press F10 again to bring the game right back!


Finally, a simple 1993 platformer with lovely music, “Santa is Back!” has Santa running between all manner of platforms, collecting snow globes and presents and all sorts of different holiday items to save Christmas. Just use the ARROW KEYS to move around and the SPACE to kneel. There’s multiple screens and a few short levels. 

Have a delightful holidays, enjoy these many strange and fun games, and thanks for being a user at the Internet Archive!

Don’t Click on the Llama


Clicking on the Llama will release Webamp, a javascript-based player that mimics, down to individual strangeness and bugs, the operations of the once dominant Winamp, a media player considered to be one of the classic software creations of the 1990s.

To help you avoid this llama, we’ll tell you it’s in the upper right corner of any Internet Archive item that has a music player in it. This means the Grateful Dead recordings, radio airchecks, network record labels like monotonik, and all manner of podcasts now have the capability to be turned into a Winamp-like player that becomes your new default.

(If, by mistake, you click on the Llama, clicking on it again will turn off the Webamp player and restore the default player.)

This all got started because of the skins.

As part of our celebration of all things Internet, the Archive now has a large collection of Winamp Skins, which were artistic re-imaginings of the Winamp interface, that allowed all sorts of neat creative works on what could have been a basic media player. These “skins” were contributed to over the years (and new ones are still created!) and now number in the thousands. In the collection you’ll see examples of superheroes, video games, surreal images and a pretty wide array of pop stars and celebrities.

We have added over 5,000 skins (with many more coming), and then someone had the bright idea to make the Webamp player work within the Internet Archive to show off these skins, and here we are.

Thanks to Jordan Eldredge and the Webamp programming community for this new and strange periscope into the 1990s internet past.


Over 1,100 New Arcade Machines Added to the Internet Arcade

The Internet Arcade, our collection of working arcade machines that run in the browser, has gotten a new upgrade in its 4th year. Advancements by both the MAME emulator team and the Emscripten conversion process allowed our team to go through many more potential arcade machines and add them to the site.

The majority of these newly-available games date to the 1990s and early 2000s, as arcade machines both became significantly more complicated and graphically rich, while also suffering from the ever-present and home-based video game consoles that would come to dominate gaming to the present day. Even fervent gamers might have missed some of these arcade machines when they were in the physical world, due to lower distribution numbers and shorter times on the floor.

A somewhat beefy machine and very modern browser will be required to run these games. In general, pressing the 5 key will insert coins, 1 and 2 will start 1 or 2 player games, and the arrow and spacebar keys will control the games themselves.

Let the games… continue!

To visit the new 1,100 additions, click here.

Thanks, as always, to Dan Brooks, for maintaining the Emularity system to allow near-instant upgrading of emulators and additions of new platforms to the Internet Archive collections.