Author Archives: Jason Scott

The Internet Arcade becomes an Archive Reality

Posted on November 17, 2016 by Jason Scott

A couple years back, we introduced the Internet Arcade, which enabled people around the world to play a number of Arcade titles from the last 40 years in their browsers, instantly. We’ve also had collections of console games, and a general library of tens of thousands of software programs which has also proven very popular.

The work continues to expand the emulated systems and refresh what titles are available, but a project we’ve had going on the side for a while just came to fruition.

Among the organizations that turned out to benefit from having our browser-based emulations was X-Arcade, manufacturers of high-quality joysticks and control panels for use with computers and software. Meant to have the original Arcade feel, a few examples of these controllers were gifted to the Archive and we’ve used them pretty extensively in demonstration days and special events.

Last year, X-Arcade announced an old-school full-sized arcade machine case for sale, and generously offered to send one to the Archive as well. We contacted an excellent artist, Mar Williams of Sudux.com, who has done excellent art for the DEFCON hacking conference and many other events, and she put together custom Internet Archive-themed arcade side art for the machine. Here’s what she came up with:

The machine has made its way through shipping and moving companies and arrived at the Internet Archive’s 300 Funston Avenue headquarters in great shape, along with all the electronics and parts to make it go soon.

It’s one thing to see a mockup, and another to see the actual machine in your lobby:

Over the next few weeks, the system will be set up to run with the Internet Archive systems and provide a really nice demonstration station for the many guests and visitors we see. It really jazzes up the place!

In the meantime, we’re now providing you with links to download the artwork files, in case you want to use them yourself.

Thanks again to X-Arcade for the lovely addition to our lobby, and to Mar Williams for such fantastic art!

I CAN HAZ MEME HISTORY??

Posted on October 25, 2016 by Jason Scott

Jason Scott presents Internet Memes of the last 20 Years at the Internet Archive’s 20th anniversary celebration.

——–

It’s always going to be an open question as to what parts of culture will survive beyond each generation, but there’s very little doubt that one of them is going to be memes.

Memes are, after all, their own successful transmission of entertainment. A photo, an image that you might have seen before, comes to you with a new context. A turn of phrase, used by a politician or celebrity and in some way ridiculous or unique, comes back you in all sorts of new ways (Imma let you finish…) and ultimately gets put back into your emails, instant messages, or even back into mass media itself.

However, there are some pretty obvious questions as to what memes even are or what qualifies as a meme. Everyone has an opinion (and a meme) to back up their position.

One can say that image macros, those combinations of an expressive image with big bold text, are memes; but it’s best to think of them as one (very prominent) kind of a whole spectrum of Meme.

Image Macros rule the roost because they’re platform independent. They slip into our lives from e-mails, texts, websites and even posted on walls and doors. The chosen image (in this example, from the Baz Luhrman directed Great Gatsby) portrays an independent idea (Here’s to you) and the text compliments or contrasts it. The smallest, atomic level of an idea. And it gets into your mind, like a piece of candy (or a piece of grit).

It can get way more complicated, however. This 1980s “Internet Archive” logo was automatically generated by an online script which does the hard work of layout, fonts and blending for you. When news of this tool broke in September of 2016 (it had been around a long time before that), this exact template showed up everywhere, from nightclub flyers to endless tweets. Within a short time, the ideas of both “using a computer to do art” and “the 1980s” became part of the payload of this image, as well as the inevitable feeling it was even more cliche and tired as hundreds piled on to using it. The long-term prospects of this “1980s art” meme are unknown.

And let’s not forget that “memes” (a term coined by Richard Dawkins in his 1976 book The Selfish Gene) themselves go back decades before the internet made its first carefully engineered cross-continental connections. Office photocopies ran rampant with passed along motivational (or de-motivational) posters, telling you that you didn’t need to be crazy to work here… but it helps! Suffering the pains of analog transfer, the endless remixing and hand touchups of these posters gave them a weathered look, as if aged by their very (relative) longevity. To many others, this whole grandparent of the internet meme had a more familiar name: Folklore.

Memes are therefore rich in history and a fundamental part of the online experience, passed along by the thousands every single day as a part of communicating with each other. They deserve study, and they’ve gotten it.

Websites have been created to describe both the contributing factors and the available examples of memes throughout the years. The most prominent has been Know Your Meme, which through several rounds of ownership and contributors has consistently provided access to the surprisingly deep dive of research a supposedly shallow “meme” has behind it.

But the very fluidity and flexibility of memes can be a huge weakness — a single webpage or a single version of an image will be the main reference point for knowing why a meme came to be, and the lifespan of these references are short indeed. Even when hosted at prominent hosting sites or as part of a larger established site, one good housecleaning or consolidation will shut off access to the information, possibly forever.

This is where the Internet Archive comes in. With our hundreds of billions of saved URLs from 20 years stored in the Wayback Machine, a neutral storehouse of not just the inspirations for memes but examples of the memes themselves are kept safe for retrieval beyond the fleeting fads and whims of the present.
58145293

The metaphor of “the web” turns out to be more and more apt as time goes on — like spider webs, they’re both surprisingly strong, but also can be unexpectedly lost in an instant. Connections that seemed immutable and everlasting will drop off the face of the earth at the drop of a hat (or a server, or an unpaid hosting bill).

Memes are, as I said, compressed culture. And when you lose culture, you lose context and meaning to the words and thoughts that came before. The Wayback machine will be a part of ensuring they stick around for a long time to come.

Guest Post: Preserving Digital Music – Why Netlabel Archive Matters

Posted on September 30, 2016 by Jason Scott

The following entry is by Simon Carless, who worked for the Internet Archive in the early 2000’s before moving on to work in media and conferences, while simultaneously maintaining collections at the Internet Archive and running the for-free game information site Mobygames.

It’s fascinating that the early Internet era (digital) data can sometimes be trickier to preserve & access than pre-Internet (analog) data. A prime example is the amazing work of the Netlabel Archive, which I wanted to both laud and highlight as ‘digital archiving done right’.

Created in 2016 by the amazing Zach Bridier, the Netlabel Archive has preserved the catalogs of 11 early ‘netlabels’ and counting, a number of which involve music that was either completely unavailable online, or difficult to listen to online. One of these netlabels is the one that I ran from 1996 to 2009, Mono/Monotonik. So obviously, I’m particularly delighted by that project. But a number of the other netlabels are also great and previously tricky to access, and I’m even more excited for those. (Reminder: all these netlabels freely distributed their music at the time, which makes it a great thing to archive and bring back.)

The nub of the problem around early netlabels – particularly from 1996 to 2003 – is due to PCs & the Internet (& pre-Internet BBSes!) just not being fast enough or having enough storage to support MP3 downloads at that time.

So this early netlabel music – on PCs and even other computers like Commodore Amigas – was composed in smaller (in kB!) module files, which was composed and played on computers by using sample data and MIDI-style ‘note triggering’ with rudimentary real-time effects. This allows 5-minute long songs to be just 30kB-300kB in size, versus the 5mB or more that a MP3 takes.

For the more recent history of netlabels, I founded the Netlabels collection at the Internet Archive back in 2003, and that’s grown to hold over 65,000 individual music releases – and hundreds of thousands of tracks – by 2016. But the Internet Archive’s collection was largely designed to hold MP3 and OGG files, and so the early .MODs, .XMs and .ITs were not always preserved as part of this collection – and they were certainly not listenable to in-browser.

Additionally, there were a number of netlabels that used their own storage instead of the Internet Archive’s, even after 2003. But if it disappeared, their data disappeared with it, and music files are generally large enough not to be archived by the saintly Wayback Machine.

So if early netlabel archives exist, it was as ZIP/LHA archives on Scene.org or other relevant demoscene FTP sites. (Netlabels were spawned from the demoscene to some extent, since demo soundtracks use the same format of .MODs and .XMs.) And tracker music is annoyingly hard to play on today’s PCs and Macs – there are programs (such as VLC & more specialist apps) which do it, but it’s not remotely mainstream & not web browser-streamable.

So what Zach has done is keep the original .ZIP/.LHA files, which often had additional ASCII art & release info in them, save the .MODs and .XMs, convert everything to .MP3, painstakingly catalog all of the releases, and then upload the entire caboodle (both original and converted files) to both the Internet Archive and additionally to YouTube, where there are gigantic playlists for each label. So there’s now multiple opportunities for in-browser listening & the original files are also properly preserved.

This means we can now all easily browse and listen to the complete catalog of Five Musicians, a seminal early global PC tracker group/netlabel, as well as the super-neat Finnish electronic music netlabel Milk, the aggressive chiptune/noise label mp3death, and a host of others. And I recently uploaded a rare FTP backup from 1998 which allowed him to put up the 10 releases (that we know about!) from funky electronic netlabel Cutoff. These may have been partially online in databases like Modland, but certainly weren’t this accessible, complete, or well-collected.

What’s somewhat crazy about this is that we’re not even talking about ancient history here – at most, these digital files are 20 years old. And they’re already becoming difficult to access, listen to, or in a few cases even find.

For example, I had to dig deep into backup CD-ROMs to find some of the secret bootleg No’Mo releases that we deliberately _didn’t_ put on the Mono website back in 1996 – opting to distribute them via BBSes instead. These files literally didn’t exist on the Internet any more, despite being small and digital-native.

I think that’s – hopefully – the exception rather than the rule. But without diligent work by Zach (much kudos to him!) & similar work by other citizen digital activists like the 4am Apple II archiver, Jason Scott (obviously!) and a host of others, we’d have issues. And we may need more help still – some of this digital-first materials may disappear permanently, as the CD-ROMs or other media they are on become unreadable.

But we’re still doing a PRETTY good job on preservation, especially with CD-ROMs being ingested in massive amounts onto the Internet Archive regularly. (I’m working with MobyGames & another to-be-announced organization on preserving video game press CD-ROMs on Archive.org, for example, and Jason Scott’s CD-ROM work is many magnitudes larger than mine.)

Yet I actually think contextualization and access to these materials is just as big a problem, if not bigger. Once we’ve got this raw data, who’s available to look through it, pick out the relevant stuff, and make it easily viewable or streamable to anyone who wants to see it? That’s why the game art/screenshots on those press CD-ROMs is also being extracted and uploaded to MobyGames for easy Google Images access, and why Netlabel Archive’s work to put streamable versions of the music on Archive.org and YouTube is so vital. (And why playable-in-browser emulation work is SO very important!)

In the end, you can preserve as much data as you want, but if nobody can find it or understand it, well – it’s not for naught, but it’s also not the reason you went to all the trouble of archiving it in the first place. And the fact the Netlabel Archive does both – the preserving AND the accessibility – makes it a gem worth celebrating. Thanks again for all your work, Zach.

The Hidden Shifting Lens of Browsers

Posted on August 16, 2016 by Jason Scott

Some time ago, I wrote about the interesting situation we had with emulation and Version 51 of the Chrome browser – that is, our emulations stopped working in a very strange way and many people came to the Archive’s inboxes asking what had broken. The resulting fix took a lot of effort and collaboration with groups and volunteers to track down, but it was successful and ever since, every version of Chrome has worked as expected.

But besides the interesting situation with this bug (it actually made us perfectly emulate a broken machine!), it also brought into a very sharp focus the hidden, fundamental aspect of Browsers that can easily be forgotten: Each browser is an opinion, a lens of design and construction that allows its user a very specific facet of how to address the Internet and the Web. And these lenses are something that can shift and turn on a dime, and change the nature of this online world in doing so.

An eternal debate rages on what the Web is “for” and how the Internet should function in providing information and connectivity. For the now-quite-embedded millions of users around the world who have only known a world with this Internet and WWW-provided landscape, the nature of existence centers around the interconnected world we have, and the browsers that we use to communicate with it.

Avoiding too much of a history lesson at this point, let’s instead just say that when Browsers entered the landscape of computer usage in a big way after being one of several resource-intensive experimental programs. In circa 1995, the effect on computing experience and acceptance was unparalleled since the plastic-and-dreams home computer revolution of the 1980s. Suddenly, in one program came basically all the functions of what a computer might possibly do for an end user, all of it linked and described and seemingly infinite. The more technically-oriented among us can point out the gaps in the dream and the real-world efforts behind the scenes to make things do what they promised, of course. But the fundamental message was: Get a Browser, Get the Universe. Throughout the late 1990s, access came in the form of mailed CD-ROMs, or built-in packaging, or Internet Service Providers sending along the details on how to get your machine connected, and get that browser up and running.

As I’ve hinted at, though, this shellac of a browser interface was the rectangular window to a very deep, almost Brazil–like series of ad-hoc infrastructure, clumsily-cobbled standards and almost-standards, and ever-shifting priorities in what this whole “WWW” experience could even possibly be. It’s absolutely great, but it’s also been absolutely arbitrary.

With web anniversaries aplenty now coming into the news, it’ll be very easy to forget how utterly arbitrary a lot of what we think the “Web” is, happens to be.

There’s no question that commercial interests have driven a lot of browser features – the ability to transact financially, to ensure the prices or offers you are being shown, are of primary interest to vendors. Encryption, password protection, multi-factor authentication and so on are sometimes given lip service for private communications, but they’ve historically been presented for the store to ensure the cash register works. From the early days of a small padlock icon being shown locked or unlocked to indicate “safe”, to official “badges” or “certifications” being part of a webpage, the browsers have frequently shifted their character to promise commercial continuity. (The addition of “black box” code to browsers to satisfy the ability to stream entertainment is a subject for another time.)

Flowing from this same thinking has been the overriding need for design control, where the visual or interactive aspects of webpages are the same for everyone, no matter what browser they happen to be using. Since this was fundamentally impossible in the early days (different browsers have different “looks” no matter what), the solutions became more and more involved:

Use very large image-based mapping to control every visual aspect
Add a variety of specific binary “plugins” or “runtimes” by third parties
Insist on adoption of a number of extra-web standards to control the look/action
Demand all users use the same browser to access the site

Evidence of all these methods pop up across the years, with variant success.

Some of the more well-adopted methods include the Flash runtime for visuals and interactivity, and the use of Java plugins for running programs within the confines of the browser’s rectangle. Others, such as the wide use of Rich Text Format (.RTF) for reading documents, or the Realaudio/video plugins, gained followers or critics along the way, and were ultimately faded into obscurity.

And as for demanding all users use the same browser… well, that still happens, but not with the same panache as the old Netscape Now! buttons.

This puts the Internet Archive into a very interesting position.

With 20 years of the World Wide Web saved in the Wayback machine, and URLs by the billions, we’ve seen the moving targets move, and how fast they move. Where a site previously might be a simple set of documents and instructions that could be arranged however one might like, there are a whole family of sites with much more complicated inner workings than will be captured by any external party, in the same way you would capture a museum by photographing its paintings through a window from the courtyard.

When you visit the Wayback and pull up that old site and find things look differently, or are rendered oddly, that’s a lot of what’s going on: weird internal requirements, experimental programming, or tricks and traps that only worked in one brand of browser and one version of that browser from 1998. The lens shifted; the mirror has cracked since then.

This is a lot of philosophy and stray thoughts, but what am I bringing this up for?

The browsers that we use today, the Firefoxes and the Chromes and the Edges and the Braves and the mobile white-label affairs, are ever-shifting in their own right, more than ever before, and should be recognized as such.

It was inevitable that constant-update paradigms would become dominant on the Web: you start a program and it does something and suddenly you’re using version 54.01 instead of version 53.85. If you’re lucky, there might be a “changes” list, but that luck might be variant because many simply write “bug fixes”. In these updates are the closing of serious performance or security issues – and as someone who knows the days when you might have to mail in for a floppy disk to be sent in a few weeks to make your program work, I can totally get behind the new “we fixed it before you knew it was broken” world we live in. Everything does this: phones, game consoles, laptops, even routers and medical equipment.

But along with this shifting of versions comes the occasional fundamental change in what browsers do, along with making some aspect of the Web obsolete in a very hard-lined way.

Take, for example, Gopher, a (for lack of an easier description) proto-web that allowed machines to be “browsed” for information that would be easy for users to find. The ability to search, to grab files or writings, and to share your own pools of knowledge were all part of the “Gopherspace”. It was also rather non-graphical by nature and technically oriented at the time, and the graphical “WWW” utterly flattened it when the time came.

But since Gopher had been a not-insignificant part of the Internet when web browsers were new, many of them would wrap in support for Gopher as an option. You’d use the gopher:// URI, and much like the ftp:// or file:// URIs, it co-existed with http:// as a method for reaching the world.

Until it didn’t.

Microsoft, citing security concerns, dropped Gopher support out of its Internet Explorer browser in 2002. Mozilla, after a years-long debate, did so in 2010. Here’s the Mozilla Firefox debate that raged over Gopher Protocol removal. The functionality was later brought back externally in the form of a Gopher plugin. Chrome never had Gopher support. (Many other browsers have Gopher support, even today, but they have very, very small audiences.)

The Archive has an assembled collection of Gopherspace material here. From this material, as well as other sources, there are web-enabled versions of Gopherspace (basically, http:// versions of the gopher:// experience) that bring back some aspects of Gopher, if only to allow for a nostalgic stroll. But nobody would dream of making something brand new in that protocol, except to prove a point or for the technical exercise. The lens has refocused.

In the present, Flash is beginning a slow, harsh exile into the web pages of history – browser support dropping, and even Adobe whittling away support and upkeep of all of Flash’s forward-facing projects. Flash was a very big deal in its heyday – animation, menu interface, games, and a whole other host of what we think of as “The Web” depended utterly on Flash, and even specific versions and variations of Flash. As the sun sets on this technology, attempts to be able to still view it like the Shumway project will hopefully allow the lens a few more years to be capable of seeing this body of work.

As we move forward in this business of “saving the web”, we’re going to experience “save the browsers”, “save the network”, and “save the experience” as well. Browsers themselves drop or add entire components or functions, and being able to touch older material becomes successively more difficult, especially when you might have to use an older browser with security issues. Our in-browser emulation might be a solution, or special “filters” on the Wayback for seeing items as they were back then, but it’s not an easy task at all – and it’s a lot of effort to see information that is just a decade or two old. It’s going to be very, very difficult.

But maybe recognizing these browsers for what they are, and coming up with ways to keep these lenses polished and flexible, is a good way to start.

Microphone Check: Thousands of Hip-Hop Mixtapes at the Archive

Posted on August 7, 2016 by Jason Scott

The Internet Archive has been growing an interesting sub-collection of music for the past few months: Hip-Hop Mixtapes. The resulting collection still has a way to go before it’s anywhere near what is out there (limited by bandwidth and a few other technical factors), but now that it’s past 150 solid days of music on there, it’s quite enough to browse and “get the idea”, should you be so inclined.

Note: Hip-Hop tends to be for a mature audience, both in subject matter and language.

I’m sure this is entirely old knowledge for some people, but it was new to me, so I’ll describe the situation and the thinking.

There’s some excellent introductions and writeups about mixtapes in Hip-Hop culture at these external articles:

So, in quick summary, there have been mixtapes of many varieties for many years, going back to the 1970s to the dawn of what we call Hip-Hop, and throughout the time since the “tapes” have become CDs and ZIP files and are now still being released out into “the internet” to be spread around. The goal is to gain traction and attention for your musical act, or for your skills as a DJ, or any of a dozen reasons related to getting music to the masses.

There is an entire ecosystem of mixtape distribution and access. There are easily tens of thousands of known mixtapes that have existed. This is a huge, already-extant environment out there, that was established, culturally critical, and born-digital.

It only made sense for a library like the Internet Archive to provide it as well.

There’s a lot coded into the covers of these mixtapes (not to even mention the stuff coded into the lyrics themselves) – there’s stressing of riches, drug use, power, and oppression. There’s commentary on government, on social issues, and on the meaning of entertainment and celebrity. There’s parody, there’s aggrandizement, and there’s every attempt to draw in the listeners in what is a pretty large pile of material floating around. It’s not about this song or that grandiose portrait, though – it’s about the fact this whole set of material has meaning, reality and relevance to many, many people.

How do I know this has relevance? Within 24 hours of the first set of mixtapes going onto the Archive, many of the albums already had hundreds of listeners, and one of them broke a thousand views. Since then, a good amount have had tens of thousands of listens. Somebody wants this stuff, that’s for sure. And that’s fundamentally what the Archive is about – bringing access to the world.

The end goal here is simple: Providing free access to huge amounts of culture, so people can reference, contextualize, enjoy and delight over material in an easy-to-reach, linkable, usable manner. Apparently it’s already taken off, but here you go too.

Get your drank on here.

Those Hilarious Times When Emulations Stop Working

Posted on June 27, 2016 by Jason Scott

Jason Scott, Software Curator and Your Emulation Buddy, writing in.

With tens of thousands of items in the archive.org stacks that are in some way running in-browser emulations, we’ve got a pretty strong library of computing history afoot, with many more joining in the future. On top of that, we have thousands of people playing these different programs, consoles, and arcade games from all over the world.

Therefore, if anything goes slightly amiss, we hear it from every angle: twitter, item reviews, e-mails, and even the occasional phone call. People expect to come to a software item on the Internet Archive and have it play in their browser! It’s great this expectation is now considered a critical aspect of computer and game history. But it also means we have to go hunting down what the problem might be when stuff goes awry.

Sometimes, it’s something nice and simple, like “I can’t figure out the keys or the commands” or “How do I find the magic sock in the village.”, which puts us in the position of a sort of 1980s Software Company Help Line. Other times, it’s helping fix situations where some emulated software is configured wrong and certain functions don’t work. (The emulation might run too fast, or show the wrong colors, or not work past a certain point in the game.)

But then sometimes it’s something like this:

In this case, a set of programs were all working just fine a while ago, and then suddenly started sending out weird “Runtime” errors. Or this nostalgia-inducing error:

Here’s the interesting thing: The emulated historic machine would continue to run. In other words, we had a still-functioning, emulated broken machine, as if you’d brought home a damaged 486 PC in 1993 from the store and realized it was made of cheaper parts than you expected.

To make things even more strange, this was only happening to emulated DOS programs in the Google Chrome browser. And only Google Chrome version 51.x. And only in the 32-bit version of Google Chrome 51.x. (A huge thanks to the growing number of people who helped this get tracked down.)

This is what people should have been seeing, which I think we can agree looks much better:

The short-term fix is to run Firefox instead of Chrome for the moment if you see a crash, but that’s not really a “fix” per se – Chrome has had the bug reported to them and they’re hard at work on it (and working on a bug can be a lot of work). And there’s no guarantee an update to Firefox (or the Edge Browser, or any of the other browsers working today) won’t cause other weird problems going down the line.

All this, then, can remind people how strange, how interlocking, and even fragile our web ecosystem is at the moment. The “Web” is a web of standards dancing with improvisations, hacks, best guesses and a radically moving target of what needs to be obeyed and discarded. With the automatic downloading of new versions of browsers from a small set of makers, we gain security, but more-obscure bugs might change the functioning of a website overnight. We make sure the newest standards are followed as quickly as possible, but we also wake up to finding out an old trusted standard was deemed no longer worthy of use.

Old standards or features (background music in web pages, the gopher protocol, Flash) give way to new plugins or processes, and the web must be expected, as best it can, to deal with the new and the old and fail gracefully when it can’t quite do it. As part of the work of the Decentralized Web Summit was to bring forward the strengths of this world (collaboration, transparency, reproducibility) while pulling back from the weaknesses of this shifting landscape (centralization, gatekeeping, utter and total loss of history), it’s obvious a lot of people recognize this is an ongoing situation, needing vigilance and hard work.

In the meantime, we’ll do our best to keep on how the latest and greatest browsers deal with the still-fresh world of in-browser emulation, and try to emulate hardware that did come working from the factory.

In the meantime, enjoy some Apple II programs. On us.

Truck and Back Again: The Internet Archive Truck Takes a Detour

Posted on April 20, 2016 by Jason Scott

When one of our employees came out of his home over the weekend, he saw an empty parking space. Granted, in San Francisco, that’s a pretty precious thing, but since this empty parking space had held the Internet Archive Truck for the previous two days, he was not feeling particularly lucky.

A staff conversation then ensued, the city was called to see if the truck had been towed, and after a short time, it became obvious that no, somebody had stolen the Truck.

This in itself is not news: thousands of vehicles are stolen in the Bay Area every year. But what makes this unusual was the nature of the vehicle stolen… the Truck is a pretty unique looking vehicle.

Once the report was filed with the police and a few more checks were made to ensure that the truck was absolutely, positively missing and presumed stolen, the truck’s theft was announced on Twitter, which garnered tens of thousands of views and the news being spread very far. Thanks to everyone who got the word out.

What was not expected, besides the initial theft, was that a lot of people wondered why the Internet Archive, essentially a website, would have a truck. So, here’s a little bit about why.

Besides the providing of older websites, books, movies, music, software and other materials to millions of visitors a day, the Internet Archive also has buildings for physical storage located in Richmond, just outside the limits of San Francisco. In these buildings, we hold copies of books we’ve scanned, audio recordings, software boxes, films, and a variety of other materials that we are either turning digital or holding for the future. It turns out you can’t be a 100% online experience – physical life just gets in the way. We also have multiple data centers and the need to transport equipment between them.

Therefore, we’ve had a hard-working vehicle for getting these materials around: a 2003 GMC Savana Cutaway G3500, often parked out front of the Archive’s 300 Funston Avenue address and making up to several trips a week between our various locations.

In a touch of whimsy, the truck has had a unique paint job for most of its life with the Archive. Notably, this isn’t even the first mural it had on its sides; here is a shot with the previous mural:

We’re not sure of the motivation in stealing this rather unique and noticeable vehicle, and there seems to be some evidence it was driven around the city for a while after it was taken. But yesterday, we were contacted by the San Francisco Police Department with really great news:

The Truck has been recovered!

Left abandoned by the side of the road, the truck was found and is about to be returned to the Archive, and with good luck, back and in service helping us prepare and transport materials related to our mission: to bring the world’s knowledge to everyone.

Again, thanks to everyone who sounded out the original call for the truck’s return, and to the SFPD for getting a hold of the truck so quickly after it was gone.

Saving 500 Apple II Programs from Oblivion

Posted on March 4, 2016 by Jason Scott

Among the tens of thousands of computer programs now emulated in the browser at the Internet Archive, a long-growing special collection has hit a milestone: the 4am Collection is now past 500 available Apple II programs preserved for the first time.

To understand this achievement, it’s best to explain what 4am (an anonymous person or persons) has described as their motivations: to track down Apple II programs, especially ones that have never been duplicated or widely distributed, and remove the copy protection that prevents them from being digitized. After this, the now playable floppy disk is uploaded to the Internet Archive along with extensive documentation about what was done to the original program to make it bootable. Finally, the Internet Archive’s play-in-a-browser emulator, called JSMESS (a Javascript port of the MAME/MESS emulator) allows users to click on the screenshot and begin experiencing the Apple II programs immediately, without requiring installation of emulators or the original software.

In fact, all the screenshots in this entry link to playable programs!

If you’re not familiar with the Apple II software library that has existed over the past few decades, a very common situation of the most groundbreaking and famous programs produced by this early home computer is that only the “cracked” versions persist. Off the shelf, the programs would include copy protection routines that went so far as to modify the performance of the floppy drive, or force the Apple II’s operating system to rewrite itself to behave in strange ways.

Because hackers (in the “hyper-talented computer programmers” sense) would take the time to walk through the acquired floppy disks and remove copy protection, those programs are still available to use and transfer, play and learn from.

One side effect, however, was that these hackers, young or proud of the work they’d done, would modify the graphics of the programs to announce the effort they’d put behind it, or remove/cleave away particularly troublesome or thorny routines that they couldn’t easily decode, meaning the modern access to these programs were to incomplete or modified versions. For examples of the many ways these “crack screens” might appear, I created an extensive gallery of them a number of years ago. (Note that there are both monochrome and color versions of the same screen, and these are just screen captures, not playable versions.) They would also focus almost exclusively on games, especially arcade games, meaning any programs that didn’t fall into the “arcade entertainment” section of the spectrum of Apple II programs was left by the wayside entirely.

With an agnostic approach to the disks being preserved, 4am has brought to light many programs that fall almost into the realm of lore and legend, only existing as advertisements in old computer magazines or in catalog listings of computer stores long past.

It gets better.

Easily missed if you’re not looking for it are the brilliant and humorous write-ups done by 4am to explain, completely, the process of removing the copy protection routines. The techniques used by software companies to prevent an Apple II floppy drive from making a duplicate while also allowing the program to boot itself were extensive, challenging, and intense. Some examples of these write-ups include this one for “Cause and Effect”, a 1988 education program, as well as this excellent one for “The Quarter Mile”, another educational program. (To find the write-up for a given 4am item in the collection click on the “TEXT” link on the right side of the item’s web page.)

These extensive write-ups shine a light on one of the core situations about these restored computer programs.

As 4am has wryly said over the years, “Copy Protection Works!” – if the copy protection of a floppy disk-based Apple II program was strong and the program did not have the attention of obsessed fans or fall into the hands of collectors, its disappearance and loss was almost guaranteed. Because many educational and productivity software programs were specialized and not as intensely pursued/wanted as “games” in all their forms, those less-popular genres suffer from huge gaps in recovered history. Sold in small numbers, these floppy disks are subject to bit rot, neglect, and being tossed out with the inevitably turning of the wheels of time.

This collection upends that situation: by focusing on acquiring as many different unduplicated Apple II programs as possible, 4am are using their skills to ensure an extended life and documented reference materials for what would otherwise disappear.

Already, the collection has garnered some attention – the “Classifying Animals With Backbones” educational program linked above has a guest review from one of the creators describing the process of the application coming to life. And a particularly thorny copy protection scheme on a 1982 game of Burger Time went viral (in a good way) and was read 25,000 times when it was uploaded to the Archive.

In a few cases, the amount of effort behind the copy protection schemes and the concerned engineering involved in removing the copy protection are epics in themselves.

As an example, this educational program Speed Reader II contains extensive copy protection routines, using tricks and traps to resist any attempts to understand its inner workings and misleading any potential parties who are duplicating it. 4am do their best to walk the user through what’s going on, and even if you might not understand the exact code and engineering involved, it leaves the reader smarter for having browsed through it.

This project has been underway for years and is now at the 500 newly-preserved program mark – that’s 500 different obscure programs preserved for the first time, which you can play and experience on the archive.

Get cracking!

(The usual notes: The “Play in Browser” technology used at the Internet Archive is still relatively new, and works best on modern machines running newest versions of browsers, especially Firefox, Chrome and Brave. Javascript (not Java) needs to be enabled on the machine to work. (By default on all browsers, it is.) The manuals for many of the programs are not directly available in many cases, so some experimentation is required, although educational programs often worked to be understood without any manuals for the use of their audiences. Thanks to 4am for housing their collection at the Internet Archive and the many individuals on the MAME and JSMESS teams who have made this emulation possible.)

Internet Archive Does Windows: Hundreds of Windows 3.1 Programs Join the Collection

Posted on February 11, 2016 by Jason Scott

Microsoft Windows was, to some people, too little, too late.

Released as Version 1.0 in 1985, the graphic revolution was already happening elsewhere, with other computer operating systems – but Microsoft was determined to catch up, no matter what it cost or took. Version 1.0 of their new multi-tasking navigation program (it was not quite an “Operating System”) appeared and immediately got marks for being a step in the right direction, but not quite a leap. Later versions, including versions 2.0 and 2.1, finished out the late 1980s with a set of graphics-oriented programs that could be run from DOS and allow the use of a mouse/keyboard combination (still new at the time) and a chance for Microsoft to be one of the dominant players in graphical interfaces. It also got them a lawsuit from Apple, which ultimately resulted in a many-years court case and a settlement in 1997 that possibly saved Apple.

Meanwhile, the Windows shell started to become more an more like an operating system, and the introduction of Windows 3.0 and 3.1 brought stability, flexibility, and ease-of-programming to a very wide audience, and cemented the still-dominant desktop paradigms in use today.

In 2015, the Internet Archive started the year with the arrival of the DOS Collection, where thousands of games, applications and utilities for DOS became playable in the browser with a single click. The result has been many hundreds of thousands of visitors to the programs, and many hours of research and entertainment.

This year, it’s time to upgrade to Windows.

We’ve now added over 1,000 programs that run, in your browser, in a Windows 3.1 environment. This includes many games, lots of utilities and business software, and what would best be called “Apps” of the 1990s – programs that did something simple, like provide a calculator or a looping animation, that could be done by an individual or small company to great success.

Indeed, the colorful and unique look of Windows 3/3.1 is a 16-bit window into what programs used to be like, and depending on the graphical whims of the programmers, could look futuristic or incredibly basic. For many who might remember working in that environment, the view of the screenshots of some of the hosted programs will bring back long-forgotten memories. And clicking on these screenshots will make them come alive in your browser.

When they focused on it, a developer could produce something truly unique and beautiful within the Windows 3.x environment. Observe this Role-Playing Game “Merlin”:

But on the whole, the simple libraries for generating clickable boxes and rendering fonts, and an intent to “get the job done” meant that a lot of the programs would look like this instead:

(Then again, how complicated and arty does a program to calculate amortization amounts have to be?)

Windows 3.1 continues to be in use in a few corners of the world – those easily-written buttons-and-boxes programs drive companies, restaurants, and individual businesses with a dogged determination and extremely low hardware requirements (a recent news story revealed at least one French airport that depended on one).

Many people, though, moved on to Microsoft’s later operating systems, like Windows 95, ME, Vista, 7, and so on. Microsoft itself stopped officially supporting Windows 3.1 in 2001, 15 years ago.

But Windows 3.1 still holds a special place in computer history, and we’re pleased to give you a bridge back to this lost trove of software.

If you need a place to start without being overwhelmed, come visit the Windows Showcase, where we have curated out a sample set of particularly interesting software programs from 20 years ago.

As is often the case with projects like this, volunteers contributed significant time to help bring this new library of software online. Justin Kerk did the critical scripting and engineering work to require only 2 megabytes to run the programs, as well as ensure that the maximum number of Windows 3.1 applications work in the browser-based emulator. (Justin thanks Eric Phelps, who in 1994 wrote the SETINI.EXE configuration program). db48x did loader programming to ensure we could save lots of space. James Baicoianu did critical metadata and technical support. As always, the emulation for Windows and DOS-based programs comes via EM-DOSBOX, which is a project by Boris Gjenero to port DOSBOX into Javascript; his optimization work has been world-class. And, of course, a huge thanks to the many contributing parties of the original DOSBOX project.

The Internet Archive Telethon Pt. 3

Posted on December 28, 2015 by Jason Scott

See also Parts 1 and 2.

We dreamed up the idea of an Internet Archive Telethon, and due to the work of employees, volunteers, and performers, we put together an (almost) 24-hour show. We had an amazing time doing it.

But what were the results?

In total, including the 2-1 matching grant we had going on, we raised $131,134 across the 24 hour telethon period. Many donations were $50 or $100, with some lower and a few higher. Watching the funding trends that were in effect from the previous year and this month, there was roughly $30,000 expected to be made if we hadn’t done anything unusual beyond the fundraising banner and the usual contacting of folks to donate. So that means, unscientifically, that the Internet Archive Telethon caused a 400 percent increase in donations, which makes it a wild success!

A shout-out to Doug Kaye of IT Conversations, who donated $10,000 to the event towards the end, as well as Kevin Savetz, who contributed $1,500 in the name of the vintage computing history he and others have been uploading. Limor Fried of Adafruit donated $500, and many, many others contributed other amounts throughout the day and night.

Not only was money raised, but awareness was raised: people were being told about the show and were checking out the Internet Archive for the first time. We got a chance to see everyone excited and happy at the end of the year about this place we work in, and to talk about what brings us there. And the performance acts, all volunteering time and effort, provided us with amazing entertainment and spectacle. It was a resounding success on many other levels as well.

Will it happen again next year? Who knows. What we do know is how incredibly wonderful the experience was, even through all the hard work and intense effort, and how great it is that a mission like the Archive’s can inspire so much.

Thank you so much for being a part of this.

There are so many people to thank for this event. We’ll start with Eddie Codel for livestreaming equipment and Jasmine/Chris/Alex at Support Class for their on-screen reactive graphics – you all made us seem much more professional. On the internal side of Internet Archive employees, June Goldsmith handled administration concerns with the hosting of the event and worked out logistics. The front office (Katherine, Laurel and Michelle) made the calls and the reaching out for security, scheduling and logistics. Michelle invited many of our acts and made logistical arrangements for their media, as well as recruited and organized our team of non-staff volunteers. Wendy Hanamura provided advice, booking, and contacts for multiple acts, as well as being onsite for portions of the event. A lot of employees and volunteers came onsite to help run the Cortex, including Sam, Davide, Jake, Kevin, Laurel, Trevor, Jackie, Carolyn, and Jeff. Rachel Lovinger was a tireless producer for the majority of the cortex’s existence. Carolyn did the Telethon landing page graphics and web design. Will Fitzgerald provided coding for the banner linkage as well as a major assist to near-realtime automatic updating of telethon fundraising totals. Ralf, Tracey, Tim, Trevor, and Brewster and others helped during the Great Network Confusion of December 2015, getting the entire network infrastructure whipped into shape. And, of course, our many acts, including Conspiracy of Beards, Diva Marisa Lendhart, Craig Baldwin, Andy Isaacson, Chris Gray, Justin Hall, Lauren Taylor, Jeff Kaplan, Odd Salon, Gary Gach, Trevor von Stein, the Balkan Brass Band, Alexis Rossi and Dwalu Khasu, Rick and Megan Prelinger, John Perry Barlow and John Gilmore, John Law.

We are no doubt missing many more people who contributed to the Telethon both behind and in front of the camera – it’s a testimony to how many hands came forward to lift this dream up into reality. Thanks to everyone who was a part of it.

Internet Archive Blogs

A blog from the team at archive.org