Author Archives: Jason Scott

The Lardine Tapes: Celebrity, History, Conversation

Bob Lardine (1924-2019) asked great questions.

As an interviewer, he knew how to keep things light, conversational. He got the information he needed, and wrote articles based on what answers the subject provided, but he did it in a way that never felt like he was prodding, or intending to catch someone out.

He held a number of positions in journalism but one of the most memorable was as a Hollywood correspondent for the NY Post, where he would write up interviews with on-the-rise celebrities or long-established actors and directors about their current project and what they’d learned. If you’ve ever read a typical Sunday newspaper magazine with a couple pages of interview with a contemporary star of stage or screen, you’ve settled in with Bob’s bread and butter for decades.

Bob would share his interview tapes with his family, scrawled with all sorts of markings and ranging with dates from the 1960s to the 1980s. Ultimately, they came to the Internet Archive as a physical donation with the intention of being digitized and put up for all to enjoy.

A selection of Lardine cassettes from the original physical donation

For a number of years, after being donated, classified, and assigned an inventory number, the tapes were stored waiting to join a digitization queue. In 2025, the box was opened to be digitized using a tape setup and converted to .WAV sound files.

Tape Digitizing Setup – TASCAM 122mkIII deck to MOTU M4 USB Interface to Audacity

The box of audio cassettes, excepting a few in need of repair, are now digitized into the Interview Tapes by Bob Lardine collection at the Archive. 57 separate recorded interviews with celebrities, and two compilations of tapes, discussed further below.

Most people will be naturally drawn to the celebrity interviews. With names like George Peppard, Sharon Gless, Ricardo Montalban and more, they represent a killer lineup of recognizable names, especially if you experienced television in the 1970s and 1980s. Many of these tapes were recorded during the height of their careers (Peppard in the middle of A-Team, Gless while appearing in House Calls, and Kate Jackson just starting out on Charlie’s Angels) and they are more than happy to talk through their biographies and thoughts while in the salad days of nationwide celebrity status.

Which is fine, but you should know – the tape quality is spectacularly terrible.

Recorded, as they were, on the tables of restaurants, in dressing rooms or sitting on set between scenes, the goal of these recordings was clearly for Bob to use as backups to notes he was taking on paper. In the modern era of podcast microphones and post-processing software able to be recorded next to moving vehicles with no problem, the tape recorder in use was likely to be a simple affair, and one left in the same place even as people shifted around or looked in the wrong direction while talking.

But as muddy as the interview tapes can be, they still do the job. In her 1978 interview, Olivia Newton-John talks about her accent and sketches out her plans for her future career, and the listener can follow with little trouble. Erik Estrada talks about his health regimen and his plans to support his extended family, recorded in what sounds like a small room. And Robert Urich talks about feeling betrayed by various press interviews, showing how trustable Bob Lardine is in conducting his.

Ultimately, the tapes are legible. And, once your ear adjusts to the situation, wonderfully personal. These are workers, craftspeople, artists, taking time out from their day to share their current worries, considerations and plans. They speak, not so much as a performer providing entertainment at a microphone for a “personal moment” during a concert or appearance, but people with a job sharing how they got there, and where they are going.

No interview shows this better than the 1975 interview tape with Henry Winkler.

With Happy Days now in its third season, Winkler has been given co-starring status in the series with Ron Howard. Fonz-mania, years away from famously “jumping the shark”, has him in stadiums with 25,000 people cheering for him. Under any measurement, he is experiencing super-stardom, with the sky the limit.

But in this tape, Winkler is the picture of humility. He talks about how nobody keeps the throne for long, how it can all disappear overnight, and what steps he takes to mentally prepare for that change. He fears typecasting (which turned out to be a legitimate concern in the 1980s) and opens his sketched-out plans for what to do about that. Through it all, he’s an artist who cares about his art, and is doing his best to keep a level head through a gauntlet of hyperbolic fame.

It’s worth nothing that our obsession with celebrity means that many of the basic facts about these interviewees is known – where they were in May of 1975, or what the actual name of a production they were working on became. We have a literal deluge of knowledge about their marriages, divorces, places of residence. From these known facts, we can surmise a lot about what these tapes are talking about. If only this were the case with so many other cultures, now-lost places or people.

This collection would already be hours of insight and materials, but there’s just a little bit more.

Alongside these celebrity interviews, Bob also had tapes from the 1960s for a radio program called The Jewish Hour. Broadcast out of Phoenix, Arizona, and syndicated elsewhere, this radio show contains a variety of interviews, appearances and performances aimed from a Jewish perspective. There appears to be very little information about this show online – and while there might be a library or archive that has records of this show, there is nothing currently obvious to find. Until now: Lardine’s tapes have recordings, as well as related taped-off-radio recordings of interviews and shows covering historical people and events of the time. Without these tapes, there seems to be very scant recorded evidence of them available.

We’re always happy to take donations of audio cassettes like this, and look forward to continuing the process of bringing them online. Who knows what other lost treasures lurk in the world?

A very large thank you to Bob Lardine’s family for their donation of these tapes, as well as friends of the Internet Archive who helped fund purchase of the tape decks used for playback and digitization.

Staring into the Void

First, let’s get one myth out of the way: The Internet Archive has not been up, rock-steady and with no loss of service or connection, for twenty-eight years.

Starting out as a project to archive online materials, with a lot of speculative ideas of how to handle data at scale, the archive.org website was hosted at a shifting set of locations across its early years. It ran at razor-thin margins while rubbing hardware and software elbows with all sorts of then-famous sites; it directed its staff towards nebulous and aspirational goals while trying not to burn through its resources.

Stand back, we’re not sure how big this Archive is going to get.

A lot changed in October of 2001, when the Wayback Machine was introduced to the world at a ceremony at the Bancroft Library in Berkeley, and the Web spontaneously developed something it hadn’t really had before: a memory.

That Memory went from a feature to a core utility for the internet.

Collections such as the Prelinger Library and the Live Music Archive were also coming along for the ride, providing a way for people to just get to the good stuff and not face down web banners and pop-up ads just to listen and watch culture from a growing set of sources and reaching back farther in time, to before the web itself.

Serving a massively-enlarging set of data to a massively-increasing audience became an engineering and cost problem, and ultimately the problem – how do you retrieve and provide terabytes, then hundreds of terabytes, then petabytes, then dozens of petabytes of data to your patrons without, again, falling to a thousand potential problems?

Photo by Ben Margot of Associated Press, 2006.

The short answer is that you work very hard with a very dedicated crew with a shared vision, but the longer answer is that sometimes, issues arise.

Many issues.

Network equipment crashes, power strip failures, unexpected configurations and firmware upgrades gone wrong. Unaccounted growth in files, surprise operating system limits, and countless other snags and roadbumps have hit the archive over nearly three decades. These problems are definitely not unique to the archive’s existence – many other websites and computers in the world experience the same snags.

Some of the snags have been localized – an item stops loading, or a filetype renders wrong in some browsers. Others will take out a rack of machines, a fleet of drives, and late nights or long days bring them back to service.

Further issues are even more generalized: Power outages due to weather or fire, or a cable (power or network) is sliced through by a misinformed construction crew. A solid heatwave takes some of the machines out for hours at a time.

Across the years, the Archive has had outages lasting minutes, hours, and even days.

In 2024, for the first time in recent history, it was weeks.

The Archive staff was now spending long days and nights auditing, assessing, and improving the entire infrastructure of the Archive, top to bottom. To the public, we looked completely down, and to some, waiting patiently and then less-patiently for the return of the site, they came to a conclusion: this was it.

For some people, the era of Internet Archive was over. The Wayback Machine, Open Library and the Internet Archive were, in one shocking stroke, gone.

This was, it turns out, not true. And it was also something surprising: an opportunity.

Among the things it is very difficult to do is attend your own funeral. You don’t get to stand among the mourners and hear their thoughts, and to find out what about you mattered to them, and what difference you made over the course of them knowing you.

You don’t hear the proclamations, the dedications, the thoughts about what inspirations and warnings your life held.

But in October, we did.

There is, naturally, an entire ecosystem dedicated to taking news about sites like the Archive being down and stretching them into 30 minute presentations, and there are articles and editorials about any events of note online.

But during this period of weeks, we also got to see the conversations, statements and posts of long-time users, who otherwise would not have communicated about their relationship with the holdings and offerings they’d used for so long.

For many people, the Archive is a standard part of their browsing life – a vast and complex shelf of media and pieces of culture that they reach out to in the process of their day.

For others, it’s a critical tool in their toolbox of research, be it verifying a source for an assignment or tracking down long-otherwise-removed sources that would be near impossible if not for the Wayback Machine or the stacks within the main site.

And the amount of people who spend their days and nights walking the collections, browsing idly and finding inspiration or entertainment or relief flipping through the items, is very significant.

The inherent invisibility of the Archive, however, can’t be ignored.

It’s clear that, for many patrons, when they look for something, they search for “SOMETHING internet archive” in their search engine or go directly to archive.org to search, but the existence of a “there” related to the archive had drifted into the background. The outage had brought the bulk of our collection and presence, the depth of it, into the foreground.

In this new attention came bewilderment at the downtime, and then a protective anger.

The Archive represents a shrinking population of sites on the web – it is not “for” a company or “for” shareholders, but is run and available “for” everyone, as much as it can afford, and facing down all the challenges that come with a constantly growing site being visited by millions of patrons, daily.

As time has passed and the years have progressed, it feels like the air you breathe and the water you drink: the place you walk through on your way to knowledge.

Staring into the void of a lost Internet Archive, people took to social media and communities to be scared, bothered, worried, and angry – and for many to recognize what part it plays in many people’s lives.

At the end of 2024, after a pretty tough year, with often-unsung employees within the Archive working incredible long and stressful hours to minimize the outages and downtime, it’s the comments from donors, posts on social media, and supportive communications (e-mail and otherwise) that have helped make everyone excited to face 2025 and beyond.

Usually, the tidal wave of users that pass through our machines remain as blinking lights on servers, and the Archive is simply a website that many people use. In this period of darkness and loss of access, everyone was reminded of the many other parts the archive plays in life, and that, at least, is a precious knowledge.

We’re glad to be back, and to be back with you. Here’s to the next year and the years to follow.

The Backrooms of the Internet Archive

Like many bits of Internet Culture, this simple image of an empty series of rooms represents a deep-repressed or recently-remembered memory of a common Internet Legend, or it’s just a shot of nothing.

If the answer is that it’s a shot of nothing, let’s get you up to speed.

This image floated around message boards in the 2010s, posted with commentary or as a general use for a slightly off-putting photograph of a less-than-well-maintained location, and was, by most standards, rather indistinct. The internet, after all, is filled with odd images and weird drawings that cause a reaction, often after many different attempts to achieve the effect. Survivorship Bias for memes, one might say. So if one more image of an indistinct indoor landscape was out there, not much was going to happen of it.

That changed in 2019, when the image was given a legend and history, made up out of the air, that it was a rare photograph of The Backrooms. The phrasing of the original declaration speaks for itself:

“If you’re not careful and you noclip out of reality in the wrong areas, you’ll end up in the Backrooms, where it’s nothing but the stink of old moist carpet, the madness of mono-yellow, the endless background noise of fluorescent lights at maximum hum-buzz, and approximately six hundred million square miles of randomly segmented empty rooms to be trapped in
God save you if you hear something wandering around nearby, because it sure as hell has heard you”

If this writing strikes you as some sort of odd, rather dramatic addition to the image of a room, then you’re being introduced to creepypasta, or as some might call them, urban legends and campfire stories. It’s part of the overwhelming need for humans to tell tales that excite and frighten, to compose meaning or horror out of the darkness, and even the mundane.

The concept of the “Backrooms” also touches on a very frequent theme of many different horror and science-fiction movies – that there are service tunnels and hard to access areas woven throughout life, known only to a special few. Movies such as The Matrix, The Adjustment Bureau, Us, Beyond the Walls, Dark City, The Cube, and many more have explored this theme – or used it as a jumping off point to tell another story.

The difference, here, is nobody really knew where that very first image came from. For a very long time.

This extended period of not finding the original source of the image left an unfinished tune, a half-written poem, about where it came from and what it meant. And the lack of information in the image as it showed up on these image boards seemed to ensure the mystery would never be found.

So people filled in the blanks.

A Subreddit called /r/backrooms, an extended web video series called Backrooms, and endless CGI models and creations meant to extend the legend and the origin story became years of effort by thousands to draw the missing pieces of a puzzle that was never a puzzle.

A constantly shifting set of games with titles based off The Backrooms were created and presented for a willing and happy audience; it’d be unfair to choose one or even a few to highlight – there are dozens.

All of them represent the efforts to bring you into a state of heightened fear or paranoia as you lurked through a series of dark hallways, overlit carpeted spaces, and a growing dread. There’s no question there was a huge audience for this, and it is sometimes thought that this entire legend brought mainstream attention to liminal spaces, a perception of the in-between geographies of less unsettling locations. It is now enjoying life as an aesthetic movement.

Supporting this explosion of creativity and storytelling was the continued fact that nobody knew where the photograph came from. This situation, of a core image having a completely shadowy and unexplained origin, is arguably the foundation of its power.

That changed, recently.

This appears to be the origin of the Backrooms Photograph.

In March 2003, there was a former furniture store called Rohner’s Home Furnishings in Oshkosh, Wisconsin whose second floor was being renovated by the (somewhat) new tenants, HobbyTown.

Renovating the space from the sale of furniture to a new remote-controlled racing car track (among other aspects) meant pulling down partitions and ripping out carpet. This inspired taking photographs of the process, one of which, DSC001561.JPG, was the legendary “Back Rooms” image.

18 times in the last 20 years, crawlers affiliated with the Internet Archive moved through this page and grabbed portions of it, speculatively, to store for future research and reference. As the whole image was grabbed, reading the metadata of the original image reveals the date it was taken (June 12, 2002), and the camera used (a Sony Cyber-Shot model). The great unknown image, the unsettling photo of a mysterious place and time, was revealed.

However the original, anonymous user stumbled onto this photograph, it appears it was taken from either the Wayback directly, or the Wayback Machine crawled the same site the user had found, and kept that webpage’s preservation for over 20 years.

Emerging, Blinking, Into the Light

Naturally, as news of the Backrooms being “found” travels throughout the world, responses have wildly ranged.

For some, this is a proof that “with enough eyeballs, all problems are shallow”. While we might argue about the relative worth of a given effort, the fact that it is possible for word to travel about a mystery to the point of being solved means that the world is a hair less intimidating and scary. Our shared efforts and cooperation can find the answer to a seemingly impossible-to-answer question. The fact that an image with basically no information and a blurry set of components could be tracked down and revealed is a miracle.

For others, the mystery being solved removes a little bit of magic and wonder from the world. It says that there’s no kayfabe, no holding of mystery in our hands without peeking further to tear out the secret. In this perspective, something special has been lost.

But there’s another lesson as well.

The Internet Archive’s crawlers moved through the pages of a hobby store multiple times over the years, capturing HTML, photographs, and time-stamping the process, with the equivalent care of an at-risk website, a politician on the national stage, or a legendary and obvious moment in history provided via a PDF file.

This agnostic, wide-ranging crawl likely represented both the original source of the image, and a persistent, dependable URL to reference back to it, as thousands are doing at this very moment.

This is the mission of the Wayback Machine – be the dependable, accessible connection to web history, and therefore all history. Give the Internet its Memory, which would otherwise be lost.

If you mourn the loss of legend and mystery in our quest to keep the truth transparent, available and persistent, don’t worry – the process of internalizing and analyzing the image to give the Backrooms history its full and complete story has already begun:

Here’s to the next mystery, and the next unsettling information being brought into the light and presented for the education, research and entertainment of the Internet, courtesy of the Wayback Machine.

Taking the Words Out of My Mouth (With AI)

Everyone has a different idea of what they’d do with a time machine. Mine’s pretty simple: Head back to 2012, find myself working on a side project to film a documentary, and grab my younger self by the lapels and shout, “A 1099 IS NOT A REIMBURSEMENT! GET AN ACCOUNTANT!” before whatever energy sending me into the past gave out.

That simple mistake on my part had truly stunning financial consequences. When the tax bills and penalties started hitting my mailbox around 2014, it became a mass of stress. The IRS is undefeated in the award for Most Intense Collection Letters, and they were coming on a regular basis, even as I started directing more and more of my paycheck towards paying the debt.

A few friends generously sent me money to help beat back the interest and costs, but the combination of this and other debts had me backed into a corner, so I decided on a simple plan: Run a Patreon campaign where I told stories and opinions in short episodes, which were then supported by the audience, with all the proceeds going into debt repayment. The resulting podcast, Jason Scott Talks His Way Out of It, helped dig me out of that hole.

But it had an interesting side effect – coming up with new topics of discussion and consideration, once a week, meant that I had to mine deeper and deeper into my own outlook and relevant stories. After the first fifty episodes, I turned further inward. After one hundred, it became equal parts emotional and technical. And as I’m heading towards three hundred episodes, I’m surprised I have anything left to say at all. But I apparently do, and having now been doing this podcast weekly for six years, its episodes span a truly panoramic spectrum of topics.

(This is what a standing recording booth looks like – putting your head in an Audio Mailbox to maintain quiet background noise while speaking. It gets very hot in there.)

The Patreon gave patrons early access to the episodes, but the episodes are all open and uploaded some months later to the general world, including this collection at Internet Archive. Download, listen, remix, whatever you’d like – you have my complete permission and blessing.

Episodes have been uploaded to the Internet Archive since 2019, but in 2022, an opportunity try out new technology came up – the Whisper project, open sourced and instantly downloadable, could be implemented for transcription, either as part of video or just a basic audio file. And thanks to the project, I had many audio files, and began experimenting with using Whisper against them.

Speech recognition, the process of turning spoken words in a microphone or pre-recorded audio files into written words or issued commands, has been around a very long time – decades and decades. The Internet Archive is excellent for doing a dive into historical citations; a fast “text contents” search found these points of discussion in a 1979 issue of the Silicon Gulch Gazette newsletter:

What has changed is the combination of much faster computers, much more analysis of speech, and advances in cross-referencing the resulting training to make chips and, in this case, a program that is using other disciplines within computer science to pattern-match audio, to the point of adding capitalization and punctuation from the implications in the words. Turning this against my growing collection of podcasts, it wasn’t long before I’d say what has continued to be a theme: when it works, it’s shockingly good, and when it doesn’t, it’s shockingly bad.

As an experiment and exploration, it was very useful to let the program run, shoot out a block of text, and generate the resulting timing blocks for the purposes of subtitles or transcription:

…but it would have been foolhardy to 100% walk away and let it do transcription without a second human-driven scan through the results to find mistakes. I’ve been that human, and I’ve seen things.

I’ve seen the resulting transcriptions do great jobs with proper name capitalization, odd and challenging punctuation, and paragraph breaks. I’ve also seen it knock itself silly on my New York accent and non-obscure phrasing, and definitely making a poor guess on my made-up word “Cowicature”. The algorithm works great, except when it doesn’t.

And here we get to a turn of phrase I’ve come to adapt, which is an alternate term for AI: “Algorithmic Intensity”. The human need to give life and will to machinery is a very long-lived one; but most who look at the code behind this mechanism would agree – it’s just code. The only difference is that the amount of computing power and data to derive the outcome dwarfs numbers considered unattainable a decade or two ago.

And the speed can’t be beat – 5 years of weekly podcasts took roughly 4 hours to completely transcribe by Whisper, and that amount of time was simply because it was set as a lower-priority action in the queueing system. Knowing how long the total time for all the uploaded episodes of my podcast are sitting in that collection is a little involved, but my back of napkin estimation is it’s two and a half solid days of me talking about technology and emotion, ranging subjects from programming and compilers to summer camp and family. I can assure you – I was not going to transcribe these podcasts anytime soon, and I was not going to ask someone to do it. While I’m proud of my work, I’m in no position to be able to record four podcasts in a month and create paragraphs of text from them.

Or, for that matter, descriptive summaries.

Which brings me to the latest, continual tinkering with the tools and environment available to analyze materials with algorithms. I began asking a large language model to look at the generated transcriptions and create a summary of a given podcast episode.

Two-plus years into generalized algorithmic intensity access, it’s still very much a lumpy and oddly spell-casting endeavor. Instead of asking it to “summarize this transcription”, my request reads like a headmaster at a school or a Dungeons and Dragons game text:

You read transcripts of a podcast and carefully write out descriptions, in the form of narrative paragraphs, to accurately describe the content of the podcast. Longer and more complete descriptions are better, and encouraged. You describe the main subjects, conclusions by the participants, and provide helpful context for the subjects. The podcast you listen to is:

…followed by the transcription of the podcast, time-codes and all.

A matrix of calculation, fast beyond my reckoning but not less mysterious-and-not-mysterious as CPUs and networking itself, begins analyzing the language in the transcriptions, cross-connecting ideas mentioned, occasionally volunteering more information based on matches to terms, and within a few minutes, language comes out.

…in this language, it’s easy to find places where there’s an algorithm in the mix, a machine putting out syllables and phrases based on what I said. Like a kaleidoscope or a magnifying glass, there are occasional strange distortions and what approaches funhouse-mirror reflections of what I put in.

And oddly enough, it brings up memories.

It reminds me of being in my first neighborhood and all the kids circled around a fun and weird toy, a tape recorder, where you pressed two buttons and the … cassette, it was called? Would turn slowly and you would shout whatever came to the top of your head, press STOP, and then rewind and hear your own voice. It was distorted and weird, but it was my own voice, and I’d not heard it before from outside my own head. The world shifted, a little bit.

It reminds me of being in a computer lab at MIT in my twenties, playing games on a lunch break during a temp job I had on campus, and finding out that if I went to this obscure IP number and downloaded this very large binary file and ran it, this gargantuan program would pop up and begin asking me what “address” to go to, and if I knew some of the addresses, they would turn into full-color, fully-laid out, beautiful print-like presentations along a grey background. And, to my utter disbelief, looking at a menu option called “view source” would let me see the bizarre “code” behind these lovely looking pages. It was called Hypertext Mark-Up Language, HTML, and messing with that odd and ever-changing set of written directives inside textfiles would be a significant part of my next quarter century.

There have been all these times the times that computers (on my desk or lab or out in some nebulous “net” that girdled the Earth) provided me with a new set of buttons and knobs to press. To play around. To see what hums and clicks, and to do my best to understand what is happening underneath. In many ways, these computers have done very well by me; and they have also brought stress, sleepness nights and confusion; even anger.

Put another way, these newest concepts and tools, these Intense Algorithms, sweeping the news and the discussion around virtual watercoolers, are the latest in a long line of topics about what computers will bring to human lives. It’s a deep and varied discussion and it deserves consideration and time, and respect of the opinions of others.

And while (hopefully) the world processes this newest use of computers, we’ll see where, if anywhere, they’ll help with the mass of information and digital materials that sites like the Internet Archive provide.

Asking an LLM to look at the full text of a nearly 200-page book about the raising of ducks, such as the legendary DUCKS; AND HOW TO MAKE THEM PAY (1924), provides a paradise of questions you can ask the book as if it was able to answer in full sentences:

Q: How, exactly, do you make ducks pay?

A: The book “Ducks: and how to make them pay” by William Cook outlines various methods for making duck farming profitable. It covers the selection of suitable duck breeds, breeding techniques, and the proper management of feeding and housing. Additionally, it provides insights into marketing strategies for selling ducks and their eggs. The goal is to guide readers on how to efficiently manage duck farms to maximize income from their operations.

Q: What is, in your opinion, the best duck?

A: Choosing the “best” duck depends on what you’re looking for! For farming purposes, the Pekin duck is often considered ideal because it grows quickly and has a high yield of meat. If egg production is your priority, the Khaki Campbell duck is highly recommended, as it is one of the most prolific egg layers. For those interested in ornamental ducks, breeds like the Mandarin or Wood Ducks are known for their striking appearances. Each breed has its own advantages depending on your goals and the specific requirements of your duck farming or hobbyist endeavors.

(…as an aside, the book has one of the best bookplates inside the front cover, one which only a human being would truly apprecate.)

And that’s how I see the near future with this newest use of computers – an implementation of these tools to make materials easier to find, to summarize and help point researchers and students, and allowing new ways to work with a staggering amount of information in the online stacks. Hiding from it won’t be a solution; but asking hard questions of it might be.

Meanwhile, six years of my spoken-word memories of the ways computers have affected my life are sitting in a collection, open to all and welcome for anyone to use to tinker with. Have a great time.

A Happy Ending For Some Manual Labor (And a Call for Support)

A great saga of rescue and preservation is coming towards its end, and there’s a chance to bask in the victory, and help push towards its conclusion.

I got word in 2015 of a collection of manuals inside a business that was getting out of the manuals business, and while a lot of well-meaning people talked a good game, they wanted to cherry-pick (people getting rid of stuff hate cherry-pickers), and I drove down to show I was serious, and after a week of work with MANY volunteers and contributors, we ended up with pallets of documentation inside boxes, tens of thousands of unique manuals, many nowhere else.

Then they were stored in a storage unit. Then they were stored in a closed coffee house. Then they were transported to Internet Archive’s Physical Archive. Then they were stored until last year, 2023.

Last year, a group called DLARC, doing digitizing and indexing projects around ham radio and radio technology, worked with me and the archive to sort four pallets of the manuals for products related to the history of radio/network technology, and off they went overseas to be scanned. And as of this month, the evaluated, professionally-scanned and available-to-the-world manuals are finished, except for a few stragglers. The loop has closed!

You can browse the collection of thousands of scanned manuals here:

The Manuals Plus Collection

And now, the pitch.

The company doing the digitizing does lots of digitizing for the Internet Archive. They are well-paid and legitimate professional contractors who are sent the items, and who do careful scanning to the best of the materials’ ability to provide access to the information, and then do quality checks, and then upload them. When they’re humming, they’re processing a pallet every couple of weeks (with lots of mitigating factors).

I’ve negotiated a situation where, if money is sent in, the remaining pallets that should be scanned can just be sent along without sorting them for DLARC funds, DLARC will fund any that happen to overlap with their mission, and the rest will just be done.

That’s if money is sent in.

How much money? The number approaches hundreds of thousands of dollars. So I’m looking for both big-ticket supporters (who can mail me at jscott@archive.org) or individuals.

Here is a specialized donation link: https://archive.org/donate/?origin=jssm-dlarc

If we make less than we need to scan them all, then we’ll only scan up to where it’s paid for. I believe we can close it out, but if the interest/money isn’t there, then it isn’t there – fair enough. Browse the collection as it grows into thousands of manuals as it is and consider if you want to be part of all that. That’s definitely happened.

But what a happy ending it would be to push all these manuals through the process, and close it up. That’s why I’m popping up to talk about it, and why I hope you would consider contributing towards it, for a non-profit that deserves your support generally.

A Visit From (And Conversation With) Jordan Mechner

If you’ve ever taken a tour of the Internet Archive headquarters with Brewster Kahle, you’ve likely watched him play a minute or two of the game “Prince of Persia” on our in-browser emulator. While talking through the technology involved, Brewster will press the keys to make the main character run through the dungeons of a kingdom, often dying rather quickly.

Over the years, the area around the “Prince of Persia” station has added additional decorations, including a print drawn by the creator of Prince of Persia, Jordan Mechner. Entitled A Faithful Friend, the print depicts a moment in the Prince of Persia Game where a small mouse visits the captive princess.

Worlds collided recently when Jordan Mechner, in town for the Game Developers Conference 2024 and doing some readings of his new graphic novel memoir Replay, stopped by the Internet Archive for a tour and discussion with Brewster.

This provided a unique opportunity for the creator of a game that Brewster had been playing for years to give him tips to learn how to do a better running jump and get farther along than he had in his many demonstrations on the tour. It can be reported that Brewster was a fast learner and took Jordan’s suggestions to heart.

Jordan was also kind enough to gift a signed copy of Replay to the Internet Archive.

Conversation turned to the Internet Archive’s help in Jordan’s work creating Replay, including images and research for the historical parts of the novel.

During the conversation, Jordan had this to say:

“I appreciate [The Internet Archive] as a graphic novelist and as a game developer. Everything I’ve done throughout my life has been based on inspiration that I get from other things and on research that I’m able to do. When I went online to write and draw this 320-page book about game development and about my life and my family’s history, I looked for visual references of everything from old postcards and photographs to video game consoles.”

“I wanted to draw the floppy disk caddies and 1970s movie posters I had in my office in Brøderbund when I was making the first Prince of Persia on the Apple II. And where could I find a 1983 April issue of Softalk magazine, which is how I learned 6502 assembly language programming? So many times, when I searched online, it was the Internet Archive that came through.”

Brewster agreed:

“Well, I’m glad we’ve been useful to you, but also thank you for going and being a model for taking something that’s very, very popular in the past and making sure that it makes it to a generation that is going to download it from GitHub and play with it and mod it and do something else with it. And you’re welcoming of that next generation, living and growing with your work.”

And Jordan couldn’t have been clearer:

“And I will say that I don’t feel harmed by that. A few years ago somebody took the time to port Prince of Persia to the Commodore 64, which the publisher had no interest in doing in 1989, because the Commodore 64 was already outdated as a platform. Even the Apple II was on its way out. But somebody has done it now just out of love, out of its challenge, and the fact that the source code was available made that easier, I hope.

“Making things available to this generation. They’re going to do weird different things with it, especially if it’s not a permission-based society. But that’s what creativity has always been based on.“

Jordan acknowledged: “Copyright law exists and was created to protect the incentive of creators to work really hard at making something. So that if someone makes something great against all odds and it gets out there and sells a lot of copies, they can make money from it. But at a certain point, things that have been created need to then be used by other people to make their versions of it. The games and movies that we love, operas, films made of the works of Shakespeare, are building on creations of the past.”

There was one last reunion in the visit: Years ago, the Archive was donated a travel case (for trade shows) used by Jordan’s game publisher, Brøderbund Software. It currently lives in one of the Internet Archive’s guest rooms, and Jordan got a quick selfie with a piece of his own history.

DISCMASTER Rises Again

In October of 2022, the DISCMASTER site arrived, providing amazing semantic search of thousands of shareware and compilation CD-ROMs at the Internet Archive. In the entry written on the blog back then, the advantages and features of this site were pretty well enumerated.

Unfortunately, the site went down in June of 2023, due to a number of factors, the most pressing of which was a need to switch hosting and administration duties. (It is not run by Internet Archive and is not hosted at Internet Archive’s datacenters.)

However, DISCMASTER HAS RETURNED!

Thanks to a set of generous donors and the efforts of multiple volunteers, the site is back running with all the data and functionality it had in its previous incarnation.

The website is at http://discmaster.textfiles.com.

The previous blog entry has fuller details on the meaning of this site and the many uses it has for computer and internet history. All hail DISCMASTER!

Mickey’s Bad Day, or, The Ecosystem

One of the Internet Archive’s most viral tweets/toots/skeets happened at the start of 2024, with the announcement/reminder that the Disney short “Steamboat Willie” had entered the public domain just moments before. We have a copy of the film online for everyone to play or download.


Within a short time, even as the hour of midnight of January 1st moved across the earth, countless creations based off the Steamboat Willie character, ranging from the sublime to the profane, rocketed into the Internet.

Along with the flood of images have come a flood of articles and overviews of the legal and other ramifications of a public-domain Mickey Mouse. These are written by very smart people who have spent a lot of time considering these issues.

There’s no point is restating what these and many others are describing (Only Steamboat Willie’s design is public domain, Disney may utilize trademark law like a large hammer to enforce as firmly as they did their copyrights, etc.)

Instead, a few words about the creative ecosystem.

As a variety of slasher movies, costumes, crypto tokens, fan-fiction creations and general meme images of Steamboat Willie cascade into the first parts of 2024, it’s worth noting how the entire situation will feel unusual or a controversial subject to a number of folks.

What it is, however, is a too-long-delayed part of a natural process of works and copyright. The implementation of universal involuntary copyright that then lasts longer than the vast majority of human lifetimes means a disconnect, a vast gulf between the life of creative works and when they become a part of culture at large in anything other than a consumption relationship.

Copyright in the US (and via the Berne Convention and other lobbying, worldwide) has been increasingly extended over the years, often following the impending expiration of the Steamboat Willie copyright, and it has done so in the face of a 20th century that knew much shorter terms (and which led to works such as Pinocchio being used by companies such as Disney after they expired into the pubic domain). As a result of this, we’ve lost the rich ecosystem that creative works grew from, the back-and-forth, parody and reference and re-imagining that existed in previous generations.

The time extension of copyright, from 14 to 28 to “75 years or life of the author plus 50 years” to the current “95 years or life of author plus 70 years” has been a rapid expansion that has swallowed many creative works, and, combined with automatic copyright, has effectively ended a long-rich and held system of creations that could reference near-contemporaries in their works beyond the scope of parody or (often disputed fair use). What was a rich environment is now a rather dry landscape.

The ramifications of this have been many, but one of the most striking has been preservation – with works whose corporate or anonymous creators are undetermined, there is very little incentive to invest in their upkeep and maintenance, meaning that many early works tend to disappear in percentages that are heartbreaking for their size: half of all American films made before 1950 and over 90% of films made before 1929 are lost forever [cite].

That excellent copies of Steamboat Willie still exist are owed mostly to Disney’s own efforts to keep their materials under control and locked down for nearly a century. Steamboat’s fellow members of the Class of 1928 will not, ultimately, be so lucky. Each successive year of items released into the public domain will have a few “stars” to make the news and receive the artistic references that Mickey is getting this month – but hundreds, maybe thousands of works from the same year may never again see the light of day.

So, let us celebrate this temporary oasis in a truly barren landscape, and work, through preservation and protection for libraries and archives, to ensure each year is a more exquisitely complete and maintained ecosystem.

A Quarter In, A Quarter-Million Out: 10 Years of Emulation at Internet Archive

10 years ago, the Internet Archive made an announcement: It was possible for anyone with a reasonably powerful computer running a modern browser to have software emulated, running as it did back when it was fresh and new, with a single click. Now, a decade later, we have surpassed 250,000 pieces of software running at the Archive and it might be a great time to reflect on how different the landscape has become since then.

Anyone can come up with an idea, and the idea of taking the then-quite-mature Javascript language, universally inside all major browsers and having it run complicated programs was not new.

With the rise of a cross-compiler named Emscripten, the idea of taking rather-complicated programs written in other languages and putting them into Javascript was kind of new.

That all being the case, the idea of taking a by-then 20-year-old super-emulator called MAME, using Emscripten to cross-compile it into Javascript, and then running the resulting code in the browser at Internet Archive to make computers and consoles run, was very new.

It was also, objectively, madness.

Well over a thousand hours of work went into the project from a very wide range of volunteers who poured galactic amounts of time into making the project a reality. Along the way, changes were made to Emscripten, the Firefox, Internet Explorer, and Chrome Browsers, MAME, and the Internet Archive’s codebase to accommodate this dream.

It was announced in the Fall of 2013, well over a year after the project started.

Additional announcements came with each expansion of the types of software being emulated, and it became huge news, leading to millions of visitors coming to try this it out.

By any measure, a quarter of a million items later, it has been a huge, huge success.

The rest of this blog entry is pretty pictures and beautiful links, but before we move on, it’s once again important to highlight people who provided major contributions, including Justin Kerk, Daniel Brooks, Vitorio Miliano, James Baicoianu, John Vilk, Tracey Jaquith, Jim Nelson, and Hank Bromley. Dozens more developers spent evenings, weekends, and months to make this system happen. Thank you to everyone involved.

The joy of watching a computer boot up in the browser was (and is) a miraculous feeling. And after that feeling, comes a quick comfort with the situation: Of course we can run computers inside our browsers. Of course we can make most anything we want run in these browser-based computers. What’s next?

Within a short time after our 2013 announcement, the archive was running hundreds, then thousands of individual programs, floppy disks and even cassette-based software from computing’s past.

As emulators besides MAME were added, it became necessary to create a framework for a versatile and understandable method to load emulators. This framework eventually got a name: THE EMULARITY.

In the decade of the Emularity’s existence, the Archive’s software emulation has expanded into directions nobody could have fully expected to work when the project started.

Here are some highlights:

Hypercard Stacks for the Apple Macintosh, a critical period in content creation and computer information architecture, have been restored to easy access, surpassing thousands of hypercards to try instantly.

Plastic Electronic Handheld Games, once a staple of toys in the 1970s through the 1990s, have been able to live once again as, including the original housing that these simple (and not so simple) machines relied on instead of graphics.

As the uploads veered into the many thousands, it became more and more difficult for new adventurous users to figure out what, if any, software was at the archive to check out. This has led to specialized collections focused on one type of program, like the Computer Chess Club. People can use these collections as gateways to quickly testing the waters of now-decades of computer and software history, seeing the turns and twists of countless lost companies and individuals who squeezed every last bit of wonder and spectacle out of these underpowered boxes.

The Calculator Drawer took things to a new level when entire calculators could be emulated, including their unique looks, accompanied by a “drawer of manuals” to browse through if you had to learn (or re-learn) how to make these machines run.


The Woz-a-Day Collection, in many ways, represents the logical end for the role that the Internet Archive’s Emularity can provide for software history. The project is the effort of the software historian 4am, who has spent years on its maintenance. Methodically preserving Apple II software from the original floppy disks, incorporating every last bit and track of the disks with no modifications, and allowing the best fidelity of these programs as they originally were offered, 4am allows some of these programs to be playable for the first time in decades.

With each new batch of added emulated systems and machines have come a greater and greater pool of users, toying with historical software or playing long-forgotten or never-remembered games with a new level of convenience and willingness to try them out.

At this milestone of a decade into this experimental adventure, Internet Archive continues to grow its collection, to test and automate the functioning of both uploaded and self-maintained collections of software, and to provide a vast and necessary service in the preservation of historical software.

And, of course, we all get to enjoy some really great games.

Here’s to what another ten years will bring us!

CRASH! BARK! BOOM! The USC Sound Effects Library

For a simple overview of the collection being presented, read Craig Smith’s original blog entry over at the Freesound site.

While there are plenty of items at the Internet Archive that have no obvious home elsewhere online, there are also cases where we hold a copy of a frequently-available set of material, but we can provide it for much easier distribution and preview, including the ability to download the entire original set of files in one fell swoop.

Such it is with the USC SOUND EFFECTS LIBRARY, a collection of .WAV files taken from rapidly crumbling magnetic tape and presented for reference, enjoyment and even projects.


The world of sound effects is two-fold interesting:

There’s the interesting way we use recorded sound, cut together from various sources and even spliced from organic and generated sources, to provide the audio soundtrack for visual experiences in a way the audience thinks sounds “natural”.

And there’s the actual process of sound effects, of engineers going into the field or into a studio and generating sound after speculative sound, trying to find just the right combination of noise and speech to create just what they might need in the future.

As long as there has been performance on the Radio and to mediums beyond, the generating of sound effects live and recorded is a fascinating skill, shared among many different people, and is rightly considered an awards-worthy occupation. While not everyone is fascinated at this sort of work, many people are, and there’s a childlike delight in going through a “sound library” of effects and noises, getting ideas of how they might be used later.

As explained in a blog entry written by Craig Smith, a variety of tapes called the “Red” and “Gold” libraries of recorded sound effects were joined by a third set from a sound company called Sunset Editorial, who worked on hundreds of films over the years.


This collection has now been mirrored at the Internet Archive.

In the USC Optical Effects Library are over 1,000 digitized tapes of sound effects, including not just the sounds themselves but the voices of many different engineers bracketing them with explanations, cajoling and call-outs while they’re being made. We hear not just a dog panting, but an engineer talking to the dog that they’re doing a good job. Some recordings clearly have a crew sitting around while recordings are being made, and they hush with the sound of professionals knowing they can’t just edit the noise out if they talk over it.

There are machines: Planes, Cars and Weapons. There are explosions, fire and footsteps. There’s effects just called SCIFI or MAGIC, where the shared culture of Hollywood’s take on what things “sounded like” makes itself known.

The pleasant stroll of “just playing” the effects in our browser-based player belies the fact that at one time, this was magnetic reels, sliced with razors and joined with tape, used to remix and reconstitute environments of sound for entertainment. The push to digital allows for much more experimentation and mixing without generational loss and huge amounts of precious time, but in these versions we can hear how much work went into the foundational soundscape of entertainment in the 20th century.


Craig Smith, who made this collection available, goes into great detail in his blog entry about how fragile these tapes had become before being transferred, and how some were lost along the way. Folks unfamiliar with “Sticky Shed Syndrome” and the process of “baking tapes” will be surprised to know how quickly and dramatically tapes can fall apart after a passage of time. With large efforts by a number of people, the amount that was saved is now available at the Archive.

There is extensive metadata in each item, captured as spreadsheets and documents about the assumed sources or credits of the sound. They’re important to bring along with these noises if a patron wants to maintain a local copy.

Speaking of which.

In this collection is a massive compilation of all the data related to the project. It’s located in an item called “Sound Effect Libraries (Red, Gold, Sunset Editorial)”. Patrons whose immediate urge is to grab their own private set of the data to keep “safe” will want to go to this item, using either the direct download of the three .ZIP files inside, or to click on the TORRENT link to download the 20+ gigabytes of files. Depending on your bandwidth, it will take some time to download, but you can be assured that you got “all” the data from this amazing collection. This, in some ways, is the Internet Archive’s greatest strength – direct access to the original files for others to have, instead of adding a layer of processing and change as the presentation mediums of the day require modification for “ease”.

Enjoy the universe of sounds in this collection!

And as one final note – if your immediate thought when you hear the term “sound effects” is to request or wonder about the legendary “Wilhelm”, we’ve got you covered: The recording session is right here.