Category Archives: News

Patrons Speak Out: The Impact of Losing Access to More Than 500,000 Books

Earlier this week, we asked readers across social media to tell us the impact of losing access to more than 500,000 books removed from our library as a result of the publishers’ lawsuit.

The response was overwhelming, and the stories shared were powerful and heartfelt. It wasn’t just titles that disappeared—it was countless memories, research materials, and sources of inspiration for readers around the world. Below, we share some of the most impactful testimonials, highlighting the profound effect these removals have had on readers and researchers everywhere.

If you’d haven’t already done so, please share your story!


Tran D. A., Ha Tinh, Vietnam: It hampers my ability to look up data sources. Books in Vietnam are significantly less accessible and my economic background doesn’t allow me to afford these things.

R.F., Surrey, Canada: As a Wikipedia editor, the Internet Archive is one of the most useful tools to find citations and verify facts. By removing books from the Internet Archive, it hinders the ability to find sources for an open encyclopedia.

Meilan S., Washington, DC, USA: As the online history editor at a national magazine, I use the Internet Archive on an almost daily basis. It’s an invaluable tool for accessing books cited by my writers, conducting research for articles I’m writing, and fact-checking quotes and other information. I regularly link to the Internet Archive in our published content, as I believe we should be as transparent as possible regarding sourcing, in addition to offering readers links to sites where they can learn more about a given topic. It has been disheartening to find the majority of books I need to access for work now listed as “removed.” The removal of this content makes it more difficult for me to include diverse, in-depth and reliable sources in my writing and editing.

Tamia T., Montreal, Canada: Internet Archive gives me access to scholarly information that is not afforded to those outside of the post-secondary education system. The Internet Archive helps bridge the gap when it comes to literacy, comprehension of history, and the discovery of new works that are otherwise gate-kept from the average person.

Olga A., Moscow, Russia: I can’t proceed with my research on bioanthropology, regarding both the current state of this science and the history of this field. None of the books I’m looking for are available for purchase in my country, even if I, by some miracle, managed to find them in second-hand bookshops abroad and had great amounts of money to buy them.

Jason V. M., Tucson, AZ, USA: The Internet Archive has allowed me and my family to access books quickly, conveniently and safely. I’m afraid that without the Archive, access to teaching material for my daughter and studying material for myself has now become significantly limited at my income level and in my area.

Poppy, Indonesia: Most of literature I’ve been using from IA are ones I couldn’t find in my city’s library, either public or academic. Without IA, my academic progress would be halted.

Lyria V.W., Middle River, MD, USA: My school in the past wanted me to read books that were considered banned (like The Great Gatsby and To Kill A Mockingbird) to learn about the culture and history at the time. I did not always have physical access to these books.

Zachary C., PA, USA: Without archive.org’s availability, I would have not been able to further my education on historical architecture and fashion.

Samson W., Omaha, NE, USA: It has made it more difficult to find quotes, to read quotes in full context, and to research language.

Nathan W., Portland, OR, USA: I purchase dozens of books every year, and check out even more from my local library — Internet Archive is an invaluable resource to explore books I’m interested in and quickly search for remembered passages or quotes from books I have already read.

Jefferson C., Managua, Nicaragua: Internet Archive had everything I needed to go through college, whilst not having ANY library available in my home country and with college books costing hundreds of dollars on top of import fee and taxes (which alone could be the salary of a person here).

Marina K., Minneapolis, MN, USA: I am an award-winning artist and writer for video games. I often need to research many diverse topics as an independent artist without institutional backing or studio resources. The Internet Archive is a valuable resource that allows me to create work that interacts more deeply with the world.

Harry S., UK: I’m a student studying Ancient History and having 500,000 books removed will undoubtedly remove my access to some sources I can’t get my hands on otherwise.

Carlos R., Aguascalientes, Mexico: I was reading Story : substance, structure, style and the principles of screenwriting (1997) and I no longer have access to it.

Alicia P., MD, USA: I organize Wikipedia editing events to improve Wikipedia articles about historical topics. We rely heavily on Internet Archive books as sources, since they are publicly available. This is essential for transparency in Wikipedia articles: every factual claim has a footnote, and the reader can click the hyperlink in the footnote to go directly to the source of the information in an Internet Archive book (often an older academic book that is no longer in print or at public libraries anyway).

Renard, Osasco, Brazil: The Internet Archive allowed me to expand my boundaries and access materials that do NOT exist here, or would be incredibly expensive to import, much of the price going to shipping and a reseller’s pockets.

Ethan S., Ottawa, Canada: I have been working on a project to document the history of social democratic governments in Canadian provinces and territories. These governments (by the NDP and CCF) are not well researched and the resources that are available at public libraries don’t always include older books, often written by members of cabinet or caucus. The Internet Archive has had some of the relevant books removed due to the lawsuit.

Berry J., Boston, MA, USA: I understand that publishers and authors have to make a profit but most of the material I am trying to access is written by people who are dead and whose publishers have stopped printing the material.

Chloe, London, UK: Internet Archive allows me to search a large number of books by keyword/name and it triggered my buying a lot of hard copies of books I would have never even known existed. I am so distressed that this has been taken away from me, as I research the history of lesbianism and it is already an extremely difficult niche field to research.

Camila N., Mexico City, Mexico: Cultural heritage, including documentary heritage, is essential for forging identities, offering knowledge, telling human history and promoting the progress of societies accompanied by cultural development.

Mary S., Rochester, NY, USA: It’s an access issue. It’s substantially harder to find the books I’m interested in reading. Heck, even for more common books, the libraries in my area are not practical to get to except by car, and I have a lot of friends who don’t have easy access to a car.

Robin L., Sydney, Australia: Having decreased access to books such as books on collage artists during certain parts of history affects my research, since I have limited to no access to such books in Australian libraries or bookshops both physical or digitally.

Samuel R., Chicago, IL, USA: In many cases there are not physical lending copies of titles i am looking for within 200 miles of my location, and no legal methods available to purchase e-versions. The Internet Archive is far and away the best solution for reading and preserving niche books across a variety of genres.

Zulma P., Covina, CA, USA: The Internet Archive has lots of books my local library doesn’t own and books that are very hard to find.

Thomas R., Manningham, Australia: These books being available on archive.org is a vital resource for me and many like me. A large amount of the Archive was never released in my corner of the globe, meaning I have few if any options for reading on niche subjects.

Juan V., Medellin, Colombia: I am a dance artist and require a big selection of options for my artistic research. Some of the books that I was using on my research are no longer available.

Sage L., Grand Rapids, MI, USA: I am an illustrator and character designer with a passion for science fiction. I use the Internet Archive to research projects that I don’t have enough background knowledge on. I frequently find that books I need are missing.

Oguz Alp K., Antalya, Turkey: In one word I can say: “devastation”. It is very difficult for people like me who live and do research in third world countries to access the books and documents in your archive.

Zachary B., Lockport, NY, USA: As someone who is working to understand the evolution of society through literature, reduced access to many classic works makes gathering information much more difficult.

Andrea T., Canada: I did not go to a university with a giant archive in the library for medieval texts, so to research these topics, free resources like Internet Archive really came into play. Not everyone will have an opportunity to read these books available at libraries. Not everyone can even afford to attend university, where many of these now removed texts are available for free in libraries and archives. Why should other students, and other people interested in these topics, be deprived of this free resource? Going into my Master’s degree, I have now lost a resource I relied on heavily through my post secondary education up until this point, hindering what sort of research I will be able to accomplish as I enter higher education.

Isa B., Lelystad, Netherlands: I was working on several papers for my education and I had to change sources because the literature was inaccessible despite it being of great importance to my research.

Mrittika D. S., Kolkata, India: Resources I had previously found on the Internet Archive site were all of a sudden no longer available when I searched for them. Hence, I faced a huge problem in completing my papers, as I had already formed a plan on what sources I wanted to refer to, and my plan was completely disrupted.

Schuyler V., Troy, NY, USA: While I am lucky to be near many physical libraries, none are as convenient and complete as the Internet Archive. Nearly all the books I’ve purchased in the last decade were ones I saw on the Internet Archive first.

Samantha F., Providence, RI, USA: Honestly? Without these books, my job becomes that much harder. Publishers aren’t going to put out a new run of, say, a 40-year-old book on specific aspects of animation history, because it’s not profitable. So, to remove them limits the number of folks like me, who are trying to tell a cohesive and factual story, who can actually work to do so as these materials get rarer and more expensive.

Kerry L., Boston, MA, USA: I had used copies of books a few months back when doing research for my master’s thesis—when I came back to them in April and May, I was surprised to find many of my more crucial secondary sources were gone. These books specifically are not as prevalent in public libraries, being older and region-specific. I was fortunate that I had taken detailed notes and quotes, but I was unable to check my references for books that were physically located miles and miles away from me.

Nicolas T., Paris, France: This gray zone of books still under copyright but that have disappeared from bookstores and libraries can be so useful… and the DRM on digital copies was very clever and fair.

Lola, Poland: On a personal level, this has severely limited the potential for both me and my partner to read books, we don’t have the money or ability to purchase actual books or E books and while there is a library near by, they usually don’t have the books we are looking for, it has in turn likely limited us from reading so many books.

John P., Menlo Park, CA, USA: In 2016 a fire in my home office left my personal library (about 700 books) smoke damaged, but still readable. Rather than let all these books go to waste, I donated them to the Internet Archive, so books in my collection they hadn’t already scanned would be available to the rest of the world. I had hoped I would be able to refer to the collection there. Unfortunately, many of these books are no longer available due to the lawsuit restrictions.

Andrew M., Easton, CT, USA: Prior to the removal of books on the IA I was able to access works on niche topics like La Terra in Piazza (1984) to review and promote reading about all sorts of interesting things to a wider audience. Since the removal, I’ve already struggled to finance a project translating a book on the causes of the fall of Rome, which would not have happened if I’d had access to materials that had been on IA at an earlier date.

Stephano L., Peru: The links I used for citations in university works are now dead, so I will have to correct that in many papers I wrote.


Editorial note: Statements have been edited for clarity.

Join Our Monthly Giving Circle and Support the Internet Archive!

Monthly donors sustain our work and ensure that the Internet Archive will always be free for all.

Our supporters have joined us arm-in-arm for decades against corporate interests, censorship, and digital erasure. Your commitment to preserving information and cultural heritage fuels our mission to provide Universal Access to All Knowledge. We want to celebrate you and thank you for being the foundation of our support system. 

Today, we invite you to join the Internet Archive Monthly Giving Circle.

Why Join?

Preservation and Accessibility are at the heart of who we are and what we do, so the following benefits are yours to enjoy when you begin your monthly donation at any level. By becoming a member of our Monthly Giving Circle, you unlock exclusive benefits designed just for you—such as: 

  • Exclusive Webcasts and Virtual Learning Opportunities: Dive deeper into topics you love with access to virtual events and learning sessions.  
  • Giving Circle Discount at Better World Books: Enjoy special savings on your used book purchases with our literacy partner, Better World Books.
  • Access to the Monthly Giving Circle Newsletter: Read our curated monthly newsletter filled with interesting finds from the Archive, surveys to get your input, and information about donor perks.
  • Submit Priority Questions at Book Talks: Have burning questions for featured authors or event presenters? You can submit priority questions for consideration, even if you cannot attend the event live.
  • Celebrations and Thank-You’s: We believe in recognizing and celebrating our supporters. Expect special thank-yous and celebrations at your donation anniversaries.
  • Powering Preservation: The knowledge that your recurring contributions sustain our efforts in digitizing, preserving, and providing access to millions of books, web pages, videos, and more.
  • Additional Petabyte Donor Perk ($250 USD+): We have something special just for our most generous supporters with a monthly contribution of $250 or more. Each month, receive a handpicked book from our staff delivered straight to your doorstep. Enjoy a recommended read directly from our bookshelves to yours!

Joining our Monthly Giving Circle is simple:

  • Head over to our Donate page
    • Ensure you have selected “monthly” as your preferred donation frequency.
  • Enter the amount you wish to contribute each month.
  • Input your payment information and click Donate.

Already a monthly donor? Thank you for being an integral part of our mission! You do not have to take any action to enjoy Monthly Giving Circle benefits. Email donations@archive.org to ensure you receive the Monthly Giving Circle newsletter which shares upcoming events, the latest discount codes, and more.

Together, we can continue expanding access to knowledge for generations.

Gratefully, 
Tom Mayer
Development Coordinator

Have a question about our monthly giving circle? Reach us at donations@archive.org.

The Backrooms of the Internet Archive

Like many bits of Internet Culture, this simple image of an empty series of rooms represents a deep-repressed or recently-remembered memory of a common Internet Legend, or it’s just a shot of nothing.

If the answer is that it’s a shot of nothing, let’s get you up to speed.

This image floated around message boards in the 2010s, posted with commentary or as a general use for a slightly off-putting photograph of a less-than-well-maintained location, and was, by most standards, rather indistinct. The internet, after all, is filled with odd images and weird drawings that cause a reaction, often after many different attempts to achieve the effect. Survivorship Bias for memes, one might say. So if one more image of an indistinct indoor landscape was out there, not much was going to happen of it.

That changed in 2019, when the image was given a legend and history, made up out of the air, that it was a rare photograph of The Backrooms. The phrasing of the original declaration speaks for itself:

“If you’re not careful and you noclip out of reality in the wrong areas, you’ll end up in the Backrooms, where it’s nothing but the stink of old moist carpet, the madness of mono-yellow, the endless background noise of fluorescent lights at maximum hum-buzz, and approximately six hundred million square miles of randomly segmented empty rooms to be trapped in
God save you if you hear something wandering around nearby, because it sure as hell has heard you”

If this writing strikes you as some sort of odd, rather dramatic addition to the image of a room, then you’re being introduced to creepypasta, or as some might call them, urban legends and campfire stories. It’s part of the overwhelming need for humans to tell tales that excite and frighten, to compose meaning or horror out of the darkness, and even the mundane.

The concept of the “Backrooms” also touches on a very frequent theme of many different horror and science-fiction movies – that there are service tunnels and hard to access areas woven throughout life, known only to a special few. Movies such as The Matrix, The Adjustment Bureau, Us, Beyond the Walls, Dark City, The Cube, and many more have explored this theme – or used it as a jumping off point to tell another story.

The difference, here, is nobody really knew where that very first image came from. For a very long time.

This extended period of not finding the original source of the image left an unfinished tune, a half-written poem, about where it came from and what it meant. And the lack of information in the image as it showed up on these image boards seemed to ensure the mystery would never be found.

So people filled in the blanks.

A Subreddit called /r/backrooms, an extended web video series called Backrooms, and endless CGI models and creations meant to extend the legend and the origin story became years of effort by thousands to draw the missing pieces of a puzzle that was never a puzzle.

A constantly shifting set of games with titles based off The Backrooms were created and presented for a willing and happy audience; it’d be unfair to choose one or even a few to highlight – there are dozens.

All of them represent the efforts to bring you into a state of heightened fear or paranoia as you lurked through a series of dark hallways, overlit carpeted spaces, and a growing dread. There’s no question there was a huge audience for this, and it is sometimes thought that this entire legend brought mainstream attention to liminal spaces, a perception of the in-between geographies of less unsettling locations. It is now enjoying life as an aesthetic movement.

Supporting this explosion of creativity and storytelling was the continued fact that nobody knew where the photograph came from. This situation, of a core image having a completely shadowy and unexplained origin, is arguably the foundation of its power.

That changed, recently.

This appears to be the origin of the Backrooms Photograph.

In March 2003, there was a former furniture store called Rohner’s Home Furnishings in Oshkosh, Wisconsin whose second floor was being renovated by the (somewhat) new tenants, HobbyTown.

Renovating the space from the sale of furniture to a new remote-controlled racing car track (among other aspects) meant pulling down partitions and ripping out carpet. This inspired taking photographs of the process, one of which, DSC001561.JPG, was the legendary “Back Rooms” image.

18 times in the last 20 years, crawlers affiliated with the Internet Archive moved through this page and grabbed portions of it, speculatively, to store for future research and reference. As the whole image was grabbed, reading the metadata of the original image reveals the date it was taken (June 12, 2002), and the camera used (a Sony Cyber-Shot model). The great unknown image, the unsettling photo of a mysterious place and time, was revealed.

However the original, anonymous user stumbled onto this photograph, it appears it was taken from either the Wayback directly, or the Wayback Machine crawled the same site the user had found, and kept that webpage’s preservation for over 20 years.

Emerging, Blinking, Into the Light

Naturally, as news of the Backrooms being “found” travels throughout the world, responses have wildly ranged.

For some, this is a proof that “with enough eyeballs, all problems are shallow”. While we might argue about the relative worth of a given effort, the fact that it is possible for word to travel about a mystery to the point of being solved means that the world is a hair less intimidating and scary. Our shared efforts and cooperation can find the answer to a seemingly impossible-to-answer question. The fact that an image with basically no information and a blurry set of components could be tracked down and revealed is a miracle.

For others, the mystery being solved removes a little bit of magic and wonder from the world. It says that there’s no kayfabe, no holding of mystery in our hands without peeking further to tear out the secret. In this perspective, something special has been lost.

But there’s another lesson as well.

The Internet Archive’s crawlers moved through the pages of a hobby store multiple times over the years, capturing HTML, photographs, and time-stamping the process, with the equivalent care of an at-risk website, a politician on the national stage, or a legendary and obvious moment in history provided via a PDF file.

This agnostic, wide-ranging crawl likely represented both the original source of the image, and a persistent, dependable URL to reference back to it, as thousands are doing at this very moment.

This is the mission of the Wayback Machine – be the dependable, accessible connection to web history, and therefore all history. Give the Internet its Memory, which would otherwise be lost.

If you mourn the loss of legend and mystery in our quest to keep the truth transparent, available and persistent, don’t worry – the process of internalizing and analyzing the image to give the Backrooms history its full and complete story has already begun:

Here’s to the next mystery, and the next unsettling information being brought into the light and presented for the education, research and entertainment of the Internet, courtesy of the Wayback Machine.

Book Talk: Attack from Within by Barbara McQuade

Join us for a VIRTUAL book talk with legal scholar BARBARA McQUADE on her New York Times bestseller, ATTACK FROM WITHIN, about disinformation’s impact on democracy. NYU professor and author CHARLTON McILWAIN will facilitate our discussion.

REGISTER NOW

“A comprehensive guide to the dynamics of disinformation and a necessary call to the ethical commitment to truth that all democracies require.”

Timothy Snyder, author of the New York Times bestseller On Tyranny

American society is more polarized than ever before. We are strategically being pushed apart by disinformation—the deliberate spreading of lies disguised as truth—and it comes at us from all sides: opportunists on the far right, Russian misinformed social media influencers, among others. It’s endangering our democracy and causing havoc in our electoral system, schools, hospitals, workplaces, and in our Capitol. Advances in technology including rapid developments in artificial intelligence threaten to make the problems even worse by amplifying false claims and manufacturing credibility.

In Attack from Within, legal scholar and analyst Barbara McQuade, shows us how to identify the ways disinformation is seeping into all facets of our society and how we can fight against it. The book includes:

  • The authoritarian playbook: a brief history of disinformation from Mussolini and Hitler to Bolsonaro and Trump, chronicles the ways in which authoritarians have used disinformation to seize and retain power.
  • Disinformation tactics—like demonizing the other, seducing with nostalgia, silencing critics, muzzling the media, condemning the courts; stoking violence—and why they work.
  • An explanation of why America is particularly vulnerable to disinformation and how it exploits our First Amendment Freedoms, sparks threats and violence, and destabilizes social structures.
  • Real, accessible solutions for countering disinformation and maintaining the rule of law such as making domestic terrorism a federal crime, increasing media literacy in schools, criminalizing doxxing, and much more.

Disinformation is designed to evoke a strong emotional response to push us toward more extreme views, unable to find common ground with others. The false claims that led to the breathtaking attack on our Capitol in 2021 may have been only a dress rehearsal. Attack from Within shows us how to prevent it from happening again, thus preserving our country’s hard-won democracy.

ABOUT OUR SPEAKERS

BARBARA McQUADE is a professor at the University of Michigan Law School, where she teaches criminal law and national security law. She is also a legal analyst for NBC News and MSNBC. From 2010 to 2017, McQuade served as the U.S Attorney for the Eastern District of Michigan. She was appointed by President Barack Obama, and was the first woman to serve in her position. McQuade also served as vice chair of the Attorney General’s Advisory Committee and co-chaired its Terrorism and National Security Subcommittee.

Before her appointment as U.S. Attorney, McQuade served as an Assistant U.S. Attorney in Detroit for 12 years, including service as Deputy Chief of the National Security Unit. In that role, she prosecuted cases involving terrorism financing, foreign agents, threats, and export violations. McQuade serves on a number of non-profit boards, and served on the Biden-Harris Transition Team in 2020-2021. She has been recognized by The Detroit News with the Michiganian of the Year Award, the Detroit Free Press with the Neal Shine Award for Exemplary Regional Leadership, Crain’s Detroit Business as a Newsmaker of the Year and one of Detroit’s Most Influential Women, and the Detroit Branch NAACP and Arab American Civil Rights League with their Tribute to Justice Award. McQuade is a graduate of the University of Michigan and its law school. She and her husband live in Ann Arbor, Michigan, and have four children.s an assistant professor of English at Emory University with a courtesy appointment in quantitative theory and methods. He is the author of American Literature and the Long Downturn: Neoliberal Apocalypse (2020). His writing has appeared in the New York Times, the Washington Post, the Los Angeles Review of BooksThe RumpusDissent, and other publications.

CHARLTON McILWAIN
Author of the recent book, Black Software: The Internet & Racial Justice, From the Afronet to Black Lives Matter, Dr. Charlton McIlwain is Vice Provost for Faculty Development, Pathways & Public Interest Technology at New York University, where he is also Professor of Media, Culture, and Communication at NYU Steinhardt. He works at the intersections of computing technology, race, inequality, and racial justice activism. He has served as an expert witness in landmark U.S. Federal Court cases on reverse redlining/racial targeting in mortgage lending and recently testified before the U.S. House Committee on Financial Services about the impacts of automation and artificial intelligence on the financial services sector. He is the author of the recent PolicyLink report Algorithmic Discrimination: A Framework and Approach to Auditing & Measuring the Impact of Race-Targeted Digital Advertising. He writes regularly for outlets such as The Guardian, Slate’s Future Tense, MIT Technology Review and other outlets about the intersection of race and technology. McIlwain is the founder of the Center for Critical Race & Digital Studies, and is Board President at Data & Society Research Institute. He leads NYU’s Alliance for Public Interest Technology, is NYU’s Designee to the Public Interest Technology University Network, and serves on the executive committee as co-chair of the ethics panel for the International Panel on the Information Environment.

Book Talk: Attack from Within by Barbara McQuade
Thursday, June 6 @ 10am PT / 1pm ET
Register now for the virtual event!

DWeb Fellows: Where Are They Now? (Part 1)

Guest blog by ngọc triệu from the DWeb Camp Core Organizing team. 

Since the program kicked off in 2019, the DWeb Fellowship has welcomed 62 fellows from more than 20 countries across five continents, spanning North America, South America, Asia, Europe, Africa, and Oceania. 

Recently, I had the opportunity to reconnect with some of the DWeb Fellows from previous cohorts. We caught up on how we’ve been since our last encounter, delved into our current projects, and reminisced about our shared experiences at DWeb Camp.

In this post, let’s join Stacco (Fellow 2023), Remy (Fellow 2022), and me in our conversations below!

*Please note that the conversations have been edited for length and clarity.

____

Q1 ngọc: Thank you for taking the time to chat with me today! Can you start by introducing yourself and sharing what you’re working on right now?

Stacco: Hi, yes! I’m Stacco from DisCO.coop, which is the project I represented at DWeb Camp 2023. DisCO is a cooperative intersection of feminist and anticapitalist responses to a lot of things like DAOs, what we want to do in the workplace, and how we want to spend our time productively. And it’s also a critical approach to designing technology. DisCO was founded in 2018, but it came out of the experience of a cooperative that we founded on distributed principles, called Guerrilla Media Collective. And a lot of the stuff that we prototype in Guerrilla Media Collective with economics and governance have led into DisCO.

Remy: Hey there, I’m Remy, from the 2022 cohort. I’m currently working at the Open Technology Fund as a programme specialist. I’ve worked kind of on an array of projects, but we get a series of applications that focus on circumventing Internet censorship in authoritarian contexts, whether that be research projects, community, convenings and so on. 

Q2 ngọc: What’s one thing from DWeb Camp that you’ve taken with you into your current work?

Remy:  When reflecting on my experience at DWeb Camp, I find two significant takeaways: 

The first one being that, at that time, I was coming from a very academic space. So, most of the work that I was doing was really focused and consolidated within academia, which was a fairly small realm of people. It was mainly archivists that I was working with, so we had kind of a narrow lane and scope. 

However, upon engaging with the broader web community, I was exposed to a diverse array of individuals working on directly related projects, each with intersecting identities and roles. I remember meeting a speculative fiction author, and I was like: “Wow, this is kind of an interesting addition to this cohort of people that I don’t think I would have naturally included in a conference!”. I also got to unlearn what a conference is and looks like, you know, I’m going to show up with my little briefcase and give my presentation, because that’s what academia looks like. And then coming in, it was a much more kind of relaxed vibe and open conversation with an array of different people. So I thought that was really interesting and opened my eyes like, wow, we do need to include more people in these conferences that we’re at, because designers play just as big a role as researchers and developers. 

And then I would say, the second thing that I learned was really what it takes from the ground up to develop a mesh network. I always kind of come back and think about that — all the love and time that it takes, and the patience to care for these systems. It really got me on a whole journey about thinking of systems of care, and what those look like in technical spaces.

Stacco: Following DWeb Camp 2023, I invited brandon (Fellow 2022, 2023) and mai (DWeb Fellowship Director 2019-2023) to Spain for a meeting called “DisCO Remastered”, which mai covered in an article. From this experience, we developed two prototypes, including one called “community supported digital commons,” inspired by the principles of community-supported agriculture. We have people who are more conscious about the food they eat and where it comes from. So how about we have that type of consciousness for the digital tools that mediate our daily lives? Having community funding and accessibility for digital commons is very important to ensure fair compensation for labor and improve accessibility to technology. 

Additionally, collaborating with brandon, we aim to explore cooperative alternatives to platforms like Spotify, but going much further. What if the musicians could develop their own technology with torrents? What if they could take full control of their work and earnings? 

Also for me, I really love the diverse age ranges, genders, sexual orientation, and provenances of DWeb Camp, especially among the Fellows. The Fellows was a super varied group and it was really fascinating to engage with people whose experiences differed from mine. I’m like, “Oh, your background is totally different from mine, let me find out about it!” There was like this commonly held space, and that really inspired me. When I was writing the introduction for our newly released website, I was actually thinking of the Fellows! 

Q3 ngọc: We’re gonna get a little bit retrospective here, what motivated you to apply (or reapply) for the Fellowship? Did the program meet your expectations and were you able to accomplish what you set out to do as a Fellow? 

Stacco: Yeah, absolutely. So the first time I couldn’t go. The second time, I applied again because I wanted to get a taste of what the decentralized community is like. More than the projects, I wanted to see what the humans behind them are like. There were a lot of contradictions which I also saw at Camp that were very interesting: There were projects which I had no interest in whatsoever, and there were other projects that I found really interesting. There’s also humans that I wanted to meet. I had been collaborating with brandon from Resonate Coop for four years and it was a great chance to meet him in person. It really was maybe like the best week I had last year. I was really, really happy. And I was really happy because of the human connections. 

With brandon king, I did a presentation that was quite successful. It was very great because we spoke about technology in a critical way and we mixed it with music, the audio, and the video. Then we left all the devices behind and we walked into the forest. That was really special. Some of the human connections that were fostered have carried on. That’s the quality time that you can only get, especially post-pandemic, by sharing a physical space. 

We were also really privileged. If you think about it, at least for the Fellows, for a week, we didn’t have to think about money or anything. We ate, we slept, we walked, we rested, we played guitar, and we danced. And that took money to do. Only that didn’t come from Mars, but money, which is like a pittance compared to some of the budgets that are being handled. So it makes you think, well, with about the distributions of value, what would life be like if it was more like DWeb all over? 

Remy: I remember, I found the Fellowship through a mutual colleague who worked at the Internet Archive. And at the time, I was really interested in the Internet Archive because I was working at a small human rights organization. We were using the Internet Archive all the time and I thought it was a really cool project. I was interested in finding out who these people that run it are and what does it look like?

And then the Fellowship popped up. At that time, I had been inhabiting a tiny little bubble that no one else really understood: I was a master’s student caught between an archivist school and public policy and people were kind of looking at me cross-eyed for talking about distributed archives or decentralised archives. So when I found out the Fellowship, I was like, wow, here’s a group that I really like and admire, and they are talking about the same thing I’ve been talking about. That’s kind of what motivated me — maybe I can learn from a lot of these people who are probably much more developed in the work than I am, and I can share this small use case that I’ve been doing and working on. 

When I read the blurb about DWeb Camp, I was like, it’s a group of people going to the woods and talking about tech. I thought it didn’t even seem real. I was wondering, like, is this real? I didn’t have much of an expectation rather than a feeling that I am going to meet really interesting people that are really smart and working on interesting projects. And then I was pleasantly surprised by how many projects I had been aware of, there were projects I’d written about in my papers as things to look at, and then I was able to meet them at the DWeb naturally. 

I mean, you’re just chatting, and then you were like: “Your project sounds really familiar. What’s it called?” And then you were like, “Whoa, that’s crazy. I was writing about your project!” I was just shocked that I was naturally coming across those people in the space, it felt like a very surreal moment. I got to meet Mark, who’s the director of the Wayback Machine. And subsequently, I’ve seen him so many times at other conferences that we’ve been to. And it’s always like, I just get so excited and happy and like, want to give him a big hug. It takes me back to that special time that we all spent together. 

ngọc: What’s one piece of advice or recommendation you’d like to share with the future cohort? 

Remy: Well, that’s a good question! I’d say, be confident in your ability and skills that you’re bringing and know that it’s a space of people that want to collaborate and work with you. It can be incredibly intimidating, walking into a space where you don’t know anyone and sometimes it feels like maybe there’s pre-existing communities of people that already know each other, but have the confidence to just walk up and start talking to them and know that it’s a very open community and everyone is really welcoming.

It just sometimes takes the courage within you to make that first step forward and just walk into a circle of people and say: “Hi, this is who I am.” I know it’s always easier said than done, but I have thought that that was when the most natural conversations happen. And you know, be kind to yourself. A lot of these conferences can feel like a marathon sometimes and it can feel like you’re missing out on this or that, but the experience is always there so if you’re feeling a little bit overwhelmed or burnt out, just step out. Some of my favourite moments from Camp were sitting with the Fellows and making buttons and just giving ourselves a second to breathe outside of everything else that was going on. 

So be confident, be courageous, and be kind to yourself when you’re there experiencing it. Another thing that I found really helpful was journalling. I journaled two or three times a day to help remember how I was feeling and what I was doing. And that was a really interesting experience to look back and read on. That would be my tips for people going to DWeb. 

Stacco: I’d advise people to not go crazy and try to join every talk or session at Camp. Just be where you are and you’ll find interesting people to talk to and interesting projects to collaborate on. While you’re there, make yourself known. In addition, don’t be shy and don’t be afraid to challenge people in a friendly way. The most special thing are the Build Days, when you’re setting up camp and we’re getting to know people. Don’t miss it, that’s my recommendation! 

___

Thanks to Remy and Stacco for joining the conversation and sharing their experience as a DWeb Fellow. 

We’re currently at the final stage of reviewing all 2024 Fellowship applications. Stay tuned to meet our new cohort in June! 

End of Term Web Archive – Preserving the Transition of a Nation

It’s that time again. The 2024 End of Term crawl has officially begun! The End of Term Web Archive #EOTArchive hosts an initiative named the End of Term crawl to archive U.S. government websites in the .gov and .mil web domains — as well as those harder-to-find government websites hosted on .org, .edu, and other top level domains (TLDs) — as one administrative term ends and a new term begins. 

End of Term crawls have been completed for term transitions in 2004, 2008, 2012, 2016, and 2020. The results of these efforts is preserved in the End of Term Web Archive. In total, over 500 terabytes of government websites and data have been archived through the End of Term Web Archive efforts. These archives can be searched full-text via the Internet Archive’s collections search and also downloaded as bulk data for machine-assisted analysis.

The purpose of the End of Term Web Archive is to preserve a record of government websites for historical and research purposes. It is important to capture these websites because they can provide a snapshot of government messaging before and after the transition of terms. The End of Term Web Archive preserves information that may no longer be available on the live web for open access.

The End of Term Archive is a collaborative effort by the Internet Archive along with the University of North Texas (UNT), Stanford University, Library of Congress (LC), U.S. Government Publishing Office (GPO), and National Archives and Records Administration (NARA). Past partners include the University of CA’s California Digital Library (CDL), George Washington University, and the Environmental Data and Governance Initiative (EDGI).

Four images of Whitehouse.gov captured between 2008 and 2020
Whitehouse.gov captures from: 2008 Sept. 15; 2013 Mar. 21; 2017 Feb. 3; and 2021 Feb. 25

We are committed to preserving a record of U.S. government websites. But we need your help to complete the 2024 End of Term crawl. 

How can you help?! 

We have a list of top level domains from the General Services Administration (GSA) and from previous End of term crawls. But we need volunteers to help us out. We are currently accepting nominations for websites to be included in the 2024 End of Term Web Archive.

Submit a url nomination by going to digital2.library.unt.edu/nomination/eth2024/.
We encourage you to nominate any and all U.S. federal government websites that you want to make sure get captured. Nominating urls deep within .gov/.mil websites helps to make our web crawls as thorough and complete as possible. 

Individuals and institutions nominating seed urls are recognized on the individual contributors leaderboard and the institutions leaderboard!

Explore the End of Term Web Archive with full text search and download the data!

Taking the Words Out of My Mouth (With AI)

Everyone has a different idea of what they’d do with a time machine. Mine’s pretty simple: Head back to 2012, find myself working on a side project to film a documentary, and grab my younger self by the lapels and shout, “A 1099 IS NOT A REIMBURSEMENT! GET AN ACCOUNTANT!” before whatever energy sending me into the past gave out.

That simple mistake on my part had truly stunning financial consequences. When the tax bills and penalties started hitting my mailbox around 2014, it became a mass of stress. The IRS is undefeated in the award for Most Intense Collection Letters, and they were coming on a regular basis, even as I started directing more and more of my paycheck towards paying the debt.

A few friends generously sent me money to help beat back the interest and costs, but the combination of this and other debts had me backed into a corner, so I decided on a simple plan: Run a Patreon campaign where I told stories and opinions in short episodes, which were then supported by the audience, with all the proceeds going into debt repayment. The resulting podcast, Jason Scott Talks His Way Out of It, helped dig me out of that hole.

But it had an interesting side effect – coming up with new topics of discussion and consideration, once a week, meant that I had to mine deeper and deeper into my own outlook and relevant stories. After the first fifty episodes, I turned further inward. After one hundred, it became equal parts emotional and technical. And as I’m heading towards three hundred episodes, I’m surprised I have anything left to say at all. But I apparently do, and having now been doing this podcast weekly for six years, its episodes span a truly panoramic spectrum of topics.

(This is what a standing recording booth looks like – putting your head in an Audio Mailbox to maintain quiet background noise while speaking. It gets very hot in there.)

The Patreon gave patrons early access to the episodes, but the episodes are all open and uploaded some months later to the general world, including this collection at Internet Archive. Download, listen, remix, whatever you’d like – you have my complete permission and blessing.

Episodes have been uploaded to the Internet Archive since 2019, but in 2022, an opportunity try out new technology came up – the Whisper project, open sourced and instantly downloadable, could be implemented for transcription, either as part of video or just a basic audio file. And thanks to the project, I had many audio files, and began experimenting with using Whisper against them.

Speech recognition, the process of turning spoken words in a microphone or pre-recorded audio files into written words or issued commands, has been around a very long time – decades and decades. The Internet Archive is excellent for doing a dive into historical citations; a fast “text contents” search found these points of discussion in a 1979 issue of the Silicon Gulch Gazette newsletter:

What has changed is the combination of much faster computers, much more analysis of speech, and advances in cross-referencing the resulting training to make chips and, in this case, a program that is using other disciplines within computer science to pattern-match audio, to the point of adding capitalization and punctuation from the implications in the words. Turning this against my growing collection of podcasts, it wasn’t long before I’d say what has continued to be a theme: when it works, it’s shockingly good, and when it doesn’t, it’s shockingly bad.

As an experiment and exploration, it was very useful to let the program run, shoot out a block of text, and generate the resulting timing blocks for the purposes of subtitles or transcription:

…but it would have been foolhardy to 100% walk away and let it do transcription without a second human-driven scan through the results to find mistakes. I’ve been that human, and I’ve seen things.

I’ve seen the resulting transcriptions do great jobs with proper name capitalization, odd and challenging punctuation, and paragraph breaks. I’ve also seen it knock itself silly on my New York accent and non-obscure phrasing, and definitely making a poor guess on my made-up word “Cowicature”. The algorithm works great, except when it doesn’t.

And here we get to a turn of phrase I’ve come to adapt, which is an alternate term for AI: “Algorithmic Intensity”. The human need to give life and will to machinery is a very long-lived one; but most who look at the code behind this mechanism would agree – it’s just code. The only difference is that the amount of computing power and data to derive the outcome dwarfs numbers considered unattainable a decade or two ago.

And the speed can’t be beat – 5 years of weekly podcasts took roughly 4 hours to completely transcribe by Whisper, and that amount of time was simply because it was set as a lower-priority action in the queueing system. Knowing how long the total time for all the uploaded episodes of my podcast are sitting in that collection is a little involved, but my back of napkin estimation is it’s two and a half solid days of me talking about technology and emotion, ranging subjects from programming and compilers to summer camp and family. I can assure you – I was not going to transcribe these podcasts anytime soon, and I was not going to ask someone to do it. While I’m proud of my work, I’m in no position to be able to record four podcasts in a month and create paragraphs of text from them.

Or, for that matter, descriptive summaries.

Which brings me to the latest, continual tinkering with the tools and environment available to analyze materials with algorithms. I began asking a large language model to look at the generated transcriptions and create a summary of a given podcast episode.

Two-plus years into generalized algorithmic intensity access, it’s still very much a lumpy and oddly spell-casting endeavor. Instead of asking it to “summarize this transcription”, my request reads like a headmaster at a school or a Dungeons and Dragons game text:

You read transcripts of a podcast and carefully write out descriptions, in the form of narrative paragraphs, to accurately describe the content of the podcast. Longer and more complete descriptions are better, and encouraged. You describe the main subjects, conclusions by the participants, and provide helpful context for the subjects. The podcast you listen to is:

…followed by the transcription of the podcast, time-codes and all.

A matrix of calculation, fast beyond my reckoning but not less mysterious-and-not-mysterious as CPUs and networking itself, begins analyzing the language in the transcriptions, cross-connecting ideas mentioned, occasionally volunteering more information based on matches to terms, and within a few minutes, language comes out.

…in this language, it’s easy to find places where there’s an algorithm in the mix, a machine putting out syllables and phrases based on what I said. Like a kaleidoscope or a magnifying glass, there are occasional strange distortions and what approaches funhouse-mirror reflections of what I put in.

And oddly enough, it brings up memories.

It reminds me of being in my first neighborhood and all the kids circled around a fun and weird toy, a tape recorder, where you pressed two buttons and the … cassette, it was called? Would turn slowly and you would shout whatever came to the top of your head, press STOP, and then rewind and hear your own voice. It was distorted and weird, but it was my own voice, and I’d not heard it before from outside my own head. The world shifted, a little bit.

It reminds me of being in a computer lab at MIT in my twenties, playing games on a lunch break during a temp job I had on campus, and finding out that if I went to this obscure IP number and downloaded this very large binary file and ran it, this gargantuan program would pop up and begin asking me what “address” to go to, and if I knew some of the addresses, they would turn into full-color, fully-laid out, beautiful print-like presentations along a grey background. And, to my utter disbelief, looking at a menu option called “view source” would let me see the bizarre “code” behind these lovely looking pages. It was called Hypertext Mark-Up Language, HTML, and messing with that odd and ever-changing set of written directives inside textfiles would be a significant part of my next quarter century.

There have been all these times the times that computers (on my desk or lab or out in some nebulous “net” that girdled the Earth) provided me with a new set of buttons and knobs to press. To play around. To see what hums and clicks, and to do my best to understand what is happening underneath. In many ways, these computers have done very well by me; and they have also brought stress, sleepness nights and confusion; even anger.

Put another way, these newest concepts and tools, these Intense Algorithms, sweeping the news and the discussion around virtual watercoolers, are the latest in a long line of topics about what computers will bring to human lives. It’s a deep and varied discussion and it deserves consideration and time, and respect of the opinions of others.

And while (hopefully) the world processes this newest use of computers, we’ll see where, if anywhere, they’ll help with the mass of information and digital materials that sites like the Internet Archive provide.

Asking an LLM to look at the full text of a nearly 200-page book about the raising of ducks, such as the legendary DUCKS; AND HOW TO MAKE THEM PAY (1924), provides a paradise of questions you can ask the book as if it was able to answer in full sentences:

Q: How, exactly, do you make ducks pay?

A: The book “Ducks: and how to make them pay” by William Cook outlines various methods for making duck farming profitable. It covers the selection of suitable duck breeds, breeding techniques, and the proper management of feeding and housing. Additionally, it provides insights into marketing strategies for selling ducks and their eggs. The goal is to guide readers on how to efficiently manage duck farms to maximize income from their operations.

Q: What is, in your opinion, the best duck?

A: Choosing the “best” duck depends on what you’re looking for! For farming purposes, the Pekin duck is often considered ideal because it grows quickly and has a high yield of meat. If egg production is your priority, the Khaki Campbell duck is highly recommended, as it is one of the most prolific egg layers. For those interested in ornamental ducks, breeds like the Mandarin or Wood Ducks are known for their striking appearances. Each breed has its own advantages depending on your goals and the specific requirements of your duck farming or hobbyist endeavors.

(…as an aside, the book has one of the best bookplates inside the front cover, one which only a human being would truly apprecate.)

And that’s how I see the near future with this newest use of computers – an implementation of these tools to make materials easier to find, to summarize and help point researchers and students, and allowing new ways to work with a staggering amount of information in the online stacks. Hiding from it won’t be a solution; but asking hard questions of it might be.

Meanwhile, six years of my spoken-word memories of the ways computers have affected my life are sitting in a collection, open to all and welcome for anyone to use to tinker with. Have a great time.

A Happy Ending For Some Manual Labor (And a Call for Support)

A great saga of rescue and preservation is coming towards its end, and there’s a chance to bask in the victory, and help push towards its conclusion.

I got word in 2015 of a collection of manuals inside a business that was getting out of the manuals business, and while a lot of well-meaning people talked a good game, they wanted to cherry-pick (people getting rid of stuff hate cherry-pickers), and I drove down to show I was serious, and after a week of work with MANY volunteers and contributors, we ended up with pallets of documentation inside boxes, tens of thousands of unique manuals, many nowhere else.

Then they were stored in a storage unit. Then they were stored in a closed coffee house. Then they were transported to Internet Archive’s Physical Archive. Then they were stored until last year, 2023.

Last year, a group called DLARC, doing digitizing and indexing projects around ham radio and radio technology, worked with me and the archive to sort four pallets of the manuals for products related to the history of radio/network technology, and off they went overseas to be scanned. And as of this month, the evaluated, professionally-scanned and available-to-the-world manuals are finished, except for a few stragglers. The loop has closed!

You can browse the collection of thousands of scanned manuals here:

The Manuals Plus Collection

And now, the pitch.

The company doing the digitizing does lots of digitizing for the Internet Archive. They are well-paid and legitimate professional contractors who are sent the items, and who do careful scanning to the best of the materials’ ability to provide access to the information, and then do quality checks, and then upload them. When they’re humming, they’re processing a pallet every couple of weeks (with lots of mitigating factors).

I’ve negotiated a situation where, if money is sent in, the remaining pallets that should be scanned can just be sent along without sorting them for DLARC funds, DLARC will fund any that happen to overlap with their mission, and the rest will just be done.

That’s if money is sent in.

How much money? The number approaches hundreds of thousands of dollars. So I’m looking for both big-ticket supporters (who can mail me at jscott@archive.org) or individuals.

Here is a specialized donation link: https://archive.org/donate/?origin=jssm-dlarc

If we make less than we need to scan them all, then we’ll only scan up to where it’s paid for. I believe we can close it out, but if the interest/money isn’t there, then it isn’t there – fair enough. Browse the collection as it grows into thousands of manuals as it is and consider if you want to be part of all that. That’s definitely happened.

But what a happy ending it would be to push all these manuals through the process, and close it up. That’s why I’m popping up to talk about it, and why I hope you would consider contributing towards it, for a non-profit that deserves your support generally.

Internet Archive Stands Firm on Library Digital Rights in Final Brief of Hachette v. Internet Archive Lawsuit

Today, the Internet Archive has taken a decisive final step in our ongoing battle for libraries’ digital rights by submitting the final appellate reply brief [PDF] in Hachette v. Internet Archive, the publishers’ lawsuit against our library. This move reaffirms Internet Archive’s unwavering commitment to fulfilling our mission of providing universal access to all knowledge, even in the face of steep legal challenges.

READ THE FINAL APPELLATE REPLY BRIEF

Statement from Brewster Kahle, founder and digital librarian of the Internet Archive:
“Resolving this should be easy—just sell ebooks to libraries so we can own, preserve and lend them to one person at a time. This is a battle for the soul of libraries in the digital age.”

This process has taken nearly four years to work through the legal system, and in that time we’ve often fielded the question, “Why should I care about this lawsuit?” By restricting libraries’ ability to lend the books they own digitally, the publishers’ license-only business model and litigation strategies perpetuate inequality in access to knowledge.

Throughout this legal battle, Internet Archive has remained steadfast in our mission to defend the core values of libraries—preservation, access, and education. This fight is not just about protecting the Internet Archive’s digital lending program; it’s about standing up for the digital rights of all libraries and ensuring that future generations have equal access to the wealth of knowledge contained within them.

Aruba’s Bold Support of Library Digital Rights, by Brewster Kahle

Aruba’s Prime Minister, Evelyn Wever-Croes: “Give them the opportunity to search for the truth.”

Last week Aruba launched the island nation’s digital heritage portal online: Coleccion Aruba. As trumpeted in Wired:  “The Internet Archive Just Backed Up an Entire Caribbean Island,” but really the credit goes to Aruba. Digitizing their national cultural heritage (100k items) and putting it online for free public access is a huge achievement.

I met with the Prime Minister (pictured above), the Minister of Culture, and the Minister of Education who backed the efforts made by the National Librarian, National Archivist, and their digital strategist. Never have I seen such unified support for cultural preservation and access. They brought together people from the Dutch islands and the Internet Archive to share the news and to inspire and to lead.

Aruba was the first to sign onto the Four Digital Rights of Memory Institutions: right to Collect, Preserve, provide Access, and interlibrary Collaboration. These are bad times when we have to reclaim these rights that are being taken from all libraries, but Aruba is making a stand. Go Aruba!

Aruba’s National Librarian, Astrid Britten, signs the Four Rights, as the National Archivist, Raymond Hernandez, and Brewster Kahle look on.

If libraries are reduced to only subscribing to commercial database products rather than owning and curating collections, we will be beholden to external corporations and subject to their whims over what’s in licensed collections, and how patrons can access them. The “Spotify for Books” model is not the way we want our libraries to go. 

To top it off, the Prime Minister, Evelyn Wever-Croes, inspired us when she told us that for the next generation, we need to “Give them the opportunity to search for the truth.” Yes.

Inspiring to see a country lead so well. I hope we have the honor of working with other nations that will also assert Digital Rights for Libraries, and live by those principles.

– Brewster Kahle