Category Archives: Wayback Machine – Web Archive

Without Access to a Local Library, Freelance Translator Turns to Internet Archive

Graeme Currie, Freelance translator & editor. Photo: gcurrie.de

When Graeme Currie was working at a university, he went to the campus library for research and often lingered in the stacks just to enjoy the collection.

Now, as a freelance translator and editor operating remotely from a small town near Hamburg, Germany, Currie doesn’t have that same access. Without an institutional affiliation, he relies on materials in the Internet Archive for his work.

“It’s been vital for me because, at times, it’s the only way I can find what I need,” says Currie, 51, who is originally from Scotland. “For freelancers who are working from home without a library nearby and using obscure sources and out-of-print books, there’s nothing to replace the Internet Archive.”

Currie first heard about the Wayback Machine in the early 2000s as a means to check changes in websites. Then, he discovered other services that the Internet Archive provides including its audio and book library.

“For freelancers who are working from home without a library nearby and using obscure sources and out-of-print books, there’s nothing to replace the Internet Archive.”

Graeme Currie, freelance translator & editor

As he edits and translates academic books from German to English, Currie says he often has to check book citations—looking up page numbers and verifying passages. The virtual collection has been helpful as he researches a range of topics in the arts, social sciences and the humanities. Currie says he’s borrowed titles related to philosophy, criminality and global urban history, including the early history of tourism in Sicily.

Not only are many of the books hard to find, but Currie says logistically, they are difficult to obtain. Without the Internet Archive, Currie says he would have to wait weeks for interlibrary loans or try to contact the book authors, who are often unavailable.

“I simply could not do my job without access to a virtual library,” says Currie, who has been freelancing for about five years. “The Internet Archive is like having a university library on your desktop.”

Learn more about Currie at https://www.gcurrie.de/.

Citizen Journalist Traces the Science to Debunk Public Health Misinformation

Sarah Barry wanted to become a fighter for something—but she didn’t know exactly what.

Citizen journalist Sarah Barry

“I was frustrated with all that was going on in the world. I knew I couldn’t wave a magic wand and fix everything, but I wanted to help in some small way,” said the 28-year-old who lives in Columbus, Ohio, and works in IT.

She decided to leverage her research skills to help correct misinformation about vaccines and public health.

For Barry, the Wayback Machine has been critical in tracking the science and sharing what she’s discovered. Without the Internet Archive, she said, valuable internet history that she needs to do effective research would have been completely lost.

“I use the Internet Archive to look up old links and resources that have since gone defunct,” said Barry. “I also use the Archive to actively input web pages that need to be saved or saved again to ensure that any resources I’m currently using are saved for mine or other’s future reference.”

“It’s a common language among people like me who do research. We all know the Internet Archive is legit.”

Sarah Barry, citizen journalist

She has turned into a citizen journalist and independent activist, volunteering for nonprofit organizations to better inform the public. Barry has given public presentations on her findings and provided materials to reporters that have appeared in a variety of news outlets.

 As a millennial, Barry said she grew up being active online and has long used the Internet Archive as a tool.  “It’s a common language among people like me who do research,” she said. “We all know the Internet Archive is legit.”

Empowering Anthropological Research in the Digital Age

As a doctoral student in anthropology at Yale University, Spencer Kaplan often relies on the Internet Archive for his research. He is an anthropologist of technology who studies virtual communities. Kaplan said he uses the Wayback Machine to create a living archive of data that he can analyze.

Doctoral student Spencer Kaplan

Last summer, Kaplan studied the blockchain community, which is active on Twitter and constantly changing. As people were sharing their views of the market and helping one another, he needed a way to save the data before their accounts disappeared. A failed project might have prompted the users to take down the information, but Kaplan used the Wayback Machine to preserve the social media exchanges.

In his research, Kaplan said he discovered an environment of mistrust online in the blockchain community and an abundance of scams. He followed how people were navigating the scams, warning one another online to be careful, and actually building trust in some cases. While blockchain is trying to build technologies that avoid trust in social interaction, Kaplan said it was interesting to observe blockchain enthusiasts engaging in trusting connections. He takes the texts of tweets to build a corpus that he can then code and analyze the data to track or show trends.

The Wayback Machine can be helpful, Kaplan said, in finding preserved discussions on Twitter, early versions of company websites or pages that have been taken down altogether—a start-up company that went out of business, for example. “It’s important to be able to hold on to that [information] because our research takes place at a very specific moment in time and we want to be able to capture that specific moment,” Kaplan said.

The Internet Archive’s Open Library has also been essential in Kaplan’s work. When he was recently researching the invention of the “corporate culture” concept, he had trouble finding the first editions of many business books written in the late 80s and early 90s. His campus library often bought updated volumes, but Kaplan needed the originals. “I needed the first edition because I needed to know exactly what they said first and I was able to find that on the Internet Archive,” Kaplan said.

Preserving the Past, Empowering the Future: Unveiling the Wayback Machine’s Vital Role in Investigative Work

A precious tool. That’s how Laura Ranca describes the Wayback Machine in her work.

As a researcher at the Berlin-based organization Tactical Tech and its Exposing the Invisible Project, she helps people use technology to inform, educate and advance causes. Ranca trains journalists, human rights activists, scholars and everyday citizens to use the internet to investigate and gather evidence.

The Wayback Machine has been particularly useful in finding and retrieving lost websites, said Ranca. She also makes sure materials she produces are preserved online so future researchers can build on her work. As people try to document how the public is interacting with technology, the material stored by the Internet Archive has been essential to investigators, Ranca said.

“We face the challenge of websites and webpages being modified, altered or intentionally taken down. Sometimes it’s to hide something that was previously published, but is no longer relevant, or it now has maybe a different connotation than was intended,” Ranca said. “For us, this is very valuable to access historical records and to save different web pages and resources online using the Wayback Machine.”

When researching environmental issues, Ranca has discovered material that reflects missed early warning signs. Finding 20-year-old mining reports, video footage or other documentation affecting the climate can be important evidence in making the case for climate action. These items need to be protected, Ranca said, and the Wayback Machine provides that security. Ranca and the team at Exposing the Invisible conduct workshops on how to navigate the Wayback Machine, as well as train-the-trainer sessions on investigative skills more broadly. She also created guides on how to use Internet Archive content, available as open source through Creative Commons.

Canadian Musician Relies on Wayback Machine for Immigration Documentation

This post is part of our ongoing series highlighting how our patrons and partners use the Internet Archive to further their own research and programs.

David Samuel, a Canadian-born viola player, has lived all over the world working as a professional musician. A graduate of The Juilliard School, he lived in Europe and New Zealand before settling in San Francisco two years ago.

As Samuel works through the U.S. immigration process to get his permanent residence (green) card, he has turned to the Internet Archive for help in gathering documentation. He’s applying for residency under the “extraordinary ability” category. To make the case, he needs to put together an extensive resume of his accomplishments, awards and reviews in the arts.

Samuel performs and teaches in the Bay Area, as a member of the Alexander String Quartet and a lecturer at San Francisco State University. Using the Wayback Machine, he was able to track down website postings and programs about his past concerts to use in his application. “It was quite remarkable to find the exact dates and times of past performances,” said Samuel. “It would have been really tough otherwise, because I only have a limited number of actual physical documents with me.”

The application process is grueling, Samuel said, but being able to freely search for supporting evidence on the Wayback Machine has made it easier. “It’s been an important tool for me,” said Samuel, who heard about the Internet Archive years ago. “It’s like an encyclopedia for the history of the internet.”

David Samuel
http://violistdavidsamuel.com

The Power of Preservation: How the Internet Archive Empowers Digital Investigations and Research

A part of a series: The Internet Archive as Research Library

Written by Caralee Adams

When gathering evidence for a court case or researching human rights violations, Lili Siri Spira often found that the material she needed was preserved by the Internet Archive.

Spira is the Social Media and Campaign Marketing Manager for TechEquity Collaborative, as well as the co-manager of RatedResilient.com, a platform that promotes psycho-social resilience for digital activists. She has interned at the Center for Justice & Accountability and was an open-source investigator at the Human Rights Center at UC Berkeley during college.

In Spira’s work, the Wayback Machine has played an integral role in providing stamped artifacts and metadata.

For example, when researching the Bolivian coup in 2019, she wanted to learn more about the sentiment of indigenous people toward political leadership. Spira used the Wayback Machine to examine how indigenous Bolivian websites had changed since 2009. She discovered after initial criticism, some websites seemed to have disappeared.

“The great thing about the Internet Archive is that it really protects the chain of custody,” Spira said. “It’s not only that you look back, but you can even find a website now and capture it in time with the metadata.”

In 2020, The Berkeley Protocol on Digital Open Source Violations provided global guidelines for using public digital information as evidence in international criminal and human rights investigations. Spira said this allows preserved website data to be used in court proceedings to hold parties accountable.

On other occasions, Spira has investigated companies suspected of unethical practices. Sometimes executives openly admitted to certain behaviors, only to later deny their action. Companies may attempt to erase past communication, but Spira said she can uncover the previous versions of websites through the Wayback Machine.

“Our knowledge is not being held sacred by many people in this country and around the world,” Spira said. “It’s incredibly important for research work in any field to have access to preserved [digital] information—especially when that research is making certain allegations against powerful entities and corporations.”

We thank Lili and her colleagues for sharing their story for how they use the Internet Archive’s collections in their work.

2022 Empowering Libraries Year in Review

The Internet Archive launched the Empowering Libraries campaign in 2020 to defend equal access to library services for all. Since then, threats to libraries have only grown, so our fight continues. As 2022 draws to a close, here’s a look back through some of our library’s milestones and accomplishments over the year.

In the news

  • When the war in Ukraine started, volunteers began using the Wayback Machine and other online tools to preserve Ukrainian websites and digital collections. The effort, Saving Ukrainian Cultural Heritage Online (SUCHO), now has more than 1,500 volunteers working to preserve more than 5,000 web sites and 50TB of data. 
    • Watch a compelling story about SUCHO from CBS News featuring Quinn Dombrowski, one of the project leaders from Stanford University, and Mark Graham, director of the Wayback Machine.
    • In May, we partnered with Better World Books on a book drive supporting Ukrainian scholars. BWB customers were able to donate $1 at checkout to acquire books cited in the Ukrainian-language Wikipedia for the Internet Archive to preserve, digitize, and link to citations in Wikipedia.
  • In October, we introduced Democracy’s Library, a free, open, online compendium of government research and publications from around the world. We hosted an in-person celebration that highlighted the critical importance of free and open access to government publications, and have continued framing out what Democracy’s Library is and why it’s necessary.
  • Internet Archive Canada opened its new headquarters in Vancouver, BC, alongside the Association of Canadian Archivists 2022 Conference.
  • More than 1,000 authors have spoken out on behalf of libraries, demanding that publishers and trade associations put the digital rights of librarians, readers, and authors ahead of shareholder profits. 
  • In a tumultuous year on social media, Internet Archive has added a Mastodon server. Why? We need a game with many winners, not just a few powerful players.
  • In an OpEd for TIME, Brewster Kahle, founder and digital librarian of the Internet Archive, warned, “the instability occasioned by Twitter’s change in ownership has revealed an underlying instability in our digital information ecosystem.”

The internet reacts to the lawsuit against our library

  • On July 7, 2022, the Internet Archive filed a motion for summary judgment, asking a federal judge to rule in our favor and end a radical lawsuit, filed by four major publishing companies, that aims to criminalize library lending. Check out the Hachette v. Internet Archive page at EFF for all filings and resources.
  • We hosted a press conference on July 8 about the lawsuit featuring statements from Brewster Kahle (Internet Archive) and Corynne McSherry (EFF), plus powerful impact statements from medical school librarian Benjamin Saracco and author and editor Tom Scocca.
  • Interest in the lawsuit crossed over into mainstream channels following a viral tweet about the filing, which kicked off a lengthy online conversation about library rights, digital lending and digital ownership.
  • After a series of standard filings across the summer and early fall, on October 8, Internet Archive filed the final brief in support of our motion for summary judgment, asking the Court to dismiss the lawsuit because our lending program is a fair use.
  • What does the lawsuit mean for the future of libraries? Internet Archive’s policy counsel, Peter Routhier, considers how the publishers view libraries based on their filings.
  • Check out the Hachette v. Internet Archive page at EFF for all filings and resources.
  • One message really resonated online—people were surprised to learn that the Internet Archive has a physical archive that preserves all the physical books we’ve acquired and digitized. 

eBooks, #OwnBooks & digital ownership

  • 2022 might go down as the year that people started to really understand what it means when libraries & individuals can no longer own content, like when streaming-only content vanishes from media platforms.
  • Musician Max Collins wrote in Popula how “owning media is now an act of countercultural defiance,” walking readers through his first-hand example of how the streaming model doesn’t work for artists, only corporations.
  • Brewster Kahle published, “Digital Books wear out faster than Physical Books,” countering the notion put forward by publishers that ebooks don’t wear out. In fact, Brewster notes that ebooks require “constant maintenance—reprocessing, reformatting, re-invigorating or they will not be readable or read.”
  • Brewster’s post sparked the interest of LA Times business columnist Michael Hiltzik, who expanded on the issues around digital ownership in “Here’s why you can’t ‘own’ your ebooks.”
  • To celebrate why it’s important to own books, and to help bring visibility to issues around digital ownership, we launched the participatory #OwnBooks campaign, which invited people to share photos with the oldest book, or most treasured volume, from their personal collection, like this signed copy of The Phantom Tollbooth.
  • Author Glyn Moody published his latest book, Walled Culture, as a free ebook that you can download and own, or as a physical book that you can purchase in print.
  • More publishers joined the movement to sell—not license—ebooks to libraries, including independent publisher 11:11 Press.

The future of libraries

  • In February, we launched Library as Laboratory, a new series exploring the computational use of Internet Archive collections. The series included segments from digital humanities scholars, computational scientists, web archiving professionals and other researchers.
  • To help librarians and other information professionals better understand the decentralized web, Internet Archive partnered with the Metropolitan New York Library Council, DWeb, and Library Futures for a six-part series, Imagining a Better Online World: Exploring the Decentralized Web
  • During this year’s National Library Week, we invited readers to Meet the Librarians who work at the Internet Archive, highlighting the new roles our librarians lead in support of our mission, “Universal Access to All Knowledge.”
  • Internet Archive joined with Creative Commons, Wikimedia Foundation and others in the Movement for a Better Internet, a collaborative effort to ensure that the internet’s evolution is guided by public interest values.
  • Lila Bailey, Internet Archive’s senior policy fellow, and Michael Menna, policy fellow from Stanford University, released their report,”Securing Digital Rights for Libraries: Towards an Affirmative Policy Agenda for a Better Internet,” regarding libraries’ role in shaping the next iteration of the internet

Milestones

  • Dave Hansen, one of the authors of the white paper on controlled digital lending, was named the new executive director of Authors Alliance.
  • Carl Malamud received this year’s Internet Archive Hero Award for his lifelong mission to make public information freely available to the public.
  • We hosted the first in-person Library Leaders Forum in three years, preceded by a virtual Forum that brought together hundreds of digital library enthusiasts to explore issues related to digital ownership and the future of library collections.
  • We hosted a joint webinar with OCLC about our resource sharing pilots, including how to request articles from the Internet Archive via interlibrary loan.
  • The Music Library Association made its publications openly available at Internet Archive.
  • We began gathering content to support the newly announced Digital Library of Amateur Radio and Communications (DLARC), and then quickly surpassed 25,000 items in the collection.
  • DISCMASTER, a new software tool, allows users to search across the contents of the tens of thousands of archived CD-ROMs at the Internet Archive.
  • In August we celebrated the 20th anniversary of the Live Music Archive with a historical tour of the effort, which has resulted in hundreds of thousands of live sets available for listening at archive.org.

Donations

  • Colgate University donated more than 1.5 million microfiche cards for preservation and digitization, covering topics including Census data, documents from the Department of Education, Congressional testimony, CIA documents, and foreign news translated into English.
  • Facing an uncertain future, Hong Kong bookstore owner Albert Wan closed his pro-democracy, independent bookstore and donated the books to the Internet Archive for preservation and digitization.
  • Do you have physical collections you’d like to donate to the Internet Archive? Check out our help document.

Book talks

Digital Library of Amateur Radio & Communications Surpasses 25,000 Items

In the six weeks since announcing that Internet Archive has begun gathering content for the Digital Library of Amateur Radio and Communications (DLARC), the project has quickly grown to more than 25,000 items, including ham radio newsletters, podcasts, videos, books, and catalogs. The project seeks additional contributions of material for the free online library.

You are welcome to explore the content currently in the library and watch the primary collection as it grows at https://archive.org/details/dlarc.

The new material includes historical and modern newsletters from diverse amateur radio groups including the National Radio Club (of Aurora, CO); the Telford & District Amateur Radio Society, based in the United Kingdom; the Malta Amateur Radio League; and the South African Radio League. The Tri-State Amateur Radio Society contributed more than 200 items of historical correspondence, newspaper clippings, ham festival flyers, and newsletters. Other publications include Selvamar Noticias, a multilingual digital ham radio magazine; and Florida Skip, an amateur radio newspaper published from 1957 through 1994.The library also includes the complete run of 73 Magazine — more than 500 issues — which are freely and openly available.  

More than 300 radio related books are available in DLARC via controlled digital lending. These materials may be checked out by anyone with a free Internet Archive account for a period of one hour to two weeks. Radio and communications books donated to Internet Archive are scanned and added to the DLARC lending library.

Amateur radio podcasts and video channels are also among the first batch of material in the DLARC collection. These include Ham Nation, Foundations of Amateur Radio, the ICQ Amateur/Ham Radio Podcast, with many more to come. Providing a mirror and archive for “born digital” content such as video and podcasts is one of the core goals of DLARC.

Additions to DLARC also include presentations recorded at radio communications conferences, including GRCon, the GNU Radio Conference; and the QSO Today Virtual Ham Expo. A growing reference library of past radio product catalogs includes catalogs from Ham Radio Outlet and C. Crane.

DLARC is growing to be a massive online library of materials and collections related to amateur radio and early digital communications. It is funded by a significant grant from Amateur Radio Digital Communications (ARDC) to create a digital library that documents, preserves, and provides open access to the history of this community. 

Anyone with material to contribute to the DLARC library, questions about the project, or interest in similar digital library building projects for other professional communities, please contact:

Kay Savetz, K6KJN
Program Manager, Special Collections
kay@archive.org
Mastodon: dlarc@mastodon.radio

Internet Archive Seeks Donations of Materials to Build a Digital Library of Amateur Radio and Communications

Internet Archive has begun gathering content for the Digital Library of Amateur Radio and Communications (DLARC), which will be a massive online library of materials and collections related to amateur radio and early digital communications. The DLARC is funded by a significant grant from the Amateur Radio Digital Communications (ARDC), a private foundation, to create a digital library that documents, preserves, and provides open access to the history of this community.

The library will be a free online resource that combines archived digitized print materials, born-digital content, websites, oral histories, personal collections, and other related records and publications. The goals of the DLARC are to document the history of amateur radio and to provide freely available educational resources for researchers, students, and the general public. This innovative project includes:

  • A program to digitize print materials, such as newsletters, journals, books, pamphlets, physical ephemera, and other records from both institutions, groups, and individuals.
  • A digital archiving program to archive, curate, and provide access to “born-digital” materials, such as digital photos, websites, videos, and podcasts.
  • A personal archiving campaign to ensure the preservation and future access of both print and digital archives of notable individuals and stakeholders in the amateur radio community.
  • Conducting oral history interviews with key members of the community. 
  • Preservation of all physical and print collections donated to the Internet Archive.

The DLARC project is looking for partners and contributors with troves of ham radio, amateur radio, and early digital communications related books, magazines, documents, catalogs, manuals, videos, software, personal archives, and other historical records collections, no matter how big or small. In addition to physical material to digitize, we are looking for podcasts, newsletters, video channels, and other digital content that can enrich the DLARC collections. Internet Archive will work directly with groups, publishers, clubs, individuals, and others to ensure the archiving and perpetual access of contributed collections, their physical preservation, their digitization, and their online availability and promotion for use in research, education, and historical documentation. All collections in this digital library will be universally accessible to any user and there will be a customized access and discovery portal with special features for research and educational uses.

We are extremely grateful to ARDC for funding this project and are very excited to work with this community to explore a multi-format digital library that documents and ensures access to the history of a specific, noteworthy community. Anyone with material to contribute to the DLARC library, questions about the project, or interest in similar digital library building projects for other professional communities, please contact:

Kay Savetz, K6KJN
Program Manager, Special Collections
kay@archive.org
Twitter: @KaySavetz 

Meet the Librarians: Sawood Alam, Wayback Machine

To celebrate National Library Week 2022, we are taking readers behind the scenes to Meet the Librarians who work at the Internet Archive and in associated programs.


Sawood Alam was born and raised on a farm in a remote village of India with no smartphones, television or electricity. 

Sawood Alam

“Books were one of the only means of learning and entertainment for us,” said Alam, who checked out as many books as he could from his school library every Thursday. “I had to take my buffalo out every afternoon. It was a boring task out in the field with no one to talk to, so books were my companions.”

When he was 10 years old, Alam helped at his school library, which was all run by children. He said he learned a lot about sorting, indexing and categorizing books—the beginning of a lifelong passion.  

Nearly two decades later, Alam completed his PhD in computer science with a specialty in web archiving from Old Dominion University. He was part of the Web Science and Digital Libraries Research Group at the university. 

Alam joined the staff of the Internet Archive as a web and data scientist in 2020. Working with the Wayback Machine team, Alam supports researchers from all around the world conducting analyses with Internet Archive collections. When someone has a research question that involves interaction with Wayback Machine APIs or downloading a large number of archived web pages, he helps prepare the data and provides technical assistance. Alam tries to improve the discoverability of items in massive web collections. His data insights and quality assurance efforts enhance web crawling and Wayback Machine operations.

Alam also collaborates with partners from academia, industry, and organizations on various research, development and standardization efforts. His own research has focused on archive profiling, interoperability and cooperation among archives, which are all topics the data scientist writes about and shares on Twitter.

“My first language is Urdu so when I see books and materials in Urdu in the Internet Archive it brings me joy.”

Sawood Alam, Wayback Machine

Formal academic training in the field of web archiving is uncommon, said Alam. With his background, he’s able to understand the data scientists’ research needs, he said, making his skills a perfect match for his position at the Internet Archive. 

“‘Universal Access to All Knowledge’ is something that certainly resonates for me,” Alam said of the Internet Archive’s mission. “I would like to focus on making it more global.”

Sawood Alam

In recognition of his contribution to the library community with digital preservation, Alam received the NDSA 2020 Future Stewards Innovation Award.

Beyond his work at the Internet Archive, Alam serves the digital library and web archiving communities by peer-reviewing research papers and chairing sessions in journals and conferences in the fields of his interest and participating in conversations of International Internet Preservation Consortium (IIPC) with focus towards interoperability, collaborations, and other related topics.

Favorite items in the Internet Archive for Alam? “I established a volunteer-driven online Unicode Urdu books library, UrduWeb Digital Library, during my graduation years. My first language is Urdu so when I see books and materials in Urdu in the Internet Archive it brings me joy. Thanks to the Wayback Machine, I was able to narrate the lost story of the evolution of Urdu blogging on the 20th anniversary of the Internet Archive.”