Author Archives: chrisfreeland

About chrisfreeland

Chris Freeland is the Director of Open Libraries at Internet Archive.

Library as Laboratory: Lightning Talks

In this final session of the Internet Archive’s digital humanities expo, Library as Laboratory, attendees heard from scholars in a series of short presentations about their research and how they’re using collections and infrastructure from the Internet Archive for their work.

Speakers:

  • Forgotten Histories of the Mid-Century Coding Bootcamp, [watch] Kate Miltner (University of Edinburgh)
  • Japan As They Saw It, [watch] Tom Gally (University of Tokyo)
  • The Bibliography of Life, [watch] Rod Page (University of Glasgow)
  • Q&A #1 [watch]
  • More Than Words: Fed Chairs’ Communication During Congressional Testimonies, [watch] Michelle Alexopoulos (University of Toronto)
  • WARC Collection Summarization, [watch] Sawood Alam (Internet Archive)
  • Automatic scanning with an Internet Archive TT scanner, [watch] Art Rhyno (University of Windsor)
  • Q&A #2 [watch]
  • Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index, [watch] Spencer Torene (Thomson Reuters Special Services, LLC)
  • My Internet Archive Enabled Journey As A Digital Humanities Citizen Scientist, [watch] Jim Salmons
  • Web and cities: (early internet) geographies through the lenses of the Internet Archive, [watch] Emmanouil Tranos (University of Bristol)
  • Forgotten Novels of the 19th Century, [watch] Tom Gally (University of Tokyo)
  • Q&A #3 [watch]

Links shared during the session are available in the series Resource Guide.


WARC Collection Summarization

Sawood Alam (Internet Archive)

Items in the Internet Archive’s Petabox collections of various media types like image, video, audio, book, etc. have rich metadata, representative thumbnails, and interactive hero elements. However, web collections, primarily containing WARC files and their corresponding CDX files, often look opaque. We created an open-source CLI tool called “CDX Summary” [1] to process sorted CDX files and generate reports. These summary reports give insights on various dimensions of CDX records/captures, such as, total number of mementos, number of unique original resources, distribution of various media types and their HTTP status codes, path and query segment counts, temporal spread, and capture frequencies of top TLDs, hosts, and URIs. We also implemented a uniform sampling algorithm to select a given number of random memento URIs (i.e., URI-Ms) with 200 OK HTML responses that can be utilized for quality assurance purposes or as a representative sample for the collection of WARC files. Our tool can generate both comprehensive and brief reports in JSON format as well as human readable textual representation. We ran our tool on a selected set of public web collections in Petabox, stored resulting JSON files in their corresponding collections, and made them accessible publicly (with the hope that they might be useful for researchers). Furthermore, we implemented a custom Web Component that can load CDX Summary report JSON files and render them in interactive HTML representations. Finally, we integrated this Web Component into the collection/item views of the main site of the Internet Archive, so that patrons can access rich and interactive information when they visit a web collection/item in Petabox. We also found our tool useful for crawl operators as it helped us identify numerous issues in some of our crawls that would have otherwise gone unnoticed.
[1] https://github.com/internetarchive/cdx-summary/ 


More Than Words: Fed Chairs’ Communication During Congressional Testimonies

Michelle Alexopoulos (University of Toronto)

 Economic policies enacted by the government and its agencies have large impacts on the welfare of businesses and individuals—especially those related to fiscal and monetary policy. Communicating the details of the policies to the public is an important and complex undertaking. Policymakers tasked with the communication not only need to present complicated information in simple and relatable terms, but they also need to be credible and convincing—all the while being at the center of the media’s spotlight. In this briefing, I will discuss recent research on the applications of AI to monetary policy communications, and lessons learned to date. In particular, I will report on my recent ongoing project with researchers at the Bank of Canada that analyzes the effects of emotional cues by the Chairs of the U.S. Federal Reserve on financial markets during congressional testimonies.  

While most previous work has mainly focused on the effects of a central bank’s highly scripted messages about its rate decisions delivered by its leader, we use resources from the Internet Archive, CSPAN and copies of testimony transcripts and apply a variety of tools and techniques to study the both the messages and the messengers’ delivery of them. I will review how we apply recent advances in machine learning and big data to construct measures of Federal Reserve Chair’s emotions, expressed via his or her words, voice, and face, as well as discuss challenges encountered and our findings to date. In all, our initial results highlight the salience of the Fed Chair’s emotional cues for shaping market responses to Fed communications. Understanding the effects of non-verbal communication and responses to verbal cues may help policy makers improve upon their communication strategies going forward.  


Digging into the (Internet) Archive: Examining the NSFW Model Responsible for the 2018 Tumblr Purge

Renata Barreto (University of California Berkeley)

In December 2018, Tumblr took down massive amounts of LGBTQ content from its platform. Motivated in part by increasing pressures from financial institutions and a newly passed law — SESTA / FOSTA, which made companies liable for sex trafficking online — Tumblr implemented a strict “not safe for work” or NSFW model, whose false positives included images of fully clothed women, handmade and digital art, and other innocuous objects, such as vases. The Archive Team, in conjunction with the Internet Archive, jumped into high gear and began to scrape self-tagged NSFW blogs in the 2 weeks between Tumblr’s announcement of its new policy and its algorithmic operationalization. At the time, Tumblr was considered a safe haven for the LGBTQ community and in 2013 Yahoo! bought Tumblr for 1.1 billion. In the aftermath of the so-called “Tumblr purge,” Tumblr lost its main user base and, as of 2019, was valued at 3 million. This paper digs into a slice of the 90 TB of data saved by the Archive Team. This is a unique opportunity to peek under the hood of Yahoo’s open_nsfw model, which experts believe was used in the Tumblr purge, and examine the distribution of false positives on the Archive Team dataset. Specifically, we run the open_nsfw model on our dataset and use the t-SNE algorithm to project the similarities across images on 3D space.


Japan As They Saw It (video)

Tom Gally (University of Tokyo)

“Japan As They Saw It” is a collection of descriptions of Japan by American and British visitors in the 1850s and later. Japan had been closed to outsiders for more than two centuries, and there was much curiosity in the West about this newly accessible country. The excerpts are grouped by category—Land, People, Culture, etc.—and each excerpt is linked to the book where it first appeared at the Internet Archive. “Japan As They Saw It” can be read online, or it can be downloaded as a free ebook.


Forgotten Novels of the 19th Century (video)

Tom Gally (University of Tokyo)

Novels were the binge-watched television, the hit podcasts of the 19th century—immersive, addictive, commercial—and they were produced and consumed in huge numbers. But many novels of that era have slipped through the cracks of literary memory. “Forgotten Novels of the 19th Century” is a list of fifty of those neglected novels, all waiting to be discovered and read for free at the Internet Archive.


Forgotten Histories of the Mid-Century Coding Bootcamp

Kate Miltner (University of Edinburgh)

Over the past 10 years, Americans have been exhorted to “learn to code” in order to solve a series of entrenched social issues: the tech “skills gap”, the looming threat of AI and automation, social mobility, and the underrepresentation of women and people of color in the tech industry. In response to this widespread discourse, an entire industry of short-term intensive training courses– otherwise known as coding bootcamps– have sprung up across the US, bringing in hundreds of millions of dollars in revenue a year and training tens of thousands of people. Coding bootcamps have been framed as a novel kind of institution that is equipped to solve contemporary problems. However, materials from the Internet Archive show us that, in fact, a similar discourse about computer programming and similar organizations called EDP schools existed over 70 years ago. This talk will showcase materials from the Ted Nelson Archive and the Computerworld archive to showcase how lessons from the past can inform the present.


The Bibliography of Life

Roderic Page (University of Glasgow)

The “bibliography of life” is the aspiration of making all the taxonomic literature available so that for every species on the planet we can find its original description, as well as track how our knowledge of those species has changed over time. By combining content from the Internet Archive and the Wayback Machine with information in Wikidata we can make 100’s of thousands of taxonomic publications discoverable, and many of these can also be freely read via the Internet Archive. This presentation will outline this project, how it relates to efforts such as the Biodiversity Heritage Library, and highlight some tools such as Wikicite Search and ALEC to help export this content.


Automatic scanning with an Internet Archive TT scanner (video)

Art Rhyno (University of Windsor)

The University of Windsor has set up a mechanism for automatic scanning with an Internet Archive TT scanner, used for the library’s Major Papers collection.


Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index

Spencer Torene (Thomson Reuters Special Services, LLC)

Developing  semantic  hierarchies  from  user-created  hashtags  in  social  media  can  provide  useful  organizational  structure  to  large  volumes  of  data.  However,  construction of  these  hierarchies  is  difficult  using  established  ontologies  (e.g.  WordNet)  due  to the differences in the semantic and pragmatic use of words vs. hashtags in social media. While alternative construction methods based on hashtag frequency are relatively straightforward, these methods can be susceptible to the dynamic nature of social media,  such  as  hashtags  associated  with  surges  in  popularity.  We  drew  inspiration  from the ecologically-based Shannon Diversity Index (SDI) to create a more representative and  resilient  method  of  semantic  hierarchy  construction  that  relies  upon  graph-based community detection and a novel, entropy-based ensemble diversity index (EDI) score. The EDI quantifies the contextual diversity of each hashtag, resulting in thousands of semantically-related groups of hashtags organized along a general-to-specific spectrum. Through an application of EDI to social media data (Twitter) and a comparison of our results to prior approaches, we demonstrate our method’s ability to create semantically consistent hierarchies that can be flexibly applied and adapted to a range of use cases.


Web and cities: (early internet) geographies through the lenses of the Internet Archive

Emmanouil Tranos (University of Bristol)

While geographers first turned their focus on the internet 25 years ago, the wealth of data that the Internet Archive preserves and offers remains at large unexplored, especially for large projects in terms of scope and geographical scale. However, there is hardly any other data source that depicts the evolution of our interaction with the digital and, importantly, the spatial footprint of this interaction better than the Internet Archive. Therefore, the last few years we have been using extensively data from the Internet Archive in order to understand the geography and the evolution of the creation of online content and their interrelation with cities and spatial structure. Specifically, we have worked with The British Library and utilised the JISC UK Web Domain Dataset (1996-2013)1 for a number of projects in order to (i) explore whether the availability of online content of local interest can attract individuals online, (ii) assess how the early engagement with web tools can affect future productivity, (iii) map the evolution of economic clusters, and (iv) predict interregional trade flows. The Internet Archive helps us not only to map the evolution and the geography of the engagement with the internet especially at its early stages and, therefore, draw important lessons regarding new future technologies, but also to understand economic activities that take place within and between cities.
1http://data.webarchive.org.uk/opendata/ukwa.ds.2/

Helping Ukrainian Scholars, One Book at a Time

The Internet Archive is proud to partner with Better World Books to support Ukrainian students and scholars. With a $1 donation at checkout during your purchase at betterworldbooks.com, you will help provide verifiable information to Ukrainian scholars all over the world through Wikipedia.

Since 2019, the Internet Archive has worked with the Wikipedia community to strengthen citations to published literature. Working in collaboration with Wikipedians and data scientists, Internet Archive has linked hundreds of thousands of citations in Wikipedia to books in our collection, offering Wikipedia editors and readers single-click access to the verifiable facts contained within libraries. 

Recently, our engineers analyzed the citations in the Ukrainian-language Wikipedia, and were able to connect citations to more than 17,000 books that have already been digitized by the Internet Archive, such as the page for Геноміка (English translation: Genomics), which links to a science textbook published in 2002. Through this work, we discovered that there are more than 25,000 additional books that we don’t have in our collection—and that’s where you can help! 

Now through the end of June, when you make a $1 donation at checkout during your purchase at betterworldbooks.com, your donation will go to acquire books that are cited in the Ukrainian-language Wikipedia. Books acquired will be donated to Internet Archive for digitization and preservation. Once digitized, the books will be linked from their citations in Wikipedia, offering readers the ability to check facts in published literature. Books will be available for borrowing by one person at a time at archive.org, and will also be available for scholars to request via interlibrary loan. With your help, we can ensure that Ukrainian scholars and people studying Ukraine have access to authoritative, factual information about Ukrainian history and culture. 

Thank you for making a difference by buying books from Better World Books and helping Ukrainian students and scholars with your donation.

Meet the Librarians of the Internet Archive

In celebration of National Library Week, we’d like to introduce you to some of the professional librarians who work at the Internet Archive and in projects closely associated with our programs. Over the next two weeks, you’ll hear from librarians and other information professionals who are using their education and training in library science and related fields to support the Internet Archive’s patrons.

What draws librarians to work at the Internet Archive? From patron services to collection management to web archiving, the answers are as varied as the departments in which these professionals work. But a common theme emerges from the profiles—that of professionals wanting to use their skills and knowledge in support of the Internet Archive’s mission: “Universal Access to All Knowledge.”

We hope that over these next two weeks you’ll learn something about the librarians working behind the scenes at the Internet Archive, and you’ll come to appreciate the training and dedication that influence their daily work. We’re pleased to help you “Meet the Librarians” during this National Library Week and beyond:

Join us April 5 for WHOLE EARTH: A Conversation with John Markoff

Join us on Tuesday, April 5 at 11am PT / 2pm ET for a book talk with John Markoff in conversation with journalist Steven Levy (Facebook: The Inside Story), on the occasion of Markoff’s new biography, WHOLE EARTH: The Many Lives of Stewart Brand.

Watch the session recording now:

For decades Pulitzer Prize-winning New York Times reporter John Markoff has chronicled how technology has shaped our society. In his latest book, WHOLE EARTH: The Many Lives of Stewart Brand (on-sale now), Markoff delivers the definitive biography of one of the most influential visionaries to inspire the technological, environmental, and cultural revolutions of the last six decades.

Purchase your copy today

Today Stewart Brand is largely known as the creator of The Whole Earth Catalog, a compendium of tools, books, and other intriguing ephemera that became a counterculture bible for a generation of young Americans during the 1960s. He was labeled a “techno-utopian” and a “hippie prince”, but Markoff’s WHOLE EARTH shows that Brand’s life’s work is far more. In 1966, Brand asked a simple question—why we had not yet seen a photograph of the whole earth? The whole earth image became an optimistic symbol for environmentalists and replaced the 1950s’ mushroom cloud with the ideal of a unified planetary consciousness. But after the catalog, Brand went on to greatly influence the ‘70s environmental movement and the computing world of the ‘80s. Steve Jobs adopted Brand’s famous mantra, “Stay Hungry, Stay Foolish” as his code to live by, and to this day Brand epitomizes what Markoff calls “that California state of mind.”

Watch now

Brand has always had an “eerie knack for showing up first at the onset of some social movement or technological inflection point,” Markoff writes, “and then moving on just when everyone else catches up.” Brand’s uncanny ahead-of-the-curveness is what makes John Markoff his ideal biographer. Markoff has covered Silicon Valley since 1977, and his reporting has always been at the cutting edge of tech revolutions—he wrote the first account of the World Wide Web in 1993 and broke the story of Google’s self-driving car in 2010. Stewart Brand gave Markoff carte blanche access in interviews for the book, so Markoff gets a clearer story than has ever been set down before, ranging across Brand’s time with the Merry Pranksters and his generation-defining Whole Earth Catalog, to his fostering of the marriage of environmental consciousness with hacker capitalism and the rise of a new planetary culture.

Above all, John Markoff’s WHOLE EARTH reminds us how today, amid the growing backlash against Big Tech, Stewart Brand’s original technological optimism might offer a roadmap for Silicon Valley to find its way back to its early, most promising vision.

Purchase your copy of WHOLE EARTH: The Many Lives of Stewart Brand via the Booksmith, our local bookstore.

EVENT DETAILS
WHOLE EARTH: A conversation with John Markoff
April 5 @ 11am PT / 2pm ET
Watch the event recording

Community Update: Controlled Digital Lending

From the hundreds of libraries using Controlled Digital Lending (CDL) to meet the needs of their communities to the many working groups and vendors investigating its potential, it’s clear that this innovative library practice is on the rise.

Want to learn more about what’s going on across the community? Join us for a public webinar at 11am PT on March 10 to hear from active projects, including:

  • Controlled Digital Lending Implementers group;
  • NISO’s grant from The Mellon Foundation to support the development of a consensus standards framework for implementing CDL;
  • Boston Library Consortium’s efforts around CDL for interlibrary loan;
  • CDL Co-Op (ILL & resource sharing);
  • Internet Archive, with an update on the publisher’s lawsuit against CDL & libraries;
  • CDL vendors;
  • and more!

Watch session recording now:

Presentations will be followed by a facilitated Q&A. Whether you are new to Controlled Digital Lending or have already implemented it in your library, this session will give everyone an update on where the community is today & where it’s going.

Community Update: Controlled Digital Lending
March 10 @ 11am PT / 2pm ET
Watch the session recording now

Library as Laboratory: A New Series Exploring the Computational Use of Internet Archive Collections

From web archives to television news to digitized books & periodicals, dozens of projects rely on the collections available at archive.org for computational & bibliographic research across a large digital corpus. This series will feature six sessions highlighting the innovative scholars that are using Internet Archive collections, services and APIs to support data-driven projects in the humanities and beyond.

Many thanks to the program advisory group:

  • Dan Cohen, Vice Provost for Information Collaboration and Dean, University Library and Professor of History, Northeastern University
  • Makiba Foster, Library Regional Manager for the African American Research Library and Cultural Center, Broward County Library
  • Mike Furlough, Executive Director, HathiTrust
  • Harriett Green, Associate University Librarian for Digital Scholarship and Technology Services, Washington University Libraries

Session Details:

March 2 @ 11am PT / 2pm ET

Supporting Computational Use of Web Collections
Jefferson Bailey, Internet Archive
Helge Holzmann, Internet Archive

What can you do with billions of archived web pages? In our kickoff session, Jefferson Bailey, Internet Archive’s Director of Web Archiving & Data Services, and Helge Holzmann, Web Data Engineer, will take attendees on a tour of the methods and techniques available for analyzing web archives at scale. 

Read the session recap & watch the video:


March 16  @ 11am PT / 2pm ET

Applications of Web Archive Research with the Archives Unleashed Cohort Program

Launched in 2020, the Cohort program is engaging with researchers in a year-long collaboration and mentorship with the Archives Unleashed Project and the Internet Archive, to support web archival research. 

 Web archives provide a rich resource for exploration and discovery! As such, this session will feature the program’s inaugural research teams, who will discuss the innovative ways they are exploring web archival collections to tackle interdisciplinary topics and methodologies. Projects from the Cohort program include:

  • AWAC2 — Analysing Web Archives of the COVID Crisis through the IIPC Novel Coronavirus dataset—Valérie Schafer (University of Luxembourg)
  • Everything Old is New Again: A Comparative Analysis of Feminist Media Tactics between the 2nd- to 4th Waves—Shana MacDonald (University of Waterloo)
  • Mapping and tracking the development of online commenting systems on news websites between 1996–2021—Robert Jansma (University of Siegen)
  • Crisis Communication in the Niagara Region during the COVID-19 Pandemic—Tim Ribaric (Brock University)
  • Viral health misinformation from Geocities to COVID-19—Shawn Walker (Arizona State University)

UPDATE: Quinn Dombrowski from Saving Ukrainian Cultural Heritage Online (SUCHO) will give an introductory presentation about the team of volunteers racing to archive Ukrainian digital cultural heritage.

Read the session recap & watch the video:


March 30  @ 11am PT / 2pm ET

Hundreds of Books, Thousands of Stories: A Guide to the Internet Archive’s African Folktales
Laura Gibbs, Educator, writer & bibliographer
Helen Nde, Historian & writer

Join educator & bibliographer Laura Gibbs and researcher, writer & artist Helen Nde as they give attendees a guided tour of the African folktales in the Internet Archive’s collection. Laura will share her favorite search tips for exploring the treasure trove of books at the Internet Archive, and how to share the treasures you find with colleagues, students, and fellow readers in the form of a digital bibliography guide. Helen will share how she uses the Internet Archive’s collections to tell the stories of individuals and cultures that aren’t often represented online through her work at Mythological Africans (@MythicAfricans). Helen will explore how she uses technology to continue the African storytelling tradition in spoken form, and she will discuss the impacts on the online communities that she is able to reach.

Read the session recap & watch the video:


April 13  @ 11am PT / 2pm ET

Television as Data: Opening TV News for Deep Analysis and New Forms of Interactive Search
Roger MacDonald, Founder, TV News Archive
Kalev Leetaru, Data Scientist, GDELT

How can treating television news as data create fundamentally new kinds of opportunities for both computational analysis of influential societal narratives and the creation of new kinds of interactive search tools? How could derived (non-consumptive) metadata be open-access and respectful of content creator concerns? How might specific segments be contextualized by linking them to related analysis, like professional journalist fact checking? How can tools like OCR, AI language analysis and knowledge graphs generate terabytes of annotations making it possible to search television news in powerful new ways?

For nearly a decade, the Internet Archive’s TV News Archive has enabled closed captioning keyword search of a growing archive that today spans nearly three million hours of U.S. local and national TV news (2,239,000+ individual shows) from mid-2009 to the present. This public interest library is dedicated to facilitating journalists, scholars, and the public to compare, contrast, cite, and borrow specific portions of the collection.  Using a range of algorithmic approaches, users are moving beyond simple captioning search towards rich analysis of the visual side of television news. 
In this session, Roger Macdonald, founder of the TV News Archive, and Kalev Leetaru, collaborating data scientist and  GDELT Project founder, will report on experiments applying full-screen OCR, machine vision, speech-to-text and natural language processing to assist exploration, analyses and data-visualization of this vast television repository. They will ​​survey the resulting open metadata datasets and demonstrate the public search tools and APIs they’ve created that enable powerful new forms of interactive search of television news and what it looks like to ask questions of more than a decade of television news.

Read the session recap & watch the video:


April 27  @ 11am PT / 2pm ET

Analyzing Biodiversity Literature at Scale
Martin R. Kalfatovic, Smithsonian Library & Archives
JJ Dearborn, Biodiversity Heritage Library Data Manager

Imagine the great library of life, the library that Charles Darwin said was necessary for the “cultivation of natural science” (1847). And imagine that this library is not just hundreds of thousands of books printed from 1500 to the present, but also the data contained in those books that represents all that we know about life on our planet. That library is the Biodiversity Heritage Library (BHL) The Internet Archive has provided an invaluable platform for the BHL to liberate taxonomic names, species descriptions, habitat description and much more. Connecting and harnessing  the disparate data from over five-centuries is now BHL’s grand challenge. The unstructured textual data generated at the point of digitization holds immense untapped potential. Tim Berners-Lee provided the world with a semantic roadmap to address this global deluge of dark data and Wikidata is now executing on his vision. As we speak, BHL’s data is undergoing rapid transformation from legacy formats into linked open data, fulfilling the promise to evaporate data silos and foster bioliteracy for all humankind.

Martin R. Kalfatovic (BHL Program Director and Associate Director, Smithsonian Library and Archives) and JJ Dearborn (BHL Data Manager) will explore how books in BHL become data for the larger biodiversity community.

Watch the video:


May 11  @ 11am PT / 2pm ET

Lightning Talks
In this final session of the Internet Archive’s digital humanities expo, Library as Laboratory, you’ll hear from scholars in a series of short presentations about their research and how they’re using collections and infrastructure from the Internet Archive for their work.

Watch the session recording:

Talks include:

  • Forgotten Histories of the Mid-Century Coding Bootcamp, [watch] Kate Miltner (University of Edinburgh)
  • Japan As They Saw It, [watch] Tom Gally (University of Tokyo)
  • The Bibliography of Life, [watch] Rod Page (University of Glasgow)
  • Q&A #1 [watch]
  • More Than Words: Fed Chairs’ Communication During Congressional Testimonies, [watch] Michelle Alexopoulos (University of Toronto)
  • WARC Collection Summarization, [watch] Sawood Alam (Internet Archive)
  • Automatic scanning with an Internet Archive TT scanner, [watch] Art Rhyno (University of Windsor)
  • Q&A #2 [watch]
  • Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index, [watch] Spencer Torene (Thomson Reuters Special Services, LLC)
  • My Internet Archive Enabled Journey As A Digital Humanities Citizen Scientist, [watch] Jim Salmons
  • Web and cities: (early internet) geographies through the lenses of the Internet Archive, [watch] Emmanouil Tranos (University of Bristol)
  • Forgotten Novels of the 19th Century, [watch] Tom Gally (University of Tokyo)
  • Q&A #3 [watch]

Join Us For A Celebration of Sound: Public Domain Day

1/21/22: Registration is now closed for the event. Watch the recording above or at https://archive.org/details/a-celebration-of-sound-public-domain-day-2022

Join us today for a virtual party at 1pm Pacific/4pm Eastern time with a keynote from Senator Ron Wyden, champion of the Music Modernization Act and a host of musical acts, dancers, historians, librarians, academics, activists and other leaders from the Open world! This event will explore the rich historical context of recorded sound from its earliest days, including early jazz and blues, classical, and spoken word recordings reflecting important political and social issues of the era.

Additional sponsoring organizations include: Library Futures, SPARC, Authors Alliance, the Biodiversity Heritage Library, Public Knowledge, ARSC, the Duke Center for the Study of the Public Domain, and the Music Library Association.

REGISTER FOR THE VIRTUAL EVENT HERE!

Scanning periodicals for patrons with print disabilities

[tweet]

The Internet Archive has increased periodical digitization of purchased and donated print and microfilm resources to enhance our services for our patrons with print disabilities. Those patrons can receive priority access to the collections, bypassing waitlists and borrowing materials for longer circulation periods. These periodicals will also be made available to the EMMA and ACE projects to support student success. Some of these materials are also available to researchers via interlibrary loan, digital humanities research, and other ways. 

The Internet Archive has a longstanding program serving patrons with print disabilities. The modern library materials that we digitize are first made available to qualified patrons, including affiliated users from the National Library Service, Bookshare, and ACE Portal. For more than ten years, thousands of patrons have signed up through our qualifying program to receive special access to the digital books available in our collection. 

Organizations can sign up for free to be a Qualifying Authority to be able to authorize patrons, and individual patrons can sign up.

Our patrons share inspiring stories with us about the impacts of the service. Pastor Doug Wilson said it’s been a “profound gift” to discover books in our digital theology collections. The breadth of materials is also compelling. “You never know what you will come across. You can search for something specific, but also just wander the virtual shelves,” said musician and graduate student Matthew Shifrin. In addition to serving our own patrons, we partner with the EMMA and ACE projects, which support students with print disabilities at schools across the US and Canada.

We have resources online to help you learn more about the Internet Archive’s program for patrons with print disabilities, including how to qualify. Please contact our Patron Services team with additional inquiries.

Thank you to the Mellon Foundation, the Institute of Museum & Library Services, the Arcadia Fund, the Kahle/Austin Foundation, and donors for their support of these services.

International patrons speak out: “Access to knowledge shouldn’t be for the rich and privileged.”

Last fall, we invited our patrons to share how you use the Internet Archive. The response was overwhelming, and gave us exactly the kinds of testimonials and messages of support we were hoping to gather.

As we worked through the responses, we were struck by the number of patrons from all over the world who use our collection. Here now, we’d like to share some of the powerful stories we received from our international users.

If you haven’t already done so, please share your story.

Editorial note: Statements have been edited for clarity.


Lisa M., Educator, England – “Internet Archive helped me help a student! I have students in one class that attend from around the globe. One student was unable to find the required texts and our university did not have digital copies that could be lent. If she were to order the book – not carried in any local stores – it could take up to 3 months for them to arrive, long after the course was over!”

Claudia G., Researcher, Romania – “Even before the pandemic, depending on the topic of my essay and thesis, it was difficult to find books on certain topics in local libraries or bookstores…Access to knowledge shouldn’t be for the rich and privileged.”

Ana S., Communications assistant, Brazil – “I borrowed a book about Stephen Sondheim. Sondheim’s story and body of work is definitely an inspiration for me as someone always trying to learn ways to exercise my creativity. I just wanted to browse one section, and it was really amazing. I’m really thankful you had it available, for anyone in the world, and the borrowing process was really easy to follow through.”

Mike D., Librarian, New Zealand – “I’m a Digital Librarian in a public library in the small town of Hokitika, New Zealand, whose job is making local history more accessible to the community – many of the New Zealand history works in our public library collection are rare or reference-only. It turns out many works of New Zealand history have been digitised by the Internet Archive from US collections”

Callum H., Yard operative, Scotland – “As a non-academic with interests in literature, history, and philosophy, the IA gives me access to books I can’t otherwise afford or access.”

Yuri L., Educator, Brazil – “I spent months of 2020 bed-ridden, and was able to view items from your digitized collection. I would not have been able to go to any physical place for my books, and the titles I was looking for were sometimes available only on the Internet Archive. There are no other means for me, in my part of South America, to have access to limited-circulation ancient newspapers of other continents without digitizing and digital libraries. Without the Internet Archive and other libraries like it, I would have no alternatives.”

Simay K., Researcher, Turkey – “Living in a developing country with so many political and economic turmoils, I believe that the Internet Archive provides a huge service and a unique platform for dissolving the injustice and inequality of [access] to knowledge between disadvantaged countries and classes.”

Lydia S., Student, Canada – “I’ve used materials from the Internet Archive many times throughout my time as an undergrad studying history…There are many primary and secondary sources on the IA that I was unable to find anywhere else online or in physical copies through my university’s library. Many of the books I’ve accessed through the IA have been out of print for many years, so it’s incredibly helpful to have [access] to titles that would otherwise be nearly impossible to track down.“

Kim C., Librarian, Canada – “I use the materials on the Internet Archive often on a personal and a professional level. I have been able to help patrons access books that we have not been able to procure for them in other ways, for reference material for every school level from primary to masters degree research. I have used the collection on many occasions to access local history or genealogical material unavailable elsewhere.”

Richard G., Poet, Canada – Richard used books within the Internet Archive’s library, “to reference other author’s prose and poetry for quotations and references.”

Chloe J., Student, Canada – “It has given me access to material that I would not otherwise have access to.”

Shehroze A., Educator, Pakistan – “I am surprised that books pertaining to learning the Urdu language are available on archive.org, and those which were used for preparation in the civil services. These books are just not available in the country anymore and are immeasurably useful as far as the history of the colonized area is concerned. These are not published anymore, and finding a copy is exceedingly rare. This is why archive.org is important and we should endorse and support it.”

Stephen C., Graduate student, Canada – “The Internet Archive has been an invaluable resource for a research project I am involved in. We have been able to access numerous historical travel narratives that are essential for our project. We have been able to view books that we could not access in archives due to travel restrictions and lending policies during the pandemic.”

Simon H., Printing press operator, Switzerland – “I often find interest in old and niche books, sometimes from parts of the world far away from me. In those cases, I have two options for accessing such a book:
1.   I order a physical copy of the work and let it ship to my home. That is incredibly expensive, harmful to the environment and occasionally damaging to an old and fragile book, conserved for such a long time with care and passion.
2.   I’m lucky enough to find a digital reproduction of a work, which can be accessed for free and “shipped” eco-friendly through wires and antennas.
The difference between those two possibilities is so pronounced, that the latter almost seems like an utopian fairy tale. But it is not! It is 21st century’s technology at work.”