At this year’s annual celebration in San Francisco, the Internet Archive team showcased its innovative projects and rallied supporters around its mission of “Universal Access to All Knowledge.”
“People need libraries more than ever,” said Brewster Kahle, founder of the Internet Archive, at the October 12 event. “We have a set of forces that are making libraries harder and harder to happen—so we have to do something more about it.”
Efforts to ban books and defund libraries are worrisome trends, Kahle said, but there are hopeful signs and emerging champions.
Watch the full live stream of the celebration
Among the headliners of the program was Connie Chan, Supervisor of San Francisco’s District 1, who was honored with the 2023 Internet Archive Hero Award. In April, she authored and unanimously passed a resolution at the San Francisco Board of Supervisors, backing the Internet Archive and the digital rights of all libraries.
Chan spoke at the event about her experience as a first-generation, low-income immigrant who relied on books in Chinese and English at the public library in Chinatown.
Watch Supervisor Chan’s acceptance speech
“Having free access to information was a critical part of my education—and I know I was not alone,” said Chan, who is a supporter of the Internet Archive’s role as a digital, online library. “The Internet Archive is a hidden gem…It is very critical to humanity, to freedom of information, diversity of information and access to truth…We aren’t just fighting for libraries, we are fighting for our humanity.”
Several users shared testimonials about how resources from the Internet Archive have enabled them to advance their research, fact-check politicians’ claims, and inspire their creative works. Content in the collection is helping improve machine translation of languages. It is preserving international television news coverage and Ukrainian memes on social media during the war with Russia.
Technology is changing things—some for the worse, but a lot for the better, said David McRaney, speaking via video to the audience in the auditorium at 300 Funston Ave. “And when [technology] changes things for the better, it’s going to expand the limited capabilities of human beings. It’s going to extend the reach of those capabilities, both in speed and scope,” he said. “It’s about a newfound freedom of mind, and time, and democratizing that freedom so everyone has access to it.”
Open Library developer Drini Cami explained how the Internet Archive is using artificial intelligence to improve access to its collections.
When a book is digitized, it used to be that photographs of pages had to be manually cropped by scanning operators. The Internet Archive recently trained a custom machine learning model to automatically suggest page boundaries—allowing staff to double the rate of process. Also, an open-source machine learning tool converts images into text, making it possible for books to be searchable, and for the collection to be available for bulk research, cross-referencing, text analysis, as well as read aloud to people with print disabilities.
“Since 2021, we’ve made 14 million books, documents, microfiche, records—you name it—discoverable and accessible in over 100 languages,” Cami said.
As AI technology advanced this year, Internet Archive engineers piloted a metadata extractor, a tool that automatically pulls key data elements from digitized books. This extra information helps librarians match the digitized book to other cataloged records, beginning to resolve the backlog of books with limited metadata in the Archive’s collection. AI is also being leveraged to assist in writing descriptions of magazines and newspapers—reducing the time from 40 to 10 minutes per item.
“Because of AI, we’ve been able to create new tools to streamline the workflows of our librarians and the data staff, and make our materials easier to discover, and work with patrons and researchers, Cami said. “With new AI capabilities being announced and made available at a breakneck rate, new ideas of projects are constantly being added.”
A recent Internet Archive hackathon explored the risks and opportunities of AI by using the technology itself to generate content, said Jamie Joyce, project lead with the organization’s Democracy’s Library project. One of the hackathon volunteers created an autonomous research agent to crawl the web and identify claims related to AI. With a prompt-based model, the machine was able to generate nearly 23,000 claims from 500 references. The information could be the basis for creating economic, environmental and other arguments about the use of AI technology. Joyce invited others to get involved in future hackathons as the Internet Archive continues to expand its AI potential.
Peter Wang, CEO and co-founder at Anaconda, said interesting kinds of people and communities have emerged around cultures of sharing. For example, those who participate in the DWeb community are often both humanists and technologists, he said, with an understanding about the importance of reducing barriers to information for the future of humanity. Wang said rather than a scarcity mindset, he embraces an abundant approach to knowledge sharing and applying community values to technology solutions.
“With information, knowledge and open-source software, if I make a project, I share it with someone else, they’re more likely to find a bug,” he said. “They might improve the documentation a little bit. They might adapt it for a novel use case that I can then benefit from. Sharing increases value.”
The Internet Archive’s Joy Chesbrough, director of philanthropy, closed the program by expressing appreciation for those who have supported the digital library, especially in these precarious times.
“We are one community tied together by the internet, this connected web of knowledge sharing. We have a commitment to an inclusive and open internet, where there are many winners, and where ethical approaches to genuine AI research are supported,” she said. “The real solution lies in our deep human connection. It inspires the most amazing acts of generosity and humanity.”