Category Archives: Books Archive

Book Talks Draw More Than 2,000 Attendees in 2023

Internet Archive drew more than 2,000 attendees to its popular book talk series in 2023, held in collaboration with Authors Alliance. The books and authors represented in this year’s series covered topics as varied as digital copyright, the persistence of history and culture through preservation, early personal computing history, and the harms of political control and corporate surveillance. Browse the full collection.

WATCH NOW:

January 12, 2023 – Ben Tarnoff, “Internet for the People

March 9, 2023 – Jason Steinhauer, “History, Disrupted

March 28, 2023 – Peter Baldwin, “Athena Unbound

April 20, 2023 – Jessica Litman, “Digital Copyright

May 9, 2023 – Jessica Silbey, “Against Progress

July 13, 2023 – Laine Nooney, “The Apple II Age

August 24, 2023 – Oya Y. Rieger, “Moving Theory Into Practice

September 20, 2023 – Abby Smith Rumsey, “Memory, Edited

October 19, 2023 – Ian Johnson, “Sparks

October 31, 2023 – Cory Doctorow, “The Internet Con

November 16, 2023 – Howie Singer & Bill Rosenblatt, “Key Changes

December 6, 2023 – David G. Stork, “Pixels & Paintings

Genealogist uncovers family histories with help of Internet Archive

In tracing her family history, Taneya Koonce discovered stories about her African American ancestors in records going back to the late 1700s. Many were enslaved. She followed the path of some descendants from North Carolina to New York in the Great Migration. 

Taneya Koonce

The Internet Archive is among the many sources that Koonce has relied on in her research. From her home in Tampa, Florida, she regularly accesses the collection’s online yearbooks, newspapers, location histories, and government records to piece together her family’s story—and has also contributed material in hopes of helping others.

“As a genealogist and family historian, the breadth of digitized materials in the Internet Archive is essential to my research and an invaluable source of information in my family history quest,” said Koonce, who works as an information scientist at an academic medical center.

Koonce began to record stories in her family by interviewing her grandmothers nearly 30 years ago. She learned about several siblings of her maternal grandmother who died in infancy and the hardships they faced in life. Rediscovering her notes from those conversations after they died, Koonce began to dive into genealogy in earnest in 2005.  

Her interest turned from a hobby to a passion in recent years. Koonce maintains a family genealogy website, created a web database for research of Koonce surnames from all over the country, publishes on her genealogy blog, and runs a collaborative genealogy-focused online community, the Academy of Legacy Leaders.

Having found so many historical items on the Internet Archive, Koonce teaches others how to use the collection in their own research. She’s active in genealogy societies, frequently presenting to others about the wealth of materials online.

Koonce applauded the Archive for preserving New York voter lists that helped her find one of her ancestors. After researching slaveholders by the name of Koonce, she connected with a man in Wisconsin who had published a “Koonce to Koonce” newsletter on the family’s history. With his cooperation, Taneya digitized and uploaded the newsletter to the Archive to preserve it for others. She always documents her findings, should they be of interest to others pursuing their family history.

“I specialize in helping family historians be very cognizant about planning for the future and leaving a legacy,” said Koonce, who has presented about the importance of saving family history research for the next generation. “One strategy is sharing material on the Internet Archive. I want to help educate people that it is a library. It’s dedicated to preserving content for the future. If we can contribute information to the collection, we can spread the word about what we’re doing and make sure it’s long lasting.”

Public Domain Day 2024 Remix Contest: The Internet Archive is Looking For Creative Short Films Made By You!

The Cameraman – 1928 – Buster Keaton

We are looking for filmmakers and artists of all levels to create and upload short films of 2–3 minutes to the Internet Archive to help us celebrate Public Domain Day at our celebrations on January 24 (in-person screening & party) & January 25 (virtual celebration), 2024!

Our short film contest serves as a platform for filmmakers to explore, remix, and breathe new life into the timeless gems that have entered the public domain. From classic literature and silent films to musical compositions and visual art, the contest winners draw inspiration from the vast archive of cultural heritage from 1928. We want artists to use this newly available content to create short films using resources from the Internet Archive’s collections from 1928. The uploaded videos will be judged and prizes of up to $1500 awarded!! (see details below)

Winners will be announced and shown at the in-person Public Domain Day Celebration at the Internet Archive headquarters in San Francisco on January 24, 2024, as well as our virtual celebration on January 25. All other participating videos will be added to a Public Domain Day Collection on archive.org and featured in a blog entry in January of 2024.

Here are a few examples of some of the materials that will become public domain on January 1, 2024:

Possible themes include, but are not limited to:  

  • Weird Tales of 1928
  • Sleuthing the Public Domain
  • What can 1928 teach us about 2024?
  • Steamboat Willie re-imagined

Guidelines

  • Make a 2–3 minute movie using at least one work published in 1928 that will become Public Domain on January 1, 2024. This could be a poem, book, film, musical composition, painting, photograph or any other work that will become Public Domain next year. The more different PD materials you use, the better!
    • Note: If you have a resource from 1928 that is not available on archive.org, you may upload it and then use it in your submission. (Here is how to do that). 
  • Your submission must have a soundtrack. It can be your own voiceover or performance of a public domain musical composition, or you may use public domain or CC0 sound recordings from sources like Openverse and the Free Music Archive.
    • Note: Music copyright is TRICKY! Currently sound recordings published up to December 31, 1922 are public domain; on the upcoming January 1 that will change to sound recordings published up to December 31, 1923.  Sound recordings published later than that are NOT public domain, even if the underlying musical composition is, so watch out for this!
  • Mix and Mash content however you like, but note that ALL of your sources must be from the public domain. They do not all have to be from 1928. Remember, U.S. government works are public domain no matter when they are published. So feel free to use those NASA images! You may include your own original work if you put a CC0 license on it.
  • Add a personal touch, make it yours!
  • Keep the videos light hearted and fun! (It is a celebration after all!)

Submission Deadline

All submissions must be in by Midnight, January 19, 2024 (PST) by loading it into this collection on the Internet Archive.

How to Submit

Prizes

  • 1st prize: $1500
  • 2nd prize: $1000
  • 3rd prize: $500

*All prizes sponsored by the Kahle/Austin Foundation

Judges:

Judges will be looking for videos that are fun, interesting and use public domain materials, especially those from 1928. They will be shown at the in-person Public Domain Day party in San Francisco and should highlight the value of having cultural materials that can be reused, remixed, and re-contextualized for a new day. Winners’ pieces will be purchased with the prize money, and viewable  on the Internet Archive under a Creative Commons license.

  • Amir Saber Esfahani (Director of Special Arts Projects, Internet Archive)
  • Rick Prelinger (Board Member, Internet Archive, Founder, Prelinger Archives)
  • BZ Petroff (Director of Admin & HR, Internet Archive)
  • Special guest judges

For reference, check out the 2023 Entrants

Using the Wayback Machine to Understand the Cultural Roots of New Technologies

As an academic librarian helping connect students and faculty with the research materials they need, Sanjeet Mann has turned to the Internet Archive many times.

“I really value having the Wayback Machine as an additional tool in my librarian’s toolbox,” Mann said. “Information preservation is an essential, but often overlooked, part of the infrastructure for teaching and learning.”

Mann, currently working as the Systems & Discovery Librarian at California State University, San Bernardino (CSUSB), said he first learned about the value of the Internet Archive in 2006 during his library science master’s program.

Over his career, Mann has worked at various libraries, tapping into the Archive on the job.

Assisting budding writers, composers and artists as Arts Librarian at University of Redlands, Mann found that the vast amount of free information online, including biographies, can shape students’ projects.

“We can draw on the Archive whenever we need inspiration for creative work, or when we need to understand how current scholarship and the issues that we’re facing now aren’t completely new—they’re based on this history of work by scholars, by politicians, by citizens active in the public interest,” he said. “These issues tend to recur over time. As a society, we need to know where we have been in order to meet the challenges of the future.”

At CSUSB, Mann also helps computer science and business students use the Archive’s collections to better understand the cultural roots of new technologies—the historical context for their innovations.

“It is the only entity I’m aware of that preserves the Internet’s scholarly and historical record at this scale,” Mann said.

“I really value having the Wayback Machine as an additional tool in my librarian’s toolbox.”

Sanjeet Mann, librarian

On a practical note, Mann leveraged information through the Wayback Machine when he was researching how to set up a campus laptop loaner program for University of Redlands. This can be an essential service that libraries provide students who have trouble with their computers.

Mann wanted to understand policies at other universities, such as how they handled the return of damaged laptops. Looking at archived versions of university library websites through the Wayback Machine, Mann was able to learn about other approaches and find contacts to follow up for additional details.

The Internet Archive is a source to verify information that is no longer listed on websites, he said.

“Companies themselves don’t have any incentive to archive the history of their website. New products get launched. The platform gets migrated from one platform to another,” Mann said. “An organization like the Internet Archive, being a library, is uniquely positioned to meet the need in society of ensuring some kind of continuity of memory and having a public record. Especially with the government being very partisan these days, I think there’s value in the Internet Archive being an independent, not-for-profit that operates in the public interest.”

Mann added: “Without the Archive, we would lose decades of information about our society at a crucial turning point in its development, eroding trust in online systems and requiring educators, students and researchers to reconsider the way we do our work and share it with others.”

Academic Librarian: End to Controlled Digital Lending Would be ‘Detrimental’ to Community 

Libraries around the world were forced to shut their doors in the spring of 2020 during the start of the COVID-19 pandemic. Temple University Libraries was no exception. While the Philadelphia institution’s physical buildings were closed, librarians got creative about how to remain open to students, faculty and staff.

Olivia Given Castello, social science librarian, Temple University

It was all about getting users connected with digital material. Library staff worked together to develop a simple new service—they added a “Get Help Finding a Digital Copy” button to their library catalog. When searching for resources in the library catalog, users can click on the button to request assistance finding a physical item in digital form, which creates a help ticket for library staff to field.

Within the first week of the button launch in April 2020, there were about 350 requests. Since then, the requests have surpassed 9,000.

“Our popular service helps users get access to resources they need quickly without economic hardship, and without having to travel to campus,” said Olivia Given Castello, a social science librarian and unit head in Temple Libraries’ Learning & Research Services department, who helped create the new service.

Temple relies on a variety of sources for its digital requests—including the Internet Archive. “It’s a valuable resource through which we help Temple library users find digital copies of inaccessible or inconveniently accessible items in our physical collection,” Given Castello said of the Internet Archive’s ebooks available through controlled digital lending (CDL).

Charles Library at Temple University

For a large research university, Temple’s library collections’ budget is modest, and it has been challenging to keep up with the rapidly rising costs of journals and monographs given the static library budget in recent years. Additionally, there are ebooks that the libraries are unable to provide. Commercial publishers want to maximize profits gained from ebook sales to individual students, so unlike with print books, there are many ebook titles they refuse to sell to libraries, or refuse to sell with adequate user licensing. Based on past requests, we estimate that just under 20% of the digital items that Temple finds through its new service is in the Internet Archive collection, said Given Castello.

“Our library serves a diverse user community that is socio-economically disadvantaged relative to those at many other R1 U.S. research universities,” she said. The R1 designation indicates a university that grants doctoral degrees and has very high research activity; the list of 146 institutions so designated include the wealthiest private universities in the U.S. “Our users’ ability to access ebooks through the Internet Archive’s controlled digital lending eases financial strain on them.”

“The actions of commercial publishers have put the academic publishing model at risk, pushing the boundaries in ways that prevent libraries from serving the role in society that they need to” Given Castello said. “We’re trying to cope with that. Services like the one we set up, and controlled digital lending for borrowing ebooks from Internet Archive are important in this challenging landscape”

“We can’t let commercial publishers’ short-term shareholder profits take such precedence that they get in the way of equitable access to information.”

Olivia Given Castello, academic librarian

Given Castello wrote about the Temple experience in Supporting Online Learning and Research: Assessing our Virtual Reference Activities and Get help finding a digital copy: A pandemic response becomes the new normal.

“For any university that has a student body with significant economic challenges, organizations like the Internet Archive are just so important in helping make knowledge and information accessible to everyone, regardless of their economic privilege,” Given Castello said. “Libraries exist, in part, so that getting access to the information you need is not dependent on your personal wealth. Inequity of information access is bad for individuals and for society as a whole.”

If legal action were to diminish or shut down CDL, Given Castello said it would be “detrimental” to the university’s service.

She added: “We can’t let commercial publishers’ short-term shareholder profits take such precedence that they get in the way of equitable access to information. Eventually, that will have a long-term negative impact on knowledge creation, which hurts our society, companies, and the economy as well. Sometimes you have to think of the greater good.”

Grad Student Finds Nostalgic ‘Treasure Trove of Goodies’ Through the Internet Archive

As Elena Rowan researches the ways that activist archivers gather and make sense of data, she often relies on the Internet Archive. She is a graduate student in sociology at Concordia University in Montreal, Canada, with an interest in the debate around copyright and e-books in public libraries. 

Elena Rowan

“I look at why archives and libraries are important to society and culture as a whole,” said Rowan, who uses materials preserved in the Wayback Machine and the lnternet Archive. “Without the Internet Archive, so much of the knowledge and information on the Internet would be lost, and most of my research would be impossible.”

Rowan is in her second year of her master’s program and works as a research assistant at the Data Justice Hub. It is a collaborative research project that pursues data-related skills development for social activists, critical researchers and the general public, and aims to understand how data activists gather and make sense of data.

The Internet Archive has been valuable, she said, in providing information for the project and its podcast, Data Decoded.  

For a recent class on sociology theory, Rowan said she’s found it useful to search for work by early researchers such as W.E.B. Du Bois in the Internet Archive’s collection. Her university library has a wealth of materials, but she says there are times when she can only find an older book through the Archive and, being digital, it’s easier to locate.

With an event sponsored by the Milieux Institute, which offers programs at the intersection of fine arts, digital culture, and information technology, Rowan leveraged the Internet Archive in another way. She created a one-hour Curating Nostalgia workshop where participants could explore resources in the digital collection to create their own personal nostalgia archive.

Listen to the Data Decoded podcast

Logging into the Internet Archive, Rowan taught people how to search for historical documents and pop culture items. For example, she found a beloved video game that came in a cereal box from her childhood, as well as an audio walking tour of her neighborhood from a decade earlier before gentrification changed the landscape. Other workshop participants found books they read as kids, Club Penguin memorabilia and a Nancy Drew game. 

“For scholarly work and nostalgia researchers, it’s a treasure trove of goodies,” Rowan says of the Internet Archive.

In her personal life, Rowan said she’s enjoyed perusing old magazines and obscure cookbooks. She’s found recipes for ambitious cakes, sewing patterns and vintage designs that give her ideas for how to pull together her eclectic mix of old furniture. 

“The colors, writing and patterns of the past offer infinite inspiration for creative hobbies and help cultivate domestic bliss,” she said. “I am grateful to everyone at the Internet Archive for creating, maintaining and continuing to expand and fight for this truly amazing public resource!”

Internet Archive Celebrates Research and Research Libraries at Annual Gathering

At this year’s annual celebration in San Francisco, the Internet Archive team showcased its innovative projects and rallied supporters around its mission of “Universal Access to All Knowledge.”

Brewster Kahle, Internet Archive’s founder and digital librarian, welcomes hundreds of guests to the annual celebration on October 12, 2023.

“People need libraries more than ever,” said Brewster Kahle, founder of the Internet Archive, at the October 12 event. “We have a set of forces that are making libraries harder and harder to happen—so we have to do something more about it.”

Efforts to ban books and defund libraries are worrisome trends, Kahle said, but there are hopeful signs and emerging champions.

Watch the full live stream of the celebration

Among the headliners of the program was Connie Chan, Supervisor of San Francisco’s District 1, who was honored with the 2023 Internet Archive Hero Award. In April, she authored and unanimously passed a resolution at the San Francisco Board of Supervisors, backing the Internet Archive and the digital rights of all libraries.

Chan spoke at the event about her experience as a first-generation, low-income immigrant who relied on books in Chinese and English at the public library in Chinatown.  

Watch Supervisor Chan’s acceptance speech

“Having free access to information was a critical part of my education—and I know I was not alone,” said Chan, who is a supporter of the Internet Archive’s role as a digital, online library. “The Internet Archive is a hidden gem…It is very critical to humanity, to freedom of information, diversity of information and access to truth…We aren’t just fighting for libraries, we are fighting for our humanity.”

Several users shared testimonials about how resources from the Internet Archive have enabled them to advance their research, fact-check politicians’ claims, and inspire their creative works. Content in the collection is helping improve machine translation of languages. It is preserving international television news coverage and Ukrainian memes on social media during the war with Russia.  

Quinn Dombrowski, of the Saving Ukrainian Cultural Heritage Online project, shows off Ukrainian memes preserved by the project.

Technology is changing things—some for the worse, but a lot for the better, said David McRaney, speaking via video to the audience in the auditorium at 300 Funston Ave. “And when [technology] changes things for the better, it’s going to expand the limited capabilities of human beings. It’s going to extend the reach of those capabilities, both in speed and scope,” he said. “It’s about a newfound freedom of mind, and time, and democratizing that freedom so everyone has access to it.”

Open Library developer Drini Cami explained how the Internet Archive is using artificial intelligence to improve access to its collections.

When a book is digitized, it used to be that photographs of pages had to be manually cropped by scanning operators. The Internet Archive recently trained a custom machine learning model to automatically suggest page boundaries—allowing staff to double the rate of process. Also, an open-source machine learning tool converts images into text, making it possible for books to be searchable, and for the collection to be available for bulk research, cross-referencing, text analysis, as well as read aloud to people with print disabilities.

Open Library developer Drini Cami.

“Since 2021, we’ve made 14 million books, documents, microfiche, records—you name it—discoverable and accessible in over 100 languages,” Cami said.

As AI technology advanced this year, Internet Archive  engineers piloted a metadata extractor, a tool that automatically pulls key data elements from digitized books. This extra information helps librarians match the digitized book to other cataloged records, beginning to resolve the backlog of books with limited metadata in the Archive’s collection. AI is also being leveraged to assist in writing descriptions of magazines and newspapers—reducing the time from 40 to 10 minutes per item.

“Because of AI, we’ve been able to create new tools to streamline the workflows of our librarians and the data staff, and make our materials easier to discover, and work with patrons and researchers, Cami said. “With new AI capabilities being announced and made available at a breakneck rate, new ideas of projects are constantly being added.”

Jamie Joyce & AI hackathon participants.

A recent Internet Archive hackathon explored the risks and opportunities of AI by using the technology itself to generate content, said Jamie Joyce, project lead with the organization’s Democracy’s Library project. One of the hackathon volunteers created an autonomous research agent to crawl the web and identify claims related to AI. With a prompt-based model, the machine was able to generate nearly 23,000 claims from 500 references. The information could be the basis for creating economic, environmental and other arguments about the use of AI technology. Joyce invited others to get involved in future hackathons as the Internet Archive continues to expand its AI potential.

Peter Wang, CEO and co-founder at Anaconda, said interesting kinds of people and communities have emerged around cultures of sharing. For example, those who participate in the DWeb community are often both humanists and technologists, he said, with an understanding about the importance of reducing barriers to information for the future of humanity. Wang said rather than a scarcity mindset, he embraces an abundant approach to knowledge sharing and applying community values to technology solutions.

Peter Wang, CEO and co-founder at Anaconda.

“With information, knowledge and open-source software, if I make a project, I share it with someone else, they’re more likely to find a bug,” he said. “They might improve the documentation a little bit. They might adapt it for a novel use case that I can then benefit from. Sharing increases value.”

The Internet Archive’s Joy Chesbrough, director of philanthropy, closed the program by expressing appreciation for those who have supported the digital library, especially in these precarious times.

“We are one community tied together by the internet, this connected web of knowledge sharing. We have a commitment to an inclusive and open internet, where there are many winners, and where ethical approaches to genuine AI research are supported,” she said. “The real solution lies in our deep human connection. It inspires the most amazing acts of generosity and humanity.”

***

If you value the Internet Archive and our mission to provide “Universal Access to All Knowledge,” please consider making a donation today.

Doors Open to Richmond Facility for Behind-the-Scenes Look at the Donation, Digitization and Preservation Process

The Physical Archive in Richmond, California, was buzzing with activity the evening of October 11 as people gathered for a peek at how donations of books, film, and media of all kinds are preserved.

Some guests were long-time fans and others had recently donated or were considering giving their treasured items. Many shared a curiosity about how the Internet Archive operates the digital side of the research library.

“I’m a big believer in libraries—and this is one of the weirdest, coolest libraries,” said Jeremy Guillory of Oakland, California, as he toured the buildings and listened to stories behind the many donations on display.

Brewster Kahle, founder and digital librarian of the Internet Archive, gives a tour of the Physical Archive.

Curated collections from individuals included books from Stevanne “Dr. Toy” Auerbach, a pioneering mass media toy reviewer and early childhood studies author. There was also a set of rare dinosaur books and years of the Laugh Makers, a journal about magic and clowning.

Some large institutions, such as the Claremont School of Theology, donated papyrus fragments from ancient Egypt. Among the eight shipping containers of items from the Graduate Theological Union was a children’s hymnal written in Chinese from 1950.

“We get to explore and make available things that may not be able to be seen otherwise,” said Caslon Kahle, a donation coordinator, speaking to visitors at the event. “It’s important to have this historical record preserved for the public.”

Caslon Kahle gives a tour of the Physical Archive.

As they toured the facility, guests learned about the meticulous steps taken to sort materials (avoiding duplication), scan books (by people, turning one page at a time) and preserve fragile film (in a high-tech lab). Many expressed an appreciation for the vast and eclectic collections.

“I think it’s super awesome—all the knowledge in one place,” said Rachel Katz of Berkeley, California, who uses the Wayback Machine in their work at a nonprofit organization, researching the historic record of health equity, racial justice and environmental issues. “I don’t think I had thought about the political aspect—that when people want power they destroy knowledge, and library preservation is a hedge against that.”

Daniel Toman came to the event after he’d contributed items when his grandfather, a big amateur radio enthusiast, passed away a few years ago. “He had a bunch of equipment, catalogs and books around the house that nobody knew what to do with,” said Toman, who lives in San Francisco. “I told my family about [the Internet Archive] and they were all interested in donating some of his materials.”

Digitization manager Elizabeth MacLeod shows off an image captured from the Internet Archive’s Scribe digitization equipment.

Larry and Ann Byler drove from Sunnyvale, California, to get a first-hand look at the physical archive as they decide what to do with their books, records (78s, LPs, 45s), cassette tapes and home movies that they’ve accumulated over the years.

Ann, 81, said some of their film collection includes black-and-white images of trains that go back to the 1940s. She likes the idea that the Internet Archive could digitize the films at a high resolution.

“I want to get them out of the house—somewhere besides the trash bin,” said Larry, a retired computer programmer, of his wall of media items. “I have this ingrained abhorrence for throwing stuff away.”

At the event, noted film archivist Rick Prelinger provided guests with an inside look at preserving vintage film. “The process is not simple, but it’s achievable when you have resources, and we’re fortunate with the generosity of the Internet Archive that we have resources,” he said.

Kate Dollenmayer demos film digitization and preservation.

Linda Brettlen, an architect from Los Angeles, said she became familiar with the Archive through her daughter, who uses the collection when looking for primary sources in her documentary filmmaking. Brettlen has become a fan herself, particularly, the collection of old postcards of L.A. buildings that no longer exist.

“I love that it’s the best use of the Internet,” she said of the Internet Archive at the event. “This is a positive beacon.”

What Happened at the Virtual Library Leaders Forum?

The Internet Archive team, its partners, and enthusiasts recently shared updates on how the organization is empowering research, ensuring preservation of vital materials, and extending access to knowledge to a growing number of grateful users.

The 2023 Library Leaders Forum, held virtually Oct. 4, featured snapshots of the many activities the organization is supporting on a global scale. Together, the efforts are making a difference in the lives of students, scholars, educators, entrepreneurs, journalists, public servants — anyone who needs trusted information without barriers.

“It’s important for us to recognize that the Internet Archive is a library. It’s a research library in the role that it plays, in the way that it works,” said Brewster Kahle, founder of the Internet Archive.

Watch the 2023 Library Leaders Forum:

With the rise of misinformation and new artificial intelligence technologies, reliable, digital information is needed more than ever, he said.  

“This is going to be a challenging time in the United States when all of our institutions — the press, the election system, and libraries — are going to be tested,” Kahle said. “It’s time for us to make sure we stand up tall and be as useful to people in the United States and to people around the world who are having some of the same issues.”

To provide citizens everywhere with free access to government data, documents, records, the Archive launched Democracy’s Library last year. The collection now has 889,000 government publications, with many more items donated but yet to be organized, said the Archive’s Jamie Joyce at the forum. The goal is to digitize municipal, provincial, state and federal documents, along with datasets, research, records publications, and microfiche so they are searchable and accessible.

The Archive is taking a leadership role in harnessing the power of AI to make its information easier for users to find, Kahle added. It is also preserving state television newscasts from Russia and Iran, along with translations, to allow researchers to track trends in coverage.

Collections as data

Thomas Padilla, deputy director of data archiving and data services at the Internet Archive, reported on a project that examines how libraries can support responsible use of collections as data. Working in partnership with Iowa State University, University of Pennsylvania, and James Madison University, it is a community development effort for libraries, archives, museums and galleries to help researchers use new technology (text and data mining, machine learning) while also mitigating potential harm that can be generated by the process.

Through the effort, the Archive gave grants to 12 research libraries and cultural heritage organizations to explore questions around collections as data, Padilla said. As it became apparent that others around the world were grappling with similar issues, the project convened representatives from 60 organizations representing 18 countries earlier this year in Canada. The group agreed on core principles (The Vancouver Statement on Collections-As-Data) to use when providing machine actionable collection data to researchers. Next, the project expects to issue a roadmap for the broader international community in this space, Padilla said.

Helping libraries help publishers

The recent forum also featured digitization managers from the Internet Archive who are collaborating with partner libraries, including Tim Bigelow, Sophie Flynn-Piercy, Elizabeth MacLead, Andrea Mills and Jeff Sharpe. These librarians are at institutions big and small from the University of North Carolina at Chapel Hill to the Wellcome Trust in London, working with teams of professionally trained technicians to digitize collections.

One of those partnerships is taking an exciting new direction. The Boston Public Library’s partnership with the Archive began in 2007. Over the years, the team has completed digitization of the John Adams presidential library, Shakespeare’s First Folio (his 36 plays published in 1632), more than 17,000 government documents and the Houghton Mifflin trade book archival collection, according to Bigelow, the Northeast Regional digitization manager for the Archive.

The Houghton Mifflin collection includes 20,000 titles dating back to 1832, including some of the best known works in American fiction and children’s literature, such as books by Ralph Waldo Emerson and the Curious George series. The publisher gave BPL the entire physical collection for preservation (90% of which were out of print) and continues to add new titles as they are published. With the formal agreement of Houghton Mifflin, BPL and the Archive have been working together since 2017 to digitize every book—those in the public domain are completely readable and downloadable; those still in copyright are available through controlled digital lending (CDL).

Lawsuit updates

As in Boston, many libraries have embraced CDL. However, commercial publishers have challenged the practice.

Lila Bailey, senior policy counsel for the Archive, provided an update at the forum on the Hachette v. Internet Archive lawsuit, in which the court ruled in favor of the publishers in limiting the use of CDL. The Archive filed an appeal in September.  Bailey encouraged supporters to consider filing amicus briefs when the Archive’s case is expected to be reviewed by the appellate court.

For the Internet Archive—and libraries everywhere—to continue their work, the Archive is advocating for a legal infrastructure that ensures libraries can collect digital materials, preserve those materials in different formats, lend digital materials, and cooperate with other libraries.

“In our evolving digital society, will new technologies serve the public good, or only corporate interests?” Bailey asked in her remarks at the forum. “Libraries are on the front line of the fight to decide this question in favor of the public good. In order to maintain our age-old role as guardians of knowledge, we need our rights to own, lend and preserve books, as we all live more and more of our lives online.”

Book Talk: The Internet Con by Cory Doctorow

Join us for a virtual book talk with author Cory Doctorow about THE INTERNET CON, the disassembly manual we need to take back our internet.

REGISTER NOW

When the tech platforms promised a future of “connection,” they were lying. They said their “walled gardens” would keep us safe, but those were prison walls.

The platforms locked us into their systems and made us easy pickings, ripe for extraction. Twitter, Facebook and other Big Tech platforms hard to leave by design. They hold hostage the people we love, the communities that matter to us, the audiences and customers we rely on. The impossibility of staying connected to these people after you delete your account has nothing to do with technological limitations: it’s a business strategy in service to commodifying your personal life and relationships.

We can – we must – dismantle the tech platforms. In The Internet Con, Cory Doctorow explains how to seize the means of computation, by forcing Silicon Valley to do the thing it fears most: interoperate. Interoperability will tear down the walls between technologies, allowing users leave platforms, remix their media, and reconfigure their devices without corporate permission.

Interoperability is the only route to the rapid and enduring annihilation of the platforms. The Internet Con is the disassembly manual we need to take back our internet.

REGISTER NOW

ABOUT THE AUTHOR
CORY DOCTOROW is a science fiction author, activist and journalist. He is the author of many books, most recently RADICALIZED and WALKAWAY, science fiction for adults; HOW TO DESTROY SURVEILLANCE CAPITALISM, nonfiction about monopoly and conspiracy; IN REAL LIFE, a graphic novel; and the picture book POESY THE MONSTER SLAYER. His latest book is ATTACK SURFACE, a standalone adult sequel to LITTLE BROTHER. In 2020, he was inducted into the Canadian Science Fiction and Fantasy Hall of Fame. He works for the Electronic Frontier Foundation, is a MIT Media Lab Research Affiliate, is a Visiting Professor of Computer Science at Open University, a Visiting Professor of Practice at the University of North Carolina’s School of Library and Information Science and co-founded the UK Open Rights Group.

Book Talk: The Internet Con by Cory Doctorow
Tuesday, October 31 @ 10am PT / 1pm ET
Register now for the virtual discussion!