Category Archives: News

The Rise of DISCMASTER

A developer came to me a week ago with a project they’d been working on for over a year. The proposition of what they offered and the importance of what it would mean to historical software at Internet Archive was so compelling that within 48 hours, we’d announced it to the world.

The site is DISCMASTER.TEXTFILES.COM, and within its stacks lie multitudes of previously hidden software treasure, and a directed search engine that makes it a top-notch research tool.

More than a fascinating site, though, it represents some philosophies regarding the Archive’s stacks that are worth exploring as well.

The first thing that strikes a visitor to the site is either how strange, or how nostalgic it looks. The site is strikingly simple and references the first few years of the world wide web, when backgrounds were grey by default, and the width of the screen was almost always under 640 pixels. Same with the link colors, and use of (to the modern era) small icons next to the words and links. This is a version of the world wide web long gone.

However, underneath this simple exterior beats the heart of a powerful search engine and an astounding amount of processing that has analyzed millions of files to make them easy to interact with. If your area of research or interest is vintage/historical software, we’ve all been handed a top-class tool to discover long-lost files and bring them back instantly.

A Quick Reminder about CD-ROMs

From (very roughly) 1989 through to the early 2000s, CD-ROMs (and later DVD-ROMs) were one of the primary ways to transfer heaps of software or large-sized programs to end users. Instead of spending hours or literal days transferring software you may or may not have wanted after you received it, you could go to stores or on-line and purchase a plastic disc that contained between 600-700 megabytes of information on it.

The potential of this, in fact, was so strong, that there was an entire industry of providing databases, news summaries, and even all-digital magazines using this format. Booklets of CD-ROMs became resplendent, and libraries could allow patrons to check out these discs to do research with them.

Besides these more institutional compilations, an industry rose up of companies compiling software, artwork, music and more and selling them to end users. Companies with names like Walnut Creek, Wayzata, Valusoft, and Imagemagic would have catalogs of CD-ROMs to buy. Starting out with software from bulletin board systems and gathered from FTP sites, these CD-ROMs quickly ran out of easy-to-find material to fill, and an era of “shovelware” began, allowing these products to claim “thousands of files, gigabytes of materials” while pulling from more and more out-of-date sources.

As websites, torrents and other means of transport brought the era of physical media for software to a close, the world was left with a finite, contained pile of titles that had come out on CDs. And, as luck would have it, people have been uploading those out of date files to the Internet Archive for years.

The Final Piece

Therefore, sitting on the Archive, are tens of thousands of these CD-ROMs of the past. And for a very long time, it’s been possible to download a Disc image, analyze its contents, search for useful or potentially interesting items, and then find a way to make them work again.

That last piece, in fact, is the hardest – not just knowing where the files you’re looking for are located, but to be able to browse them without a massive host of helper applications scattered to the four winds. There are dozens of archive types, dozens and maybe hundreds of multimedia formats, and, even more frustrating, archives within archives – making everything that much harder to find.

DiscMaster has fixed this.

Within the search engine is the ability to find millions of files, categorized by type or size or date or extension, and then be presented them instantly. Three decades of computer software with layers upon layers of obfuscation are brought immediately to the top.

The developer wrote applications to grind through the contents of a CD-ROM and present them with previews that wouldn’t require anything but a browser to see. This can take hours to pull out of a single CD-ROM, but the results are breathtaking.

Audio and music files play in the browser. Flash, IFF, Bitmaps, Fonts and more display in preview. Macintosh, PC, Commodore, Atari and more are presented simply, without a mandate to track down the proper utility to figure out what they are.

In other words, vintage and historical software is back from the obfuscated darkness.

In the short time that Discmaster has been online, success stories are appearing. Authors are finding shareware programs they lost track of decades ago. Original versions of software that were thought impossible to track down just pop up in the search engine. And organizations dedicated to creating catalogs of now-dormant formats are suddenly handed a thousands-of-items to-do list on a silver platter.

The Philosophy of the Support Site

The ramifications and discoveries from Discmaster are going to be coming for a very long time – even if a researcher has a light memory of something they’re looking for, the search results will guide them in the right direction faster than ever before.

But beyond that, this site shows a different approach to the Internet Archive’s materials that’s worth seeing more of.

With over 100 petabytes of data, representing a mass of materials with all sorts of containers, metadata, and approaches by contributors, the Internet Archive has to be as general as possible. This generality extends to the presentation, search engine, and storage of the items.

It is a major effort to ensure the data stays secure, the metadata is searchable, and the ability to upload nearly anything results in a usable item details page.

But that’s kind of where it has to stop.

It’s asking an awful lot to both maintain an entity like this, and also design, say, a specifically-geared site for a relatively smaller set of people and needs. It can be done, but when energy and funding are limited, it’s sometimes best to stick to basics.

Discmaster shows one way it could be done. After working hard on its specific set (software from CD-ROMs), the entire site is constructed with its singular goal in mind. If it’s not obvious, the simple, almost-no-javascript and straightforward design lends itself to an entire family of browsers that run on those original machines. You’ll be able to download Amiga software through your Amiga, your Atari software to your Atari and so on. A thousand little touches and flourishes live easily on this custom experience – because it has the freedom to allow them.

Perhaps seeing Discmaster in action will encourage others to interact with the Internet Archive as a pool, a container of resources that could receive some of the powerful analysis along specific lines. If they can then be fed back to the Archive at the end, even better; but let a hundred supporting sites bloom.

Meanwhile, enjoy the history of software – it just got a lot easier to find.

A Small Addendum Regarding Emulation

After this announcement came out, a not-insignificant amount of people have come forward to ask some form of:

You’re the Emulation In The Browser People – will DISCMASTER allow you to emulate the programs that are found in these floppies and CD-ROMS?

The short answer is no, there are no current plans to do emulated previews.

The longer answer is that the wonderful emulation in the browser that the Internet Archive has covers over the amount of work that needs to be done in selecting, refining, and in some cases modifying original programs to make them work. If a program requires all of Windows 3.1 installed, for example, someone went through the process of determining that, configuring the item to know to load Windows 3.1, and then added custom settings in the item to ensure it would all boot up correctly. Often this work can be automated to a degree, but the time involved is considerable.

Multiply these issues by the dozens of platforms that are emulated, and you can see why it would be more trouble than it would be worth. Additionally, some programs just don’t make sense to be emulated – running a printer utility “in the browser” will probably just show a prompt and nothing else, as it is loaded in the background – many, many programs of the past don’t make sense without additional context.

A much more likely scenario will be DISCMASTER revealing long-lost vintage software that is so interesting and/or fun that it will get uploaded to Internet Archive separately and those configurations done to allow it to be played in the browser.

If you find interesting items along DISCMASTER’s millions, feel free to contact me, Jason Scott, or take a shot at uploading the program yourself and doing the configurations.

Community Turns Out to Celebrate Promise of Democracy’s Library

Friends and supporters of the Internet Archive gathered October 19 at the organization’s headquarters in San Francisco to celebrate the launch of Democracy’s Library.

Plans to collect government documents from around the world and make them easily accessible online were met with enthusiasm and endorsements. Speakers at the event expressed an urgency to preserve the public record, make valuable research discoverable, and keep the citizenry informed—all potential benefits of Democracy’s Library. 

“If we really succeed — and we have to succeed — then Democracy’s Library might become an inspiration for openness in areas that are becoming more and more closed,” said Internet Archive founder Brewster Kahle. 

The 10-year project aims to make freely available the massive volume of government publications (from the U.S. and other democracies), including books, guides, reports, surveys, laws and academic research results, which are all funded with taxpayer money, but often difficult to find. 

To kick off the project, Kahle announced the Internet Archive’s initial contributions to Democracy’s Library:

  • United States .gov websites collected since 2008; 
  • Crawls of the U.S. state government websites;
  • Digitized microfilm and microfiche from the U.S. Government Publishing Office, NASA and other government entities;
  • Crawls of government domains from 200 other countries;
  • 50 million government PDF documents made into text searchable information.

It will be a collaborative effort, said Kahle, calling upon others to join in the ambitious undertaking to contribute to the online collection.

The need for Democracy’s Library

“We need Democracy’s Library. The Internet Archive’s work leading this project represents a critical step in the evolution of democracy,” said Jamie Joyce, executive director of The Society Library and emcee of the program. “Archives and libraries, as they’ve always done in the past, will continue to change in their scope, scale, and capabilities to be of critical use to society, especially democratic societies. Tonight is about witnessing another transformation.”

Although there is more data available than ever before, Joyce said, society’s knowledge management system is badly broken. Misinformation is rampant, while high quality government data is buried and scattered across different federal, state and local agencies. 

Having public material consolidated, digitized and machine readable will allow journalists, activists, and others to be better informed. It will also make democracy more transparent and accountable, as well as protect the historical documents. “We will not be able to compute in the future what we do not save today,” Joyce said.

At a time when polarized politics can put information at risk, the event highlighted the need to safeguard public data.

Gretchen Gehrke, co-founder of the Environmental Data and Governance Initiative, has been working in partnership with the Internet Archive to track changes in federal environmental websites. 

“People should be able to know about environmental issues and have a say in environmental decisions,” she said. “For the last 20 years, the majority of this information has been delivered through the web, but the right to access that information through the web is not protected.”

Gehrke described how public resources and tools related to the federal Clean Power Plan, a hallmark environmental regulation of the Obama administration, were taken down from the Environmental Protection Agency’s website under President Trump’s tenure. 

“There are no policies protecting federal website information from suppression or outright censorship,” Gehrke said. “This case serves as an example of why we need Democracy’s Library to preserve and provide continued access to these critical government documents.”

When statistics are being cited in policy debates, citizens need to be able to have access to sources of claims. For example, Sharon Hammond, chief operating officer of The Society Library, said documents related to the environmental impact of California’s Diablo Canyon power plant should be easily available. There are nearly 5 different government bodies that have some role in monitoring the plant’s ecological impact, but the agencies house the reports on their own websites. 

“Finding governmental records about public policy matters should not be a barrier to becoming an informed participant in these collective decisions,” Hammond said. “When we connect evidence directly to the claims and make that information publicly accessible as a resource, we can improve the public discourse.”

Hammond said a searchable, machine readable repository of government documents, with active links and a register of relevant government agencies, will dramatically increase meaningful access to the public’s information.

An international vision

The effort is an international one, and Canada has stepped forward as an early partner.

Canada has contributed crawls by the Library and Archives Canada of all the country’s government websites, as well as digitized microfilm and books from the Canadian Research Knowledge Network, Canadiana, and the University of Toronto.

Leslie Weir, librarian and archivist of Canada, spoke in support of the initiative. 

“We know by making our collection and work of government openly accessible, we will create a more engaged community, a community that participates in elections, school board meetings, in public consultations, and yes, even and especially in protests,” Weir said. “Access is the key to understanding. And understanding is the underpinning of democracy.”

Celebrating heroes

The festivities concluded with a tribute to Carl Malamud, recipient of the 2022 Internet Archive Hero Award. Corynne McSherry, legal director of the Electronic Frontier Foundation, presented the award. “Carl has always seen what the internet could be. He has dedicated his life to building that internet,” she said. “He is a true hero.”

Malamud said government information is more than just a good idea. “It is about the law. It is about our rulebook. It is the manual on how we, as citizens, choose to run our society. We own this manual,” he said. “We cannot honor our obligations to future generations if we cannot freely read and speak and even change that rulebook.”

Malamud urged the audience to get involved to realize the vision of Democracy’s Library and guarantee universal access to human knowledge. 

“This is our moment. We must build a distributed and interoperable internet for our global village. We must make the increase in diffusion of knowledge our mutual and everlasting mission,” Malamud said. “We must seize the means of computation and share their fruits with all the people. Let us all swim together in the ocean of knowledge.”

For more on Malamud’s career and contributions, read his profile here.

Introducing Democracy’s Library

Democracies need an educated citizenry to thrive. In the 21st century, that means easy access to reliable information online for all. 

To meet that need, the Internet Archive is building Democracy’s Library—a free, open, online compendium of government research and publications from around the world.

“Governments have created an abundance of information and put it in the public domain, but it turns out the public can’t easily access it,” said Internet Archive founder Brewster Kahle, who is spearheading the effort to collect materials for the digital library. 

By having a wealth of public documents curated and searchable through a single interface, citizens will be able to leverage useful research, learn about the workings of their government, hold officials accountable, and be more informed voters. 

Too often, the best information on the internet is locked behind paywalls, said Kahle, who has helped create the world’s largest digital library.

“It’s time to turn that scarcity model upside down and build an internet based on abundance,” Kahle said. There is a need for equitable access to objective, historical information to balance the onslaught of misinformation online.  

Libraries have long played a vital role in collecting and preserving materials that can educate the public. This mission continues, but the collections need to include digital items to meet the needs of patrons of the internet generation today.

Over the next decade, the Internet Archive is committing to work with libraries, universities, and agencies everywhere to bring the government’s historical information online. It is inviting citizens, libraries, colleges, companies, and the Wikipedians of the world to unlock good information and weave it back into the Internet.

Democracy’s Library will be celebrated at the October 19 event, Building Democracy’s Library, in San Francisco and online. 

Watch the livestream of Building Democracy’s Library:

The project is part of Kahle’s vision to build a better Internet—one that keeps the public interest above private profit. It is based on an abundance model, in which data can be uncovered, unlocked and reused in new and different ways. 

“We know there’s an information flood, but it’s not necessarily all that good,” Kahle said. “It turns out the information on the Internet is not very deep. If you know a subject well, you find that the best information is buried or not even online.”

Democracy’s Library is a move to make governments’ massive investment in research and publications open to all. 

Kahle added: “Democracy’s Library is a stepping stone toward citizens who are more empowered and more engaged.“

The first steps of Democracy’s Library are available online at https://archive.org/details/democracys-library.

2022 Internet Archive Hero Award: Carl Malamud

Photo by Kirk Walter.

Carl Malamud is a man with a mission: To make public information freely available to the public.

For more than three decades, Malamud has not just talked in theory about why government materials should be online—he has taken action to digitize and upload massive amounts of data himself. He is the reason many laws and judicial opinions, corporate filings and patents, Congressional hearings and government films are at the fingertips of the American people. 

“Our democracy, particularly today, depends on an informed citizenry, with so much misinformation and disinformation,” said Malamud, 63, founder of the nonprofit organization Public.Resource.Org. “We have to learn how our government works, what our fundamental values are, and we have to communicate that with our fellow citizens.”

Malamud is a disrupter for the public good.

His effort to unleash government data behind paywalls has put him at odds with many trying to profit from dispensing public records. Yet in case after case, Malamud is winning and adding to the body of open knowledge freely available online.

In recognition of his relentless work on behalf of the public interest, Malamud has been honored with the 2022 Internet Archive Hero Award.

The annual award is given to those who have exhibited leadership in making information available for digital learners all over the world. Previous recipients have included copyright expert Michelle Wu, librarians Kanta Kapoor and Lisa Radha Vohra, the Biodiversity Heritage Library, and the Grateful Dead. His contributions will be celebrated the evening of October 19 at the Internet Archive’s Building Democracy’s Library event.

“Carl has spent his career getting public access to the public domain, bringing government information to everyone with no restrictions,” said Internet Archive founder Brewster Kahle. “He’s been unwavering in his vision, seeing how the works of governments can be leveraged by everyone using this digital technology.”

Although he’s not in the civil service, Kahle said Malamud acts as a civil servant. He sides with advancing the public interest over corporate profits, and has been a pioneer in how to operate a nonprofit in the internet space. Malamud’s tenacity and drive is at the essence of what it means to be a hero, said Kahle: “Somebody who puts themselves at risk or in harm’s way to get their vision built.”

Early work

After studying the convergence of computers and communication in college, Malamud went to Washington, D.C., to work in public policy. Malamud developed an expertise in databases, networking, and technology to broadcast audio and video over the internet. In 1993, he started the nonprofit Internet Multicasting Service and ran the first radio station on the internet out of an office in the National Press Building. (An archive of his broadcasts from 1993-95 are available here.) 

One of Malamud’s early projects was putting corporate information from the U.S. Securities and Exchange Commission — the Electronic Data Gathering, Analysis, and Retrieval system (EDGAR) online. This allowed investors, journalists and citizens to download information about SEC filings for free, rather than pay a fee to a private company.

The work was funded with a grant from the National Science Foundation, and with money left over from the project, Malamud put the databases from the U.S. Patent Office online. In each instance, Malamud had to first purchase the database from.

“I got a grant from the American people, to buy the data from the American people, so I could give it back to the American people,” said Malamud, who often uses such plain language in his arguments for unlocking information into the public realm.

Demonstrating by doing

Getting the SEC data online was a seminal event, said Tim O’Reilly, founder of O’Reilly Media, noting he and others were inspired by Malamud’s fearless “hacktivism” approach. “It was the beginning of the open government data movement,” he said. “I’ve always called Carl an unsung hero ever since that, because he’s the guy who started it all in motion.”

Faced with pushback from entities that say it’s too hard or it will take too long to put information online, Malamud moves forward and demonstrates it can be done affordably—and the public will use it. It was Malamud who set up the first internet demonstration in the White House during Bill Clinton’s presidency. He advised the administration, and others that followed, on technology policy and identified opportunities to make government records available online—and demonstrated it’s possible.

“Carl has an unwavering commitment to the core principle that citizens should have access to the law and to government documents….and he’s establishing an important legal precedent,” said Tom Kalil, former White House aide to President Clinton and President Obama. “He’s not just a public intellectual writing op-eds, but actually getting things done.’

A passion for changing systems

Malamud has also been a prolific writer. He is the author of nine books, including “Exploring the Internet,” all composed in long hand on paper. 

His writing caught the eye of John Podesta in the early 1990s, who was working for President Clinton and figuring out how to move from paper to digital archiving.

“Carl and I had a passion for [the idea] that public records should be public and electronic records should be preserved,” said Podesta. “Carl was both a pioneer and advocate for the power of the net as a democratic tool.”

Podesta said Malamud was a force on Capitol Hill trying to shape legislation, and when he started the Center for American Progress, in 2003, Podesta hired Malamud to be chief technology officer of the progressive think tank. “Carl is friendly and funny, but what really makes him effective is that he’s dogged and passionate. He wears that on his sleeve,” Podesta said. “He just gets right to the point, and I really admire that in him.”

From pushing for access to material from the Smithsonian Institution to the House of Representatives , Podesta said his single-handed influence is clear. “He’s really changed systems,” Podesta said. “He just won’t accept the status quo.”

Podesta said Malamud has had the most impact going right to the source of the data, trying to convince the entities to put information in the public domain.

“It’s extremely valuable in a democracy to make sure that people have not just theoretical access, but real access,” to information, Podesta said. “Oftentimes, the burdens are either bureaucracy or ridiculous charges to get public documents. No one challenges that, but Carl does.”

A battle for the ages 

In 2007, Malamud started Public.Resource.Org, based in Sonoma County, California. He has 18 people on contract and numerous collaborators, and works with a dozen pro bono law firms to advance the mission of the nonprofit. The organization operates with a grant from Arcadia (a charitable trust of Lisbet Rausing and Peter Baldwin) and donations from individuals. He appeals to players across the political spectrum with a variety of tactics: writing letters, making speeches, talking to officials in person, and, when necessary, filing lawsuits to challenge claims of copyright.

Recently, Malamud had a big win with a U.S. Supreme Court case (Georgia v. Public.Resource.Org) after he posted the Official Code of Georgia and was sued for copyright violations—a decision that has had a ripple effect across the country. For nearly a decade, he’s been embroiled in a legal fight to put building, electrical and other public safety codes with the force of law online.

“I look for things that should be available and are not,” Malamud said, then simply lays out why information should be free with clear, defensible reasons. “You have to have a story that makes sense.”

Malamud has worked at this cause like no one else, determined to make sure the public realizes what’s at stake when powerful people are concealing the world of knowledge, said David Halperin, a Washington, D.C., attorney. Halperin was with the Clinton administration and has been counsel to Public.Resource.Org since 2012. “He puts it on them to have to explain why their special interests are more important than global progress and democracy,” he said.

Halperin said Malamud is effective because he is relentless and shares his infectious love of democracy. “And, he is willing to be the person who, when everyone else says, ‘Shut up and get along,’ says: ‘No, this still isn’t right. I’m not going to be cuddly here. It’s time for me to be the moral voice, to be the energy in the room that says, Okay, everyone else may now feel it’s time to be collegial. I feel like it’s time to be just.’”

Corynne McSherry, legal director at the Electronic Frontier Foundation, has represented Malamud in several cases and said he knows how to adjust his strategy to persuade others and be creative in his messaging.

“He tries to help people understand what it is he’s up to, because it’s not always clear to everybody,” McSherry said. “When you can’t see the world that the person is building towards, that person has to imagine it for you—and that’s the thing he does.”

Since Malamud was involved in the early days of the internet, he embraced the potential promise of the technology to open up knowledge, McSherry said.

 “We live in a nation of rules, and we should have the ability to actually know what they are,” McSherry said, although for a long time those rules were only available to experts with special access. “That changed. Pulling our governmental structures and all our laws into the 21st century is not a small task, but that’s what Malamud took on.”

Drawing inspiration from history

To make his case in the court of public opinion, Malamud has used humor and tapped into his artistic side. He produced a video about making building and electrical codes open, “Show Me The Manual,” and a short movie about his philosophy, “Open Access Ninja.” He speaks at conferences and universities, tailoring his message to attorneys, government workers, students, or fellow open advocates to advance his cause. The Internet Archive hosts a collection of his videos, texts and other materials online, as well as FedFlix, which includes government films Malamud uploaded and curated.

Malamud has expanded his efforts internationally, working with organizations in India to scan government and cultural information. His Public Library of India collections on the Internet Archive are some of the most popular India resources on the net.He’s become an Indian food expert, of sorts, too, said McSherry, and often expresses his gratitude to her and other attorneys working on his behalf by gifting them with Indian spices.

Since he began working in this space, Malamud said he’s encouraged to see more forward thinking about open data. Still, barriers exist. Most often, he said, he’s up against money and control. While Malamud said he’s making inroads in the power struggle, he said it’s “sort of Whack-A-Mole” with every win followed by another challenge popping up.

When he needs a little inspiration himself, Malamud said he reads from his library of writings from early American feminists and civil rights leaders. Sometimes he quotes Martin Luther King Jr. (“Change only comes with continuous struggle”) or Gandhi (“A public worker has to learn to endure with fortitude.’) 

A recurring lesson he’s gleaned from others in history who had fought against the establishment: “You can, in fact, change the way the world works—but you have to be patient. It takes time.”

An Update from Hugh Halpern, Director of the U.S. Government Publishing Office

What are some of the new initiatives from the U.S. Government Publishing Office? Director Hugh Halpern offers an update, which has been incorporated into our program for tonight’s Building Democracy’s Library event.

Many thanks to Director Halpern and the U.S. Government Publishing Office for sharing this update!

The CDL Lawsuit and the Future of Libraries

https://www.loc.gov/item/2019642586/

It’s been over two years since a group of large book publishers sued the Internet Archive over our lending programs. After an expensive and lengthy discovery phase, arguments have now been fully briefed in the district court. What might we learn from the proceedings so far about how publishers see the future of libraries?

The first thing we might learn is that the publishers want controlled digital lending declared illegal. At the time the lawsuit against us was filed, much of the commentary and analysis suggested that the case was really about the National Emergency Library–our emergency pandemic lending program. But while the NEL is certainly a part of the lawsuit, it did not take center stage in the briefing. In the publisher’s request for summary judgment, for example, only a few short paragraphs–out of about forty pages of argument–were devoted to the NEL. Of all the submissions, about 99% have concerned CDL. So it seems clear that the publishers view this lawsuit as a referendum on CDL, which they claim will cause “catastrophic harm” to the publishing industry.

A second thing this lawsuit has demonstrated is that publishers will continue to sue libraries over digital practices that were long considered fair uses in the physical world–even if they are done on a non-profit basis with no measurable economic harm. In the case against us, the publishers argue that digital lending harms markets they claim to own–and that it therefore is not a fair use under copyright law–under “the common sense economic principle that users are drawn to free goods as a substitute for paid goods.” Put another way, in the digital realm, every non-fee-paying library practice harms the publishers’ economic interests as a matter of principle–regardless of libraries’ historic practices and their previously-accepted roles, let alone what tangible economic evidence shows. In the digital world, where publishers have newfound abilities to surveil and control libraries and their patrons, the publishers argue that the economic opportunities these abilities open to them trump longstanding library practices and the public interest. Thus, they sued over digital course reserves, and are now suing over digital lending, notwithstanding a “thriving” and profitable industry. What library practice will they challenge next?

For many of us, the internet promised a world where libraries and their patrons would have more and better access to high quality information. For these publishers, it’s simply an opportunity to charge more while providing less. In the CDL lawsuit, they have admitted that of the millions of books we have digitized, they themselves have only made about 33,000 available to libraries; only about 1% of what we have done, and only under restrictive and expensive license agreements. This is, they claim, the essence of their copyright rights: the ability to restrict access to information as they see fit, to further their theoretical economic interests, without regard to libraries traditional functions and the greater public good. 

The good news is that many in the library community and beyond–including authors, small publishers, and patrons themselves–are seeing with clear eyes what is truly at stake. And they are seeing that, unfortunately, libraries and their supporters cannot just sit idly by–they will have to fight back. Indeed, that work has long since begun. In an extraordinary show of support–and recognition of what’s at stake–groups of librarians, scholars, and many others submitted friend of the court briefs in the publishers’ lawsuit against us. In these briefs, they demonstrated (among other things) the importance of libraries in the digital world. As the brief of Kenneth Crews, Kevin Smith, and the Harvard Law School Cyberlaw Clinic explained:

To remain relevant and to continue to democratize information access, libraries must meet patrons where they are; in the present day, that means the Internet. Libraries have nurtured our democracy from its inception and have changed alongside our society–evolving from private subscription models serving only the elite to free institutions that enrich citizens without regard to race, creed, gender, or socioeconomic status. As a cornerstone of democracies, libraries will always be the site of cultural struggle and ‘a crucible for a society that is constantly  moving toward a more perfect union.’”

Library Leaders Forum Recap

This year’s Library Leaders Forum kicked off on October 12 with news of promising research, digitization projects and advocacy efforts designed to best shape the library of the future.

The virtual gathering also called on participants to take action in sharing resources and promoting a variety of public interest initiatives underway in the library community.

Watch session recording:

Chris Freeland, director of Open Libraries, moderated the first event of the 2022 forum with librarians, policy experts, publishers and authors. (A complete recording of the virtual session is available here) The second session will take place Oct. 19, live in San Francisco and via Zoom starting at 7 p.m. PT. (Registration is still open).

Libraries have a vital role to play in educating citizens, combating misinformation and preserving materials that the public can use to hold officials accountable. To help meet those challenges, Internet Archive Founder Brewster Kahle gave a preview of a new project: Democracy’s Library. The vision is to establish a free, open, online compendium of government research and publications from around the world.

“We have the big opportunity to help inform users of the internet and bring as good information to them as possible to help them understand their world,” said Kahle, who will launch the initiative next week and invited others to join in the effort. “We need your input and partnership.”

The virtual forum covered the latest on Controlled Digital Lending (CDL), the library practice that is growing in popularity in the wake of pandemic closures when physical collections were unavailable to the public. Freeland announced the 90th library recently joined the Open Libraries program, which embraces CDL as the digital equivalent of traditional library lending, allowing patrons to borrow one copy at a time of a title the library owns.

As librarians look for ways of safeguarding digital books, Readium LCP was highlighted as a promising, open source technology gaining popularity. Participants were encouraged in this same space to spread the word about the advocacy work of the nonprofit Library Futures, and recognize many authors who have recently offered public support for libraries, CDL and digital ownership of books.

Lila Bailey reported on an emerging coalition of nonprofits working on a policy agenda to build a better internet centered on public interest values. A forthcoming paper will outline four digital library rights that without which it would be impossible to function in the 21st century. They include the right to collect, preserve, lend and access material. This encouraging collaboration is the result of two convenings earlier this year, including one in Washington, D.C. in July.

CDL Community of Practice

A panel at the forum discussed projects within the CDL community of practice.

Nettie Lagace of the National Information Standards Organization gave an update on an initiative, funded by the Mellon Foundation, to create a consensus framework and recommendations on CDL. Working groups are focused now on considering digital objects, circulation and reserves, interlibrary loans and asset sharing. Public comments on the draft will be welcome in the coming months, with a final document likely released next summer.

Amanda Wakaruk a copyright and scholarly communications librarian at the University of Alberta, announced a new paper exploring the legal considerations of CDL for Canadian libraries. She is one of the co-authors on the research, along with others in the Canadian Federation of Library Associations. The preprint is available now and the final paper will be published soon in the journal, Partnership: The Canadian Journal of Library and Information Practice and Research.

Working with Project ReShare, the Boston Library Consortium is leveraging CDL as a mechanism for interlibrary loan. “BLC really believes that CDL is an extension of existing resource sharing practices, both in the legal sense–the same protections and opportunities afforded to interlibrary loan also apply to CDL,” said Charlie Bartow, executive director, “but, also in a services sense–that existing resource sharing systems and practices can be readily adapted to include CDL.”

Also, speaking in the session was Caltech’s Mike Hucka. He described efforts on his campus to provide students with learning materials when the pandemic hit by creating a simple model they named the Digital Borrowing System (DIBS).

In Canada, a large digitization project is underway at the University of Toronto, where 40,000 titles in the library’s government collection are being scanned and made available online for easier public access.

Take action

In the final segment, Freeland announced that Carl Malamud is the recipient of the 2022 Internet Archive Hero Award for his dedication in making government information accessible to all. Malamud will receive the Hero Award onstage at next week’s evening celebration, “Building Democracy’s Library.”

Freeland concluded the event with a final call to action: To join the #OwnBooks campaign. People are encouraged to take a photo of themselves holding a book they own that has special meaning, perhaps something that has influenced their career path or has sentimental value. As the Internet Archive fights for the right for libraries to own books, this is a chance to bring attention to the issue and build public support.

Internet Archive to Honor Carl Malamud with 2022 Hero Award

Carl Malamud, founder of Public.Resource.Org and a champion for making government information accessible to all, will receive the 2022 Internet Archive Hero Award. He will be presented the award at next week’s evening celebration, “Building Democracy’s Library.”

The Internet Archive Hero Award is an annual award that recognizes those who have exhibited leadership in making information available for digital learners all over the world. Previous recipients have included librarians Kanta Kapoor and Lisa Radha Vohra, copyright expert Michelle Wu, the Biodiversity Heritage Library, and the Grateful Dead.

This year, the Internet Archive is honoring Carl as a tireless advocate for free access to government information. Some highlights of his work include: 

  • In the early days of the internet, Carl was a pioneer in pushing for public materials to be available online. Over three decades, he has digitized and uploaded thousands of documents from Congressional hearings, government films, and worked with the executive branch to shape public policy on information sharing.
  • He is to thank for EDGAR (Electronic Data Gathering, Analysis, and Retrieval system) Online, the free Securities and Exchange Commission database of corporate information and putting the database of U.S. patents on the internet. 
  • Carl is relentless in his ongoing quest to have detailed codes for buildings, product safety, and infrastructure available to the public on the internet.
  • He founded Public.Resource.Org, a nonprofit based in California in 2007. Several contractors and pro-bono attorneys work with him to unleash public information from behind paywalls—sometimes landing him in court to defend his actions, all done in the name of the public good.
  • Carl is known as a dedicated, passionate, principled individual whose creative strategies—and, at times, dose of humor and flair—have fueled his success in opening up access to public knowledge.

Carl has been a supporter of the Internet Archive since its inception. Much of his work appears in the Internet Archive collection including his book, “Exploring the Internet,” a movie, Open Access Ninja, about his philosophy with Public Resource.org and a video, “Show Me the Manual,” about making building and electrical codes available.

Join with us in celebrating Carl at Building Democracy’s Library on October 19.  Register now

Stay tuned for a full profile on Carl’s work and impact next week here on the Internet Archive blog.

Introducing the COVID-19 Web Archive

We are pleased to announce that the COVID-19 Web Archive is now available! As the COVID-19 pandemic emerged in early 2020, librarians, archivists, and others with interest in preserving cultural heritage began documenting the personal, cultural, and societal impact of the global pandemic on their communities. These efforts included creating archival collections preserving physical, digital, and web-based records and information for use by students, scholars, and citizens. In response to this immediate need for archiving resources by both libraries and memory institutions, the Internet Archive’s Archive-It service launched a COVID-19 Web Archiving Special Campaign in April 2020 providing free and subsidized tools, training, and community support to institutions and local efforts to preserve web-published materials documenting the COVID-19 pandemic.

The COVID-19 Web Archive builds on this curatorial work to gather together more than 160 web archive collections created by more than 125 libraries, archives, and cultural heritage organizations into a shared access portal built and maintained by the Internet Archive. The COVID-19 Web Archive currently totals nearly 90 terabytes of archived data composed of over 1.5 billion webpages and allows for full text, metadata, and media search within individual collections and across the entire archive. The archive will be continuously updated over time. If you have a collection you’d like to include in the portal, please contact us at covidwebarchive@archive.org.

Collections document the pandemic from a number of different perspectives, including:

  • Athens Regional Library System’s Athens, Georgia Area COVID-19 Response collection, which highlights “the local response to the coronavirus (COVID-19) pandemic in Athens, Georgia. Included are communications from Athens-Clarke County government, communications from Clarke County School District, fundraisers for local businesses, ‘Band Together’ showcases, and various other items that are related to the local response.”
  • University of British Columbia’s COVID-19, Racism, and Asian Communities collection, which documents incidents of racism against the Asian communities in Canada, related to the COVID-19 pandemic.
  • New York University’s Tamiment Wagner: NYC COVID-19 Web Activism collection, which “documents activists’ use of social media and the internet to create content, online campaigns, online actions, virtual mutual aid networks and funds to highlight, resist, and call attention to ways in which COVID-19 has impacted New York City physically, emotionally, politically, and economically.”
  • Pennsylvania Horticultural Society’s COVID-19 Collection, “focus[ed] on the Pennsylvania Horticultural Society’s programmatic COVID-19 response via #GrowTogetherPHS, a campaign to engage our audiences in gardening at home.”

The browsing and searching capabilities available on the COVID-19 Web Archive website will soon be augmented by the availability of public datasets, as well as a series of in-person and virtual data analysis workshops which will facilitate a myriad of potential avenues for research use of web archives. A number of research projects and use cases for COVID-19-related web archives have already emerged from the work of ARCH (Archives Research Compute Hub) cohort program members in 2021-2022.

If you are interested in learning more about the COVID-19 Web Archive and associated research opportunities, we are holding an informational webinar on Thursday, October 27 at 11am PT. The session will be recorded and made publicly available, but we encourage you to register here to attend the live webinar.

The COVID-19 Web Archive was made possible with generous support from the Institute of Museum and Library Services (IMLS) as part of their American Rescue Plan grant program.

New eBook Protection Software Gaining Popularity Among Publishers and Libraries

A new digital rights management (DRM) technology that is open source—and embraced by publishers—is gaining traction in the library eBook world. 

Readium LCP was developed five years ago to protect digital files from unauthorized distribution. Unlike proprietary platforms, the technology is open to anyone who wants to look inside the codebase and make improvements. It is a promising alternative for libraries and users wanting to avoid the limitations of traditional DRM. 

“It’s important to have a decentralized, open source system for lending and vending eBooks,” said Brewster Kahle, Internet Archive founder. “LCP is a new generation of software protection that is proving popular with both libraries and publishers.” 

LCP is a flexible, vendor-neutral, low-cost solution against over-sharing of content for eBooks, as well as audiobooks. The codebase is open source with the exception of an algorithm that protects the files.

“LCP was developed in conjunction with publishers to make sure it would meet their criteria to safeguard the content of their books,” said Brenton Cheng, senior engineer at the Internet Archive. “Yet, it’s an open format, and not tied to one particular company or commercial entity. In that spirit of openness, it’s available to anyone who wants to protect their content.” 


A number of leading publishers, libraries and book distributors have adopted LCP, including:

  • HarperCollins integrated LCP into its Harlequin Plus subscription service. 
  • Academic publisher John Libbey Eurotext has adopted LCP for its 2022 publications.
  • Stockholm Public Library has incorporated LCP into its Bibblix mobile app for young readers.
  • Numilog has deployed LCP for more than 500,000 eBooks in French & English.
  • BiblioVault adopted LCP in 2021, serving more than 90 scholarly presses & 40,000 books.
  • The Palace Project has integrated LCP into its mobile apps.

Source: LCP adopters


It’s a simple system that allows readers to access eBooks and audiobooks—and does not limit the selection of titles from a single source (as with Amazon or Apple). 

It offers a large freedom in the choice of a reading solution, keeps intact the accessibility of digital publications and does not leak personal data, says Laurent Le Meur, chief technology officer, with EDRLab, the open source software development laboratory which develops LCP and receives funding from publishers, eBooks distributors, libraries and public bodies.

With LCP’s structure, there is no need to go through a third-party source to be authorized to download a protected book. Therefore, there is no threat of personal information being compromised. LCP is interoperable by design and socially engineered to be a sustainable, nonprofit DRM solution. 

“Open source technologies like LCP protect authors and their works,” said Maria Bustillos, editor at The Brick House Cooperative, a publishing platform designed, owned and operated by journalists. “As a publisher committed to preserving traditional library rights, The Brick House looks forward to exploring the integration of LCP into our forthcoming projects.”

As a new technology, LCP is being used around the world with Europe and Canada leading the way. For organizations working on accessibility, LCP is the natural solution they have been waiting for, said Le Meur. In 2025, the EU Accessibility act will require all distributors of digital publications to offer accessible services and LCP is a DRM format that complies with the mandate. 

“LCP is appealing because it’s not locked,” Cheng said. “There’s a greater sense that it might last. It has more transparency and accountability because the source code is out there and available for anyone to see.”


Image by Freepik