In our digital world, data is power. Information hoarding businesses reign supreme, using intimidation, aggression, and force to maintain influence and control. SARAH LAMDAN brings us into the unregulated underworld of these “data cartels”, demonstrating how the entities mining, commodifying, and selling our data and informational resources perpetuate social inequalities and threaten the democratic sharing of knowledge.
What do libraries have to do with building a better internet? How would securing certain digital rights for these traditional public interest institutions help make the internet work better for everyone?
Join Public Knowledge President CHRIS LEWIS as he facilitates a conversation on these issues and the emerging Movement for a Better Internet with library and internet policy experts LILA BAILEY (Internet Archive), KATHERINE KLOSEK (Association of Research Libraries) and BRIGITTE VÉZINA (Creative Commons).
They will discuss Internet Archive’s forthcoming report “Securing Digital Rights for Libraries: Towards an Affirmative Policy Agenda for a Better Internet” along with ongoing copyright reform projects from Creative Commons and ARL.
Today’s copyright wars can seem unprecedented. Sparked by the digital revolution that has made copyright—and its violation—a part of everyday life, fights over intellectual property have pitted creators, Hollywood, and governments against consumers, pirates, Silicon Valley, and open-access advocates. But while the digital generation can be forgiven for thinking the dispute between, for example, the publishing industry and libraries is completely new, the copyright wars in fact stretch back three centuries—and their history is essential to understanding today’s battles. THE COPYRIGHT WARS—the first major trans-Atlantic history of copyright from its origins to today—tells this important story.
Democracies need an educated citizenry to thrive. In the 21st century, that means easy access to reliable information online for all.
To meet that need, the Internet Archive is building Democracy’s Library—a free, open, online compendium of government research and publications from around the world.
“Governments have created an abundance of information and put it in the public domain, but it turns out the public can’t easily access it,” said Internet Archive founder Brewster Kahle, who is spearheading the effort to collect materials for the digital library.
By having a wealth of public documents curated and searchable through a single interface, citizens will be able to leverage useful research, learn about the workings of their government, hold officials accountable, and be more informed voters.
Too often, the best information on the internet is locked behind paywalls, said Kahle, who has helped create the world’s largest digital library.
“It’s time to turn that scarcity model upside down and build an internet based on abundance,” Kahle said. There is a need for equitable access to objective, historical information to balance the onslaught of misinformation online.
Libraries have long played a vital role in collecting and preserving materials that can educate the public. This mission continues, but the collections need to include digital items to meet the needs of patrons of the internet generation today.
Over the next decade, the Internet Archive is committing to work with libraries, universities, and agencies everywhere to bring the government’s historical information online. It is inviting citizens, libraries, colleges, companies, and the Wikipedians of the world to unlock good information and weave it back into the Internet.
Watch the livestream of Building Democracy’s Library:
The project is part of Kahle’s vision to build a better Internet—one that keeps the public interest above private profit. It is based on an abundance model, in which data can be uncovered, unlocked and reused in new and different ways.
“We know there’s an information flood, but it’s not necessarily all that good,” Kahle said. “It turns out the information on the Internet is not very deep. If you know a subject well, you find that the best information is buried or not even online.”
Democracy’s Library is a move to make governments’ massive investment in research and publications open to all.
Kahle added: “Democracy’s Library is a stepping stone toward citizens who are more empowered and more engaged.“
Carl Malamud is a man with a mission: To make public information freely available to the public.
For more than three decades, Malamud has not just talked in theory about why government materials should be online—he has taken action to digitize and upload massive amounts of data himself. He is the reason many laws and judicial opinions, corporate filings and patents, Congressional hearings and government films are at the fingertips of the American people.
“Our democracy, particularly today, depends on an informed citizenry, with so much misinformation and disinformation,” said Malamud, 63, founder of the nonprofit organization Public.Resource.Org. “We have to learn how our government works, what our fundamental values are, and we have to communicate that with our fellow citizens.”
Malamud is a disrupter for the public good.
His effort to unleash government data behind paywalls has put him at odds with many trying to profit from dispensing public records. Yet in case after case, Malamud is winning and adding to the body of open knowledge freely available online.
In recognition of his relentless work on behalf of the public interest, Malamud has been honored with the 2022 Internet Archive Hero Award.
“Carl has spent his career getting public access to the public domain, bringing government information to everyone with no restrictions,” said Internet Archive founder Brewster Kahle. “He’s been unwavering in his vision, seeing how the works of governments can be leveraged by everyone using this digital technology.”
Although he’s not in the civil service, Kahle said Malamud acts as a civil servant. He sides with advancing the public interest over corporate profits, and has been a pioneer in how to operate a nonprofit in the internet space. Malamud’s tenacity and drive is at the essence of what it means to be a hero, said Kahle: “Somebody who puts themselves at risk or in harm’s way to get their vision built.”
After studying the convergence of computers and communication in college, Malamud went to Washington, D.C., to work in public policy. Malamud developed an expertise in databases, networking, and technology to broadcast audio and video over the internet. In 1993, he started the nonprofit Internet Multicasting Service and ran the first radio station on the internet out of an office in the National Press Building. (An archive of his broadcasts from 1993-95 are available here.)
One of Malamud’s early projects was putting corporate information from the U.S. Securities and Exchange Commission — the Electronic Data Gathering, Analysis, and Retrieval system (EDGAR) online. This allowed investors, journalists and citizens to download information about SEC filings for free, rather than pay a fee to a private company.
The work was funded with a grant from the National Science Foundation, and with money left over from the project, Malamud put the databases from the U.S. Patent Office online. In each instance, Malamud had to first purchase the database from.
“I got a grant from the American people, to buy the data from the American people, so I could give it back to the American people,” said Malamud, who often uses such plain language in his arguments for unlocking information into the public realm.
Demonstrating by doing
Getting the SEC data online was a seminal event, said Tim O’Reilly, founder of O’Reilly Media, noting he and others were inspired by Malamud’s fearless “hacktivism” approach. “It was the beginning of the open government data movement,” he said. “I’ve always called Carl an unsung hero ever since that, because he’s the guy who started it all in motion.”
Faced with pushback from entities that say it’s too hard or it will take too long to put information online, Malamud moves forward and demonstrates it can be done affordably—and the public will use it. It was Malamud who set up the first internet demonstration in the White House during Bill Clinton’s presidency. He advised the administration, and others that followed, on technology policy and identified opportunities to make government records available online—and demonstrated it’s possible.
“Carl has an unwavering commitment to the core principle that citizens should have access to the law and to government documents….and he’s establishing an important legal precedent,” said Tom Kalil, former White House aide to President Clinton and President Obama. “He’s not just a public intellectual writing op-eds, but actually getting things done.’
A passion for changing systems
Malamud has also been a prolific writer. He is the author of nine books, including “Exploring the Internet,” all composed in long hand on paper.
His writing caught the eye of John Podesta in the early 1990s, who was working for President Clinton and figuring out how to move from paper to digital archiving.
“Carl and I had a passion for [the idea] that public records should be public and electronic records should be preserved,” said Podesta. “Carl was both a pioneer and advocate for the power of the net as a democratic tool.”
Podesta said Malamud was a force on Capitol Hill trying to shape legislation, and when he started the Center for American Progress, in 2003, Podesta hired Malamud to be chief technology officer of the progressive think tank. “Carl is friendly and funny, but what really makes him effective is that he’s dogged and passionate. He wears that on his sleeve,” Podesta said. “He just gets right to the point, and I really admire that in him.”
From pushing for access to material from the Smithsonian Institution to the House of Representatives , Podesta said his single-handed influence is clear. “He’s really changed systems,” Podesta said. “He just won’t accept the status quo.”
Podesta said Malamud has had the most impact going right to the source of the data, trying to convince the entities to put information in the public domain.
“It’s extremely valuable in a democracy to make sure that people have not just theoretical access, but real access,” to information, Podesta said. “Oftentimes, the burdens are either bureaucracy or ridiculous charges to get public documents. No one challenges that, but Carl does.”
A battle for the ages
In 2007, Malamud started Public.Resource.Org, based in Sonoma County, California. He has 18 people on contract and numerous collaborators, and works with a dozen pro bono law firms to advance the mission of the nonprofit. The organization operates with a grant from Arcadia (a charitable trust of Lisbet Rausing and Peter Baldwin) and donations from individuals. He appeals to players across the political spectrum with a variety of tactics: writing letters, making speeches, talking to officials in person, and, when necessary, filing lawsuits to challenge claims of copyright.
Recently, Malamud had a big win with a U.S. Supreme Court case (Georgia v. Public.Resource.Org) after he posted the Official Code of Georgia and was sued for copyright violations—a decision that has had a ripple effect across the country. For nearly a decade, he’s been embroiled in a legal fight to put building, electrical and other public safety codes with the force of law online.
“I look for things that should be available and are not,” Malamud said, then simply lays out why information should be free with clear, defensible reasons. “You have to have a story that makes sense.”
Malamud has worked at this cause like no one else, determined to make sure the public realizes what’s at stake when powerful people are concealing the world of knowledge, said David Halperin, a Washington, D.C., attorney. Halperin was with the Clinton administration and has been counsel to Public.Resource.Org since 2012. “He puts it on them to have to explain why their special interests are more important than global progress and democracy,” he said.
Halperin said Malamud is effective because he is relentless and shares his infectious love of democracy. “And, he is willing to be the person who, when everyone else says, ‘Shut up and get along,’ says: ‘No, this still isn’t right. I’m not going to be cuddly here. It’s time for me to be the moral voice, to be the energy in the room that says, Okay, everyone else may now feel it’s time to be collegial. I feel like it’s time to be just.’”
Corynne McSherry, legal director at the Electronic Frontier Foundation, has represented Malamud in several cases and said he knows how to adjust his strategy to persuade others and be creative in his messaging.
“He tries to help people understand what it is he’s up to, because it’s not always clear to everybody,” McSherry said. “When you can’t see the world that the person is building towards, that person has to imagine it for you—and that’s the thing he does.”
Since Malamud was involved in the early days of the internet, he embraced the potential promise of the technology to open up knowledge, McSherry said.
“We live in a nation of rules, and we should have the ability to actually know what they are,” McSherry said, although for a long time those rules were only available to experts with special access. “That changed. Pulling our governmental structures and all our laws into the 21st century is not a small task, but that’s what Malamud took on.”
Drawing inspiration from history
To make his case in the court of public opinion, Malamud has used humor and tapped into his artistic side. He produced a video about making building and electrical codes open, “Show Me The Manual,” and a short movie about his philosophy, “Open Access Ninja.” He speaks at conferences and universities, tailoring his message to attorneys, government workers, students, or fellow open advocates to advance his cause. The Internet Archive hosts a collection of his videos, texts and other materials online, as well as FedFlix, which includes government films Malamud uploaded and curated.
Malamud has expanded his efforts internationally, working with organizations in India to scan government and cultural information. His Public Library of India collections on the Internet Archive are some of the most popular India resources on the net.He’s become an Indian food expert, of sorts, too, said McSherry, and often expresses his gratitude to her and other attorneys working on his behalf by gifting them with Indian spices.
Since he began working in this space, Malamud said he’s encouraged to see more forward thinking about open data. Still, barriers exist. Most often, he said, he’s up against money and control. While Malamud said he’s making inroads in the power struggle, he said it’s “sort of Whack-A-Mole” with every win followed by another challenge popping up.
When he needs a little inspiration himself, Malamud said he reads from his library of writings from early American feminists and civil rights leaders. Sometimes he quotes Martin Luther King Jr. (“Change only comes with continuous struggle”) or Gandhi (“A public worker has to learn to endure with fortitude.’)
A recurring lesson he’s gleaned from others in history who had fought against the establishment: “You can, in fact, change the way the world works—but you have to be patient. It takes time.”
Carl Malamud, founder of Public.Resource.Org and a champion for making government information accessible to all, will receive the 2022 Internet Archive Hero Award. He will be presented the award at next week’s evening celebration, “Building Democracy’s Library.”
This year, the Internet Archive is honoring Carl as a tireless advocate for free access to government information. Some highlights of his work include:
In the early days of the internet, Carl was a pioneer in pushing for public materials to be available online. Over three decades, he has digitized and uploaded thousands of documents from Congressional hearings, government films, and worked with the executive branch to shape public policy on information sharing.
He is to thank for EDGAR (Electronic Data Gathering, Analysis, and Retrieval system) Online, the free Securities and Exchange Commission database of corporate information and putting the database of U.S. patents on the internet.
Carl is relentless in his ongoing quest to have detailed codes for buildings, product safety, and infrastructure available to the public on the internet.
He founded Public.Resource.Org, a nonprofit based in California in 2007. Several contractors and pro-bono attorneys work with him to unleash public information from behind paywalls—sometimes landing him in court to defend his actions, all done in the name of the public good.
Carl is known as a dedicated, passionate, principled individual whose creative strategies—and, at times, dose of humor and flair—have fueled his success in opening up access to public knowledge.
Carl has been a supporter of the Internet Archive since its inception. Much of his work appears in the Internet Archive collection including his book, “Exploring the Internet,” a movie, Open Access Ninja, about his philosophy with Public Resource.org and a video, “Show Me the Manual,” about making building and electrical codes available.
Internet Archive’s Community Webs program is excited to announce that metadata for more than 4,800 archived websites and web collections created by 23 Community Webs member organizations are now available in Digital Public Library of America (DPLA). This marks the first of many metadata ingests that will come over the next months and years, as additional web and digital archives are created and described by members of the program. To access Community Webs web content in DPLA, click here.
The Community Webs program was launched in 2017, and currently provides web and digital archiving training, infrastructure, services, and professional community cultivation for more than 150 public libraries and cultural heritage organizations across the country and around the world. The participating organizations have shared goals of documenting local history and community archiving, especially documenting communities and populaces traditionally excluded from the historical record. These goals dovetail nicely with DPLA’s recently launched Digital Equity Project, which aims to provide support to libraries and archives as they shift toward greater inclusion of diverse stories and voices.
Community Webs collections now available in DPLA include:
The #Syllabus collection, created by the Schomburg Center for Research in Black Culture in New York City, which “aims to web archive Black-authored and Black-related educational resources to document Black studies, movements, and experiences in the twenty-first century.”
The D.C. Punk (Web) Archive, created by People’s Archive, DC Public Library, which documents the punk and hardcore music scenes in Washington, DC.
The Covid-19 in Hennepin County collection, created by Hennepin County Library, which documents the pandemic’s impact on Minneapolis, Minnesota and the surrounding areas, is one of a dozen web collections on local impacts of the Covid-19 pandemic which are now available in DPLA.
The Internet Archive has been a DPLA content provider since 2015, primarily contributing digital materials from our many print digitizing partnerships. However, this is the first time our partners’ web collections have appeared in the DPLA. We are excited for this opportunity to add community-focused born-digital and web collections from our program partners to the already unparalleled breadth of cultural heritage collections accessible via DPLA’s portal. We think these hyperlocal archived web resources will add additional depth and context to DPLA’s existing national collections. Meanwhile, the Community Webs collections’ inclusion in the portal will put these materials alongside other types of digital objects and in front of a broader audience of researchers, steps that are vital to dismantling the silos that often enclose web archives.
We are grateful to be partnering with DPLA to increase access to these vital community history collections and look forward to building more integrations and furthering this collaboration in the years to come. We would like to extend special thanks to the team at DPLA for all their work making this integration possible and to the 23 Community Webs member organizations who have both built and shared their local history web content for posterity.
Since 18th century and pre-Constitution America, libraries have been a public space, a central repository where books could be borrowed, read and returned—a long defended democratic ideal of the public library. But new challenges like book bans and lawsuits against libraries threaten that historic role. Join Brewster Kahle for a discussion about the future of libraries at The Commonwealth Club of California, October 6 @ 5:30pm PT.
Public Library Lending: An Endangered Core Value of American Democracy? October 6 @ 5:30pm PT The Commonwealth Club of California 110 The Embarcadero, Toni Rembe Rock Auditorium Register now for the in-person event (virtual attendance available)
Join us on October 19 to help inaugurate Democracy’s Library and celebrate all the different efforts happening at the Internet Archive!
Why is it that on the internet the best information is often locked behind paywalls? Brewster Kahle, founder of the Internet Archive, believes it’s time to turn that scarcity model upside down and build an internet based on abundance. Join us for an evening event where he’ll share a new project—Democracy’s Library—a free, open, online compendium of government research and publications from around the world. Why? Because democracies need an educated citizenry to thrive.
This year’s event is hybrid. We will be celebrating in-person at our main library in San Francisco, and will be livestreaming the event itself from 7pm-8pm PT so that everyone who cares about democracy around the world can join in.
We are excited to announce that the National Endowment for the Humanities (NEH) has awarded nearly $50,000 through its Digital Humanities Advancement Grant program to UC Berkeley Library and Internet Archive to study legal and ethical issues in cross-border text data mining research. NEH funding for the project, entitled Legal Literacies for Text Data Mining – Cross Border (LLTDM-X), will support research and analysis that addresses law and policy issues faced by U.S. digital humanities practitioners whose text data mining research and practice intersects with foreign-held or licensed content, or involves international research collaborations. LLTDM-X builds upon Building Legal Literacies for Text Data Mining Institute (Building LLTDM), previously funded by NEH. UC Berkeley Library directed BuildingLLTDM, bringing together expert faculty from across the country to train 32 digital humanities researchers on how to navigate law, policy, ethics, and risk within text data mining projects (results and impacts are summarized in the white paper here.)
Why is LLTDM-X needed?
Text data mining, or TDM, is an increasingly essential and widespread research approach. TDM relies on automated techniques and algorithms to extract revelatory information from large sets of unstructured or thinly-structured digital content. These methodologies allow scholars to identify and analyze critical social, scientific, and literary patterns, trends, and relationships across volumes of data that would otherwise be impossible to sift through. While TDM methodologies offer great potential, they also present scholars with nettlesome law and policy challenges that can prevent them from understanding how to move forward with their research. Building LLTDM trained TDM researchers and professionals on essential principles of licensing, privacy law, as well as ethics and other legal literacies —thereby helping them move forward with impactful digital humanities research. Further, digital humanities research in particular is marked by collaboration across institutions and geographical boundaries. Yet, U.S. practitioners encounter increasingly complex cross-border problems and must accordingly consider how they work with internationally-held materials and international collaborators.
How will LLTDM-X help?
Our long-term goal is to design instructional materials and institutes to support digital humanities TDM scholars facing cross-border issues. Through a series of virtual roundtable discussions, and accompanying legal research and analyses, LLTDM-X will surface these cross-border issues and begin to distill preliminary guidance to help scholars in navigating them. After the roundtables, we will work with the law and ethics experts to create instructive case studies that reflect the types of cross-border TDM issues practitioners encountered. Case studies, guidance, and recommendations will be widely-disseminated via an open access report to be published at the completion of the project. And most importantly, these resources will be used to inform our future educational offerings.
The LLTDM-X team is eager to get started. The project is co-directed by Thomas Padilla, Deputy Director, Archiving and Data Services at Internet Archive and Rachael Samberg, who leads UC Berkeley Library’s Office of Scholarly Communication Services. Stacy Reardon, Literatures and Digital Humanities Librarian, and Timothy Vollmer, Scholarly Communication and Copyright Librarian, both at UC Berkeley Library, round out the team.
We would like to thank NEH’s Office of Digital Humanities again for funding this important work. The full press release is available at UC Berkeley Library’s website. We invite you to contact us with any questions.
The Internet Archive has asked a federal judge to rule in our favor and end a radical lawsuit, filed by four major publishing companies, that aims to criminalize library lending.
The motion for summary judgment, filed Thursday in the U.S. District Court for the Southern District of New York by the Electronic Frontier Foundation (EFF) and Durie Tangri LLP,explains that our Controlled Digital Lending (CDL) program is a lawful fair use that preserves traditional library lending in the digital world.
The brief explains how the Internet Archive is advancing the purposes of copyright law by furthering public access to knowledge and facilitating the creation of new creative and scholarly works. The Internet Archive’s digital lending hasn’t cost the publishers one penny in revenues; in fact, concrete evidence shows that the Archive’s digital lending does not and will not harm the market for books.
Earlier today, we hosted a press conference with stakeholders in the lawsuit and the librarians and creators who will be affected by its outcome, including:
“Should we stop libraries from owning and lending books? No,” said Brewster Kahle, the Internet Archive’s founder and digital librarian. “We need libraries to be independent and strong, now more than ever, in a time of misinformation and challenges to democracy. That’s why we are defending the rights of libraries to serve our patrons where they are, online.”
Through CDL, the Internet Archive and other libraries make and lend out digital scans of print books in our collections, subject to strict technical controls. Each book loaned via CDL has already been bought and paid for, so authors and publishers have already been fully compensated for those books. Nonetheless, publishers Hachette, HarperCollins, Wiley, and Penguin Random House sued the Archive in 2020, claiming incorrectly that CDL violates their copyrights.
“The publishers are not seeking protection from harm to their existing rights. They are seeking a new right foreign to American copyright law: the right to control how libraries may lend the books they own,” said EFF Legal Director Corynne McSherry. “They should not succeed. The Internet Archive and the hundreds of libraries and archives that support it are not pirates or thieves. They are librarians, striving to serve their patrons online just as they have done for centuries in the brick-and-mortar world. Copyright law does not stand in the way of a library’s right to lend its books to its patrons, one at a time.”
Authors and librarians speak out in support of the Internet Archive
“In the all-consuming tide of entropy, the Internet Archive brings some measure of order and permanence to knowledge,” said author Tom Scocca. “Out past the normal circulating lifespan of a piece of writing—or past the lifespan of entire publications—the Archive preserves and maintains it.”
“The library’s practice of controlled digital lending was a lifeline at the start of the pandemic and has become an essential service and a public good since,” said Benjamin Saracco, a research and digital services faculty librarian at an academic medical and hospital library in New Jersey. “If the publishers are successful in their pursuit to shut down the Internet Archive’s lending library and stop all libraries from practicing controlled digital lending, libraries of all varieties and the communities they serve will suffer.”
Last week, Knowledge Rights 21 released a strong call to action to ensure that libraries can continue serving their centuries old role in society of providing access to knowledge to the public. Knowledge Rights 21 is an Arcadia funded project advocating for copyright and open access reform across Europe.
In their Position Statement on eBooks and eLending, Knowledge Rights 21 explains that government action is urgently needed because the market for eBooks now operates outside of the current copyright law that permits libraries to acquire, lend and preserve physical books. Monopolistic behavior by commercial publishers including refusals to sell, embargoes, high prices, and restrictive licensing terms have frustrated libraries’ ability to undertake collection development, hurting those who rely on libraries for education, research, and cultural participation.
The Position Statement demands that “governments must wake up and act now before the rights of citizens to access information and learning through libraries are eroded any further.” The Statement proposes the following clarifications in EU law:
1.The right for libraries to acquire, preserve and make a digital reproduction of an analogue and / or an electronic book / audiobook that has been made available in the market under sale or licence; 2. No more copies than have been acquired under 1 above, shall be loaned to members of the public at any one time. Libraries should have the right to lend directly to users, as well as via other libraries as part of interlibrary loan; 3. Neither contracts nor technical protection measures shall be enforceable to prevent this; 4. Any loans made under this shall require the payment of [Public Lending Right] monies by public libraries in line with existing practice with paper and or audiobooks.
The Internet Archive agrees that action on this issue is important and necessary. We are defending these principles in US court, in the lawsuit brought by four of the world’s largest publishers over our controlled digital lending program. We look forward to working with Knowledge Rights 21 and the library community “to help libraries not only to survive, but also to flourish” as the EU Court of Justice said in its landmark case supporting eBook lending by libraries.