Category Archives: News

What Does the Blockbuster Antitrust Trial Against Penguin Random House Mean for the Future of Libraries?

The publishing industry is large and powerfulby some accounts, it generates nearly $100 billion in revenue worldwide. The United States Department of Justice has accused big publishers of abusing that power in the past, by conspiring with each other to raise the price of e-books. More recently, Penguin Random House has been in the legal crosshairs for an alleged abuse of power, as the Justice Department sues to stop its proposed (and allegedly anticompetitive) acquisition of Simon & Schuster. 

What would more concentrated power in the publishing industry mean for libraries? In recent years, publishers have blamed libraries for all manner of illsclaiming that they unfairly cannibalize sales, among other thingsto justify the imposition of increasingly expensive licensing models. But as testimony in the Justice Department lawsuit has confirmed, the publishing industry isn’t the least bit ill: it’s “thriving,” with years of double-digit growth. And although the economics of the publishing industry was examined at trial in excruciating detail, the supposed threat of library lending was nowhere to be found; libraries weren’t mentioned at all.

What of the authors? The publishing industry often claims that its actions are necessary for the good of authors, but this case does not support such a claim. The Authors Guild has publicly opposed the merger, expressing its own concern about the extraordinary concentration of power in the publishing industry and how it could harm emerging and mid-list authors. Meanwhile, at trial, we learned that the vast majority of all published books are of this sort, selling very few copies. Of course, libraries are one of the few markets for such titles: buying them, preserving them, and ensuring they remain publicly available after their commercial life is over. Unfortunately, as the trial made abundantly clearfeaturing, as it did, the CEO of Penguin Random House bragging about cutting author compensation for e-bookssuch matters are not high on the publisher’s priority list.

So what does this portend for the future of libraries? While the outcome of the trial remains unclear, the Association of American Publisher’s view of libraries could not be clearer: “Libraries are an important part of the copyright ecosystem as authorized distributors,” they recently said. That is the world the AAP hopes for: one where our public interest institutions, and our library professionals, are little more than “authorized distributors” of whatever is most profitable for the publishers. It should be no surprise, then, that libraries remain deeply concerned that the future envisioned by these publishers is in nobody’s interest but their own.

Internet Archive Opposes Publishers in Federal Lawsuit

On Friday, September 2, we filed a brief in opposition to the four publishers that sued Internet Archive in June 2020: Hachette Book Group, Harper Collins Publishers, John Wiley & Sons, and Penguin Random House. This is the second of three briefs from us that will help the Court decide the case.

Read: Hachette v. Internet Archive – Internet Archive’s Opposition to Motion for Summary Judgment

As many of you know, these four publishers sued the Internet Archive to try to shut down our digital lending program. The lawsuit has been ongoing for over two years now. In addition to the papers that have gone in so far, there will be one more opportunity, later this fall, for the parties to file arguments with the court. These will be the “reply” briefs. At that point, the filing of papers tends to cease. The Court will then decide whether or not it wants to hear from the parties in person–through “oral argument.” After that, the Court will make a decision on this set of briefs. That could resolve the case in its entirety, or it could lead to a trial and/or appeal. In the end, the lawsuit could take some years to resolve.

Our opposition brief responds to the arguments raised in the publisher’s motion for summary judgment. There, some of the world’s largest and most-profitable publishers complained that sometimes “Americans who read an ebook use free library copies, rather than purchasing a commercial ebook.” They believe that copyright law gives them the right to control how libraries lend the books they own, and demand that libraries implement the restrictive terms and conditions that publishers prefer.

Our opposition brief explains that “[p]ublishers do not have a right to limit libraries only to inefficient lending methods, in hopes that those inefficiencies will lead frustrated library patrons to buy their own copies.” The record in this case shows that publishers have suffered no economic harm as a result of our controlled digital lending–indeed, publishers have earned record profits in recent years. “[D]igital lending of physical books costs rightsholders no more or less than, for example, lending books via a bookmobile or interlibrary loan. In each case, the books the library lends are bought and paid for, ensuring that rightsholders receive all of the financial benefits to which they are entitled.”

The future of library lending is at stake in this lawsuit. We will keep fighting to prove that copyright does not stand in the way of a library’s right to do what libraries have always done: lend the books it owns to one patron at a time.

Alexis Rossi announced as RFC Series Consulting Editor

Alexis Rossi, the Director of Media & Access at the Internet Archive, was announced yesterday as the new RFC Series Consulting Editor for the Internet Engineering Task Force (IETF).

The RFC Series contains documents that define how the Internet functions. The first RFC was published in 1969, when just a few organizations were trying to figure out how to communicate digitally. Now, 53 years later, more than 9,200 RFCs have been written by thousands of volunteers and these documents and protocols are the underpinnings of the Internet systems we use every day.

Alexis joins the IETF team to help maintain the archival quality of the RFC Series, and to provide guidance on the policies and processes for publishing these important documents. She will also continue in her role with the Internet Archive, managing the organization’s millions of digital items.

The Internet Archive’s founder, Brewster Kahle, who has his own informational RFC (RFC 1625) published in 1994 for WAIS (Wide Area Information Servers), said of Alexis’s new role, “From my own days working on WAIS, I know how important these documents have been to the development of today’s Web. I’m glad to know that someone with so much experience will be helping to keep this Series preserved.”

We wish Alexis well in her new endeavor!

Book Talk: Surveillance State, Sep 14 (in-person)

“Josh Chin and Liza Lin have given us a truly groundbreaking investigation of China’s embrace of digital surveillance. The global scope and deep detail of their account retires the notion of an ‘all-seeing’ surveillance as some future scenario; it is happening already. They will open your eyes to the astonishing intersection of data, politics, and the human body. Anyone who cares about the future of technology, of China, or of free will cannot afford to miss this.”
—Evan Osnos, The New Yorker

Join authors Josh Chin & Liza Lin for an in-person discussion on life in China’s burgeoning surveillance state. They will be joined in conversation by Xiao Qiang (Berkeley).
September 14 @ Internet Archive, 300 Funston Avenue, San Francisco
Doors open at 6:30pm, discussion starts at 7pm.

People living in democracies have for decades drawn comfort from the notion that their form of government, for all its flaws, is the best history has managed to produce. In SURVEILLANCE STATE: Inside China’s Quest to Launch a New Era of Social Control (St. Martin’s Press; September 6, 2022), award-winning journalists Josh Chin and Liza Lin (Wall Street Journal) document with startling detail how China’s Communist Party is striving for something new: a political model that shapes the will of the people not through the ballot box but through the sophisticated—and often brutal—harnessing of data.


Registration is free for the in-person event.

Purchase a copy of Surveillance State at registration to be signed by the authors at the event. You can also purchase unsigned copies from The Booksmith, our local bookshop in the historic Haight-Ashbury neighborhood, to be delivered to you, or from your own local bookstore.

Book Talk: Surveillance State
Authors Josh Chin and Liza Lin
September 14 @ 7pm PT
IN-PERSON @ the Internet Archive, 300 Funston Avenue, San Francisco
Registration is required! Register now

Introducing the 2022 DWeb Fellows

A discussion session from DWeb Camp 2019 led by Fellows.

How do we ensure that the decentralized web fulfills its potential to create a better web for all? That the technologies, organizations, and approaches that gain traction and succeed (by any measure) uphold the security, privacy, and self-determination of everyone, especially those of marginalized populations who have the most to gain? 

The first step is to recognize that there are many people around the world who are already doing this work. They’re not only imagining and theorizing about a better web, but are actually creating and employing digital tools to uplift communities facing systemic inequities. They bring about justice and enable individual and collective agency, both through network technologies and by also creating and maintaining communities of care.

As the Decentralized Web (DWeb) San Francisco team, we help grow networks of solidarity among these individuals and organizations by creating opportunities for them to build relationships with each other and the DWeb community. Our Fellows from DWeb Camp 2019 strongly influenced our thinking as we defined a set of shared Principles and continued to hold virtual and in-person convenings in the three years since. 

As the Director of this year’s Fellowship program, one of my strongest hopes is that the DWeb Fellows are able to build lasting, fruitful relationships with each other and other DWeb Campers. My other hope is that the Fellows’ projects and approaches continue to shape the DWeb community overall – to connect and empower the most under-resourced, and ensure that the decentralized web we’re building truly addresses the needs of all.

The 2022 DWeb Fellowship program was made possible with generous support from the Ford Foundation, Filecoin Foundation for the Decentralized Web, Mysterium Network, donations through the Gitcoin grant challenge, and others.

2022 DWeb Fellows

Alice Yuan Zhang, Media Artist/Researcher

Andrew Chou, Digital Democracy

brandon king, Resonate.Coop

Cody Harris, Seattle Community Network

Dana Beltrán, Colnodo

Esther Jang, Seattle Community Network

Hiure Queiroz, Portal Sem Porteiras

Jaime Villarreal, May First Movement Technology

Johan Michalove, Cornell University

Kemly Camacho Jiménez, Sulá Batsú Coop

Kola Heyward-Rotimi, COMPOST Magazine

Luisa Bagope, Portal Sem Porteiras

María Alvarez Malvido, Redes por la Diversidad, Equidad y Sustentabilidad A.C

Michael Abraha, Tigray Art Collective

Ngọc Triệu, Simply Secure | Decentralization Off the Shelf

Nicolás Pace, Association for Progressive Communication

Remy Hellstern, Xinjiang Documentation Project, University of British Columbia

riley wong, Independent Researcher

Rudo Kemper, Digital Democracy

Sanketh Kumar P, COWDe.Net | Janastu Servelots | GramSevaSangh

Shafali Jain, COWDe.Net | Janastu Servelots

Tania Silva, Coolab

T B Dinesh, Janastu Servelots

Vaipunu Ian Tairea, Project Sunrise | Tai Collective

Ying Tong Lai, Halo2 | ZCash

Launching Legal Literacies for Text Data Mining – Cross Border (LLTDM-X)

We are excited to announce that the National Endowment for the Humanities (NEH) has awarded nearly $50,000 through its Digital Humanities Advancement Grant program to UC Berkeley Library and Internet Archive to study legal and ethical issues in cross-border text data mining research. NEH funding for the project, entitled Legal Literacies for Text Data Mining – Cross Border (LLTDM-X), will support research and analysis that addresses law and policy issues faced by U.S. digital humanities practitioners whose text data mining research and practice intersects with foreign-held or licensed content, or involves international research collaborations. LLTDM-X builds upon Building Legal Literacies for Text Data Mining Institute (Building LLTDM), previously funded by NEH. UC Berkeley Library directed Building LLTDM, bringing together expert faculty from across the country to train 32 digital humanities researchers on how to navigate law, policy, ethics, and risk within text data mining projects (results and impacts are summarized in the white paper here.) 

Why is LLTDM-X needed?

Text data mining, or TDM, is an increasingly essential and widespread research approach. TDM relies on automated techniques and algorithms to extract revelatory information from large sets of unstructured or thinly-structured digital content. These methodologies allow scholars to identify and analyze critical social, scientific, and literary patterns, trends, and relationships across volumes of data that would otherwise be impossible to sift through. While TDM methodologies offer great potential, they also present scholars with nettlesome law and policy challenges that can prevent them from understanding how to move forward with their research. Building LLTDM trained TDM researchers and professionals on essential principles of licensing, privacy law, as well as ethics and other legal literacies —thereby helping them move forward with impactful digital humanities research. Further, digital humanities research in particular is marked by collaboration across institutions and geographical boundaries. Yet, U.S. practitioners encounter increasingly complex cross-border problems and must accordingly consider how they work with internationally-held materials and international collaborators.

How will LLTDM-X help? 

Our long-term goal is to design instructional materials and institutes to support digital humanities TDM scholars facing cross-border issues. Through a series of virtual roundtable discussions, and accompanying legal research and analyses, LLTDM-X will surface these cross-border issues and begin to distill preliminary guidance to help scholars in navigating them. After the roundtables, we will work with the law and ethics experts to create instructive case studies that reflect the types of cross-border TDM issues practitioners encountered. Case studies, guidance, and recommendations will be widely-disseminated via an open access report to be published at the completion of the project. And most importantly, these resources will be used to inform our future educational offerings.

The LLTDM-X team is eager to get started. The project is co-directed by Thomas Padilla, Deputy Director, Archiving and Data Services at Internet Archive and Rachael Samberg, who leads UC Berkeley Library’s Office of Scholarly Communication Services. Stacy Reardon, Literatures and Digital Humanities Librarian, and Timothy Vollmer, Scholarly Communication and Copyright Librarian, both at UC Berkeley Library, round out the team.

We would like to thank NEH’s Office of Digital Humanities again for funding this important work. The full press release is available at UC Berkeley Library’s website. We invite you to contact us with any questions.

Why it’s Important to #OwnBooks

Here’s Max Collins, lead singer of legendary alt-rock band Eve 6, reading a book that he owns.

As you know, the Internet Archive is currently being sued by four corporate publishers. The publishers want to stop libraries from owning books. In the age of Netflix and Spotify, ownership of culture is increasingly in the hands of large corporations rather than people, artists and public institutions.

We’re fighting back by celebrating book ownership with the #OwnBooks campaign. 

It’s very easy to take part. Choose a book that you’ve owned for a long time – ideally the oldest book you own! You can also choose another media piece, such as a record, CD, or DVD. Take a photo with the book and share it on social media. Tell us how long you’ve owned your book and use the #OwnBooks hashtag.

Check out why other readers like to #OwnBooks.

You could also tell us the story of your relationship with the book – what were the circumstances in which you acquired it? Does it spark any special memories for you? If you prefer, you could make a selfie video and record yourself telling the story of the book. 

We’ll retweet your posts. Make sure to use the #OwnBooks hashtag and mention @internetarchive to help us find them.

Celebrating 20 Years of the Live Music Archive

This week, the Live Music Archive collection at the Internet Archive reaches a milestone – 20 years since the collection was started. The roots of the Live Music Archive collection are visible right in the URL – etree. Did you ever wonder what the “etree” in the URL references? In 1998, the etree music community was created to promote the online trading of lossless audio recordings of live music performances. With the advent of more widely available broadband (by 1990’s standards, mind you) internet connections and the creation of lossless file compression formats (Shorten at first, followed by FLAC), the community established protocols to ensure the preservation and archiving of these original audio recordings. Preservation and archiving. The very ethos of the Internet Archive.

Early Live Music Archive logo

In July 2002, Jon Aizen, a software engineer at the Internet Archive and live music enthusiast, proposed to Brewster Kahle the idea of archiving live music recordings. Brewster was enthusiastic and so on July 23, 2002, Jon reached out to the etree community via their email list to make an offer. The Internet Archive was offering to provide “unlimited storage, unlimited bandwidth, forever, for free” to ensure the preservation and easy distribution of these live music recordings. The reply came back:  “We don’t believe you. But if you could, that would be our dream.” And we were off to the races to create the first library archive of lossless, legal, live audio recordings. The first order of business was to get explicit permission from the artists to not only preserve but also make available easy access to their recordings. Aizen and others starting emailing bands and documenting their responses. It would be a great story to have the first item as part of the collection to be some rare Grateful Dead recording from 1968, but it is actually an unassuming Rusted Root audience recording from August 24, 2001 uploaded to the new Live Music Archive collection by Aizen on August 12, 2002. You can listen to it here. Of course, there has to be a Grateful Dead connection as the show features a guest appearance by Mark Karan, guitarist (at the time) for Ratdog, one of Bob Weir’s side projects. Perhaps the fact that it is unassuming is more in line with the goal of preservation and archiving. Preservation of all, not just the shiny fancy gem. Permission from the Grateful Dead came a little while later, through Brewster’s connection to John Perry Barlow, who worked together on the board of the Electronic Frontier Foundation.

As the Live Music Archive was established, the etree community jumped in to help get things rolling – dedicating hundreds of hours to cataloging, uploading, and verifying recordings of shows. In those days it would take 6-12 hours to upload a show via FTP. Jon Aizen describes grabbing shows off etree’s FTP server network as well as from hard drives and other sources and uploading them to the Live Music Archive. Aizen also worked in the early days to create the curation process to enable volunteers to ensure that uploads were permitted by the artists. The Internet Archive team also worked on the “deriver” software which would convert the lossless recordings to MP3 and other more accessible formats (which came after heated debate amongst the etree community, for many of whom the notion of lossy distribution of recordings was anathema). Today’s uploading experience is a web interface that takes most folks 10-20 minutes to upload a show and have it almost immediately available to the world. There were many people involved in the early days and I’m sure we will miss some, but we’d like to thank the following notable contributors:

Alexis Rossi
Brad Leblanc
Bram Cohen
Caleb Epstein
Diana Hamilton
Greg Pope
John Dailey
Jon Aizen
Lauren Gelman
Marc Pujol
Mark Goldey
Matt Vernon
Parker Thompson
Peter Hedeman
Ryan Brase
Tom Anderson
Tom Horton
Tracey Jaquith
Tyler Huff

Brad Leblanc recalls doing all the tasks manually – validating checksums, moving files to public download areas, running derivation routines to create mp3/ogg files for streaming. Brad, Jon, and all the others were curating this new collection, bit by bit, as well as building software to automate the process. The Live Music Archive volunteers today still refer to themselves as curators. An amazing task with incredible results.

A grand offer followed by a positive, yet skeptical, response. And then a lot of hard work by both Internet Archive staff and engineers as well as volunteers from the live music taping and trading community. For 20 years, we have kept curating, uploading to the Live Music Archive about 1,000 recordings per month with the total now at 240,000 recordings in total – by far the largest collection of live music recordings in the world. We should reach 250,000 by next summer. More than 8,000 artists have given permission to have recordings of their shows archived on the Live Music Archive. Those recordings have been listened to more than 600,000,000 (yes, 600 Million) times. And many of those are not even the Grateful Dead, giving visibility to artists that might otherwise have less exposure. The Grateful Dead remains the cornerstone artist of the Live Music Archive, but there are many other options on the Live Music Archive – jambands, folk singers, bluegrass, rock, pop, jazz, classical, experimental, mainstream artists, and every combination you can think of.

Beyond listening to the music, what impact has the Live Music Archive had on the artists? The recordings allow their fans to hear the shows they were at or couldn’t make it to or the one across the country that happened yesterday. Building and fortifying a fanbase through the community of live music recordings. Not just for the fans, but the appreciation from the artists as well. One of our curators was having a conversation backstage before a show with a musician friend. It was an “in the round” type show featuring four songwriters alternating to perform their songs with the others playing or singing along. One of the other artists was on the couch trying to take a nap before the show. As soon as the conversation turned to the Live Music Archive, he popped off the couch to say, “I love the Live Music Archive! That place is great. I go there to check out music all the time.” From a nap to excitement in a second. The Live Music Archive is a resource both personally and professionally for musicians. A new musician joins the band? Send them to the Live Music Archive to check out some shows to learn how the songs are played live, the seques occur. A recent text one of our curators received was an artist looking for a recommendation, “What is a good recent recording I can send to some musicians? I love the Tahoe and Eugene recordings from earlier this year but need something more recent.” It was certainly enough to put a smile on a curator’s face.

From trading tapes (reel to reel, cassettes, DATs) by mail months/years after the show occurred to CD’s to FTP server networks to hearing the show hours after it ended on your mobile device – a transformation of a community. No longer hundreds and thousands hearing the show, but hundreds of thousands.

The Live Music Archive curators are not just archivists, but tapers and music fans themselves. Here are some suggestions for curators past and present.

From Jon Aizen:

“It’s hard to pick one, but I think Sim and Uniit at the State Theater in Ithaca in 2002 is an amazing example of the power of the Archive. If it weren’t for the Archive, this recording would be sitting on a tape somewhere, probably lost forever. This small act, never to be repeated (Sim and Uniit are friends, but not a regular act) is a moment in time perfectly captured.”

Sim Redmond and Uniit Carruyo Live at State Theater on 2002-09-14

From vanark:

“Some of my favorite recordings are from the most intimate settings – especially house concerts and in store performances. Close to the performer in a more informal environment, without a big PA or sound system. Musician, instrument (usually acoustic), microphones, and a couple dozen fans. In this recording, the in store occurred in the afternoon prior to the evening performance at a local club. JJ Grey walks to the small area in the corner of the store, sees the microphones set up in front of him and asks, ‘Whose are these?’ I raise my hand and a big smile rises across his face and I get a ‘That’s great!’ A short 5-song set promoting the newest CD. There were still hours before the evening show, so I head home and upload the show to the Archive before heading to the club. I think I had more fun at the free in store than the main event.”

JJ Grey Live at Newbury Comics – Faneuil Hall Marketplace on 2008-10-25

If you want to listen to some of the most popular recordings of all time on the Live Music Archive, here are some selections.

The most listened to item of all time. OAR has quite a following, and this one might have been embedded on their Myspace page to get 2.7 million listens:

Of A Revolution Live at Madison Square Garden on 2006-01-14

The most listened to Grateful Dead recording (no, not Cornell 1977, although that is second):

Grateful Dead Live at Robert F. Kennedy Stadium on 1973-06-10

An interesting show in the top 20 of all time, from a pizza/brewery in Asheville, NC, recorded by curator Gordon:

Patterson Hood Live at Asheville Pizza & Brewing Company on 2006-01-07

Whichever show you choose to listen to, whether it has been listened to 500,000 times or a backyard show from last weekend listened to 50 times, they have value to someone and it is not measured by the number of listens. The tapers are still out there capturing the moments from artists, new and established, doing covers or originals. Capturing, archiving, preserving.

From all the listeners, artists, and tapers, thank you to the Internet Archive and etree for taking that leap of faith in 2002 and pushing it forward. Who knows where we can take it from here? Let’s keep it going! Let’s start planning that party for the 25th anniversary – who’s in?

Book Signing with Congressman Adam Schiff at the Internet Archive

Please join us for a conversation and book signing sponsored by Booksmith, Berkeley Arts & Letters and the Internet Archive.

Congressman Schiff is celebrating the paperback launch of his #1 New York Times bestselling “Midnight in Washington: How We Almost Lost Our Democracy and Still Could”.

Tuesday 8/16/22 7:30 pm

300 Funston Ave.
San Francisco, CA 94118

Adam Schiff is the United States Representative for California’s 28th Congressional District. In his role as Chairman of the House Permanent Select Committee on Intelligence, Schiff led the first impeachment of Donald J. Trump. Before he served in Congress, he worked as an Assistant U.S. Attorney in Los Angeles and as a California State Senator.

RSVP here

New additions to the Internet Archive for July 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 

Books – 78,091 New items in July

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive – 91,636 New Items in July

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

LibriVox Audiobooks – 119 New Items in July

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

78 RPMs and Cylinder Recordings – 8,888 New Items in July

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Live Music Archive – 965 New Items in July

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Movies – 135 New Items in July

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.