Category Archives: News

Knowledge Rights 21 Calls for Action on Library Rights

Last week, Knowledge Rights 21 released a strong call to action to ensure that libraries can continue serving their centuries old role in society of providing access to knowledge to the public. Knowledge Rights 21 is an Arcadia funded project advocating for copyright and open access reform across Europe.

In their Position Statement on eBooks and eLending, Knowledge Rights 21 explains that government action is urgently needed because the market for eBooks now operates outside of the current copyright law that permits libraries to acquire, lend and preserve physical books. Monopolistic behavior by commercial publishers including refusals to sell, embargoes, high prices, and restrictive licensing terms have frustrated libraries’ ability to undertake collection development, hurting those who rely on libraries for education, research, and cultural participation.

The Position Statement demands that “governments must wake up and act now before the rights of citizens to access information and learning through libraries are eroded any further.” The Statement proposes the following clarifications in EU law:

1.The right for libraries to acquire, preserve and make a digital reproduction of
an analogue and / or an electronic book / audiobook that has been made
available in the market under sale or licence;
2. No more copies than have been acquired under 1 above, shall be loaned to
members of the public at any one time. Libraries should have the right to lend
directly to users, as well as via other libraries as part of interlibrary loan;
3. Neither contracts nor technical protection measures shall be enforceable to
prevent this;
4. Any loans made under this shall require the payment of [Public Lending Right] monies by public libraries in line with existing practice with paper and or audiobooks.

The Internet Archive agrees that action on this issue is important and necessary. We are defending these principles in US court, in the lawsuit brought by four of the world’s largest publishers over our controlled digital lending program. We look forward to working with Knowledge Rights 21 and the library community “to help libraries not only to survive, but also to flourish” as the EU Court of Justice said in its landmark case supporting eBook lending by libraries.

Recent Report from IFLA: How well did copyright laws serve libraries during COVID-19?

The short answer to this question from a report recently published by IFLA appears to be: not very well at all. The report documents a worldwide survey of 114 libraries, 83% of which said they had copyright-related challenges providing materials during pandemic-related facility closures. The report also provides direct quotes from a series of interviews of library professionals, discussing the challenges they faced and often how difficult digital access to necessary materials such as textbooks has been throughout the last two years. As one librarian from the United States explains:

Times were tough. We were scrambling and worried about so many things – including the health and safety of our students, faculty and colleagues – and trying to spin up as much as possible in the way of service. There are certainly some vendors that we personally like the interactions with, but it felt to me like the publishers saw this as an opportunity just to make more money and not really an opportunity to build stronger connections with us and our library. They offered free things for a very limited period of time.

The report is well worth reading in its 22-page entirety. You can find it here.

Preserving Pro-Democracy Books From Shuttered Hong Kong Bookstore

Albert Wan ran Bleak House Books, an independent bookstore in Hong Kong, for nearly five years, before closing it in late 2021. The changing political climate and crackdown on dissent within Hong Kong made life too uncertain for Wan, his wife and two children. 

As they were preparing to move, Wan packed a box of books at risk of being purged by the government. He brought them on a plane back to the United States in January and donated them to the Internet Archive for preservation. 

The collection includes books about the pro-democracy protests of 2019—some photography books; another was a limited edition book of essays by young journalists who covered the event. There was a book about the Tiananmen Square massacre and volumes about Hong Kong politics, culture and history—most written in Chinese. 

“In Hong Kong, because the government is restricting and policing speech in a way that is even causing libraries to remove books from shelves, I thought that it would be good to digitize books about Hong Kong that might be in danger of disappearing entirely,” Wan said.

“I thought that it would be good to digitize books about Hong Kong that might be in danger of disappearing entirely.”

Albert Wan, owner of the now-closed Bleak House Books

Hearing that Bleak House Books would be shutting its doors, the Internet Archive reached out and offered to digitize its remaining books. As it happens, Wan said his inventory was dwindling quickly. So, he gathered contributions from others, and along with some from his own collection, donated about thirty books and some periodicals to the Internet Archive for preservation and digitization. Wan said he was amazed at how flexible and open the Archive was in the process, assisting with shipping and scanning the materials at no cost to him. (See Hong Kong Community Collection.)

Now, Wan wants others to do the same.

“There are still titles out there that have never been digitized and might be on the radar for being purged or sort of hidden from public view,” Wan said. “The hope is that more people would contribute and donate those kinds of books to the Archive and have them digitized so that people still have access to them.”

Do you have books you’d like to donate to the Internet Archive? Learn more.

Wan said he likes how the Internet Archive operates using controlled digital lending (CDL) where the items can be borrowed one at time, not infringing on the rights of the authors, while providing broad public access.

Before his family moved to Hong Kong for his wife’s university teaching job, Wan was a civil rights and criminal defense attorney in private practice. Now, they are all getting settled in Rochester, New York, where Wan plans to open another bookshop.

Memorial Day BBQ, Live Music and Lost Landscapes at the Internet Archive – Monday May 30, 2022

Calling all SF Cineastes and Archivists!

LOST LANDSCAPES is BACK!! With BBQ and live music too!

Come join the fun this Memorial Day and hang out with us at the Internet Archive.

$1 hotdogs, live music by the Traveling Wilburys Revue then onto a screening of Prelinger Archives’ “Lost Landscapes: Earth, Fire, Air, Water: California Infrastructures“.

Date: Monday, May 30, 2022
When:
5:30 PM BBQ – 6:30 PM Live Music – 8:15 PM Film Screening
Where:
300 Funston Ave., San Francisco, CA
Cost:
$15.00

GET YOUR TICKETS HERE

Goodbye Facebook. Hello Decentralized Social Media?

The pending sale of Twitter to Elon Musk has generated a buzz about the future of social media and just who should control our data.

Wendy Hanamura, director of partnerships at the Internet Archive, moderated an online discussion April 28 “Goodbye Facebook, Hello Decentralized Social Media?” about the opportunities and dangers ahead. The webinar is part of a series of six workshops, “Imagining a Better Online World: Exploring the Decentralized Web.” 

Watch the session recording:

The session featured founders of some of the top decentralized social media networks including Jay Graber, chief executive officer of R&D project Bluesky, Matthew Hodgson, technical co-founder of Matrix, and Andre Staltz, creator of Manyverse. Unlike Twitter, Facebook or Slack, Matrix and Manyverse have no central controlling entity. Instead the peer-to-peer networks shift power to the users and protect privacy. 

If Twitter is indeed bought and people are disappointed with the changes, the speakers expressed hope that the public will consider other social networks. “A crisis of this type means that people start installing Manyverse and other alternatives,” Staltz said. “The opportunity side is clear.” Still in the transition period if other platforms are not ready, there is some risk that users will feel stuck and not switch, he added.

Hodgson said there are reasons to be both optimistic and pessimistic about Musk purchasing Twitter. The hope is that he will use his powers for good, making it available to everybody and empowering people to block the content they don’t want to see. The risk is with no moderation, Hodgson said, people will be obnoxious to one another without sufficient controls to filter, and the system will melt down. “It’s certainly got potential to be an experiment. I’m cautiously optimistic on it,” he said.

People who work in decentralized tech recognize the risk that comes when one person can control a network and act for good or bad, Graber said. “This turn of events demonstrates that social networks that are centralized can change very quickly,” she said. “Those changes can potentially disrupt or drastically alter people’s identity, relationships, and the content that they put on there over the years. This highlights the necessity for transition to a protocol-based ecosystem.” 

When a platform is user-controlled, it is resilient to disruptive change, Graber said. Decentralization enables immutability so change is hard and is a slow process that requires a lot of people to agree, added Staltz.

The three leaders spoke about how decentralized networks provide a sustainable alternative and are gaining traction. Unlike major players that own user data and monetize personal information, decentralized networks are controlled by users and information lives in many different places.

“Society as a whole is facing a lot of crises,” Graber said. “We have the ability to, as a collective intelligence, to investigate a lot of directions at once. But we don’t actually have the free ability to fully do this in our current social architecture…if you decentralize, you get the ability to innovate and explore many more directions at once. And all the parts get more freedom and autonomy.”

Decentralized social media is structured to change the balance of power, added Hanamura: “In this moment, we want you to know that you have the power. You can take back the power, but you have to understand it and understand your responsibility.”

The webinar was co-sponsored by DWeb and Library Futures, and presented by the Metropolitan New York Library Council (METRO).

The next event in the series, Decentralized Apps, the Metaverse and the “Next Big Thing,” will be held Thursday, May 26 at 4-5 p.m.EST, Register here

Congressman Ro Khanna in conversation with Larry Lessig

Could Ro Khanna be the first Asian American President of the United States?

California Congressman Ro Khanna is a political rising star, one that some Democrats see as the future of the Party. Known both for his progressive leadership and his ability to work across the aisle, Khanna – who represents Silicon Valley – is one of the most important figures setting tech policy in our nation today.

The Internet Archive invites you to come hear Khanna speak about his vision for the future. In Dignity in the Digital Age: Making Tech Work for All of Us, Khanna offers a vision for democratizing digital innovation to build economically vibrant and inclusive communities. Instead of being subject to tech’s reshaping of our economy, Khanna offers that we must channel those powerful forces toward creating a more healthy, equal, and democratic society.

On Tuesday, May 31st, 6pm PT/9pm ET, Representative Khanna will be interviewed by professor Larry Lessig, a digital access visionary and co-founder of Creative Commons and the Free Culture movement. Lessig himself ran for President in the Democratic primaries in 2016. The Internet Archive is honored to have these two great thinkers sharing our stage, for one night only! Please join us for this exciting political conversation either virtually or in-person at the Internet Archive, 300 Funston Ave, San Francisco. 

REGISTER NOW!

A note about safety for our in-person audience: The Internet Archive is taking COVID precautions very seriously. We will be requiring proof of vaccination and masks indoors. There will be no food or beverages served (though there will be a water station). We are limiting seating in our huge, thousand seat Great Room to only 200 people. And of course we will have our large windows and doors open to ensure good airflow. We are working hard to make sure that this event is as safe as can be! Please reserve your seats ASAP.


New additions to the Internet Archive for April 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 

  • Chris Cromwell Rare Reel to Reel Tapes – Rare and recovered reel-to-reel tapes from a variety of sources and preserved by Chris Cromwell. 
  • 1940s Classic TV – Television from the 1940s.
  • Game Shows Archive – A collection of game shows throughout television history, involving chance, skill and luck, usually presided over by a host and providing in-show commercials.
  • Dutch Television – Television programs and videos in the Dutch language, or from the Netherlands.

Books – 50,109 New items in April

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive – 150,224 New Items in April

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

LibriVox Audiobooks – 99 New Items in April

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

78 RPMs and Cylinder Recordings – 6,745 New Items in April

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Live Music Archive – 909 New Items in April

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Netlabels111 New Items in April

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Explore.

Movies – 55 New Items in April

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

Library as Laboratory Recap: Opening Television News for Deep Analysis and New Forms of Interactive Search

Watching a single episode of the evening news can be informative. Tracking trends in broadcasts over time can be fascinating. 

The Internet Archive has preserved nearly 3 million hours of U.S. local and national TV news shows and made the material open to researchers for exploration and non-consumptive computational analysis. At a webinar April 13, TV News Archive experts shared how they’ve curated the massive collection and leveraged technology so scholars, journalists and the general public can make use of the vast repository.

Roger Macdonald, founder of the TV News Archive, and Kalev Leetaru, collaborating data scientist and GDELT Project founder, spoke at the session. Chris Freeland, director of Open Libraries, served as moderator and Internet Archive founder Brewster Kahle offered opening remarks.

Watch video

“Growing up in the television age, [television] is such an influential, important medium—persuasive, yet not something you can really quote,” Kahle said. “We wanted to make it so that you could quote, compare and contrast.” 

The Internet Archive built on the work of the Vanderbilt Television Archive, and the UCLA Library Broadcast NewsScape to give the public a broader “macro view,” said Kahle. The trends seen in at-scale computational analyses of news broadcasts can be used to understand the bigger picture of what is happening in the world and the lenses through which we see the world around us.

In 2012, with donations from individuals and philanthropies such as the Knight Foundation, the Archive started repurposing the closed captioning data stream required of all U.S. broadcasters into a search index. “This simple approach transformed the antiquated experience of searching for specific topics within video,” said Macdonald, who helped lead the effort. “The TV caption search enabled discovery at internet speed with the ability to simultaneously search millions of programs and have your results plotted over time, down to individual broadcasters and programs.”

“[Television] is such an influential, important medium—persuasive, yet not something you can really quote. We wanted to make it so that you could quote, compare and contrast.”

Brewster Kahle, Internet Archive

Scholars and journalists were quick to embrace this opportunity, but the team kept experimenting with deeper indexing. Techniques like audio fingerprinting, Optical Character Recognition (OCR) and Computer Vision made it possible to capture visual elements of the news and improve access, Macdonald said. 

Sub-collections of political leaders’ speeches and interviews have been created, including an extensive Donald Trump Archive. Some of the Archive’s most productive advances have come from collaborating with outsiders who have requested more access to the collection than is available through the public interface, Macdonald said. With appropriate restrictions to maintain respect for broadcasters and distribution platforms, the Archive has worked with select scientists and journalists as partners to use data in the collection for more complex analyses.

Treating television as data

Treating television news as data creates vast opportunities for computational analysis, said Leetaru. Researchers can track word frequency use in the news and how that has changed over time.  For instance, it’s possible to look at mentions of COVID-related words across selected news programs and see when it surged and leveled off with each wave before plummeting downward, as shown in the graph below.

The newly computed metadata can help provide context and assist with fact checking efforts to combat misinformation. It can allow researchers to map the geography of television news—how certain parts of the world are covered more than others, Leetaru said. Through the collections, researchers have explored  which presidential tweets challenging election integrity got the most exposure on the news.  OCR of every frame has been used to create models of how to identify names of every “Dr.” depicted on cable TV after the outbreak of COVID-19 and calculate air time devoted to the medical doctors commenting on one of the virus variants.  Reverse image lookup of images in TV news has been used to determine the source of photos and videos.  Visual entity search tools can even reveal the increasing prevalence of bookshelves as backdrops during home interviews in the pandemic, as well as appearances of books by specific authors or titles. Open datasets of computed TV news metadata are available that include all visual entity and OCR detections, 10-minute interval captioning ngrams and second by second inventories of each broadcast cataloging whether it was “News” programming, “Advertising” programming or “Uncaptioned” (in the case of television news this is almost exclusively advertising).

From television news to digitized books and periodicals, dozens of projects rely on the collections available at archive.org for computational and bibliographic research across a large digital corpus. Data scientists or anyone with questions about the TV News Archives, can contact info@archive.org.

Up Next

This webinar was the fourth a series of six sessions highlighting how researchers in the humanities use the Internet Archive. The next will be about Analyzing Biodiversity Literature at Scale on April 27. Register here.

What’s in Your Smart Wallet? Keeping your Personal Data Personal

How Decentralized Identity Drives Privacy” with Internet Archive, Metro Library Council, and Library Futures

How many passwords do you have saved, and how many of them are controlled by a large, corporate platform instead of by you? Last month’s “Keeping your Personal Data Personal: How Decentralized Identity Drives Privacy” session started with that provocative question in order to illustrate the potential of this emerging technology.

Self-sovereign identity (SSI), defined as “an idea, a movement, and a decentralized approach for establishing trust online,” sits in the middle of the stack of technologies that makes up the decentralized internet. In the words of the Decentralized Identity Resource Guide written specifically for this session, “self-sovereign identity is a system where users themselves–and not centralized platforms or services like Google, Facebook, or LinkedIn–are in control and maintain ownership of their personal information.”

  Research shows that the average American has more than 150 different accounts and passwords – a number that has likely skyrocketed since the start of the pandemic. In her presentation, Wendy Hanamura, Director of Partnerships at the Internet Archive, discussed the implications of “trading privacy and security for convenience.” Hanamura drew on her recent experience at SXSW, which bundled her personal data, including medical and vaccine data, into an insecure QR code used by a corporate sponsor to verify her as a participant. In contrast, Hanamura says that the twenty-year old concept of self-sovereign identity can disaggregate these services from corporations, empowering people to be in better control of their own data and identity through principles like control, access, transparency, and consent. While self-sovereign identity presents incredible promise as a concept, it also raises fascinating technical questions around verification and management.

For Kaliya “Identity Woman” Young, her interest in identity comes from networks of global ecology and information technology, which she has been part of for more than twenty years. In 2000, when the Internet was still nascent, she joined with a community to ask: “How can this technology best serve people, organizations, and the planet?” Underlying her work is the strong belief that people should have the right to control their own online identity with the maximum amount of flexibility and access. Using a real life example, Young compared self-sovereign identity to a physical wallet. Like a wallet, self-sovereign identity puts users in control of what they share, and when, with no centralized ability for an issuer to tell when the pieces of information within the wallet is presented.

In contrast, the modern internet operates with a series of centralized identifiers like ICANN or IANA for domain names and IP addresses and corporate private namespaces like Google and Facebook. Young’s research and work decentralizes this way of transmitting information through “signed portable proofs,” which come from a variety of sources rather than one centralized source. These proofs are also called verifiable credentials and have metadata, the claim itself, and a digital signature embedded for validation. All of these pieces come together in a digital wallet, verified by a digital identifier that is unique to a person. Utilizing cryptography, these identifiers would be validated by digital identity documents and registries. In this scenario, organizations like InCommon, an access management service, or even a professional licensing organization like the American Library Association can maintain lists of institutions that would be able to verify the identity or organizational affiliation of an identifier. In the end, Young emphasized a message of empowerment – in her work, self-sovereign identity is about “innovating protocols to represent people in the digital realm in ways that empower them and that they control.”

Next, librarian Lambert Heller of Technische Bibliothek and Irene Adamski of the Berlin-based SSI firm Jolocom discussed and demonstrated their work in creating self-sovereign identity for academic conferences on a new platform called Condidi. This tool allows people running academic events to have a platform that issues digital credentials of attendance in a decentralized system. Utilizing open source and decentralized software, this system minimizes the amount of personal information that attendees need to give over to organizers while still allowing participants to track and log records of their attendance. For libraries, this kind of system is crucial – new systems like Condidi help libraries protect user privacy and open up platform innovation.

Self-sovereign identity also utilizes a new tool called  a “smart wallet,” which holds one’s credentials and is controlled by the user. For example, at a conference, a user might want to tell the organizer that she is of age, but not share any other information about herself. A demo of Jolocom’s system demonstrated how this system could work. In the demo, Irene showed how a wallet could allow a person to share just the information she wants through encrypted keys in a conference situation. Jolocom also allows people to verify credentials using an encrypted wallet. According to Adamski, the best part of self sovereign identity is that “you don’t have to share if you don’t want to.” This way, “I am in control of my data.”

To conclude, Heller discussed a recent movement in Europe called “Stop Tracking Science.” To combat publishing oligopolies and data analytics companies, a group of academics have come together to create scholar-led infrastructure. As Heller says, in the current environment, “Your journal is reading you,” which is a terrifying thought about scholarly communications.

These academics are hoping to move toward shared responsibility and open, decentralized infrastructure using the major building blocks that already exist. One example of how academia is already decentralized is through PIDs, or persistent identifiers, which are already widely used through systems like ORCID. According to Heller, these PIDs are “part of the commons” and can be shared in a consistent, open manner across systems, which could be used in a decentralized manner for personal identity rather than a centralized one. To conclude, Heller said, “There is no technical fix for social issues. We need to come up with a model for how trust works in research infrastructure.”

It is clear that self-sovereign identity holds great promise as part of a movement for technology that is privacy-respecting, open, transparent, and empowering. In this future, it will be possible to have a verified identity that is held by you, not by a big corporation – the vision that we are setting out to achieve. Want to help us get there? 

Join us at the next events hosted by METRO Library Council, Internet Archive, and Library Futures. https://metro.org/decentralizedweb

Links Shared

Links shared:
Resource guide for this session: https://archive.org/details/resource-guide-session-03-decentralized-identity
All resource guides: https://metro.org/DWebResourceGuides
Decentralized ORCID: https://whoisthis.wtf
Internet Identity Workshop: https://internetidentityworkshop.com/
Jolocom: https://jolocom.io/
Condidi: https://labs.tib.eu/info/en/project/condidi/
TruAge: https://www.convenience.org/TruAge/Home
DIACC Trust Framework: https://diacc.ca/trust-framework/
PCTF-CCP https://canada-ca.github.io/PCTF-CCP
TruAge Digital ID Verification Solution: https://www.convenience.org/Media/Daily/2021/May/11/2-TruAgeTM-Digital-ID-Verification-Solution_NACS
NuData Security: https://nudatasecurity.com/passive-biometrics/
Kaliya Young’s Book, Domains of Identity: https://identitywoman.net/wp-content/uploads/Domains-of-Identity-Highlights.pdf