Category Archives: Web & Data Services

Goodbye Facebook. Hello Decentralized Social Media?

The pending sale of Twitter to Elon Musk has generated a buzz about the future of social media and just who should control our data.

Wendy Hanamura, director of partnerships at the Internet Archive, moderated an online discussion April 28 “Goodbye Facebook, Hello Decentralized Social Media?” about the opportunities and dangers ahead. The webinar is part of a series of six workshops, “Imagining a Better Online World: Exploring the Decentralized Web.” 

Watch the session recording:

The session featured founders of some of the top decentralized social media networks including Jay Graber, chief executive officer of R&D project Bluesky, Matthew Hodgson, technical co-founder of Matrix, and Andre Staltz, creator of Manyverse. Unlike Twitter, Facebook or Slack, Matrix and Manyverse have no central controlling entity. Instead the peer-to-peer networks shift power to the users and protect privacy. 

If Twitter is indeed bought and people are disappointed with the changes, the speakers expressed hope that the public will consider other social networks. “A crisis of this type means that people start installing Manyverse and other alternatives,” Staltz said. “The opportunity side is clear.” Still in the transition period if other platforms are not ready, there is some risk that users will feel stuck and not switch, he added.

Hodgson said there are reasons to be both optimistic and pessimistic about Musk purchasing Twitter. The hope is that he will use his powers for good, making it available to everybody and empowering people to block the content they don’t want to see. The risk is with no moderation, Hodgson said, people will be obnoxious to one another without sufficient controls to filter, and the system will melt down. “It’s certainly got potential to be an experiment. I’m cautiously optimistic on it,” he said.

People who work in decentralized tech recognize the risk that comes when one person can control a network and act for good or bad, Graber said. “This turn of events demonstrates that social networks that are centralized can change very quickly,” she said. “Those changes can potentially disrupt or drastically alter people’s identity, relationships, and the content that they put on there over the years. This highlights the necessity for transition to a protocol-based ecosystem.” 

When a platform is user-controlled, it is resilient to disruptive change, Graber said. Decentralization enables immutability so change is hard and is a slow process that requires a lot of people to agree, added Staltz.

The three leaders spoke about how decentralized networks provide a sustainable alternative and are gaining traction. Unlike major players that own user data and monetize personal information, decentralized networks are controlled by users and information lives in many different places.

“Society as a whole is facing a lot of crises,” Graber said. “We have the ability to, as a collective intelligence, to investigate a lot of directions at once. But we don’t actually have the free ability to fully do this in our current social architecture…if you decentralize, you get the ability to innovate and explore many more directions at once. And all the parts get more freedom and autonomy.”

Decentralized social media is structured to change the balance of power, added Hanamura: “In this moment, we want you to know that you have the power. You can take back the power, but you have to understand it and understand your responsibility.”

The webinar was co-sponsored by DWeb and Library Futures, and presented by the Metropolitan New York Library Council (METRO).

The next event in the series, Decentralized Apps, the Metaverse and the “Next Big Thing,” will be held Thursday, May 26 at 4-5 p.m.EST, Register here

Library as Laboratory Recap: Analyzing Biodiversity Literature at Scale

At a recent webinar hosted by the Internet Archive, leaders from the Biodiversity Heritage Library (BHL) shared how its massive open access digital collection documenting life on the planet is an invaluable resource of use to scientists and ordinary citizens.

“The BHL is a global consortium of the  leading natural history museums, botanical gardens, and research institutions — big and small— from all over the world. Working together and in partnership with the Internet Archive, these libraries have digitized more than 60 million pages of scientific literature available to the public”, said Chris Freeland, director of Open Libraries and moderator of the event.

Watch session recording:

Established in 2006 with a commitment to inspiring discovery through free access to biodiversity knowledge, BHL has 19 members and 22 affiliates, plus 100 worldwide partners contributing data. The BHL has content dating back nearly 600 years alongside current literature that, when liberated from the print page, holds immense promise for advancing science and solving today’s pressing problems of climate change and the loss of biodiversity.

Martin Kalfatovic, BHL program director and associate director of the Smithsonian Libraries and Archives, noted in his presentation that Charles Darwin and colleagues famously said “the cultivation of natural science cannot be efficiently carried on without reference to an extensive library.”

“Today, the Biodiversity Heritage Library is creating this global, accessible open library of literature that will  help scientists, taxonomists, environmentalists—a host of people working with our planet—to actually have ready access to these collections,” Kalfatovic said. BHL’s mission is to improve research methodology by working with its partner libraries and the broader biodiversity and bioinformatics community. Each month, BHL draws about 142,000 visitors and 12 million users overall.

“The outlook for the planet is challenging. By unlocking this historic data [in the Biodiversity Heritage Library], we can find out where we’ve been over time to find out more about where we need to be in the future.”

Martin Kalfatovic, program director, Biodiversity Heritage Library

Most of the BHL’s materials are from collections in the global north, primarily in large, well-funded institutions. Digitizing these collections helps level the playing field, providing researchers in all parts of the world equal access to vital content.

The vast collection includes species descriptions, distribution records, climate records, history of scientific discovery, information on extinct species, and records of scientific distributions of where species live. To date, BHL has made over 176,000 titles and 281,000 volumes available. Through a partnership with the Global Names Architecture project, more than 243 million instances of taxonomic (Latin) names have been found in BHL content.

Kalfatovic underscored the value of BHL content in understanding the environment in the wake of recent troubling news from the Sixth Assessment Report (AR6) published by the  Intergovernmental Panel on Climate Change about the impact of the earth’s warming. 

Biodiversity Heritage Library by the numbers.

“The outlook for the planet is challenging,” he said. “By unlocking this historic data, we can find out where we’ve been over time to find out more about where we need to be in the future.”

JJ Dearborn, BHL data manager, discussed how digitization transforms physical books into digital objects that can be shared with “anyone, at any time, anywhere.” She describes the Wikimedia ecosystem as “fertile ground for open access experimentation,” crediting the organization with giving BHL the ability to reach new audiences and transform its data into 5-star linked open data. “Dark data” that is locked up in legacy formats, JP2s, and OCR text are sources of valuable checklist, species occurrence, and event sampling data that the larger biodiversity community can use to improve humanity’s collective ability to monitor biodiversity loss and the destructive impacts of climate change, at scale.  

The majority of the world’s data today is siloed, unstructured, and unused, Dearborn explained. This “dark data” “represents an untapped resource that could really transform human understanding if it could be truly utilized,” she said. “It might represent a gestalt leap for humanity.” 

The event was the fifth in a series of six sessions highlighting how researchers in the humanities use the Internet Archive. The final session of the Library as Laboratory series will be a series of lightning talks on May 11 at 11am PT / 2pm ET—register now!

What’s in Your Smart Wallet? Keeping your Personal Data Personal

How Decentralized Identity Drives Privacy” with Internet Archive, Metro Library Council, and Library Futures

How many passwords do you have saved, and how many of them are controlled by a large, corporate platform instead of by you? Last month’s “Keeping your Personal Data Personal: How Decentralized Identity Drives Privacy” session started with that provocative question in order to illustrate the potential of this emerging technology.

Self-sovereign identity (SSI), defined as “an idea, a movement, and a decentralized approach for establishing trust online,” sits in the middle of the stack of technologies that makes up the decentralized internet. In the words of the Decentralized Identity Resource Guide written specifically for this session, “self-sovereign identity is a system where users themselves–and not centralized platforms or services like Google, Facebook, or LinkedIn–are in control and maintain ownership of their personal information.”

  Research shows that the average American has more than 150 different accounts and passwords – a number that has likely skyrocketed since the start of the pandemic. In her presentation, Wendy Hanamura, Director of Partnerships at the Internet Archive, discussed the implications of “trading privacy and security for convenience.” Hanamura drew on her recent experience at SXSW, which bundled her personal data, including medical and vaccine data, into an insecure QR code used by a corporate sponsor to verify her as a participant. In contrast, Hanamura says that the twenty-year old concept of self-sovereign identity can disaggregate these services from corporations, empowering people to be in better control of their own data and identity through principles like control, access, transparency, and consent. While self-sovereign identity presents incredible promise as a concept, it also raises fascinating technical questions around verification and management.

For Kaliya “Identity Woman” Young, her interest in identity comes from networks of global ecology and information technology, which she has been part of for more than twenty years. In 2000, when the Internet was still nascent, she joined with a community to ask: “How can this technology best serve people, organizations, and the planet?” Underlying her work is the strong belief that people should have the right to control their own online identity with the maximum amount of flexibility and access. Using a real life example, Young compared self-sovereign identity to a physical wallet. Like a wallet, self-sovereign identity puts users in control of what they share, and when, with no centralized ability for an issuer to tell when the pieces of information within the wallet is presented.

In contrast, the modern internet operates with a series of centralized identifiers like ICANN or IANA for domain names and IP addresses and corporate private namespaces like Google and Facebook. Young’s research and work decentralizes this way of transmitting information through “signed portable proofs,” which come from a variety of sources rather than one centralized source. These proofs are also called verifiable credentials and have metadata, the claim itself, and a digital signature embedded for validation. All of these pieces come together in a digital wallet, verified by a digital identifier that is unique to a person. Utilizing cryptography, these identifiers would be validated by digital identity documents and registries. In this scenario, organizations like InCommon, an access management service, or even a professional licensing organization like the American Library Association can maintain lists of institutions that would be able to verify the identity or organizational affiliation of an identifier. In the end, Young emphasized a message of empowerment – in her work, self-sovereign identity is about “innovating protocols to represent people in the digital realm in ways that empower them and that they control.”

Next, librarian Lambert Heller of Technische Bibliothek and Irene Adamski of the Berlin-based SSI firm Jolocom discussed and demonstrated their work in creating self-sovereign identity for academic conferences on a new platform called Condidi. This tool allows people running academic events to have a platform that issues digital credentials of attendance in a decentralized system. Utilizing open source and decentralized software, this system minimizes the amount of personal information that attendees need to give over to organizers while still allowing participants to track and log records of their attendance. For libraries, this kind of system is crucial – new systems like Condidi help libraries protect user privacy and open up platform innovation.

Self-sovereign identity also utilizes a new tool called  a “smart wallet,” which holds one’s credentials and is controlled by the user. For example, at a conference, a user might want to tell the organizer that she is of age, but not share any other information about herself. A demo of Jolocom’s system demonstrated how this system could work. In the demo, Irene showed how a wallet could allow a person to share just the information she wants through encrypted keys in a conference situation. Jolocom also allows people to verify credentials using an encrypted wallet. According to Adamski, the best part of self sovereign identity is that “you don’t have to share if you don’t want to.” This way, “I am in control of my data.”

To conclude, Heller discussed a recent movement in Europe called “Stop Tracking Science.” To combat publishing oligopolies and data analytics companies, a group of academics have come together to create scholar-led infrastructure. As Heller says, in the current environment, “Your journal is reading you,” which is a terrifying thought about scholarly communications.

These academics are hoping to move toward shared responsibility and open, decentralized infrastructure using the major building blocks that already exist. One example of how academia is already decentralized is through PIDs, or persistent identifiers, which are already widely used through systems like ORCID. According to Heller, these PIDs are “part of the commons” and can be shared in a consistent, open manner across systems, which could be used in a decentralized manner for personal identity rather than a centralized one. To conclude, Heller said, “There is no technical fix for social issues. We need to come up with a model for how trust works in research infrastructure.”

It is clear that self-sovereign identity holds great promise as part of a movement for technology that is privacy-respecting, open, transparent, and empowering. In this future, it will be possible to have a verified identity that is held by you, not by a big corporation – the vision that we are setting out to achieve. Want to help us get there? 

Join us at the next events hosted by METRO Library Council, Internet Archive, and Library Futures. https://metro.org/decentralizedweb

Links Shared

Links shared:
Resource guide for this session: https://archive.org/details/resource-guide-session-03-decentralized-identity
All resource guides: https://metro.org/DWebResourceGuides
Decentralized ORCID: https://whoisthis.wtf
Internet Identity Workshop: https://internetidentityworkshop.com/
Jolocom: https://jolocom.io/
Condidi: https://labs.tib.eu/info/en/project/condidi/
TruAge: https://www.convenience.org/TruAge/Home
DIACC Trust Framework: https://diacc.ca/trust-framework/
PCTF-CCP https://canada-ca.github.io/PCTF-CCP
TruAge Digital ID Verification Solution: https://www.convenience.org/Media/Daily/2021/May/11/2-TruAgeTM-Digital-ID-Verification-Solution_NACS
NuData Security: https://nudatasecurity.com/passive-biometrics/
Kaliya Young’s Book, Domains of Identity: https://identitywoman.net/wp-content/uploads/Domains-of-Identity-Highlights.pdf

Building the Collective COVID-19 Web Archive

The COVID-19 pandemic has been life-changing for people around the globe. As efforts to slow the progress of the virus unfolded in early 2020, librarians, archivists and others with interest in preserving cultural heritage began considering ways to document the personal, societal, and systemic impacts of the global pandemic. These collections  included preserving physical, digital and web-based information and artifacts for posterity and future research use. 

Clockwise from top left: blog post about local artists making masks from Kansas City Public Library’s “COVID-19 Outbreak” collection; youth vaccination campaign website from American Academy of Pediatrics’ “AAP COVID” collection, COVID-19 case dashboard from Carnegie Mellon University’s “COVID-19” collection and COVID-19 FAQs from Library of Michigan’s “COVID-19 in Michigan” collection.

In response, the Internet Archive’s Archive-It service launched a COVID-19 Web Archiving Special Campaign starting in April 2020 to allow existing Archive-It partners to increase their web archiving capacity or new partners to join to collect COVID-19 related content. In all, more than 100 organizations took advantage of the COVID-19 Web Archiving Special Campaign and more than 200 Archive-It partner organizations built more than 300 new collections specifically about the global pandemic and its effects on their regions, institutions, and local communities. From colleges, universities, and governments documenting their own responses to community-driven initiatives like Sonoma County Library’s Sonoma Responds Community Memory Archive, a variety of information has been preserved and made available. These collections are critical historical records in and of themselves, and when taken in aggregate will allow researchers a comprehensive view into life during the pandemic.

Sonoma County Library’s Sonoma Responds: A Community Memory Archive encouraged community members to contribute content documenting their lives during the COVID-19 pandemic.

We have been exploring with partners ways to provide unified access to hundreds of individual COVID-related web collections created by Archive-It users. When the Institute of Museum and Library Services launched the American Rescue Plan grant program, that was part of the broader American Rescue Plan, a $1.9 trillion stimulus package signed into law on March 11, we applied and were awarded funding  to build a COVID-19 Web Archive access portal – a dedicated search and discovery access platform for COVID-19 web collections from hundreds of institutions.  The COVID-19 Web Archive will allow for browsing and full text search across diverse institutional collections and enable other access methods, including making datasets and code notebooks available for data analysis of the aggregate collections by scholars.  This work will support scholars, public health officials, and the general public in fully understanding the scope and magnitude of our historical moment now and into the future. The COVID-19 Web Archive is unique in that it will provide a unified discovery mechanism to hundreds of aggregated web archive collections built by a diverse group of over 200 libraries from over 40 US states and several other nations, from large research libraries to small public libraries to government agencies. If you would like your Archive-It collection or a portion of it included in the COVID-19 Web Archive, please fill out this interest form by Friday, April 29, 2022. If you are an institution in the United States that has COVID-related web archives collected outside of Archive-It or Internet Archive services that you are interested in having included in the COVID-19 Web Archive, please contact covidwebarchive@archive.org.

Meet the Librarians of the Internet Archive

In celebration of National Library Week, we’d like to introduce you to some of the professional librarians who work at the Internet Archive and in projects closely associated with our programs. Over the next two weeks, you’ll hear from librarians and other information professionals who are using their education and training in library science and related fields to support the Internet Archive’s patrons.

What draws librarians to work at the Internet Archive? From patron services to collection management to web archiving, the answers are as varied as the departments in which these professionals work. But a common theme emerges from the profiles—that of professionals wanting to use their skills and knowledge in support of the Internet Archive’s mission: “Universal Access to All Knowledge.”

We hope that over these next two weeks you’ll learn something about the librarians working behind the scenes at the Internet Archive, and you’ll come to appreciate the training and dedication that influence their daily work. We’re pleased to help you “Meet the Librarians” during this National Library Week and beyond:

Library as Laboratory Recap: Supporting Computational Use of Web Collections

For scholars, especially those in the humanities, the library is their laboratory. Published works and manuscripts are their materials of science. Today, to do meaningful research, that also means having access to modern datasets that facilitate data mining and machine learning.

On March 2, the Internet Archive launched a new series of webinars highlighting its efforts to support data-intensive scholarship and digital humanities projects. The first session focused on the methods and techniques available for analyzing web archives at scale.

Watch the session recording now:

“If we can have collections of cultural materials that are useful in ways that are easy to use — still respectful of rights holders — then we can start to get a bigger idea of what’s going on in the media ecosystem,” said Internet Archive Founder Brewster Kahle.

Just what can be done with billions of archived web pages? The possibilities are endless. 

Jefferson Bailey, Internet Archive’s Director of Web Archiving & Data Services, and Helge Holzmann, Web Data Engineer, shared some of the technical issues libraries should consider and tools available to make large amounts of digital content available to the public.

The Internet Archive gathers information from the web through different methods including global and domain crawling, data partnerships and curation services. It preserves different types of content (text, code, audio-visual) in a variety of formats.

Learn more about the Library as Laboratory series & register for upcoming sessions.

Social scientists, data analysts, historians and literary scholars make requests for data from the web archive for computational use in their research. Institutions use its service to build small and large collections for a range of purposes. Sometimes the projects can be complex and it can be a challenge to wrangle the volume of data, said Bailey.

The Internet Archive has worked on a project reviewing changes to the content of 800,000 corporate home pages since 1996. It has also done data mining for a language analysis that did custom extractions for Icelandic, Norwegian and Irish translation.

Transforming data into useful information requires data engineering. As librarians consider how to respond to inquiries for data, they should look at their tech resources, workflow and capacity. While more complicated to produce, the potential has expanded given the size, scale and longitudinal analysis that can be done.  

“We are getting more and more computational use data requests each year,” Bailey said. “If librarians, archivists, cultural heritage custodians haven’t gotten these requests yet, they will be getting them soon.”

Up next in the Library as Laboratory series:

The next webinar in the series will be held March 16, and will highlight five innovative web archiving research projects from the Archives Unleashed Cohort Program. Register now.

Internet Archive Releases Refcat, the IA Scholar Index of over 1.3 Billion Scholarly Citations

As part of our ongoing efforts to archive and provide perpetual access to at-risk, open-access scholarship, we have released Refcat (“reference” + “catalog”), the citation index culled from the catalog that underpins our IA Scholar service for discovering the scholarly literature and research outputs within Internet Archive. This first release of the Refcat dataset contains over 1.3 billion citations extracted from over 60 million metadata records and over 120 million scholarly artifacts (articles, books, datasets, proceedings, code, etc) that IA Scholar has archived through web harvesting, digitization, integrations with other open knowledge services, and through partnerships and joint initiatives.

Refcat represents one of the larger citation graph datasets of scholarly literature, as well as uniquely containing a notable portion of citations from works that do not have a DOI or persistent identifier. We hope this dataset will be a valuable community resource alongside other critical knowledge graph projects, including those with which we are collaborating, such as OpenCitations and Wikicite

The Refcat dataset is released under a CC0 license and is available for download from archive.org. The related software created for the extraction and matching process, including exact and fuzzy citation matching (refcat and fuzzycat), are also released as open-source tools. For those interested in technical details about the project, a white paper is available on arxiv.org authored by IA engineers, including Martin Czygan, who led work on Refcat, and is described in our catalog user guide.

What does Refcat mean for regular users of IA Scholar? Refcat results from work to ensure the interconnection between material within IA Scholar and other resources archived in Internet Archive in order to make browsing and lookups easier and to ensure overall citation integrity and persistence. For example, there are over 25 million web links in the citations in Refcat and we were able to match ~14 million of these to archived web pages in Wayback Machine and also found that ~18% of these matched web citations are no longer available on the live web. Web links in citations not in Wayback Machine have been added to ongoing web harvests. We also matched over 20 million citations to books that are available for lending in our Open Library service and matched over 1 million citations to Wikipedia entries. 

Besides interconnection, Refcat will allow users to understand what works have cited a specific scholarly resource (i.e. “cited by” or “inbound citations”) that will help with improved discovery features. Finally, knowing the full “knowledge graph” of IA Scholar helps us better identify important scholarly material that we have not yet archived, thus improving the overall quality and extent of the collection. This, in turn, aids scholars by ensuring their open-access work is archived and accessible forever, especially for those whose publisher may not have the resources for long-term preservation, and it ensures that related outputs like research registrations or datasets are also archived, matched to the article of record, and available into the future.

The Refcat release is a milestone of Phase Two of our project, “Ensuring the Persistent Access of Long Tail Open Access Journal Literature,” first announced in 2018 and supported by funding from the Andrew W. Mellon Foundation. Current work focuses on citation integrity within the IA Scholar archive, partnerships and services, such as our role in the multi-institutional Project Jasper and our partnership with Center for Open Science, and the addition of secondary scholarly outputs to IA Scholar, including datasets, software, and other non-article/book scholarly materials. Lookout for a plethora of announcements about other IA Scholar milestones in the coming months!

The Internet Archive’s Community Webs Program Welcomes 60+ New Members from the US, Canada and Internationally

Community Webs, the Internet Archive’s community history web and digital archiving program, is welcoming over 60 new members from across the US, Canada, and internationally. This new cohort is the first expansion of the Community Webs program outside of the United States and we are thrilled to be supporting the development of diverse, community-based web collections on an international scale. 

Community Webs empowers cultural heritage organizations to collaborate with their communities to build web and digital archives of primary sources documenting local history and culture, especially collections inclusive of voices typically underrepresented in traditional memory collections. The program achieves this mission by providing its members with free access to the Archive-It web archiving service, digital preservation and digitization services, and technical support and training in topics such as web archiving, community outreach, and digital preservation. The program also offers resources to support a local history archiving community of practice and to facilitate scholarly research.

New Community Webs member Karen Ng, Archivist at Sḵwx̱wú7mesh Úxwumixw (Squamish Nation), BC, Canada, notes that the program offers a way to capture community-generated online content in a context where many of the Nation’s records are held by other institutions. “The Squamish Nation community is active in creating and documenting language, traditional knowledge, and histories. Now more than ever in the digital age, it is imperative that these stories and histories be captured and stored in accessible ways for future generations.” 

Similarly, for Maryna Chernyavska, Archivist at the Kule Folklore Centre in Edmonton, Canada, the program will allow the Centre to continue building relationships with community members and organizations. “Being able to assist local heritage organizations with web archiving will help us empower these communities to preserve their heritage based on their values and priorities, but also according to professional standards.”

The current expansion of the program was made possible in part by generous funding from the Andrew Mellon Foundation, which supports the growth of Community Webs to new public libraries in the US. Additional funding provided by the Internet Archive allows the program to reach cultural heritage organizations in Canada and beyond. This newest cohort brings the total number of participants in Community Webs to over 150 organizations, a ten-fold increase since the program’s inception in 2017. For a full list of new participants, see below. The program continues to add members – if your institution is interested in joining, please view our open calls for applications and please make your favorite local memory organization aware of the opportunity.

Programming for the new cohort is underway and these members are already diving into the program’s educational resources and familiarizing themselves with the technical aspects of web archiving and digital preservation. We kicked things off recently with introductory Zoom sessions, where participants met one another and shared their organizations’ missions, communities served and goals for membership in the program. Online training modules, developed by staff at the Internet Archive and the Educopia Institute, went live for new members at the beginning of September. And our new cohort joined our existing Community Webs partners at our virtual Partner Meeting on September 22nd. 

We are thrilled to see the program continuing to grow and we look forward to working with our newest cohort. A warm welcome to the following new Community Webs members!

Canada:

  • Aanischaaukamikw Cree Cultural Institute
  • Age of Sail Museum and Archives
  • Ajax Public Library
  • Blue Mountains Public Library – Craigleith Heritage Depot
  • Canadian Friends Historical Association
  • Charlotte County Archives
  • City of Kawartha Lakes Public Library
  • Community Archives of Belleville and Hastings County
  • Confluence Concerts | Toronto Performing Arts Archives
  • Edson and District Historical Society – Galloway Station Museum & Archives
  • Essex-Kent Mennonite Historical Association
  • Ex Libris Association
  • Fishing Lake Métis Settlement Public Library
  • Frog Lake First Nations Library
  • Goulbourn Museum
  • Grimsby Public Library
  • Hamilton Public Library
  • Kule Folklore Centre
  • Maskwacis Cultural College
  • Meaford Museum
  • Milton Public Library
  • Mission Folk Music Festival
  • Nipissing Nation Kendaaswin
  • North Lanark Regional Museum
  • Northern Ontario Railroad Museum and Heritage Centre
  • Parkwood National Historic Site
  • Regina Public Library
  • Sḵwx̱wú7mesh Úxwumixw (Squamish Nation) Archives
  • Société historique du Madawaska Inc.
  • St. Clair West Oral History Project
  • Temagami First Nation Public Library
  • The ArQuives: Canada’s LGBTQ2+ Archives
  • The Historical Society of Ottawa
  • Thunder Bay Museum
  • Tk’emlups te Secwepemc

International:

  • Biblioteca Nacional Aruba
  • Institute of Information Science, Academia Sinica (Taiwan)
  • Mbube Cultural Preservation Foundation (Nigeria)
  • National Library and Information System Authority (NALIS) (Republic of Trinidad and Tobago)

United States:

  • Abilene Public Library
  • Ashland City Library
  • Auburn Avenue Research Library on African American Culture and History
  • Charlotte County Libraries & History
  • Choctaw Cultural Center
  • Cultura Local ABI
  • DC History Center
  • Forsyth County Public Library
  • Fort Worth Public Library
  • Inuit Circumpolar Council – Alaska
  • Menominee Tribal Archives
  • Mineral Point Library Archives
  • Obama Hawaiian Africana Museum
  • Scott County Library System
  • South Sioux City Public Library
  • St. Louis Media History Foundation
  • Tacoma Public Library
  • The History Project
  • The Seattle Public Library
  • Tipp City Public Library
  • University of Hawaiʻi – West Oʻahu
  • Wilmington Public Library District

Congrats to these new partners! We are excited to have you on board.

Internet Archive Launches Collaborative, Web-Based Art Resources Preservation and Access Initiative

Much of the art gallery, artist, and arts organization materials that were once published in print form are now available primarily or solely on the web. These groups, like many in the cultural sector, have also been hit especially hard by the global pandemic, making their web presences particularly at-risk of being lost if they are not proactively collected and preserved.The creation of reference and research resources that promote streamlined access and enable new types of scholarly use will ensure that the art historical record of the 21st century, and especially of our current global pandemic, is readily accessible far into the future.

For this reason, the Internet Archive, along with the New York Art Resources Consortium (NYARC), are pleased to announce our project Consortial Action to Preserve Born-Digital, Web-Based Art History & Culture. The project recently received a two-year, $305,343 Humanities Collections and Reference Resources grant from the Division of Preservation and Access at the National Endowment for the Humanities. This award will support the formation of a cooperative group of 30+ art and museum libraries from across the United States to collaborate on the preservation of, and access to vital arts content from the web. 

The Internet Archive has a long history of building and supporting collaborative communities and providing non-profit web, preservation, and access services to cultural heritage organizations. The multi-institutional initiative between Internet Archive, NYARC, and other arts and museum organizations will build on similar community-based archiving and professional cultivation projects in the Community Programs group, especially our Community Webs program, currently expanding nationally and internationally. Community Webs has received funding from The Andrew W. Mellon Foundation and IMLS to provide public libraries and cultural heritage organizations with services, training, and professional development opportunities to document their diverse local history. 

NYARC are pioneers in collaborative web archiving and shared services, among art and museum libraries. NYARC’s robust web archive collections encompass art resources, artists’ websites, auction catalogs, catalogues raisonnes, and hundreds of New York City gallery websites. The Internet Archive and NYARC have partnered on work to build born-digital collecting capacity among arts organizations in the past, most recently in the IMLS-funded Advancing Art Libraries and Curated Web Archives National forum and related events.  Through discussions, workshops and roadmapping sessions with leaders in art and museum libraries, a strategy and plan  towards an inclusive, sustainable, cooperative approach to collecting and stewarding born-digital, locally-focused art history collection was developed, forming the basis of this broader cooperative effort.

Members in the project’s preliminary group of art and museum libraries will select topics and specific web content that is relevant to their expertise, will provide metadata to facilitate access to archived content, and will participate in planning and evaluation meetings, all while curating a valuable reference resource that will enhance their traditional collecting areas. The Internet Archive will coordinate communications, facilitate governance and collective curatorial activities, provide technical digital library and archive services, and help enable members to build and maintain discovery and access platforms, as well as facilitate researcher use of the collections resulting from the group’s work.

If your art or museum library is interested in joining this collaborative effort, please fill out this participation form by July 31 to join us! 

Introducing 50+ New Public Library Members of the Internet Archive’s Community Webs Program

The Internet Archive’s Community Webs Program provides training and education, infrastructure and services, and professional community cultivation for public librarians across the country to document their local history and the lives of their patrons. Following our recent announcement of the program’s national expansion, with support from the Andrew W. Mellon Foundation, we are excited to welcome the first class of 50+ new public libraries to the program. This brings the current number of new and returning Community Webs participants to 90+ libraries from 33 states and 3 US territories. This diverse group of organizations includes multiple state libraries representing their regions, as well as a mix of large metropolitan library systems, small libraries in rural areas, and libraries like the Feleti Barstow Public Library in American Samoa. All will be working to document their communities, with a particular focus on archiving materials from traditionally underrepresented groups.

The new cohort class kicked off with virtual introductory events in mid-March, where participants met one another and shared stories about their communities and their goals for preserving and providing access to local history materials. Member libraries are currently receiving training in topics such as collection development and starting to build digital collections that reflect local diversity, events, and culture.

Program participant Kathleen Pickering, Director of the Belen Public Library and Harvey House Museum in Belen, New Mexico notes that their library “is committed to free and open-source electronic resources for our patrons, especially given the low-income status of many of our residents” and Community Webs will help further that goal. Similarly, new cohort member Aaron Ramirez of Pueblo City-County Library District (PCCLD) found Community Webs to be a great fit for existing institutional goals and initiatives. “PCCLD’s five-year strategic plan directs us to embrace local cultures, to include individuals of all skill levels and physical abilities, and to enrich established partnerships and collaborations. The groups that have not seen themselves in our archives will find through this project PCCLD’s intention and means to listen and go forward as allies and as a resource of support, rather than an institution serving only the affluent.”

Makiba J. Foster

Makiba J. Foster, Manager of The African American Research Library and Cultural Center of Broward County, Florida pointed out that “as content becomes increasingly digital, we need this opportunity to document the digital life and content of our community which includes a diverse representation of the Black Diaspora.”  Makiba was a member of the original Community Webs cohort in a previous position at the Schomburg Center for Research in Black Culture at New York Public Library, and recently presented on her work archiving the black diaspora to a group of more than 200 attendees.

The Community Webs Program is continuing to grow towards the milestone of over 150 participating libraries across the United States and will soon announce another call for applicants for a U.S. cohort starting in late summer. The program also is beginning to expand internationally, starting in Canada, exploring the addition of other types of libraries and cultural heritage organizations, and expanding its suite of training and services available to participants. Expect more news on these initiatives soon. 

Welcome to our new cohort of Community Webs libraries! The full list of new members: 

  • Alamogordo Public Library (New Mexico)
  • Amelia Island Museum of History (Florida)
  • ART | library deco (Texas)
  • Asbury Park Public Library (New Jersey)
  • Atlanta History Center (Georgia)
  • Bartholomew County Public Library (Indiana)
  • Bedford Public Library System (Virginia)
  • Belen Public Library and Harvey House Museum (New Mexico)
  • Bensenville Community Public Library (Illinois)
  • Biblioteca Municipal Aurea M. Pérez (Puerto Rico)
  • Carbondale Public Library (Illinois)
  • Cedar Mill & Bethany Community Libraries (Oregon)
  • Charlotte Mecklenburg Library (North Carolina)
  • Chicago Public Library (Illinois)
  • City Archives & Special Collections, New Orleans Public Library (Louisiana)
  • Dayton Metro Library (Ohio)
  • Elba Public Library (Alabama)
  • Essex Library Association (Connecticut)
  • Everett Public Library (Washington)
  • Feleti Barstow Public Library (American Samoa)
  • Forsyth County Public Library (North Carolina)
  • Hartford History Center, Hartford Public Library (Connecticut)
  • Heritage Public Library (Virginia)
  • Huntsville-Madison County Public Library (Alabama)
  • James Blackstone Memorial Library (Connecticut)
  • Jefferson Parish Library (Louisiana)
  • Jefferson-Madison Regional Library (Virginia)
  • Laramie County Library System (Wyoming)
  • Lawrence Public Library (Massachusetts)
  • Los Angeles Public Library (California)
  • Mill Valley Public Library, Lucretia Little History Room (California)
  • Missoula Public Library (Montana)
  • Niagara Falls Public Library (New York)
  • Pueblo City-County Library District (Colorado)
  • Rochester Public Library (New York)
  • Santa Cruz Public Libraries (California)
  • South Pasadena Public Library (California)
  • State Library of Pennsylvania (Pennsylvania)
  • Tangipahoa Parish Library (Louisiana)
  • The African American Research Library and Cultural Center (Florida)
  • The Ferguson Library (Connecticut)
  • Three Rivers Public Library District (Illinois)
  • Virginia Beach Public Library (Virginia)
  • Waltham Public Library (Massachusetts)
  • Watsonville Public Library (California)
  • West Virginia Library Commission (West Virginia)
  • William B Harlan Memorial Library (Kentucky)
  • Worcester Public Library (Massachusetts)
  • Your Heritage Matters (North Carolina)