Tag Archives: web archiving

Web Archiving to the Rescue: One Library’s Quest to Fill an Information Gap

Guest post by: Dana Hamlin, Archivist at Waltham Public Library

This post is part of a series written by members of the Internet Archive’s Community Webs program. Community Webs advances the capacity for community-focused memory organizations to build web and digital archives documenting local histories and underrepresented voices. For more information, visit communitywebs.archive-it.org/

What is an archivist to do when items of public record, which have been systematically added to publicly accessible collections for over a century, suddenly turn from paper into bits and bytes that disappear from the web, or even get stuck behind paywalls? Like many in my profession, I’ve been grappling with this question for a while. Having no real training in digital archiving and facing this quandary as a lone arranger, it’s sometimes hard to keep that grappling from turning into low-key panicking that my inaction has been causing information to be lost forever.

Imagine my excitement, then, when I learned about the Community Webs program – access to and training for Archive-It, collaboration with the Internet Archive, and a network of others like me to bounce ideas off and get inspiration from? Yes please! With the blessing of my boss, I applied right away and my library joined the program in April 2021.

The outside of the Waltham Public Library. Photo by C. Sowa.

(This might be a good point for a quick introduction. I work as the archivist/local history librarian at the Waltham Public Library (WPL) in Waltham, Massachusetts. Waltham is a city about 10 miles west of Boston, and is home to an ethnically and economically diverse population of just over 62,000 people. The WPL is a fully-funded community hub, fostering a healthy democratic society by providing a wealth of current informational, educational, and recreational resources free of charge to all members of the community. The library is known throughout the area for its knowledgeable and friendly staff, welcoming and safe environment, accessibility, convenience, current technology, and helpful assistance.)

I eagerly dove into the program and used our first web-archive collection – Waltham Public Library – as a testing ground, a place to gain familiarity with both Archive-It and the whole process of web archiving. I’ve been trying to capture content that aligns with the material found in the library’s analog records – annual reports, policies, announcements, event flyers, records from our Friends group, etc. – by doing a weekly crawl of the library website, our Friends website, and the library’s Twitter feed. For the most part this collection has been thankfully pretty straightforward.

Our largest collection so far is COVID-19 in Waltham, which makes up a portion of the library’s very first born-digital archival collection. That collection began in April 2020, when the WPL (like most other places) was closed to help “flatten the curve.” A month or two prior, as the pandemic was building steam, I had become fascinated with the 1918 influenza. A poke through our archives for the topic had been disappointing, as there wasn’t too much beyond a couple of newspaper clippings, brief mentions in the library trustees’ minutes, and a few pages in the records of the local nurses’ association. I was hoping to put together a better picture of what it was like to live in Waltham during the flu, perhaps to give myself a glimpse of what I could expect in the coming weeks (heh… how naïve I was).

Scrapbook page showing newspaper clippings from the early days of the 1918 flu. Scrapbook is part of the records of the Waltham Public Library. Photo by D. Hamlin.

I put out a call via the library’s social media for those who lived, worked, and/or went to school in Waltham to share their stories, hoping to build the kind of collection I wanted and failed to find from 1918. There was an initial rush of Google Form submissions, a handful of photos, and one video, and then nothing. I was pleased we had received some materials, but still wanted to paint a broader picture of Waltham under Covid. Enter Community Webs! For the past several months I’ve been working to collect retroactively what I was hoping to capture at the time – news articles, videos, the city website, information from the schools, and so on. While it’s not as comprehensive as it might have been if I’d been able to gather it all as it happened, I’ve been able to save over 500 GB of data that will help those in the future to better imagine what it was like to live in Waltham during Covid.

Screenshot from a WPL Instagram post sharing a patron’s submission to our COVID-19 in Waltham collection.
Screenshot examples of Covid-related content captured retroactively with Archive-It.

Finally, related to the quandary in the first paragraph of this post, our most complicated collection is the Waltham News Tribune. The WPL has microfilm copies of the paper going back to its earliest iteration in the 1860s, and part of my job has been to collect each issue and send yearly batches to a vendor for microfilming. However, as of this past May, the publisher has moved the paper entirely online, with some content requiring a paid subscription to view. The WPL has a subscription so that we can continue to provide free access to our patrons, but what happens to our archive of back issues? Does it just stop abruptly in May 2022, even as time and local news continue to march on? As it is, our microfilm is heavily used, especially since the paper’s offices burned down in 1999, making ours the only existing archive. 

Drawers full of microfilmed newspapers at the WPL. Photo by D. Hamlin.

Thanks to web archiving, we’re able to continue to fulfill our unofficial role as the repository for the city newspaper, at least in theory. In practice, I look at the daily crawls of the digital edition of the paper and can’t help but see that it is no longer the type of local news we’ve been archiving for over a century. The corporate publisher of the paper has consolidated ours with those from several other local cities and towns, and has sacrificed true local news coverage for more generic topics, many of which aren’t even related specifically to Massachusetts. This is a problem that sits well outside of my archives wheelhouse, but at least I feel I can do my due diligence by capturing what local news does trickle through. 

I’ve had a slower go of web archiving than I’d like so far, thanks to several months of parental leave in 2021 and a very packed part-time work schedule. Nevertheless, I’ve been chipping away at our collections and planning for more, with an eye to add more diverse voices than those that make up much of our analog collections. I’m grateful for the encouragement and help I’ve received from Community Webs staff and peers, and want to give a special shout-out to the Archive-It folks who hold office hours to assist us with technical issues! This really is a fantastic program, and I’m so glad my library is part of it.

Preserving Wilmington History on the Web

Guest Post by: Tricia Dean, Tech Services Manager at Wilmington Public Library District (IL)

This post is part of a series written by members of the Community Webs program. Community Webs advances the capacity for community-focused memory organizations to build web and digital archives documenting local histories and underrepresented voices. For more information, visit communitywebs.archive-it.org/

Wilmington Public Library. Photo: T. Dean 4/21/22

I was excited when I saw the call for participants in Community Webs. While Wilmington, Illinois is a small, rural town (5,664 people), the thought was that we still had something to contribute. Most Archive-It partners are universities, museums and large libraries, and being in their company was a little daunting to me initially. Other institutions have someone who opens the project, and then it develops into a larger team project. Wilmington Public Library District (WPLD) has a much smaller staff; the project has been wholly mine, which has been both thrilling and terrifying. 

Wilmington is a small rural town, falling on the lower end of the economic scale.  Because we are isolated,the library plays a vital part in the community.  We offer the usual storytimes and adult programs, but also loan out hotspots and ChromeBooks. We have 45 hotspots and these are almost always checked out; some people are using them for vacations, but by usage it is apparent that others are using them as their primary means of connecting to the Internet. Internet access has been more and more important, but after the Covid-19 broke out, more governmental services went strictly online, making access even more critical – and to many who had not been regular patrons. WPLD is a hub for the community, offering computers, information, tax forms, and a place to come in and chat – even more important when we are trying to stay close and limit outside contact.

Main Street in Wilmington, circa 1900

I am a Chicago native who went to Champaign-Urbana for grad school. I was a scanner for the Internet Archive for several years where I was privileged to handle some incunabula (pre-1500 items). I am the Technical Services Supervisor at Wilmington; primarily I catalog our materials, but I also tend toward Projects, from adding series labels to re-orienting all the calls in the juvenile non-fiction section.  I am currently going through our attic to help determine what we have (it’s a Mystery!). I’m making lists, and hoping to have items to scan which would be available online, in multiple places. I applied for the Community Webs program (with my director’s blessing) because I felt that it’s important for small towns to be represented in the collection of history. Only 20% of the population still lives outside major metro areas, but it is every bit as important to capture that life as it is to retain the history of large cities.

Wilmington Library joined Community Webs in the summer of 2021. After some technical clarifications with the Archive-It staff WLPD was set up. In considering what made Wilmington unique, the first link was to our library and social media pages. Social media has grown in importance in the last twenty years, but it became a vital link during Covid when services were otherwise unavailable. Wilmington Library YouTube videos, how-tos, crafts and storytime, stand to remind us of how we responded and as a continuing reference for parents who can’t get to the library. But since social media, specifically, is known for ‘right now,’ it lacks the kind of reflection over time that we can create through the Community Webs project.

We may be small, but we have a number of historical articles and sites which needed to be brought together. We want to reflect events that have been impactful to our community, from the explosion of the Joliet Armory in the 1940s to the continuing issues with the Wilmington Dam, which has proved dangerous, but has complicated ownership issues. I still have a long way to go; the projects (attic/local history/web archive) are all intertwined. Wilmington has the usual Community Resources and City Government collections in Archive-It. Going forward, we want to continue to develop our Wilmington History collection. We are working on local history and will establish a collection of materials from our attic and public donations. Our local paper has vertical files which could be a goldmine of information – again, on my to-do list.  We will be kicking off an Oral History Project, which will begin with a series of simple gatherings/coffee hours for our seniors, providing a place for them to gather, and a space to share their stories. I am hoping these will be in our Community Webs archive. Who better to speak to where we’ve been and where we are than some of our oldest residents?

Wilmington Dam (present). Photo: T. Dean 4/21/22
[Photo by John Irvine – Chicago Tribune – August 29, 1992]. Shallow appearing dam is still quite hazardous, partially because it doesn’t ‘look’ dangerous – photo long before warning signs went up.

Why is Community Webs important? Because it will help to remember when we cannot keep up with the information overload. Because there is so much happening that we miss a good deal of what is around us – or can’t bear to face it for long. Because so very very much of our lives are now online – and can be erased with a keystroke. Because we are seeing, painfully, that those who do not learn from the past will be/are condemned to re-live it. And, for Wilmington, I think it is important because so many of the voices and sites being captured are from museums, universities and large public libraries. It is important that we remember that we used to be far less urban than we are today. It is important to remember the smaller places, those who are too easily lost in the maelstrom of modern life, because to be forgotten is to be erased.

Sharing Inuit Voices Across Time: Inuit Circumpolar Council Alaska’s Web and Digital Archive

Guest post by: Inuit Circumpolar Council Alaska

This post is part of a series written by members of Internet Archive’s Community Webs program. Community Webs advances the capacity for community-focused memory organizations to build web and digital archives documenting local histories and underrepresented voices. For more information, visit communitywebs.archive-it.org/

Can you describe your community and the services and role of your organization within the community?

Inuit Circumpolar Council (ICC) Alaska works on behalf of the Inupiat of the North Slope, Northwest and Bering Straits Regions; St. Lawrence Island Yupik; and the Central Yup’ik and Cup’ik of the Yukon-Kuskokwim Region in Southwest Alaska. ICC Alaska is a national member of ICC International. Since inception in 1977, ICC has gained consultative status II with the United Nations, and is a Permanent Participant of the Arctic Council.

For example, ICC has provisional status with the International Maritime Organization (IMO), is an active member at the Arctic Council senior level and within the working groups and is a prominent voice at the UN Framework Convention on Climate Change (UNFCCC). Work and engagement occur in many ways at these different Fora. Within the UNFCCC, ICC has taken a leadership role in putting forward Indigenous Knowledge and establishing a platform for providing equitable space for multiple knowledge systems. Additionally, at the UNFCCC COP 26, ICC Chair, Dr. Dalee Sambo Dorough, led an ICC delegation made up of Inuitrepresentatives from across the Arctic.

ICC COP26 position paper, available at https://iccalaska.org/wp-icc/wp-content/uploads/2021/10/20211028-en-ICC-COP26-Position-Paper.pdf

An immense amount of work occurs in direct partnership with Inuit communities to inform work at international fora. For example, ICC is facilitating the development of international protocols for Equitable and Ethical Engagement. These protocols will provide a pathway to success for all that want to work within Inuit homelands and whose work impacts the Arctic. The protocols will aid in a paradigm shift in how work, decisions, and policies are currently created and carried out. The paradigm shift will lead toward greater equity and recognition of Inuit sovereignty and Self-determination.

Why was your organization interested in participating in Community Webs? 

The Community Webs program was attractive to ICC because it provided the training and the storage to effectively preserve ICC’s digitized & born-digital archival materials. We were pleased to see this offering as a solution for an ongoing desire to archive the prolific organization’s digital materials & products. This work dovetails nicely with ICC Alaska’s efforts to digitize 47 boxes, or around 80 linear feet of material that span 6 decades, including audio, film, photographic media, and paper documents.

ICC Jam – part 2 – Greenland

Cultural programming as part of the 1983 General Assembly. In this clip, view performances from Greenland’s Tuktak Theater and a Greenlandic choir

ICC advocates for Inuit and Inuit way of life, highlighted by ICC’s General Assembly meetings. The ICC receives its mandate from a General Assembly held every four years. The General Assembly is the heart of the organization, providing an opportunity for sharing information, discussing common concerns, debating issues, and strengthening the unity between all Inuit across our homelands. Through the Community Webs project, ICC Alaska has been able to preserve archival video of the ICC General Assemblies going back 30 years using Archive-It and the Internet Archive, as well as all newsletters, press releases, resolutions, social media campaigns, and reports published on its website. These are a significant record of ICC advocacy, but more importantly, Inuit political and cultural heritage.

Moses Wassillie’s Oral History of first ICC General Assembly in 1977, available at: http://oralhistory.library.uaf.edu/88/88-49-114_T01.pdf

Why do you think it is important for public libraries, community archives, and other local and community-based organizations to do this work?

Community-based organizations are uniquely positioned as both a part of and apart from the community. This vantage point allows for the self-reflection and observation needed for web archiving, as well as the relationships within the community to create the space and dialogue needed for community archiving projects. By building more capacity within community-based organizations for web archiving and digital preservation efforts, we can expand the recorded historical narrative and humanities-based inquiries in a multitude of directions, to truly reflect the diversity of our world & time.

Where do you hope to see your web archiving program going?

The core goal of this work is to make ICC documents and its historical narrative more accessible and discoverable within ICC, to ICC’s member organizations, international bodies, and researchers, our aspirations are much bigger. Our hope is that this web archive goes beyond the core goal to inspire, delight, hearten, inform, and add depth to the conversations Inuit are having about cultural identity, relationship to the land, hunting, advocacy, self-determination, and self-governance. 

We are curious about the intangible outcomes: What new work does the archive inspire? How does the archive add depth & historical weight to existing projects, discussions, and advocacy? What stories and knowledge gets re-remembered, or re-investigated after viewing archival materials? What advocacy, ethics, and philosophical works come from Inuit leaders informed by the legacy that the archive shared? Are youth leaders interested in adding to the archive?

Is there anything you would like your organization to contribute back to the broader community of web archiving and/or local history in the form of documentation, workflows, policy drafts or other resources?

We have several aspirations. Firstly, it is the telling of Inuit stories. The archive is another manifestation of that mission – to record and share Inuit voices across time. To increase access to those voices, information, knowledge, and history. The ICC Archival holdings are a historically unique & culturally significant telling of Inuit cultural heritage, history (including political history), educational pedagogy, philosophy, self-determination, values, ethics, environmental stewardship, and Indigenous Knowledge. It is important to create a way for Inuit to discover and interact with this work. Community Webs has offered a new tool in our toolkit.

Secondly, the goal is to move forward conversations about categorization and information management for indigenous communities. What does that look like in best practice? Can we, together with other Inuit archives, improve on existing practices to create a more equitable and ethical engagement with Inuit-produced information, the management of that information, and the discovery and access of that information.

What are you most excited to learn through your participation in Community Webs?

It was exciting to discover that many Inuit and Alaska Native resources that have already been preserved using the Internet Archive. These resources are often affected by insufficient financial support. Being able to have a preserved and accessible copy of these resources is an important step towards creating the bigger picture of the historical record of Inuit advocacy. As part of the Community Webs meetings, it was exciting to hear from other tribal librarians and community archivists across the country & world. Additionally, it was exciting to hear from speakers whose work informs our community archival work at ICC Alaska – such as Chaitra Powell who created (among other amazing things) the “Archive in a Backpack” project.

What impact do you think web archiving could have within your community?

Hopefully this work inspires other organizations to also preserve their digital assets, creating a richer narrative of Inuit political and cultural heritage.

What do you foresee as some of the challenges you may face?

We are eager to preserve our social media channels that have replaced the DRUM newsletter as a vehicle for keeping our community up-to-date on ICC’s work. Ongoing challenges with Facebook and Instagram archiving are preventing us from doing that. Hopefully these issues are resolved in the favor of the communities who created the content and bring their community and connections to these software platforms.

Building the Collective COVID-19 Web Archive

The COVID-19 pandemic has been life-changing for people around the globe. As efforts to slow the progress of the virus unfolded in early 2020, librarians, archivists and others with interest in preserving cultural heritage began considering ways to document the personal, societal, and systemic impacts of the global pandemic. These collections  included preserving physical, digital and web-based information and artifacts for posterity and future research use. 

Clockwise from top left: blog post about local artists making masks from Kansas City Public Library’s “COVID-19 Outbreak” collection; youth vaccination campaign website from American Academy of Pediatrics’ “AAP COVID” collection, COVID-19 case dashboard from Carnegie Mellon University’s “COVID-19” collection and COVID-19 FAQs from Library of Michigan’s “COVID-19 in Michigan” collection.

In response, the Internet Archive’s Archive-It service launched a COVID-19 Web Archiving Special Campaign starting in April 2020 to allow existing Archive-It partners to increase their web archiving capacity or new partners to join to collect COVID-19 related content. In all, more than 100 organizations took advantage of the COVID-19 Web Archiving Special Campaign and more than 200 Archive-It partner organizations built more than 300 new collections specifically about the global pandemic and its effects on their regions, institutions, and local communities. From colleges, universities, and governments documenting their own responses to community-driven initiatives like Sonoma County Library’s Sonoma Responds Community Memory Archive, a variety of information has been preserved and made available. These collections are critical historical records in and of themselves, and when taken in aggregate will allow researchers a comprehensive view into life during the pandemic.

Sonoma County Library’s Sonoma Responds: A Community Memory Archive encouraged community members to contribute content documenting their lives during the COVID-19 pandemic.

We have been exploring with partners ways to provide unified access to hundreds of individual COVID-related web collections created by Archive-It users. When the Institute of Museum and Library Services launched the American Rescue Plan grant program, that was part of the broader American Rescue Plan, a $1.9 trillion stimulus package signed into law on March 11, we applied and were awarded funding  to build a COVID-19 Web Archive access portal – a dedicated search and discovery access platform for COVID-19 web collections from hundreds of institutions.  The COVID-19 Web Archive will allow for browsing and full text search across diverse institutional collections and enable other access methods, including making datasets and code notebooks available for data analysis of the aggregate collections by scholars.  This work will support scholars, public health officials, and the general public in fully understanding the scope and magnitude of our historical moment now and into the future. The COVID-19 Web Archive is unique in that it will provide a unified discovery mechanism to hundreds of aggregated web archive collections built by a diverse group of over 200 libraries from over 40 US states and several other nations, from large research libraries to small public libraries to government agencies. If you would like your Archive-It collection or a portion of it included in the COVID-19 Web Archive, please fill out this interest form by Friday, April 29, 2022. If you are an institution in the United States that has COVID-related web archives collected outside of Archive-It or Internet Archive services that you are interested in having included in the COVID-19 Web Archive, please contact covidwebarchive@archive.org.

Volunteers Rally to Archive Ukrainian Web Sites

As the war intensifies in Ukraine, volunteers from around the world are working to archive digital content at risk of destruction or manipulation. The Internet Archive is supporting several preservation efforts including the Saving Ukrainian Cultural Heritage Online (SUCHO) initiative launched in early March. 

“When we think about the internet, we think the data is always going to be there. But all this data exists on physical servers and they can get destroyed just like buildings and monuments,” said Quinn Dombrowski, academic technology specialist at Stanford University and co-founder of SUCHO. “A tremendous amount of effort and energy has gone into the development of these websites and digitized collections. The people of Ukraine put them together for a reason. They wanted to share their history, culture, language and literature with the world.”

Watch:

More than 1,200 volunteers with SUCHO have saved 10 terabytes of data including 14,000 uploaded items (images and PDFs) and captured parts of 2,300 websites so far. This includes material from Ukrainian museums, library websites, digital exhibits, open access publications and elsewhere. 

The initiative is using a combination of technologies to crawl and archive sites and content. Some of the information is stored at the Internet Archive, where it can be discovered and accessed using open-source software.

Staff at the Internet Archive are committed to assisting with the effort, which aligns with the organization’s mission of universal access to knowledge, and aim to make the web more useful and reliable, said Mark Graham, director of the Wayback Machine.

“This is a pivotal time in history,” he said. “We’re seeing major powers engaged in a war and it’s happening in the internet age where the platforms for information sharing and access we have built, and rely on, the Internet and the Web, are at risk.”

The Internet Archive is documenting and making information accessible that might not otherwise be available, Graham said. For years, the Wayback Machine has been archiving about 950 Russian news sites and 350 Ukrainian news sites. Stories that are deleted or altered are being archived for the historical record. 

“We’re seeing major powers engaged in a war and it’s happening in the internet age where the platforms for information sharing and access…are at risk.”

Mark Graham, director, Wayback Machine

Recognizing the urgency of this moment, Dombrowski has been stunned by the response to help from archivists, scholars, librarians involved in cultural heritage and the general public. Volunteers need not have technical expertise or special language skills to be of value in the project. 

“Many people were spending the days before they got involved with SUCHO scrolling the news and feeling helpless and wishing they could do something to contribute more directly towards helping out with the situation,” Dombrowski said. “It’s been really inspiring hearing the stories that people have told about what it’s meant to them to be able to be part of something like this.”

Gudrun Wirtz, head of the East European Department of the Bavarian State Library (Bayerische Staatsbibliothek) in Munich, was archiving on a smaller scale when she and other colleagues began to collaborate with SUCHO.

“We are committed to Ukraine’s heritage and horrified by this war against the people and their rich culture and the distorting of history going on,” Wirtz said. “As Germans we are especially shocked and reminded of our historical responsibility, because last time Ukraine was invaded it was 1941 by Nazi-Germany. We try to do everything we can at the moment.”

Anna Kiljas, Tufts University

The invasion of Ukraine hits particularly close to home for Anna Kijas, a librarian at Tufts University and co-founder of SUCHO, who is a Polish immigrant with family members who lived through Soviet occupation following WWII.

“Contributing to the SUCHO effort is something tangible that I can do and bring my expertise as a librarian and digital humanist in order to help preserve as much of the cultural heritage of the Ukrainian people as is possible,” said Kijas. 

The third co-founder SUCHO, Sebastian Majstorovic, is with the Austrian Centre for Digital Humanities and Cultural Heritage. 

The Internet Archive is providing technical support, tools and training to assist volunteers, including those with SUCHO, who are giving of their time.

Through Archive-It, a customizable self-service web archiving platform that captures, stores, and provides access to web-based content, free online accounts have been offered to volunteer archivists. Mirage Berry, business development manager for Archive-It, has coordinated support with other preservation partners including the Harvard Ukrainian Research Institute, the Center for Urban History of East Central Europe, and East European & Central Asian Studies Collections librarian Liladhar Pendse at University of California, Berkeley.

“It’s so incredible how quickly all of these archivists have pulled together to do this,” Berry said. “Everyone wants to do something. You don’t need to have a ton of technical experience. For anyone who is willing to learn, it’s a great jumping off point for web archiving.”

SUCHO organizers anticipate after the immediate emergency of website archiving is over, there will be an ongoing need to stay vigilant with data curation of Ukrainian material. To learn more and get involved, visit http://www.sucho.org.

Integrating Web-Based Content into a Vibrant Local History Collection: South Pasadena Public Library and the Community Webs Program

Guest post by: Olivia Radbill, Adult Services/Local History Librarian, South Pasadena Public Library

This post is part of a series written by members of Internet Archive’s Community Webs program. Community Webs advances the capacity for community-focused memory organizations to build web and digital archives documenting local histories and underrepresented voices. For more information, visit communitywebs.archive-it.org/

The South Pasadena Public Library (SPPL) is a single branch library system located in the small city of South Pasadena, California, just fifteen minutes from downtown Los Angeles. SPPL serves a population of approximately 25,000 residents, many of whom are very dedicated to preservation and local history. As the Adult Services/Local History Librarian at SPPL, I regularly interact with local organizations, City staff, City commissioners, and residents in search of the many little-known details of South Pasadena’s history. My role not only entails organizing, processing, and making accessible local history, but also archiving current events that will inevitably be the subject of future research. 

At the onset of the COVID-19 pandemic, when the SPPL building was first shut down and our physical Local History Collection made inaccessible, Library staff sought to provide means of bringing local history to digital platforms in a consistent and manageable way. Our first means of public outreach was through the creation of digital exhibits using ArcGIS Storymaps, an interactive web-mapping tool used to host narrative multimedia displays. Exhibits in the series include Ray Bradbury: Celebrating 100 Years, South Pasadena Public Library: Twelve Decades and Counting, City of Trees: Our Urban Tree Canopy, Summers in SoPas: Highlights of Summers Past, and COVID-19: Living History Project. To date, this series has garnered thousands of views.

Screenshot of “Summers in SoPas: Highlights of Summers Past” online exhibit.

While this series did quell some of the community desire to interact with the Local History Collection, it did not address the needs of the community in regards to born-digital content. The COVID-19 pandemic highlighted certain gaps in our collection. One of the most notable gaps was the lack of any born-digital or web-archived content. Previously, SPPL has relied primarily on physical donations and physical City documentation. However, once these objects became inaccessible to both Library staff and patrons during our initial COVID-19 closure in March 2020, we sought means of preserving documentation that has increasingly moved to exclusively web-based platforms. For example, in April 2020 the City of South Pasadena launched “City Hall Scoop”, an online blog intended to provide quick, reliable news updates to local residents. It became imperative for Library staff to actively seek out and ensure preservation of this kind of content. 

The South Pasadena Public Library homepage on Archive-It.

At the onset of our involvement in the Community Webs program, I strove to ensure that the objective of our internet archiving was specific, consistent, and attainable. After careful consideration, the following categories were determined to be priorities to the SPPL Local History Collection: City Government, Local Newspapers, and Nonprofit Organizations. Based on these categories we have identified many relevant websites, but chose to focus primarily on official websites and social media pages, to add to the Archive-It platform. The Community Webs project has been an invaluable resource for addressing the needs of both the SPPL staff and the community. Online trainings have aided significantly in overcoming learning curves, helped us determine the scope of our archiving project, and have allowed SPPL to create a system in which web-based content is an integral part of our Local History Collection. SPPL, as of March 2022, has archived, either singularly or on a recurring basis, eleven websites. We are hoping to archive 22 new sites by the end of the year, doubling the number we reached last year. 

Library as Laboratory Recap: Supporting Computational Use of Web Collections

For scholars, especially those in the humanities, the library is their laboratory. Published works and manuscripts are their materials of science. Today, to do meaningful research, that also means having access to modern datasets that facilitate data mining and machine learning.

On March 2, the Internet Archive launched a new series of webinars highlighting its efforts to support data-intensive scholarship and digital humanities projects. The first session focused on the methods and techniques available for analyzing web archives at scale.

Watch the session recording now:

“If we can have collections of cultural materials that are useful in ways that are easy to use — still respectful of rights holders — then we can start to get a bigger idea of what’s going on in the media ecosystem,” said Internet Archive Founder Brewster Kahle.

Just what can be done with billions of archived web pages? The possibilities are endless. 

Jefferson Bailey, Internet Archive’s Director of Web Archiving & Data Services, and Helge Holzmann, Web Data Engineer, shared some of the technical issues libraries should consider and tools available to make large amounts of digital content available to the public.

The Internet Archive gathers information from the web through different methods including global and domain crawling, data partnerships and curation services. It preserves different types of content (text, code, audio-visual) in a variety of formats.

Learn more about the Library as Laboratory series & register for upcoming sessions.

Social scientists, data analysts, historians and literary scholars make requests for data from the web archive for computational use in their research. Institutions use its service to build small and large collections for a range of purposes. Sometimes the projects can be complex and it can be a challenge to wrangle the volume of data, said Bailey.

The Internet Archive has worked on a project reviewing changes to the content of 800,000 corporate home pages since 1996. It has also done data mining for a language analysis that did custom extractions for Icelandic, Norwegian and Irish translation.

Transforming data into useful information requires data engineering. As librarians consider how to respond to inquiries for data, they should look at their tech resources, workflow and capacity. While more complicated to produce, the potential has expanded given the size, scale and longitudinal analysis that can be done.  

“We are getting more and more computational use data requests each year,” Bailey said. “If librarians, archivists, cultural heritage custodians haven’t gotten these requests yet, they will be getting them soon.”

Up next in the Library as Laboratory series:

The next webinar in the series will be held March 16, and will highlight five innovative web archiving research projects from the Archives Unleashed Cohort Program. Register now.

24 Arts Organizations join the Collaborative ART Archive (CARTA)

Earlier this summer, the Internet Archive announced its partnership with the New York Art Resources Consortium (NYARC) to form a collaborative, web-based art resources preservation and access initiative. We are now thrilled to announce that the initiative has kicked off with a diverse roster of 24 participating member institutions throughout the United States and Canada.

The Collaborative ART Archive (CARTA) project has a mission to collect, preserve, and provide access to vital arts content from the web by supporting a vibrant, growing collaboration of art and museum libraries. With funding from federal agencies and foundations, the Internet Archive is able to expand CARTA to a diverse set of museums and art libraries worldwide and to broaden the ways the resulting collections can be discovered and used both by scholar and patrons.

The arts institutions actively participating in this program so far include:

  • American Craft Council
  • American Folk Art Museum
  • ART | library deco
  • Art Gallery of Ontario
  • Art Institute of Chicago
  • Fashion Institute of Technology
  • Getty Research Institute (Getty Library)
  • Harvard University – Fine Arts Library
  • Harvard University – Graduate School of Design
  • Indianapolis Museum of Art at Newfields
  • Leonardo/ISAST
  • Maryland Institute College of Art
  • Museum of Contemporary Art of Georgia
  • National Gallery of Art Library
  • National Gallery of Canada
  • New York Art Resources Consortium
  • Philadelphia Museum of Art
  • San Francisco Museum of Modern Art
  • Sterling and Francine Clark Art Institute Library
  • The Corning Museum of Glass
  • The Menil Collection
  • The Metropolitan Museum of Art
  • The Nelson-Atkins Museum of Art, Spencer Reference Library
  • University of Hawaii at Manoa, Hamilton Library

Membership in the program includes national and regional art and museum libraries throughout the United States and Canada committed to the preservation of 21st century art historical resources on the web. One of our early supporters and current CARTA member Amelia Nelson, Director of Library and Archives at The Nelson-Atkins Museum of Art, noted the increased risk of losing art history on the web in comparison to earlier generations of artists: “Websites are the letters, exhibition postcards, exhibition reviews and newspaper articles of today’s artists and artistic communities, but they aren’t resources that scholars can find in archives like the physical materials that document the careers of earlier generations of artists. I worry that as we lose these sites, we are also losing the potential for scholars to place this moment in the canon of art history and culture broadly. This initiative will build a collaborative and sustainable way for art libraries to pool their limited resources, with the technical, administrative, and organizational expertise of the Internet Archive, to ensure that this content is available for future generations.”

The initial group of member institutions have identified an initial set of more than 150 valuable and at-risk websites, articles, and other materials on five primary collection topics: Local Arts Organizations; Artists Websites; Art Galleries; Auction Houses (Catalogs/Price Lists); and Art Criticism.  These collections will continue to grow and evolve over the course of the project, capturing thousands of websites and many terabytes of data. 

Untitled Art website, nominated by NYARC for inclusion in the CARTA Art Fairs and Events collection.

We’re actively seeking more US-based arts institutions to participate in the project as we continue to grow our collections of web-based art history resources. Collaborative members attend meetings every two months to coordinate curation and other group activities as well as participate in subcommittees focused on collection development, metadata, end-user/researcher engagement, and outreach. If you are involved with an art and/or museum library interested in joining this collaborative project, please complete this form.

The Internet Archive’s Community Webs Program Welcomes 60+ New Members from the US, Canada and Internationally

Community Webs, the Internet Archive’s community history web and digital archiving program, is welcoming over 60 new members from across the US, Canada, and internationally. This new cohort is the first expansion of the Community Webs program outside of the United States and we are thrilled to be supporting the development of diverse, community-based web collections on an international scale. 

Community Webs empowers cultural heritage organizations to collaborate with their communities to build web and digital archives of primary sources documenting local history and culture, especially collections inclusive of voices typically underrepresented in traditional memory collections. The program achieves this mission by providing its members with free access to the Archive-It web archiving service, digital preservation and digitization services, and technical support and training in topics such as web archiving, community outreach, and digital preservation. The program also offers resources to support a local history archiving community of practice and to facilitate scholarly research.

New Community Webs member Karen Ng, Archivist at Sḵwx̱wú7mesh Úxwumixw (Squamish Nation), BC, Canada, notes that the program offers a way to capture community-generated online content in a context where many of the Nation’s records are held by other institutions. “The Squamish Nation community is active in creating and documenting language, traditional knowledge, and histories. Now more than ever in the digital age, it is imperative that these stories and histories be captured and stored in accessible ways for future generations.” 

Similarly, for Maryna Chernyavska, Archivist at the Kule Folklore Centre in Edmonton, Canada, the program will allow the Centre to continue building relationships with community members and organizations. “Being able to assist local heritage organizations with web archiving will help us empower these communities to preserve their heritage based on their values and priorities, but also according to professional standards.”

The current expansion of the program was made possible in part by generous funding from the Andrew Mellon Foundation, which supports the growth of Community Webs to new public libraries in the US. Additional funding provided by the Internet Archive allows the program to reach cultural heritage organizations in Canada and beyond. This newest cohort brings the total number of participants in Community Webs to over 150 organizations, a ten-fold increase since the program’s inception in 2017. For a full list of new participants, see below. The program continues to add members – if your institution is interested in joining, please view our open calls for applications and please make your favorite local memory organization aware of the opportunity.

Programming for the new cohort is underway and these members are already diving into the program’s educational resources and familiarizing themselves with the technical aspects of web archiving and digital preservation. We kicked things off recently with introductory Zoom sessions, where participants met one another and shared their organizations’ missions, communities served and goals for membership in the program. Online training modules, developed by staff at the Internet Archive and the Educopia Institute, went live for new members at the beginning of September. And our new cohort joined our existing Community Webs partners at our virtual Partner Meeting on September 22nd. 

We are thrilled to see the program continuing to grow and we look forward to working with our newest cohort. A warm welcome to the following new Community Webs members!

Canada:

  • Aanischaaukamikw Cree Cultural Institute
  • Age of Sail Museum and Archives
  • Ajax Public Library
  • Blue Mountains Public Library – Craigleith Heritage Depot
  • Canadian Friends Historical Association
  • Charlotte County Archives
  • City of Kawartha Lakes Public Library
  • Community Archives of Belleville and Hastings County
  • Confluence Concerts | Toronto Performing Arts Archives
  • Edson and District Historical Society – Galloway Station Museum & Archives
  • Essex-Kent Mennonite Historical Association
  • Ex Libris Association
  • Fishing Lake Métis Settlement Public Library
  • Frog Lake First Nations Library
  • Goulbourn Museum
  • Grimsby Public Library
  • Hamilton Public Library
  • Kule Folklore Centre
  • Maskwacis Cultural College
  • Meaford Museum
  • Milton Public Library
  • Mission Folk Music Festival
  • Nipissing Nation Kendaaswin
  • North Lanark Regional Museum
  • Northern Ontario Railroad Museum and Heritage Centre
  • Parkwood National Historic Site
  • Regina Public Library
  • Sḵwx̱wú7mesh Úxwumixw (Squamish Nation) Archives
  • Société historique du Madawaska Inc.
  • St. Clair West Oral History Project
  • Temagami First Nation Public Library
  • The ArQuives: Canada’s LGBTQ2+ Archives
  • The Historical Society of Ottawa
  • Thunder Bay Museum
  • Tk’emlups te Secwepemc

International:

  • Biblioteca Nacional Aruba
  • Institute of Information Science, Academia Sinica (Taiwan)
  • Mbube Cultural Preservation Foundation (Nigeria)
  • National Library and Information System Authority (NALIS) (Republic of Trinidad and Tobago)

United States:

  • Abilene Public Library
  • Ashland City Library
  • Auburn Avenue Research Library on African American Culture and History
  • Charlotte County Libraries & History
  • Choctaw Cultural Center
  • Cultura Local ABI
  • DC History Center
  • Forsyth County Public Library
  • Fort Worth Public Library
  • Inuit Circumpolar Council – Alaska
  • Menominee Tribal Archives
  • Mineral Point Library Archives
  • Obama Hawaiian Africana Museum
  • Scott County Library System
  • South Sioux City Public Library
  • St. Louis Media History Foundation
  • Tacoma Public Library
  • The History Project
  • The Seattle Public Library
  • Tipp City Public Library
  • University of Hawaiʻi – West Oʻahu
  • Wilmington Public Library District

Congrats to these new partners! We are excited to have you on board.

Internet Archive Launches Collaborative, Web-Based Art Resources Preservation and Access Initiative

Much of the art gallery, artist, and arts organization materials that were once published in print form are now available primarily or solely on the web. These groups, like many in the cultural sector, have also been hit especially hard by the global pandemic, making their web presences particularly at-risk of being lost if they are not proactively collected and preserved.The creation of reference and research resources that promote streamlined access and enable new types of scholarly use will ensure that the art historical record of the 21st century, and especially of our current global pandemic, is readily accessible far into the future.

For this reason, the Internet Archive, along with the New York Art Resources Consortium (NYARC), are pleased to announce our project Consortial Action to Preserve Born-Digital, Web-Based Art History & Culture. The project recently received a two-year, $305,343 Humanities Collections and Reference Resources grant from the Division of Preservation and Access at the National Endowment for the Humanities. This award will support the formation of a cooperative group of 30+ art and museum libraries from across the United States to collaborate on the preservation of, and access to vital arts content from the web. 

The Internet Archive has a long history of building and supporting collaborative communities and providing non-profit web, preservation, and access services to cultural heritage organizations. The multi-institutional initiative between Internet Archive, NYARC, and other arts and museum organizations will build on similar community-based archiving and professional cultivation projects in the Community Programs group, especially our Community Webs program, currently expanding nationally and internationally. Community Webs has received funding from The Andrew W. Mellon Foundation and IMLS to provide public libraries and cultural heritage organizations with services, training, and professional development opportunities to document their diverse local history. 

NYARC are pioneers in collaborative web archiving and shared services, among art and museum libraries. NYARC’s robust web archive collections encompass art resources, artists’ websites, auction catalogs, catalogues raisonnes, and hundreds of New York City gallery websites. The Internet Archive and NYARC have partnered on work to build born-digital collecting capacity among arts organizations in the past, most recently in the IMLS-funded Advancing Art Libraries and Curated Web Archives National forum and related events.  Through discussions, workshops and roadmapping sessions with leaders in art and museum libraries, a strategy and plan  towards an inclusive, sustainable, cooperative approach to collecting and stewarding born-digital, locally-focused art history collection was developed, forming the basis of this broader cooperative effort.

Members in the project’s preliminary group of art and museum libraries will select topics and specific web content that is relevant to their expertise, will provide metadata to facilitate access to archived content, and will participate in planning and evaluation meetings, all while curating a valuable reference resource that will enhance their traditional collecting areas. The Internet Archive will coordinate communications, facilitate governance and collective curatorial activities, provide technical digital library and archive services, and help enable members to build and maintain discovery and access platforms, as well as facilitate researcher use of the collections resulting from the group’s work.

If your art or museum library is interested in joining this collaborative effort, please fill out this participation form by July 31 to join us!