Internet Archive Welcomes Digital Humanists and Cultural Heritage Professionals to “Humanities and the Web: Introduction to Web Archive Data Analysis”

By The Community Programs Team

On November 14, 2022, the Internet Archive hosted Humanities and the Web: Introduction to Web Archive Data Analysis, a one-day introductory workshop for humanities scholars and cultural heritage professionals. The group included disciplinary scholars and information professionals with research interests ranging from Chinese feminist movements, to Indigenous language revitalization, to the effects of digital platforms on discourses of sexuality and more. The workshop was held at the Central Branch of the Los Angeles Public Library and coincided with the National Humanities Conference.

Attendees and Facilitators at Humanities and the Web: Introduction to Web Archive Data Analysis, November 14, 2022, Los Angeles Public Library

The goals of the workshop were to introduce web archives as primary sources and to provide a sampling of tools and methodologies that could support computational analysis of web archive collections. Internet Archive staff shared web archive research use cases and provided participants with hands-on experience building web archives and analyzing web archive collections as data.

Senior Program Manager, Lori Donovan, guiding attendees in using Voyant to analyze text datasets extracted from an Archive-It collection using ARCH.

The workshop’s central feature was an introduction to ARCH (Archives Research Compute Hub). ARCH transforms web archives into datasets tuned for computational research, allowing researchers to, for example, extract all text, spreadsheets, PDFs, images, audio, named entities and more from collections. During the workshop, participants worked directly with text, network, and image file datasets generated from web archive collections. With access to datasets derived from these collections, the group explored a range of analyses using Palladio, RAWGraphs, and Voyant

Visualization of the image files contained in the Chicago Architecture Biennial collection, created using Palladio based on an Image File dataset extracted from the collection using ARCH.

The high level of interest and participation in this event is indicative of the appetite within the Humanities for workshops on computational research. Participants described how the workshop gave them concrete language to express the challenges of working with large-scale data, while also expressing how the event offered strategies they could apply to their own research or could use to support their research communities. For those who were not able to make it to Humanities and the Web, we will be hosting a series of virtual and in-person workshops in 2023. Keep your eye on this space for upcoming announcements.

3 thoughts on “Internet Archive Welcomes Digital Humanists and Cultural Heritage Professionals to “Humanities and the Web: Introduction to Web Archive Data Analysis”

  1. Maxwell Bogie

    Because it was great for Hu-manities and the Web: Introduction to Web Archive Data Analysis on Monday, November 14th, 2022 from one-day introductory workshop to humanities scholars and cultural heritage scholarships and I’m having Indigenous language revitalization about my the effects of digital platforms on discourses to sexuality and everything! That is going to be jealous from Internet Archive Blogs. I’ve enjoyed it too! Thanks. ARCH (Archives Research Compute Hub) is fabulous with Palladio, RAWGraphs, and Voyant! The one will be hosting a series of virtual and in-person workshops in 2023! Goodbye.

  2. Patrick

    I’m retired, but if I were still teaching cultural history, the archive for Arts and Humanities would have been so valuable for my community college students. In retirement, I’ve taken the job of Archivist for Alcoholics Anonymous. in southern New Mexico; your work is so appreciated.

Comments are closed.