Tag Archives: archive-it

Collective Web-Based Art Preservation and Access at Scale 

Art historians, critics, curators, humanities scholars and many others rely on the records of artists, galleries, museums, and arts organizations to conduct historical research and to understand and contextualize contemporary artistic practice. Yet, much of the art-related materials that were once published in print form are now available primarily or solely on the web and are thus ephemeral by nature. In response to this challenge, more than 40 art libraries spent the last 3 years developing a collective approach to preservation of web-based art materials at scale. 

Supported by the Institute of Museum and Library Services and the National Endowment for the Humanities, The Collaborative ART Archive (CARTA) community has successfully aligned effort across libraries large and small, from Manoa, Hawaii to Toronto, Ontario and back resulting in preservation of and access to 800 web-based art resources, organized into 8 collections (art criticism, art fairs and events, art galleries, art history and scholarship, artists websites, arts education, arts organizations, auction houses), totalling nearly 9 TBs of data with continued growth. All collections are preserved in perpetuity by the Internet Archive. 

Today, CARTA is excited to launch the CARTA portal – providing unified access to CARTA collections.

CARTA portal

🎨 CARTA portal 🎨

The CARTA portal includes web archive collections developed jointly by CARTA members, as well as preexisting art-related collections from CARTA institutions, and non-CARTA member collections. CARTA portal development builds on the Internet Archive’s experience creating the COVID-19 Web Archive and Community Webs portal. 

CARTA collections are searchable by contributing organization, collection, site, and page text. Advanced search supports more granular exploration by host, results per host, file types, and beginning and end dates.

CARTA search

🔭 CARTA search 🔭

In addition to the CARTA portal, CARTA has worked to promote research use of collections through a series of day long computational research workshops – Working to Advance Library Support for Web Archive Researchbacked by ARCH (Archives Research Compute Hub). A call for applications for the next workshop, held concurrent to the annual Society of American Archivists meeting, is now open. 

Moving forward CARTA aims to grow and diversify its membership in order to increase collective ability to preserve web-based art materials. If your art library would like to join CARTA please express interest here..

The Internet Archive’s Community Webs Program Welcomes 60+ New Members from the US, Canada and Internationally

Community Webs, the Internet Archive’s community history web and digital archiving program, is welcoming over 60 new members from across the US, Canada, and internationally. This new cohort is the first expansion of the Community Webs program outside of the United States and we are thrilled to be supporting the development of diverse, community-based web collections on an international scale. 

Community Webs empowers cultural heritage organizations to collaborate with their communities to build web and digital archives of primary sources documenting local history and culture, especially collections inclusive of voices typically underrepresented in traditional memory collections. The program achieves this mission by providing its members with free access to the Archive-It web archiving service, digital preservation and digitization services, and technical support and training in topics such as web archiving, community outreach, and digital preservation. The program also offers resources to support a local history archiving community of practice and to facilitate scholarly research.

New Community Webs member Karen Ng, Archivist at Sḵwx̱wú7mesh Úxwumixw (Squamish Nation), BC, Canada, notes that the program offers a way to capture community-generated online content in a context where many of the Nation’s records are held by other institutions. “The Squamish Nation community is active in creating and documenting language, traditional knowledge, and histories. Now more than ever in the digital age, it is imperative that these stories and histories be captured and stored in accessible ways for future generations.” 

Similarly, for Maryna Chernyavska, Archivist at the Kule Folklore Centre in Edmonton, Canada, the program will allow the Centre to continue building relationships with community members and organizations. “Being able to assist local heritage organizations with web archiving will help us empower these communities to preserve their heritage based on their values and priorities, but also according to professional standards.”

The current expansion of the program was made possible in part by generous funding from the Andrew Mellon Foundation, which supports the growth of Community Webs to new public libraries in the US. Additional funding provided by the Internet Archive allows the program to reach cultural heritage organizations in Canada and beyond. This newest cohort brings the total number of participants in Community Webs to over 150 organizations, a ten-fold increase since the program’s inception in 2017. For a full list of new participants, see below. The program continues to add members – if your institution is interested in joining, please view our open calls for applications and please make your favorite local memory organization aware of the opportunity.

Programming for the new cohort is underway and these members are already diving into the program’s educational resources and familiarizing themselves with the technical aspects of web archiving and digital preservation. We kicked things off recently with introductory Zoom sessions, where participants met one another and shared their organizations’ missions, communities served and goals for membership in the program. Online training modules, developed by staff at the Internet Archive and the Educopia Institute, went live for new members at the beginning of September. And our new cohort joined our existing Community Webs partners at our virtual Partner Meeting on September 22nd. 

We are thrilled to see the program continuing to grow and we look forward to working with our newest cohort. A warm welcome to the following new Community Webs members!

Canada:

  • Aanischaaukamikw Cree Cultural Institute
  • Age of Sail Museum and Archives
  • Ajax Public Library
  • Blue Mountains Public Library – Craigleith Heritage Depot
  • Canadian Friends Historical Association
  • Charlotte County Archives
  • City of Kawartha Lakes Public Library
  • Community Archives of Belleville and Hastings County
  • Confluence Concerts | Toronto Performing Arts Archives
  • Edson and District Historical Society – Galloway Station Museum & Archives
  • Essex-Kent Mennonite Historical Association
  • Ex Libris Association
  • Fishing Lake Métis Settlement Public Library
  • Frog Lake First Nations Library
  • Goulbourn Museum
  • Grimsby Public Library
  • Hamilton Public Library
  • Kule Folklore Centre
  • Maskwacis Cultural College
  • Meaford Museum
  • Milton Public Library
  • Mission Folk Music Festival
  • Nipissing Nation Kendaaswin
  • North Lanark Regional Museum
  • Northern Ontario Railroad Museum and Heritage Centre
  • Parkwood National Historic Site
  • Regina Public Library
  • Sḵwx̱wú7mesh Úxwumixw (Squamish Nation) Archives
  • Société historique du Madawaska Inc.
  • St. Clair West Oral History Project
  • Temagami First Nation Public Library
  • The ArQuives: Canada’s LGBTQ2+ Archives
  • The Historical Society of Ottawa
  • Thunder Bay Museum
  • Tk’emlups te Secwepemc

International:

  • Biblioteca Nacional Aruba
  • Institute of Information Science, Academia Sinica (Taiwan)
  • Mbube Cultural Preservation Foundation (Nigeria)
  • National Library and Information System Authority (NALIS) (Republic of Trinidad and Tobago)

United States:

  • Abilene Public Library
  • Ashland City Library
  • Auburn Avenue Research Library on African American Culture and History
  • Charlotte County Libraries & History
  • Choctaw Cultural Center
  • Cultura Local ABI
  • DC History Center
  • Forsyth County Public Library
  • Fort Worth Public Library
  • Inuit Circumpolar Council – Alaska
  • Menominee Tribal Archives
  • Mineral Point Library Archives
  • Obama Hawaiian Africana Museum
  • Scott County Library System
  • South Sioux City Public Library
  • St. Louis Media History Foundation
  • Tacoma Public Library
  • The History Project
  • The Seattle Public Library
  • Tipp City Public Library
  • University of Hawaiʻi – West Oʻahu
  • Wilmington Public Library District

Congrats to these new partners! We are excited to have you on board.

Reflecting on 9/11: Twenty Years of Archived TV News – Special Event and Resources

On Thursday, September 9, the Internet Archive will host an online webinar, “Reflecting on 9/11: Twenty Years of Archived TV News” Learn from scholars, journalists, archivists, and data scientists about the importance of archived television for gaining insights into our evolving understanding of history and society.

Participants include the Internet Archive, The American Archive of Public Broadcasting, The Vanderbilt Television News Archive and UCLA Library’s NewsScape TV News Archive. Speakers will include Roger Macdonald (Founder, Internet Archive’s TV News Archive), Jim Duran (Director, Vanderbilt Television News Archives), Karen Cariani (David O. Ives Executive Director, GBH Archives and GBH Project Director, American Archive of Public Broadcasting), Todd Grappone (UCLA Associate University Librarian for Digital Initiatives and Information Technology), Kalev Leetaru (Founder, Global Database of Events, Language and Tone Project), and Philip Bump (Washington Post national correspondent focused largely on the numbers behind politics)

Please register in advance for the September 9 webinar (11:00 AM – 12:30 PM PDT)

Journalists and scholars: as you prepare 20th anniversary 9/11 reporting and analysis, these unique resources are available:

  • Internet Archive’s 9/11 Television News Archive – a browsable library of TV news from U.S. and international broadcasters from 19 networks, over seven days, from the morning of September 11 through September 17, 2001. Contact: Josh Baran 917-797-1799
  • The Vanderbilt Television News Archive (VTNA) – Founded in 1968, the Archive’s collection includes TV news of attacks on 9/11/2001 coverage during the following weeks broadcast by ABC, NBC, CBS and CNN. Over 270 hours of footage is available for viewing and research. The VTNA records and preserves national television broadcasts of the evening news on ABC, CBS, and NBC with the addition of the primetime news program on CNN in 1995 and the Fox News Channel in 2004. In addition to these nightly recordings, the VTNA also monitors television news networks for breaking live events. Contact: Jim Duran – 615-936-4019  
  • The American Archive of Public Broadcasting (AAPB) marks the 20th anniversary of the 9/11 terrorist attacks by releasing a new 9/11 Special Coverage Collection of 68 public television and radio programs from stations across the country covering the events of the attacks and the aftermath. Among the featured programs are coverage of 9/11 and its anniversaries by The Newshour with Jim Lehrer, the PBS News Hour, and much more. The AAPB is a collaboration between Boston public media producer GBH and the Library of Congress to preserve and make accessible culturally significant public media programs from across the country. Contact: Emily Balk, GBH External Communications Manager – 617-300-5317
  • UCLA Library’s NewsScape TV News Archive contains digitized television news programs collected from cable and broadcast sources in the Los Angeles area from 2005 to the present, as well as a smaller number of news programs from other domestic, international, and online sources collected from 2004 to the present. The archive includes hundreds of thousands of hours of news programs, which are indexed and time-referenced via their closed captions and other associated metadata to enable full-text searching and interactive streaming playback.
Interface for browsing TV news on 19 networks – September 11, 2001 through September 17th – Internet Archive

Background:

  • 500+ archived 9/11-related websites curated by The National September 11 Memorial Museum using the Internet Archive’s Archive-It service
  • Internet Archive’s Open Library offers a list of 2,630 published works about the 9/11 attack
  • A decade ago, on the 10th anniversary of 9/11, NYU’s Department of Cinema Studies hosted a conference that featured work by scholars using television news materials to help us understand how TV news presented the events of 9/11 and the international response. “Learning from Recorded Memory”
  • This fall, the Internet Archive celebrates its 25th anniversary.
  • The Internet Archive’s TV News Archive repurposes closed captioning as a search index for nearly three million hours of U.S. local and national TV news (2,239,000+ individual shows) from mid-2009 to the present. The public interest library is dedicated to facilitating journalists, scholars, and the public to compare, contrast, cite, and borrow specific portions of the collection. Advanced quantitive analysis opportunities and data visualizations are available via the collaborating GDELT Project’s Television Explorer and AI Television Explorer.
  • Roger Macdonald, founder of the Internet Archive’s TV News Archive, is available for background interviews and to help journalists access the archive.

Internet Archive 9/11 Event and Resources Media Contact:  pressinfo@archive.org

Community Webs joins the Digital Public Library of America

Internet Archive’s Community Webs program is delighted to announce a partnership with the Digital Public Library of America (DPLA) to ingest metadata from the over 700 publicly available Community Webs web archive collections into DPLA. These collections include thousands of archived websites and millions of individual web-published resources that document local history and underrepresented groups. The Internet Archive has been a DPLA content provider since 2015, primarily contributing content from our many print digitizing partnerships. Community Webs will also join DPLA as a member and we are excited for this opportunity to add hyperlocal born-digital and web collections from public libraries nationwide into DPLA’s national portal to cultural heritage collections.

The Community Webs program was launched in 2017 to provide training, infrastructure, services, and professional community cultivation for public librarians across the country for the purpose of documenting local history and community archiving, especially documenting communities and populaces traditionally excluded from the historical record. The program is in the midst of nationwide expansion and currently includes more than 100 member public libraries who are collaborating with local organizations, movements, and groups to document the lives and accomplishments of their citizens. The program continues to add new public libraries and cultural heritage organizations to support and scale their community archiving and has an open call for applications in the US, Canada, and internationally for additional public libraries and local heritage organizations to join the program. Examples of Community Webs collections include:

  • Community Webs members have created more than 30 collections documenting local responses to the COVID-19 pandemic, including COVID-19 Coronavirus East Baton Rouge Parish from East Baton Rouge Parish Library and Schomburg Center for Research in Black Culture’s “Novel Coronavirus COVID-19” collection which focuses on “the African diasporan experiences of COVID-19 including racial disparities in health outcomes and access, the impact on Black-owned businesses, and cultural production.” 
  • Community Webs members have created a number of collections documenting LGBTQ groups, events and other resources, including LGBTQIA/Hormel Resources from San Francisco Public Library and Birmingham Public Library’s “LGBTQ in Alabama” collection.
  • Members are also actively archiving materials on their local or regional culture, such as Kansas City Public Library’s Arts & Culture collection, which “documents Kansas City’s thriving arts community, including galleries, museums, nonprofits, advocacy organizations, criticism and art spaces.”
  • Many members have focused on documenting local social services or advocacy groups, such as Madison Public Library’s Racial Equity and Social Justice, Madison, WI collection of “organizations and non-profits that engage in public discourse on issues of racial equity and social justice.”

Working with a mission-aligned organization like DPLA and our shared values of collaboration, open access, and community empowerment made it an obvious fit for Community Webs member collections to also be available in DPLA. Some public libraries who are a part of the Community Webs program are also members of local or statewide DPLA content hubs, and already have digitized content available in DPLA.The partnership between DPLA and Community Webs will ensure that archived web and born-digital collections are accessible alongside similar digitized materials for seamless discovery and access for uses. Pairing Community Webs’ free archiving, infrastructure, education, and other services with DPLA’s aggregation tools, hubs networks, and its advocacy role will help expand national access and capacity for making primary sources, and a more diverse archival record, accessible to any online user,

“DPLA’s new partnership with the Community Webs program will help further our mission to provide free digital access to cultural heritage artifacts that inform a truly representative history of our nation, “ said Shaneé Yvette Murrain, director of community engagement for DPLA. “We are thrilled to be deepening our work with Internet Archive through a program so perfectly aligned with our organizations’ shared values.”

“Pairing the community web archives of 100+ public libraries and the cohort cultivation that are part of Community Webs with the national scope and professional networks native to DPLA is a perfect match. We are excited to expand access to these amazing grassroots digital collections,” said Jefferson Bailey, Director of Web Archiving & Data Services at Internet Archive.

We are excited to be partnering with DPLA to increase access to these vital community history collections and look forward to building more integrations and furthering this collaboration in the years to come.

Community Webs Seeks Applicants from the US, Canada and Around the World

The Internet Archive is seeking applicants for its next cohort of Community Webs! We are thrilled to announce that the program is now open to additional cultural heritage organizations in the US, as well as any public library or local memory organization in Canada and internationally.

Community Webs provides infrastructure and services, training and education, and professional community cultivation for public libraries and cultural heritage organizations to document local history and the lives of their communities. Launched in the US in 2017 with kickoff funding from the Institute of Museum and Library Services (IMLS), Community Webs began expanding nationally in 2020 with generous support from The Andrew W. Mellon Foundation. Building on the program’s success and continued growth, Internet Archive is now supporting expansion of the program into Canada and to the international community, and is accepting applications for our next cohort kicking off in late-Summer 2021. The deadline for applications is August 2, 2021.

The program offers a unique opportunity for participating organizations to build capacity in digital collecting. Community Webs participants work alongside peer organizations and with their local communities to document the lives of their citizens, marginalized voices, and groups often absent from the historical record. All Community Webs participants receive: 

  • A guaranteed multi-year free subscription to the Archive-It web archiving service, which includes perpetual storage and access provided by the Internet Archive.
  • Access to additional Internet Archive non-profit services, such as digitization and digital preservation, either for free (as funding allows) or at or below actual cost.
  • Training and educational resources related to digital collections, web archiving, digital preservation, and other topics, as well as access to a cohort community pursuing similar work and to networking spaces, events, and knowledge sharing platforms.
  • The option to leverage program partnerships and integrations to include community web archives in other aggregators or access platforms beyond Internet Archive.

The program currently includes over 100 public libraries from across the United States. These organizations have collectively archived over 70 terabytes of web-based community heritage materials. Some highlights include:

Archived web page: Reporte Hispano, April 6, 2021. New Brunswick Free Public Library. Spanish Newspapers collection.
Archived web page: KC Friends of Alvin Ailey, January 10, 2021. Kansas City Public Library, Arts & Culture collection.

The benefits of the program are wide-ranging and impactful for both participants and their communities. As Community Webs member Makiba J. Foster of the African American Research Library and Cultural Center in Broward County, Florida stated during a recent Community Webs event, Archiving the Black Diaspora, “Community Webs provided me with the training, they provided me with the cohort support, […] provided me with services, and particularly it helped to develop an expertise for me in terms of creating collections of historically significant web materials documenting our local communities.” The program “allowed me to start a project of recovery and documentation of digitally born content related to the Black experience.” More information about what Foster and other Community Webs members are up to can be found by viewing our recent program announcements.

Find out more about the program and keep up to date by visiting the Community Webs website. Apply online today and spread the word! 

Archive-It and Archives Unleashed Join Forces to Scale Research Use of Web Archives

Archived web data and collections are increasingly important to scholarly practice, especially to those scholars interested in data mining and computational approaches to analyzing large sets of data, text, and records from the web. For over a decade Internet Archive has worked to support computational use of its web collections through a variety of services, from making raw crawl data available to researchers, performing customized extraction and analytic services supporting network or language analysis, to hosting web data hackathons and having dataset download features in our popular suite of web archiving services in Archive-It. Since 2016, we have also collaborated with the Archives Unleashed project to support their efforts to build tools, platforms, and learning materials for social science and humanities scholars to study web collections, including those curated by the 700+ institutions using Archive-It

We are excited to announce a significant expansion of our partnership. With a generous award of $800,000 (USD) to the University of Waterloo from The Andrew W. Mellon Foundation, Archives Unleashed and Archive-It will broaden our collaboration and further integrate our services to provide easy-to-use, scalable tools to scholars, researchers, librarians, and archivists studying and stewarding web archives.  Further integration of Archives Unleashed and Archive-It’s Research Services (and IA’s Web & Data Services more broadly) will simplify the ability of scholars to analyze archived web data and give digital archivists and librarians expanded tools for making their collections available as data, as pre-packaged datasets, and as archives that can be analyzed computationally. It will also offer researchers a best-of-class, end-to-end service for collecting, preserving, and analyzing web-published materials.

The Archives Unleashed team brings together a team of co-investigators.  Professor Ian Milligan, from the University of Waterloo’s Department of History, Jimmy Lin, Professor and Cheriton Chair at Waterloo’s Cheriton School of Computer Science, and Nick Ruest, Digital Assets Librarian in the Digital Scholarship Infrastructure department of York University Libraries, along with Jefferson Bailey, Director of Web Archiving & Data Services at the Internet Archive, will all serve as co-Principal Investigators on the “Integrating Archives Unleashed Cloud with Archive-It” project. This project represents a follow-on to the Archives Unleashed project that began in 2017, also funded by The Andrew W. Mellon Foundation.

“Our first stage of the Archives Unleashed Project,” explains Professor Milligan, “built a stand-alone service that turns web archive data into a format that scholars could easily use. We developed several tools, methods and cloud-based platforms that allow researchers to download a large web archive from which they can analyze all sorts of information, from text and network data to statistical information. The next logical step is to integrate our service with the Internet Archive, which will allow a scholar to run the full cycle of collecting and analyzing web archival content through one portal.”

“Researchers, from both the sciences and the humanities, are finally starting to realize the massive trove of archived web materials that can support a wide variety of computational research,” said Bailey. “We are excited to scale up our collaboration with Archives Unleashed to make the petabytes of web and data archives collected by Archive-It partners and other web archiving institutions around the world more useful for scholarly analysis.” 

The project begins in July 2020 and will begin releasing public datasets as part of the integration later in the year. Upcoming and future work includes technical integration of Archives Unleashed and Archive-It, creation and release of new open-source tools, datasets, and code notebooks, and a series of in-person “datathons” supporting a cohort of scholars using archived web data and collections in their data-driven research and analysis. We are grateful to The Andrew W. Mellon Foundation for their support of this integration and collaboration in support of critical infrastructure supporting computational scholarship and its use of the archived web.

Primary contacts:
IA – Jefferson Bailey, Director of Web Archiving & Data Services, jefferson [at] archive.org
AU – Ian Milligan, Professor of History, University of Waterloo, i2milligan [at] uwaterloo.ca

Archiving Information on the Novel Coronavirus (Covid-19)

The Internet Archive’s Archive-It service is collaborating with the International Internet Preservation Consortium’s (IIPC) Content Development Group (CDG) to archive web-published resources related to the ongoing Novel Coronavirus (Covid-19) outbreak. The IIPC Content Development Group consists of curators and professionals from dozens of libraries and archives from around the world that are preserving and providing access to the archived web. The Internet Archive is a co-founder and longtime member of the IIPC. The project will include both subject-expert curation by IIPC members as well as the inclusion of websites nominated by the public (see the nomination form link below).

Due to the urgency of the outbreak, archiving of nominated web content will commence immediately and continue as needed depending on the course of the outbreak and its containment. Web content from all countries and in any language is in scope. Possible topics to guide nominations and collections: 

  • Coronavirus origins 
  • Information about the spread of infection 
  • Regional or local containment efforts
  • Medical/Scientific aspects
  • Social aspects
  • Economic aspects
  • Political aspects

Members of the general public are welcomed to nominate websites and web-published materials using the following web form: https://forms.gle/iAdvSyh6hyvv1wvx9. Archived information will also be available soon via the IIPC’s public collections in Archive-It. [March 23, 2020 edit: the public collection can now be found here, https://archive-it.org/collections/13529.]

Members of the general public can also take advantage of the ability to upload non-web digital resources directly to specific Internet Archive collections such as Community Video or Community Texts. For instance, see this collection of “Files pertaining to the 2019–20 Wuhan, China Coronavirus outbreak.” We recommend using a common subject tag, like coronavirus to facilitate search and discovery. Fore more information on uploading materials to archive.org, see the Internet Archive Help Center.

A special thanks to Alex Thurman of Columbia University and Nicola Bingham of the British Library, the co-chairs of the IIPC CDG, and to other IIPC members participating in the project. Thanks as well to any and all public nominators assisting with identifying and archiving records about this significant global event.