Author Archives: jefferson

Internet Archive Partners with University of Edinburgh to Provide Historical Web Data Supporting Machine Translation

The Internet Archive will provide portions of its web archive to the University of Edinburgh to support the School of Informatics’ work building open data and tools for advancing machine translation, especially for low-resource languages. Machine translation is the process of automatically converting text in one language to another.

The ParaCrawl project is mining translated text from the web in 29 languages.  With over 1 million translated sentences available for several languages, ParaCrawl is often the largest open collection of translations for each language.   The project is a collaboration between the University of Edinburgh, University of Alicante, Prompsit, TAUS, and Omniscien with funding from the EU’s Connecting Europe Facility.  Internet Archive data is vastly expanding the data mined by ParaCrawl and therefore the amount of translated sentences collected. Lead by Kenneth Heafield of the University of Edinburgh, the overall project will yield open corpora and open-source tools for machine translation as well as the processing pipeline.  

Archived web data from IA’s general web collections will be used in the project.  Because translations are particularly scarce for Icelandic, Croatian, Norwegian, and Irish, the IA will also use customized internal language classification tools to prioritize and extract data in these languages from archived websites in its collections.

The partnership expands on IA’s ongoing effort to provide computational research services to large-scale data mining projects focusing on open-source technical developments for furthering the public good and open access to information and data. Other recent collaborations include providing web data for assessing the state of local online news nationwide, analyzing historical corporate industry classifications, and mapping online social communities. As well, IA is expanding its work in making available custom extractions and datasets from its 20+ years of historical web data. For further information on IA’s web and data services, contact webservices at archive dot org.

“Make It Weird”: Building a collaborative public library web archive in an arts and counterculture community

This post is reposted from the Archive-It blog and written by guest author Dylan Gaffney of the Forbes Library, one of the public libraries participating in the Community Webs program.

Whether documenting the indie music scene of the 1990s, researching the history of local abolitionists and formerly enslaved peoples in the 1840s, or helping patrons research the early LGBT movement in the area, I am frequently reminded of what was not saved or is not physically present in our collections. These gaps or silences often reflect subcultures in our community, stories that were not told on the pages of the local newspaper, or which might not be reflected in the websites of city government or local institutions. In my first sit down with a fellow staff member to talk about the prospects for a web archive, we brainstormed how we could more completely capture the digital record of today’s community. We discussed including lesser known elements like video of music shows in house basements, the blog of a small queer farm commune in the hills, the Instagram account of the kid who photographs local graffiti, etc. My colleague Heather whispered to me excitedly: “We could make it weird!” I knew immediately I had found my biggest ally in building our collections.

The Forbes Library was one of a few public libraries chosen nationwide for the Community Webs cohort, a group of public libraries organized by the Internet Archive and funded by the Institute of Museum and Library Services to expand web archiving in local history collections. As a librarian in a small city of 28,000 people, who works in a public library with no full-time archivists, the challenge of trying to build a web archive from scratch that truly reflected our rich, varied and “weird” cultural community, the arts and music scenes, and the rich tradition of activism in Western Massachusetts was a daunting but exciting project to embark on.

We knew we would have to leverage our working relationships with media organizations, nonprofits, city departments, the arts and music community, and our staff if we truly hoped to build something which reflected our community as it is. Our advantage was that we had such relationships, and could pitch the idea not only through traditional means like press releases and social media, but by chatting after meetings typically spent coordinating film screenings, gallery walks, and lawn concerts. We knew if we became comfortable enough with the basic concepts of archiving the web, that we could pick the brains of activists planning events in our meeting rooms, friends at shows, the staff of our local media company who lend equipment to aspiring filmmakers, and the folks who sell crops from small family farms in the community at the Farmer’s Markets.

We started by training just a few Information Services staff in one-on-one sessions and shared Archive-It training videos. This helped to broaden the number of librarians familiar with the Archive-It software in general, but also got the wheels turning amongst our reference and circulation staffs–our front lines of communication with the public–in particular. We talked a great deal about what we wish we had in our current archive, about filling in gaps and having the archive more accurately reflect and represent our community.

In order to solicit ideas from the community for preservation, we put together a Google form to be posted online, which was almost entirely cribbed from my Community Webs cohort colleagues at East Baton Rouge Parish Library, Queens Public Library and others. We also set up in-person, one-on-one meetings with community partners and academic institutions that were already engaged in web archiving. We put out press releases and generally just talked to and at anyone who would listen. As a result, nearly all of our first web archival acquisitions come directly from recommendations by the public and our community partners.

For instance, one of the first websites that I knew I wanted to preserve was From Wicked to Wedded, a great site which preserves the history of the LGBTQ community in our area. It was gratifying when two of the first responses to our online outreach also mentioned the site and we had a great conversation with its creator, who researches at the library, and who, like all the content creators we’ve approached thus far, was excited to be included.

Creating an accurate and exciting overview of the lively arts scene in Northampton and the surrounding area seemed like a daunting task at first, but by crawling the websites of notable galleries, arts organizations, and Northampton’s monthly gallery walk, we found that we were quickly able to capture a really interesting cross-section of local artists’ work. We have subsequently begun working with the local arts organizations directly  to identify artists who may have their own websites worthy of inclusion.

Similarly, Northampton has a rich music scene for a city of its small size. With the number of people already documenting live music these days, we weren’t sure how to contribute with our own selection and curation, and so asked several folks embedded in the scene to curate some of their own favorite content, then reached out to the bands themselves to get their thoughts. We are still early in this process, but the response has been encouraging and the benefits to the library in building relationships with folks who are documenting the music scene have already led to physical donations to the archive as well.

It was important to us from the beginning to also consult with Northampton Community Television. NCTV partners with the library on film programming to preserve a record of all they do for the community–teaching filmmaking, lending equipment, training and empowering citizen journalists.. They, in turn, have pointed us to local filmmakers, and through our ongoing collaborations around film programming and the Northampton film festival, we have a platform for outreach in that community as well.

Staff members and local activists pointed us in the direction of other new local radio shows and citizen journalism websites, both of which give personal takes on local politics. One was a wonderful radio show called Out There by one of our bicycle trash pickup workers Ruthie. In a single episode, Ruthie will talk to everybody from the mayor, environmental activists and farmers, to the random junior high kids that she runs into hanging out on the bike path under a bridge.  The other recommendation was for a new citizen journalism site called Shoestring which asks common sense questions of people in power in local government and places them in a national context. The folks from Shoestring stopped by the library’s Arts and Music desk to ask about our bi-weekly Zine Club meeting, which gave us an opportunity to talk about including their site in our web archive and led to physical donation to the archive as well!

At numerous people’s suggestion, we are preserving the Instagram account of our gruff looking former video store clerk turned City Council president Bill Dwight. Bill has a great camera, a great eye and has the ability to capture a wonderful cross-section of the community in his feed. Dann Vazquez has an instagram feed dedicated to capturing oddball moments, new building developments and local graffiti, (one of the more ephemeral of our community’s arts) which gives a unique day to day perspective of change on the streets of our city.

We are a community rich in activism, with a long tradition that, like our LGBTQ history, has not been properly reflected in our archives. For years, the personal and organizational archives of local activists have found homes at the larger colleges and Universities in the Five College Area. Now, by including the websites of long-running and new nonprofits and activist organizations, we are able to create a richer archive for future generations to learn from their pioneering work.

We have tried to remain conscious of what communities are being left out of the collections we are developing, such as the non-English speaking communities with whom we need to improve our outreach and individuals and organizations that might not have a digital presence currently. As we  have the ability to offer basic training at the library and through our community partners,we have recently been exploring the idea of creating a website or Instagram account designed to give individuals and organizations the opportunity to try out these technologies without the weight of a long-term commitment, but with the assurance that their content would be preserved among our web archives.

It still feels that we are in the earliest phases of this endeavour, but we have tried to build a collaborative system of curation which could be sustained going forward. By spreading the role of curation across the community, we can prevent staff burnout on the project and ensure that the perspectives represented in the archive are broader, more varied, and thus more reflective of our small city as it is.

Additional credits: IA staff Karl-Rainer Blumenthal who edits the Archive-It blog and Maria Praetzellis, who manages the Community Webs program.

Internet Archive, Code for Science and Society, and California Digital Library to Partner on a Data Sharing and Preservation Pilot Project

Research and cultural heritage institutions are facing increasing costs to provide long-term public access to historically valuable collections of scientific data, born-digital records, and other digital artifacts. With many institutions moving data to cloud services, data sharing and access costs have become more complex. As leading institutions in decentralization and data preservation, the Internet Archive (IA), Code for Science & Society (CSS) and California Digital Library (CDL) will work together on a proof-of-concept pilot project to demonstrate how decentralized technology could bolster existing institutional infrastructure and provide new tools for efficient data management and preservation. Using the Dat Protocol (developed by CSS), this project aims to test the feasibility of a decentralized network as a new option for organizations to archive and monitor their digital assets.

Dat is already being used by diverse communities, including researchers, developers, and data managers. California Digital Library is building innovative tools for data publication and digital preservation. The Internet Archive is leading efforts to advance the decentralized web community. This joint project will explore the issues that emerge from collecting institutions adopting decentralized technology for storage and preservation activities. The pilot will feature a defined corpus of open data from CDL’s data sharing service. The project aims to demonstrate how members of a cooperative, decentralized network can leverage shared services to ensure data preservation while reducing storage costs and increasing replication counts. By working with the Dat Protocol, the pilot will maximize openness, interoperability, and community input. Linking institutions via cooperative, distributed data sharing networks has the potential to achieve efficiencies of scale not possible through centralized or commercial services. The partners intend to openly share the outcomes of this proof-of-concept work to inform further community efforts to build on this potential.

Want to learn more? Representatives of this project will be at FORCE 2018, Joint Conference on Digital Libraries, Open Repositories, DLF Forum, and the Decentralized Web Summit.

More about CSS: Code for Science & Society is a nonprofit organization committed to building public interest technology and low-cost decentralized tools with the Dat Project to help people share and preserve versioned digital information. Read more about CSS’ Dat in the Lab project, our recent Community Call, and other activities. (Contact: Danielle Robinson)

More about CDL UC3: The University of California Curation Center (UC3) at the California Digital Library (CDL) provides innovative data curation and digital preservation services to the 10-campus University of California system and the wider scholarly and cultural heritage communities. https://uc3.cdlib.org/. (Contact: John Chodacki)

More about IA: The Internet Archive is a non-profit digital library with the mission to provide “universal access to all knowledge.” It works with hundreds of national and international partners providing web, data, and preservation services and maintains an online library comprising millions of freely-accessible books, films, audio, television broadcasts, software, and hundreds of billions of archived websites. https://archive.org/. (Contact: Jefferson Bailey)

Internet Archive and New York Art Resources Consortium Receive Grant for a National Forum to Advance Web Archiving in Art and Museum Libraries

We are pleased to announce that the Institute of Museum and Library Services (IMLS) has recently awarded a collaborative grant to the New York Art Resources Consortium and our Archive-It group to host a national forum event, along with associated workshops and stakeholder meetings, to catalyze collaboration among art libraries in the stewardship of historically valuable art-related materials published on the web. The New York Art Resources Consortium (NYARC) consists of the research libraries and archives of three leading art museums in New York City: The Brooklyn Museum, The Frick Collection, and The Museum of Modern Art. Archive-It is the web archiving service of the Internet Archive that works with hundreds of heritage organizations, including an international set of museums and art libraries, to preserve and provide access to web-published resources. Archive-It and NYARC will jointly run the project, Advancing Art Libraries and Curated Web Archives: A National Forum.

This National Leadership Grant in the Curating Collections program category to conduct a National Forum and affiliated meetings builds on NYARC’s and Archive-It’s work together expanding web archiving amongst art and museum libraries and archives, including through the ARLIS/NA Web Archiving Special Interest Group, as well as their individual efforts to advance born-digital collection building. In Reframing Collections for the Digital Age, NYARC focused on web archiving program development, including technical work to integrate Archive-It and its discovery services that can inform work in similar institutions. Archive-It, with its Community Webs program, is working with dozens of public libraries on cohort building, educational resources, and network development supporting community history web archiving — a model that can be adopted by the national art library community to scale out its coordinated efforts. In addition, Archive-It has led, and NYARC operationalized, collaborative efforts towards joint API-based systems integrations research and development to further joint services and interoperability. 

By mobilizing a broad effort through an invitational forum, the project aims to achieve national scale through network building and shared infrastructure planning that the project team will foster through a program of discussion, training, and strategic roadmapping. The project will include the contribution of a diverse group of members of the art library community, lead to published outputs on strategic directions and community-specific training materials, and launch a multi-institutional effort to scale the extent of web-published, born-digital materials preserved and accessible for art scholarship and research. Thank you to IMLS for their continued support of work advancing web archiving and the overall national digital platform initiative.

Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation

The Andrew W. Mellon Foundation has awarded a research and development grant to the Internet Archive to address the critical need to preserve the “long tail” of open access scholarly communications. The project, Ensuring the Persistent Access of Long Tail Open Access Journal Literature, builds on prototype work identifying at-risk content held in web archives by using data provided by identifier services and registries. Furthermore, the project expands on work acquiring missing open access articles via customized web harvesting, improving discovery and access to this materials from within extant web archives, and developing machine learning approaches, training sets, and cost models for advancing and scaling this project’s work.

The project will explore how adding automation to the already highly automated systems for archiving the web at scale can help address the need to preserve at-risk open access scholarly outputs. Instead of specialized curation and ingest systems, the project will work to identify the scholarly content already collected in general web collections, both those of the Internet Archive and collaborating partners, and implement automated systems to ensure at-risk scholarly outputs on the web are well-collected and are associated with the appropriate metadata. The proposal envisages two opposite but complementary approaches:

  • A top-down approach involves taking journal metadata and open data sets from identifier and registry sources such as ISSN, DOAJ, Unpaywall, CrossRef, and others and examining the content of large-scale web archives to ask “is this journal being collected and preserved and, if not, how can collection be improved?”
  • A bottom-up approach involves examining the content of general domain-scale and global-scale web archives to ask “is this content a journal and, if so, can it be associated with external identifier and metadata sources for enhanced discovery and access?”

The grant will fund work to use the output of these approaches to generate training sets and test them against smaller web collections in order to estimate how effective this approach would be at identifying the long-tail content, how expensive a full-scale effort would be, and what level of computing infrastructure is needed to perform such work. The project will also build a model for better understanding the costs for other web archiving institutions to do similar analysis upon their collection using the project’s algorithms and tools. Lastly, the project team, in the Web Archiving and Data Services group with Director Jefferson Bailey as Principal Investigator,  will undertake a planning process to determine resource requirements and work necessary to build a sustainable workflow to keep the results up-to-date incrementally as publication continues.

In combination, these approaches will both improve the current state of preservation for long-tail journal materials as well as develop models for how this work can be automated and applied to existing corpora at scale. Thanks to the Mellon Foundation for their support of this work and we look forward to sharing the project’s open-source tools and outcomes with a broad community of partners.

27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving

The lives and activities of communities are increasingly documented online; local news, events, disasters, celebrations — the experiences of citizens are now largely shared via social media and web platforms. As these primary sources about community life move to the web, the need to archive these materials becomes an increasingly important activity of the stewards of community memory. And in many communities across the nation, public libraries, as one of their many responsibilities to their patrons, serve the vital role of stewards of local history. Yet public libraries have historically been a small fraction of the growing national and international web archiving community.

With generous support from the Institute of Museum and Library Services, as well as the Kahle/Austin Foundation and the Archive-It service, the Internet Archive and 27 public library partners representing 17 different states have launched a new program: Community Webs: Empowering Public Libraries to Create Community History Web Archives. The program will provide education, applied training, cohort network development, and web archiving services for a group of public librarians to develop expertise in web archiving for the purpose of local memory collecting. Additional partners in the program include OCLC’s WebJunction training and education service and the public libraries of Queens, Cleveland and San Francisco will serve as “lead libraries” in the cohort. The program will result in dozens of terabytes of public library administered local history web archives, a range of open educational resources in the form of online courses, videos, and guides, and a nationwide network of public librarians with expertise in local history web archiving and the advocacy tools to build and expand the network. A full listing of the participating public libraries is below and on the program website.

In November 2017, the cohort gathered together at the Internet Archive for a kickoff meeting of brainstorming, socializing, and, of course, talking all things web archiving.  Partners shared details on their existing local history programs and ideas for collection development around web materials. Attendees talked about building collections documenting their demographic diversity or focusing on local issues, such as housing availability or changes in community profile. As an example, Abbie Zeltzer from the Patagonia Public Library, spoke about the changes in her community of 913 residents as the town redevelops a long dormant mining industry. Zeltzer intends on developing a web archive documenting this transition and the related community reaction and changes.

Since the kickoff meeting, the Community Webs cohort has been actively building collections, from hyper-local media sites in Kansas City, to neighborhood blogs in Washington D.C., to Mardi Gras in East Baton Rouge. In addition, program staff, cohort members, and WebJunction have been building out an extensive online course space with educational materials for training on web archiving for local history. The full course space and all open educational resources will be released in early 2019 and a second full in-person meeting of the cohort will take place in Fall 2018.

For further information on the Community Webs program, contact Maria Praetzellis, Program Manager, Web Archiving [maria at archive.org] or Jefferson Bailey, Director, Web Archiving [jefferson at archive.org].

Public Library City State
Athens Regional Library System Athens GA
Birmingham Public Library Birmingham AL
Brooklyn Public Library – Brooklyn Collection New York City NY
Buffalo & Erie County Public Library Buffalo NY
Cleveland Public LIbrary Cleveland OH
Columbus Metropolitan Library Columbus OH
County of Los Angeles Public Library Los Angeles CA
DC Public Library Washington DC
Denver Public Library – Western History and Genealogy Department and Blair-Caldwell African American Research Library Denver CO
East Baton Rouge Parish Library East Baton Rouge LA
Forbes Library Northampton MA
Grand Rapids Public Library Grand Rapids MI
Henderson District Public Libraries Henderson NV
Kansas City Public Library Kansas City MO
Lawrence Public Library Lawrence KS
Marshall Lyon County Library Marshall MN
New Brunswick Free Public Library New Brunswick NJ
Schomburg Center for Research in Black Culture (NYPL) New York City NY
Patagonia Library Patagonia AZ
Pollard Memorial Library Lowell MA
Queens Library New York City NY
San Diego Public Library San Diego CA
San Francisco Public Library San Francisco CA
Sonoma County Public Library Santa Rosa CA
The Urbana Free Library Urbana IL
West Hartford Public Library West Hartford CT
Westborough Public Library Westborough MA

Military Industrial Powerpoint Complex Karaoke! — Tuesday, March 6

The Internet Archive presents the first ever Military Powerpoint Karaoke: a night of “Powerpoint Karaoke” using presentations in the Military Industrial Powerpoint Complex collection at archive.org that were extracted by the Internet Archive from its public web archive and converted into a special collection of PDFs/epubs. The event will take place on Tuesday, March 6th at 7:30pm at our headquarters in San Francisco. The show will be preceded by a reception at 6:30 pm, when doors will also open.

Get Free Tickets Here

Also known as “Battle Decks,” Powerpoint Karaoke is an improvisational and art event where audience members give a presentation using a set of Powerpoint slides that they’ve never seen before. There are three rules: 1) The presenter cannot see the slides before presenting; 2) The presenter delivers each slide in succession without skipping slides or going back; and 3) The presentation ends when all slides are presented, or after 5 minutes (whichever comes first). We’re thrilled to have Rick Prelinger, creator of Lost Landscapes and Prelinger Archive, and Avery Trufelman of 99% Invisible, joining us to deliver headlining Powerpoint decks. The rest of the presentations will be delivered by you — the audience members who sign up.

This event will use, as its source material, a curated collection of the Internet Archive’s Military Industrial Powerpoint Complex, a special project alongside GifCities that was originally created for the Internet Archive’s 20th Anniversary in October 2016. For the project, IA staff extracted all the Powerpoint files from its archive of the government’s public .mil web domain. The collection was expanded in early 2017 to include materials collected during the End of Term project, which archived a snapshot of the .gov and .mil web domains during the administration change. The Military Industrial Powerpoint Complex collection contains over 57,000 Powerpoint decks, each charged with material that ranges from the violent to the banal, featuring attack modes, leadership styles, harness types, and modes for requesting vacation days from the US Military. The project was originally inspired by writer Paul Ford’s article, “Amazing Military Infographics” which can be found in the Wayback Machine. As a whole, this collection forms a unique snapshot into our government’s Military Industrial Complex.

This event is organized by artists/archivists Liat Berdugo and Charlie Macquarie in partnership with the Internet Archive.

Tuesday, March 6
6:30 pm Reception
7:30 pm Program

Internet Archive
300 Funston Avenue
San Francisco, CA 94118

Get Free Tickets Here

Canadian Library Consortia OCUL and COPPUL Join Forces with Archive-It to Expand Web Archiving in Canada

The Council of Prairie and Pacific University Libraries (COPPUL) and the Ontario Council of University Libraries (OCUL) have joined forces in a multi-consortial offering of Archive-It, the web archiving service of the Internet Archive. Working together, COPPUL and OCUL are considering ways that they can significantly expand web archiving in Canada.

A coordinated subscription to Archive-It builds on the efforts of Canadian universities that have developed web archiving programs over the years, and the past work of Archive-It with both COPPUL and OCUL members.  With 12 COPPUL members and 12 OCUL members (more than half the total membership) now subscribing to Archive-It, there is an opportunity to build a foundation for further collaboration supporting research services and other digital library initiatives. In addition, participation by so many libraries helps lower the barrier of entry for additional member institutions to join in web archiving efforts across Canada.

“OCUL is very pleased to be able to offer Archive-It to our members,” said Ken Hernden, University Librarian at Algoma University and OCUL Chair. “Preservation of information and research is an important aspect of what libraries do to benefit scholars and communities. Preserving information for the future was challenging in a paper-and-print environment. It has become even more so in the digital information environment. We hope that enabling access to this tool will help build capacity for web archiving across Ontario, and beyond.”

“Tools like Archive-It enable libraries and archives of all sizes to build news kinds of collections to support their communities in an environment where more and more of our cultural memory has moved online. We’re absolutely thrilled to be working with our OCUL colleagues in this critically important area,” said Corey Davis, COPPUL Digital Preservation Network Coordinator.

“Archive-It is excited to ramp up its support for web archiving in Canada. The joint subscription is a strategic and cost-effective way to expand web archiving among Canadian universities and to encourage participation from smaller universities who may not have felt they had the institutional resources to develop a web archiving program without the support of the consortiums.” said Lori Donovan, Senior Program Manager for Archive-It.

OCUL is a consortium of Ontario’s 21 university libraries. OCUL provides a range of services to its members, including collection purchasing and a shared digital information infrastructure, in order to support to support high quality education and research in Ontario’s universities. In 2017, OCUL commemorates its 50th anniversary.

Working together, COPPUL members leverage their collective expertise, resources, and influence, increasing capacity and infrastructure, to enhance learning, teaching, student experiences and research at our institutions. The consortium comprises 22 university libraries located in Manitoba, Saskatchewan, Alberta and British Columbia, as well as 15 affiliate members across Canada. First deployed in 2006, Archive-It is a subscription web archiving service from the Internet Archive that helps organizations to harvest, build, and preserve collections of web-published digital content.

Additionally, the recently created Canadian Web Archiving Coalition (CWAC) will help build a community of practice for Canadian organizations engaging in web archiving and create a network for collaboration, support, and knowledge sharing. Under the auspices of Canadian Association of Research Libraries (CARL) and in collaboration with Library and Archives Canada (LAC), the CWAC plans to hold an inaugural meeting in conjunction with the Internet Preservation Coalition General Assembly this September at LAC’s Preservation Centre in Gatineau, QC.  For more information about the CWAC, including how to join, please contact corey@coppul.ca.

For more information on the consortial subscription, contact carol@coppul.ca or jacqueline.cato@ocul.on.ca or lori@archive.org.

IMLS Grant to Advance Web Archiving in Public Libraries

We are excited to announce that the Institute of Museum and Library Services (IMLS) has recently awarded our Archive-It service a Laura Bush 21st Century Librarian grant from its Continuing Education in Curating Collections program for the project Community Webs: Empowering Public Librarians to Create Community History Web Archives.

Working with partners from Queens Public Library, Cleveland Public Library, and San Francisco Public Library, and with OCLC’s WebJunction, which offers education and training to public libraries nationwide, the “Community Webs” project will provide training, cohort support, and services, for a group of librarians at 15 different public libraries to develop expertise in creating collections of historically valuable web materials documenting their local communities. Project outputs will include over 30 terabytes of community history web archives and a suite of open educational resources, from guides to videos, for use by any librarian, archivist, or heritage professional working to preserve collections of local history comprised of online materials.

We are now accepting applications from public libraries to participate in the program! Please help us spread the word about this opportunity to the entire public library community. You can also visit the program’s webpage for more information and the project’s grant materials are available through the IMLS award page.

Curating web archives documenting the lives of their patrons offers public librarians a unique opportunity to position themselves as the natural stewards of web-published local history and solidifies their role as information custodians and community anchors in the era of the web. We owe a debt of thanks to IMLS for supporting innovative tools and training for librarians and look forward to working with our public library friends and colleagues to advance web archiving within their profession and for the benefit of their local communities.

K-12 Web Archivists Capture History in the Making

by Sylvie Rollason-Cass, Web Archivist, Archive-It

This year marked the 9th season of the K-12 web archiving program. Students from 11 schools around the country worked together to think critically about information on the web and to select websites to archive for the future. Their collections are centered around topics that reflect their interests, their day-to-day lives, current events, and topics they studied in class. Each school incorporates web archiving into its curriculum differently. This year 3 teachers generously shared their experiences participating in the K-12 Web Archiving program. Find out more about their year below, and be sure to check out all of the 2016-2017 Student Collections.

Web Archiving in the Civics Classroom at Williams Middle Magnet School 

Elizabeth Smith – Civics Teacher

My name is Elizabeth Smith and I teach Grade 7 Civics at Willaims Middle Magnet School in Tampa, Florida. I was so excited when I read about this project and could not wait to apply. I am technologically challenged but always look for ways to integrate technology into my classroom. Our civics curriculum is spiraled with analysis of primary and secondary sources and this project was a great way to enrich what we were already doing. We chose Florida as our focus as many of my students wanted to learn more about the state in which they live. Students chose to research websites of their own personal and academic interest. Several said this project would help them identify areas of research for their upcoming 8th grade community project. We are looking forward to being a part of the project again during the next archiving season!

Check out William’s Middle Magnet School’s 2016-2017 Collections >>> 

Using the Wayback Machine at the Rooftop School, Samuel discovers that YouTube was originally a dating website.

“Archive/Opera” – The Studio at Mayeda at Rooftop School

Andi Wong – Teaching Artist

At Rooftop Alternative PreK-8 School in San Francisco, 40 seventh and eighth graders worked with teaching artist Andi Wong to establish The Rooftop ARTchives. The “Archive/Opera” class at the school’s Mayeda Campus gave these students the opportunity to create both the “archives” (“public records” from the Greek ta arkheia,) and the opera (new “work”).

In this tumultuous political climate, the importance of community, civic responsibility and cultural memory became clear to our students. History will record how tribes of water-protectors gathered together at Standing Rock; millions of women marched around the world in pink knitted caps; scientists worked with archivists to save climate data and disappearing government websites; and the National Parks Service went rogue on Twitter. The act of archiving requires careful consideration of the past, present and future. As our students ventured beyond the walls of their classrooms to experience the stairways, slides and expansive vistas of Twin Peaks, their open conversations led to many questions. How can we know what is missing, if something has not yet been found? How many students have graduated from Rooftop? How far does the raven fly? Will San Francisco still experience fog in one hundred years? When today’s youth are sixty-four, will they still remember the lyrics to all of the songs from Hamilton?

The Archive/Opera class culminated with a community gathering — an Open House celebration of the Mayeda Campus’ 20th Anniversary, featuring artwork, musical performances, student speeches about the archiving experience. A tea serving ceremony honored principal Nancy Mayeda and the teachers who first opened the doors to the Mayeda Campus in 1997. The evening’s program closed with the presentation of a City proclamation and the dedication of the Rooftop ARTchives. When our students were asked to reflect on what they valued most about this year’s experience, they spoke of freedom, friendship and community pride in accomplishing something important together. Many thanks to the Internet Archive, the Library of Congress and Archive-It’s K12 Web Archiving Program for helping Rooftop’s students to capture history in the making. The act of archiving gave our students a very real sense of their collective power and responsibility as the keepers of their own stories and memories.

Check out Rooftop School’s 2016-2017 Collections >>>

Reflections on the 2016-17 K12 Web Archiving Project at Mount Dora High School

Patricia Carlton, PhD – Media Specialist

Challenging adult authority may be the bailiwick of teenagers, yet when questioning the authority of the Internet, teens are not as skilled or tenacious. Web archiving presents a fun and empowering way for my high school students to critically examine the authorship and credibility of the Internet, as well as identify what is historically and culturally significant. When this year’s web-archiving students began selecting and creating collections for the archive, I suggested they peer more closely under the hood of each site and object. What did they discover from their crawls that wasn’t immediately apparent from their first “reading” of the website? The following quotes excerpted from a sampling of the students’ final review and evaluation of the project reveal the type of discoveries made regarding their collections and the Internet in general.

The web is an extremely important factor in preserving things in order to view them in later years. The web, in my opinion is also much easier and more accessible to a wider range of people. While on the web, you have to be extremely careful on what you consider a reliable source. – Felicia

I have learned that the web is very contradictory and is filled with differing opinions, facts, and beliefs, and you normally can find an answer you like if you search long enough, despite general public beliefs. – Kacee

Most students assumed greater responsibility for controlling their crawls than my previous web-archivers, evidenced by their attention to their crawl scopes and carefully crafted descriptions at both collection and seed level metadata. The 2016-17 cohort “authored” their respective collections and even added corresponding MLA citations! They believed not only in the significance of their collections (conspiracy theories, political memes, and chick flicks to mention a few), but they also believed they were contributing new knowledge – real, meaningful content to the Internet that someone, someday might discover. And, their teen voices would be the authority behind their interpretation and curation!

Check out Mount Dora’s 2016-2017 Collections >>>