Tag Archives: End of Term Web Archive

End of Term Web Archive – Preserving the Transition of a Nation

It’s that time again. The 2024 End of Term crawl has officially begun! The End of Term Web Archive #EOTArchive hosts an initiative named the End of Term crawl to archive U.S. government websites in the .gov and .mil web domains — as well as those harder-to-find government websites hosted on .org, .edu, and other top level domains (TLDs) — as one administrative term ends and a new term begins. 

End of Term crawls have been completed for term transitions in 2004, 2008, 2012, 2016, and 2020. The results of these efforts is preserved in the End of Term Web Archive. In total, over 500 terabytes of government websites and data have been archived through the End of Term Web Archive efforts. These archives can be searched full-text via the Internet Archive’s collections search and also downloaded as bulk data for machine-assisted analysis.

The purpose of the End of Term Web Archive is to preserve a record of government websites for historical and research purposes. It is important to capture these websites because they can provide a snapshot of government messaging before and after the transition of terms. The End of Term Web Archive preserves information that may no longer be available on the live web for open access.

The End of Term Archive is a collaborative effort by the Internet Archive along with the University of North Texas (UNT), Stanford University, Library of Congress (LC), U.S. Government Publishing Office (GPO), and National Archives and Records Administration (NARA). Past partners include the University of CA’s California Digital Library (CDL), George Washington University, and the Environmental Data and Governance Initiative (EDGI).

Four images of Whitehouse.gov captured between 2008 and 2020
Whitehouse.gov captures from: 2008 Sept. 15; 2013 Mar. 21; 2017 Feb. 3; and 2021 Feb. 25

We are committed to preserving a record of U.S. government websites. But we need your help to complete the 2024 End of Term crawl. 

How can you help?! 

We have a list of top level domains from the General Services Administration (GSA) and from previous End of term crawls. But we need volunteers to help us out. We are currently accepting nominations for websites to be included in the 2024 End of Term Web Archive.

Submit a url nomination by going to digital2.library.unt.edu/nomination/eth2024/.
We encourage you to nominate any and all U.S. federal government websites that you want to make sure get captured. Nominating urls deep within .gov/.mil websites helps to make our web crawls as thorough and complete as possible. 

Individuals and institutions nominating seed urls are recognized on the individual contributors leaderboard and the institutions leaderboard!

Explore the End of Term Web Archive with full text search and download the data!