Category Archives: Web Archive

Robots.txt Files and Archiving .gov and .mil Websites

The Internet Archive is collecting webpages from over 6,000 government domains, over 200,000 hosts, and feeds from around 10,000 official federal social media accounts. Some have asked if we ignore URL exclusions expressed in robots.txt files. The answer is a … Continue reading

Posted in News, Wayback Machine, Web Archive | 3 Comments

Please: Help Build the 2016 U.S. Presidential Election Web Archive

Help us build a web archive documenting reactions to the 2016 Presidential Election. You can submit websites and other online materials, and provide relevant descriptive information, via this simple submission form. We will archive and provide ongoing access to these … Continue reading

Posted in Announcements, Archive-It, News, Web Archive | 8 Comments


Jason Scott presents Internet Memes of the last 20 Years at the Internet Archive’s 20th anniversary celebration. ——– It’s always going to be an open question as to what parts of culture will survive beyond each generation, but there’s very … Continue reading

Posted in News, Wayback Machine, Web Archive | 2 Comments

Defining Web pages, Web sites and Web captures

The Internet Archive has been archiving the web for 20 years and has preserved billions of webpages from millions of websites. These webpages are often made up of, and link to, many images, videos, style sheets, scripts and other web objects. … Continue reading

Posted in Announcements, News, Web Archive | 4 Comments