Author Archives: jefferson

27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving

Posted on February 28, 2018 by jefferson

The lives and activities of communities are increasingly documented online; local news, events, disasters, celebrations — the experiences of citizens are now largely shared via social media and web platforms. As these primary sources about community life move to the web, the need to archive these materials becomes an increasingly important activity of the stewards of community memory. And in many communities across the nation, public libraries, as one of their many responsibilities to their patrons, serve the vital role of stewards of local history. Yet public libraries have historically been a small fraction of the growing national and international web archiving community.

With generous support from the Institute of Museum and Library Services, as well as the Kahle/Austin Foundation and the Archive-It service, the Internet Archive and 27 public library partners representing 17 different states have launched a new program: Community Webs: Empowering Public Libraries to Create Community History Web Archives. The program will provide education, applied training, cohort network development, and web archiving services for a group of public librarians to develop expertise in web archiving for the purpose of local memory collecting. Additional partners in the program include OCLC’s WebJunction training and education service and the public libraries of Queens, Cleveland and San Francisco will serve as “lead libraries” in the cohort. The program will result in dozens of terabytes of public library administered local history web archives, a range of open educational resources in the form of online courses, videos, and guides, and a nationwide network of public librarians with expertise in local history web archiving and the advocacy tools to build and expand the network. A full listing of the participating public libraries is below and on the program website.

In November 2017, the cohort gathered together at the Internet Archive for a kickoff meeting of brainstorming, socializing, and, of course, talking all things web archiving. Partners shared details on their existing local history programs and ideas for collection development around web materials. Attendees talked about building collections documenting their demographic diversity or focusing on local issues, such as housing availability or changes in community profile. As an example, Abbie Zeltzer from the Patagonia Public Library, spoke about the changes in her community of 913 residents as the town redevelops a long dormant mining industry. Zeltzer intends on developing a web archive documenting this transition and the related community reaction and changes.

Since the kickoff meeting, the Community Webs cohort has been actively building collections, from hyper-local media sites in Kansas City, to neighborhood blogs in Washington D.C., to Mardi Gras in East Baton Rouge. In addition, program staff, cohort members, and WebJunction have been building out an extensive online course space with educational materials for training on web archiving for local history. The full course space and all open educational resources will be released in early 2019 and a second full in-person meeting of the cohort will take place in Fall 2018.

For further information on the Community Webs program, contact Maria Praetzellis, Program Manager, Web Archiving [maria at archive.org] or Jefferson Bailey, Director, Web Archiving [jefferson at archive.org].

Public Library	City	State
Athens Regional Library System	Athens	GA
Birmingham Public Library	Birmingham	AL
Brooklyn Public Library – Brooklyn Collection	New York City	NY
Buffalo & Erie County Public Library	Buffalo	NY
Cleveland Public LIbrary	Cleveland	OH
Columbus Metropolitan Library	Columbus	OH
County of Los Angeles Public Library	Los Angeles	CA
DC Public Library	Washington	DC
Denver Public Library – Western History and Genealogy Department and Blair-Caldwell African American Research Library	Denver	CO
East Baton Rouge Parish Library	East Baton Rouge	LA
Forbes Library	Northampton	MA
Grand Rapids Public Library	Grand Rapids	MI
Henderson District Public Libraries	Henderson	NV
Kansas City Public Library	Kansas City	MO
Lawrence Public Library	Lawrence	KS
Marshall Lyon County Library	Marshall	MN
New Brunswick Free Public Library	New Brunswick	NJ
Schomburg Center for Research in Black Culture (NYPL)	New York City	NY
Patagonia Library	Patagonia	AZ
Pollard Memorial Library	Lowell	MA
Queens Library	New York City	NY
San Diego Public Library	San Diego	CA
San Francisco Public Library	San Francisco	CA
Sonoma County Public Library	Santa Rosa	CA
The Urbana Free Library	Urbana	IL
West Hartford Public Library	West Hartford	CT
Westborough Public Library	Westborough	MA

Military Industrial Powerpoint Complex Karaoke! — Tuesday, March 6

Posted on February 15, 2018 by jefferson

The Internet Archive presents the first ever Military Powerpoint Karaoke: a night of “Powerpoint Karaoke” using presentations in the Military Industrial Powerpoint Complex collection at archive.org that were extracted by the Internet Archive from its public web archive and converted into a special collection of PDFs/epubs. The event will take place on Tuesday, March 6th at 7:30pm at our headquarters in San Francisco. The show will be preceded by a reception at 6:30 pm, when doors will also open.

Get Free Tickets Here

Also known as “Battle Decks,” Powerpoint Karaoke is an improvisational and art event where audience members give a presentation using a set of Powerpoint slides that they’ve never seen before. There are three rules: 1) The presenter cannot see the slides before presenting; 2) The presenter delivers each slide in succession without skipping slides or going back; and 3) The presentation ends when all slides are presented, or after 5 minutes (whichever comes first). We’re thrilled to have Rick Prelinger, creator of Lost Landscapes and Prelinger Archive, and Avery Trufelman of 99% Invisible, joining us to deliver headlining Powerpoint decks. The rest of the presentations will be delivered by you — the audience members who sign up.

This event will use, as its source material, a curated collection of the Internet Archive’s Military Industrial Powerpoint Complex, a special project alongside GifCities that was originally created for the Internet Archive’s 20th Anniversary in October 2016. For the project, IA staff extracted all the Powerpoint files from its archive of the government’s public .mil web domain. The collection was expanded in early 2017 to include materials collected during the End of Term project, which archived a snapshot of the .gov and .mil web domains during the administration change. The Military Industrial Powerpoint Complex collection contains over 57,000 Powerpoint decks, each charged with material that ranges from the violent to the banal, featuring attack modes, leadership styles, harness types, and modes for requesting vacation days from the US Military. The project was originally inspired by writer Paul Ford’s article, “Amazing Military Infographics” which can be found in the Wayback Machine. As a whole, this collection forms a unique snapshot into our government’s Military Industrial Complex.

This event is organized by artists/archivists Liat Berdugo and Charlie Macquarie in partnership with the Internet Archive.

Tuesday, March 6
6:30 pm Reception
7:30 pm Program

Internet Archive
300 Funston Avenue
San Francisco, CA 94118

Get Free Tickets Here

Canadian Library Consortia OCUL and COPPUL Join Forces with Archive-It to Expand Web Archiving in Canada

Posted on August 2, 2017 by jefferson

The Council of Prairie and Pacific University Libraries (COPPUL) and the Ontario Council of University Libraries (OCUL) have joined forces in a multi-consortial offering of Archive-It, the web archiving service of the Internet Archive. Working together, COPPUL and OCUL are considering ways that they can significantly expand web archiving in Canada.

A coordinated subscription to Archive-It builds on the efforts of Canadian universities that have developed web archiving programs over the years, and the past work of Archive-It with both COPPUL and OCUL members. With 12 COPPUL members and 12 OCUL members (more than half the total membership) now subscribing to Archive-It, there is an opportunity to build a foundation for further collaboration supporting research services and other digital library initiatives. In addition, participation by so many libraries helps lower the barrier of entry for additional member institutions to join in web archiving efforts across Canada.

“OCUL is very pleased to be able to offer Archive-It to our members,” said Ken Hernden, University Librarian at Algoma University and OCUL Chair. “Preservation of information and research is an important aspect of what libraries do to benefit scholars and communities. Preserving information for the future was challenging in a paper-and-print environment. It has become even more so in the digital information environment. We hope that enabling access to this tool will help build capacity for web archiving across Ontario, and beyond.”

“Tools like Archive-It enable libraries and archives of all sizes to build news kinds of collections to support their communities in an environment where more and more of our cultural memory has moved online. We’re absolutely thrilled to be working with our OCUL colleagues in this critically important area,” said Corey Davis, COPPUL Digital Preservation Network Coordinator.

“Archive-It is excited to ramp up its support for web archiving in Canada. The joint subscription is a strategic and cost-effective way to expand web archiving among Canadian universities and to encourage participation from smaller universities who may not have felt they had the institutional resources to develop a web archiving program without the support of the consortiums.” said Lori Donovan, Senior Program Manager for Archive-It.

OCUL is a consortium of Ontario’s 21 university libraries. OCUL provides a range of services to its members, including collection purchasing and a shared digital information infrastructure, in order to support to support high quality education and research in Ontario’s universities. In 2017, OCUL commemorates its 50th anniversary.

Working together, COPPUL members leverage their collective expertise, resources, and influence, increasing capacity and infrastructure, to enhance learning, teaching, student experiences and research at our institutions. The consortium comprises 22 university libraries located in Manitoba, Saskatchewan, Alberta and British Columbia, as well as 15 affiliate members across Canada. First deployed in 2006, Archive-It is a subscription web archiving service from the Internet Archive that helps organizations to harvest, build, and preserve collections of web-published digital content.

Additionally, the recently created Canadian Web Archiving Coalition (CWAC) will help build a community of practice for Canadian organizations engaging in web archiving and create a network for collaboration, support, and knowledge sharing. Under the auspices of Canadian Association of Research Libraries (CARL) and in collaboration with Library and Archives Canada (LAC), the CWAC plans to hold an inaugural meeting in conjunction with the Internet Preservation Coalition General Assembly this September at LAC’s Preservation Centre in Gatineau, QC. For more information about the CWAC, including how to join, please contact corey@coppul.ca.

For more information on the consortial subscription, contact carol@coppul.ca or jacqueline.cato@ocul.on.ca or lori@archive.org.

IMLS Grant to Advance Web Archiving in Public Libraries

Posted on July 18, 2017 by jefferson

We are excited to announce that the Institute of Museum and Library Services (IMLS) has recently awarded our Archive-It service a Laura Bush 21st Century Librarian grant from its Continuing Education in Curating Collections program for the project Community Webs: Empowering Public Librarians to Create Community History Web Archives.

Working with partners from Queens Public Library, Cleveland Public Library, and San Francisco Public Library, and with OCLC’s WebJunction, which offers education and training to public libraries nationwide, the “Community Webs” project will provide training, cohort support, and services, for a group of librarians at 15 different public libraries to develop expertise in creating collections of historically valuable web materials documenting their local communities. Project outputs will include over 30 terabytes of community history web archives and a suite of open educational resources, from guides to videos, for use by any librarian, archivist, or heritage professional working to preserve collections of local history comprised of online materials.

We are now accepting applications from public libraries to participate in the program! Please help us spread the word about this opportunity to the entire public library community. You can also visit the program’s webpage for more information and the project’s grant materials are available through the IMLS award page.

Curating web archives documenting the lives of their patrons offers public librarians a unique opportunity to position themselves as the natural stewards of web-published local history and solidifies their role as information custodians and community anchors in the era of the web. We owe a debt of thanks to IMLS for supporting innovative tools and training for librarians and look forward to working with our public library friends and colleagues to advance web archiving within their profession and for the benefit of their local communities.

K-12 Web Archivists Capture History in the Making

Posted on July 5, 2017 by jefferson

by Sylvie Rollason-Cass, Web Archivist, Archive-It

This year marked the 9th season of the K-12 web archiving program. Students from 11 schools around the country worked together to think critically about information on the web and to select websites to archive for the future. Their collections are centered around topics that reflect their interests, their day-to-day lives, current events, and topics they studied in class. Each school incorporates web archiving into its curriculum differently. This year 3 teachers generously shared their experiences participating in the K-12 Web Archiving program. Find out more about their year below, and be sure to check out all of the 2016-2017 Student Collections.

Web Archiving in the Civics Classroom at Williams Middle Magnet School

Elizabeth Smith – Civics Teacher

My name is Elizabeth Smith and I teach Grade 7 Civics at Willaims Middle Magnet School in Tampa, Florida. I was so excited when I read about this project and could not wait to apply. I am technologically challenged but always look for ways to integrate technology into my classroom. Our civics curriculum is spiraled with analysis of primary and secondary sources and this project was a great way to enrich what we were already doing. We chose Florida as our focus as many of my students wanted to learn more about the state in which they live. Students chose to research websites of their own personal and academic interest. Several said this project would help them identify areas of research for their upcoming 8th grade community project. We are looking forward to being a part of the project again during the next archiving season!

Check out William’s Middle Magnet School’s 2016-2017 Collections >>>

Using the Wayback Machine at the Rooftop School, Samuel discovers that YouTube was originally a dating website.

“Archive/Opera” – The Studio at Mayeda at Rooftop School

Andi Wong – Teaching Artist

At Rooftop Alternative PreK-8 School in San Francisco, 40 seventh and eighth graders worked with teaching artist Andi Wong to establish The Rooftop ARTchives. The “Archive/Opera” class at the school’s Mayeda Campus gave these students the opportunity to create both the “archives” (“public records” from the Greek ta arkheia,) and the opera (new “work”).

In this tumultuous political climate, the importance of community, civic responsibility and cultural memory became clear to our students. History will record how tribes of water-protectors gathered together at Standing Rock; millions of women marched around the world in pink knitted caps; scientists worked with archivists to save climate data and disappearing government websites; and the National Parks Service went rogue on Twitter. The act of archiving requires careful consideration of the past, present and future. As our students ventured beyond the walls of their classrooms to experience the stairways, slides and expansive vistas of Twin Peaks, their open conversations led to many questions. How can we know what is missing, if something has not yet been found? How many students have graduated from Rooftop? How far does the raven fly? Will San Francisco still experience fog in one hundred years? When today’s youth are sixty-four, will they still remember the lyrics to all of the songs from Hamilton?

The Archive/Opera class culminated with a community gathering — an Open House celebration of the Mayeda Campus’ 20th Anniversary, featuring artwork, musical performances, student speeches about the archiving experience. A tea serving ceremony honored principal Nancy Mayeda and the teachers who first opened the doors to the Mayeda Campus in 1997. The evening’s program closed with the presentation of a City proclamation and the dedication of the Rooftop ARTchives. When our students were asked to reflect on what they valued most about this year’s experience, they spoke of freedom, friendship and community pride in accomplishing something important together. Many thanks to the Internet Archive, the Library of Congress and Archive-It’s K12 Web Archiving Program for helping Rooftop’s students to capture history in the making. The act of archiving gave our students a very real sense of their collective power and responsibility as the keepers of their own stories and memories.

Check out Rooftop School’s 2016-2017 Collections >>>

Reflections on the 2016-17 K12 Web Archiving Project at Mount Dora High School

Patricia Carlton, PhD – Media Specialist

Challenging adult authority may be the bailiwick of teenagers, yet when questioning the authority of the Internet, teens are not as skilled or tenacious. Web archiving presents a fun and empowering way for my high school students to critically examine the authorship and credibility of the Internet, as well as identify what is historically and culturally significant. When this year’s web-archiving students began selecting and creating collections for the archive, I suggested they peer more closely under the hood of each site and object. What did they discover from their crawls that wasn’t immediately apparent from their first “reading” of the website? The following quotes excerpted from a sampling of the students’ final review and evaluation of the project reveal the type of discoveries made regarding their collections and the Internet in general.

The web is an extremely important factor in preserving things in order to view them in later years. The web, in my opinion is also much easier and more accessible to a wider range of people. While on the web, you have to be extremely careful on what you consider a reliable source. – Felicia

I have learned that the web is very contradictory and is filled with differing opinions, facts, and beliefs, and you normally can find an answer you like if you search long enough, despite general public beliefs. – Kacee

Most students assumed greater responsibility for controlling their crawls than my previous web-archivers, evidenced by their attention to their crawl scopes and carefully crafted descriptions at both collection and seed level metadata. The 2016-17 cohort “authored” their respective collections and even added corresponding MLA citations! They believed not only in the significance of their collections (conspiracy theories, political memes, and chick flicks to mention a few), but they also believed they were contributing new knowledge – real, meaningful content to the Internet that someone, someday might discover. And, their teen voices would be the authority behind their interpretation and curation!

Check out Mount Dora’s 2016-2017 Collections >>>

Over 200 terabytes of the government web archived!

Posted on May 9, 2017 by jefferson

In our December post, “Preserving U.S. Government Websites and Data as the Obama Term Ends,” we described our participation in the End of Term Web Archive project to preserve federal government websites and data at times of administration changes. We wanted to give a quick update on the project — we have archived a heck of a lot of data!

Between Fall 2016 and Spring 2017, the Internet Archive archived over 200 terabytes of government websites and data. This includes over 100TB of public websites and over 100TB of public data from federal FTP file servers totaling, together, over 350 million URLs/files. This includes over 70 million html pages, over 40 million PDFs and, towards the other end of the spectrum and for semantic web aficionados, 8 files of the text/turtle mime type. Other End of Term partners have also been vigorously preserving websites and data from the .gov/.mil web domains.

Every web page we have archived is accessible through the Wayback Machine and we are working to add the 2016 harvest to the main End of Term portal soon. While we continue to analyze this collection, we posted some preliminary statistics using the new Wayback Machine’s summary interface for this specific collection, which can be found on the End of Term (EOT 2016) summary stats page; those and additional stats are served via a public EOT 2016 stats API and the full collection is also available.

Through the EOT project’s public nomination form and through our collaboration with the DataRefuge, Environmental Data and Governance Initiative (EDGI), and other efforts, over 100,000 webpages or government datasets were nominated by citizens and preservationists for archiving. The EOT and community efforts have also garnered notable press (see our End of Term 2016 Press collection). We are working with partners to provide access to the full dataset for use in data mining and computational analysis and hosted a hackathon earlier this year to support use of the Obama White House Social Media datasets.

While the specific End of Term collection has closed, we continue our large-scale, dedicated efforts to preserve the government web. Working with the University of North Texas, we launched the Government Web & Data Archive nomination form so the public can continue to nominate public government websites and data to be archived.

Lastly, archiving government data remains a critical activity of the preservation community. You can help our role in these efforts by continuing to nominate websites, promoting the EOT project via press and outreach (contact the EOT project team for any inquiries), and by donating to the Internet Archive to support our ongoing mission to provide “Universal Access to All Knowledge.”

Join us for a White House Social Media and Gov Data Hackathon!

Posted on January 2, 2017 by jefferson

Join us at the Internet Archive this Saturday January 7 for a government data hackathon! We are hosting an informal hackathon working with White House social media data, government web data, and data from election-related collections. We will provide more gov data than you can shake a script at! If you are interested in attending, please register using this form. The event will take place at our 300 Funston Avenue headquarters from 10am-5pm.

We have been working with the White House on their admirable project to provide public access to eight years of White House social media data for research and creative reuse. Read more on their efforts at this blog post. Copies of this data will be publicly accessible at archive.org. We have also been furiously archiving the federal government web as part of our collaborative End of Term Web Archive and have also collected a voluminous amount of media and web data as part of the 2016 election cycle. Data from these projects — and others — will be made publicly accessible for folks to analyze, study, and do fun, interesting things with.

At Saturday’s hackathon, we will give an overview of the datasets available, have short talks from affiliated projects and services, and point to tools and methods for analyzing the hackathon’s data. We plan for a loose, informal event. Some datasets that will be available for the event and publicly accessible online:

Obama Administration White House social media from 2009-current, including Twitter, Tumblr, Vine, Facebook, and (possibly) YouTube
Comprehensive web archive data of current White House websites: whitehouse.gov, petitions.whitehouse.gov, letsmove.gov and other .gov websites
The End of Term Web Archives, a large-scale collaborative effort to preserve the federal government web ( .gov/.mil) at presidential transitions, including web data from 2008, 2012, and our current 2016 project
Special sub-collections of government data, such as every powerpoint in the Internet Archive’s web archive from the .mil web domain
Extensive archives of of social media data related to the 2016 election including data from candidates, pundits, and media
Full text transcripts of Trump candidate speeches
Python notebooks, cluster computing tools, and pointers to methods for playing with data at scale.

Much of this data was collected in partnership with other libraries and with the support of external funders. We thank, foremost, the current White House Office of Digital Strategy staff for their advocacy for open access and working with us and others to make their social media open to the public. We also thank our End of Term Web Archive partners and related community efforts helping preserve the .gov web, as well as the funders that have supported many of the collecting and engineering efforts that makes all this data publicly accessible, including the Institute of Museum and Library Services, Altiscale, the Knight Foundation, the Democracy Fund, the Kahle-Austin Foundation, and others.

Preserving U.S. Government Websites and Data as the Obama Term Ends

Posted on December 15, 2016 by jefferson

Long before the 2016 Presidential election cycle librarians have understood this often-overlooked fact: vast amounts of government data and digital information are at risk of vanishing when a presidential term ends and administrations change. For example, 83% of .gov pdf’s disappeared between 2008 and 2012.

That is why the Internet Archive, along with partners from the Library of Congress, University of North Texas, George Washington University, Stanford University, California Digital Library, and other public and private libraries, are hard at work on the End of Term Web Archive, a wide-ranging effort to preserve the entirety of the federal government web presence, especially the .gov and .mil domains, along with federal websites on other domains and official government social media accounts.

While not the only project the Internet Archive is doing to preserve government websites, ftp sites, and databases at this time, the End of Term Web Archive is a far reaching one.

The Internet Archive is collecting webpages from over 6,000 government domains, over 200,000 hosts, and feeds from around 10,000 official federal social media accounts. The effort is likely to preserve hundreds of millions of individual government webpages and data and could end up totaling well over 100 terabytes of data of archived materials. Over its full history of web archiving, the Internet Archive has preserved over 3.5 billion URLs from the .gov domain including over 45 million PDFs.

This end-of-term collection builds on similar initiatives in 2008 and 2012 by original partners Internet Archive, Library of Congress, University of North Texas, and California Digital Library to document the “gov web,” which has no mandated, domain-wide single custodian. For instance, here is the National Institute of Literacy (NIFL) website in 2008. The domain went offline in 2011. Similarly, the Sustainable Development Indicators (SDI) site was later taken down. Other websites, such as invasivespecies.gov were later folded into larger agency domains. Every web page archived is accessible through the Wayback Machine and past and current End of Term specific collections are full-text searchable through the main End of Term portal. We have also worked with additional partners to provide access to the full data for use in data-mining research and projects.

The project has received considerable press attention this year, with related stories in The New York Times, Politico, The Washington Post, Library Journal, Motherboard, and others.

“No single government entity is responsible for archiving the entire federal government’s web presence,” explained Jefferson Bailey, the Internet Archive’s Director of Web Archiving. “Web data is already highly ephemeral and websites without a mandated custodian are even more imperiled. These sites include significant amounts of publicly-funded federal research, data, projects, and reporting that may only exist or be published on the web. This is tremendously important historical information. It also creates an amazing opportunity for libraries and archives to join forces and resources and collaborate to archive and provide permanent access to this material.”

This year has also seen a significant increase in citizen and librarian driven “hackathons” and “nomination-a-thons” where subject experts and concerned information professionals crowdsource lists of high-value or endangered websites for the End of Term archiving partners to crawl. Librarian groups in New York City are holding nomination events to make sure important sites are preserved. And universities such as The University of Toronto are holding events for “guerrilla archiving” focused specifically on preserving climate related data.

We need your help too! You can use the End of Term Nomination Tool to nominate any .gov or government website or social media site and it will be archived by the project team. If you have other ideas, please comment here or send ideas to info@archive.org. And you can also help by donating to the Internet Archive to help our continued mission to provide “Universal Access to All Knowledge.”

Please: Help Build the 2016 U.S. Presidential Election Web Archive

Posted on November 11, 2016 by jefferson

Help us build a web archive documenting reactions to the 2016 Presidential Election. You can submit websites and other online materials, and provide relevant descriptive information, via this simple submission form. We will archive and provide ongoing access to these materials as part of the Internet Archive Global Events collection.

Since its beginning, the Internet Archive has worked with a global partner community of cultural heritage institutions, researchers and scholars, and citizens to build crowdsourced topical web archives that preserve primary sources documenting significant global events. Past collections include the Occupy Movement, the 2013 US Government Shutdown, the Jasmine Revolution in Tunisia, and the Charlie Hebdo attacks. These collections leverage the power of individual curators and motivated citizens to help expand our collective efforts to diversity and augment the historical record. Any webpages, sites, or other online resources about the 2016 Presidential Election are in scope. This web archive will build upon our affiliated efforts, such as the Political TV Ad Archive, and other collecting strategies, to provide permanent access to current political events.

As we noted in a recent blog post, the Internet Archive is “well positioned, with our mission of Universal Access to All Knowledge, to help inform the public in turbulent times, to demonstrate the power in sharing and openness.” You can help us in this mission by submitting websites that preserve the online record of this unique historical moment.

GifCities: The GeoCities Animated GIF Search Engine

Posted on November 1, 2016 by jefferson

Try the Internet Archive’s animated GIF search engine at GifCities.org! You can now get your early-web GIF fix and have a fun way to browse the web archive. Search for snowglobes or butterflies or balloons or (naturally) cats. If you click on a GIF, then it brings to you to the original page from the Wayback Machine. (Then please consider donating to the Archive)

One of the goals for our 20th anniversary event last week was to highlight the amusing and wacky corners of the web, as represented in our web archive, in order to provide a light-hearted, novel perspective on the history of this amazing publication platform that we have worked to preserve over the years.

The animated GIF is perhaps the iconic, indomitable filetype of the early web. Meme-vessel, page-spacer, action-graphic-maker — GIFS are a quintessential feature of the 1990’s web aesthetic, but remain just as popular today as they were twenty years ago. GeoCities, the first major web hosting platform for individual users to create their own pages, and once the third most visited site on the web before being shut down in 2009, occupies a similarly notable place in the history of the web.

So we combined these two aspects of web history by extracting every animated GIF from GeoCities in our web archive and built a search engine on top of them. Behold, for your viewing pleasure, over 4,500,000 animated GIFs (1,600,000 unique), searchable based on filename and URL path, with most GIFs linking to the archived GeoCities web page where it was originally displayed.

Some random staff faves:

Soft-launched at our anniversary event on Wednesday, where we also projected GifCities on the side of our headquarters in San Francisco, the project has been featured in The Guardian, BoingBoing, the A.V. Club, CNET, and others. The GeoCities GIF collection was also made available for creative reuse by artists and researchers, and featured in work such as the GifCollider project currently showing at BAMPFA (see the videos online) and the Hall of GIFs data visualization at NCSU. Shout-outs also go to others working with the GeoCities web archive, including the Geocities Research Institute and historians. More details on the project can be found at the GifCities about page.

And yes, like every other upstanding web citizen, we GifCities’ed ourselves:

Internet Archive Blogs

A blog from the team at archive.org

Author Archives: jefferson

27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving

Military Industrial Powerpoint Complex Karaoke! — Tuesday, March 6

Get Free Tickets Here

Canadian Library Consortia OCUL and COPPUL Join Forces with Archive-It to Expand Web Archiving in Canada

IMLS Grant to Advance Web Archiving in Public Libraries

K-12 Web Archivists Capture History in the Making

Web Archiving in the Civics Classroom at Williams Middle Magnet School

“Archive/Opera” – The Studio at Mayeda at Rooftop School

Reflections on the 2016-17 K12 Web Archiving Project at Mount Dora High School

Over 200 terabytes of the government web archived!

Join us for a White House Social Media and Gov Data Hackathon!

Preserving U.S. Government Websites and Data as the Obama Term Ends

Please: Help Build the 2016 U.S. Presidential Election Web Archive

GifCities: The GeoCities Animated GIF Search Engine