John Perry Barlow Symposium — Saturday, April 7

Please join us for a celebration of the life and leadership of the recently departed founder of EFF, John Perry Barlow. His friends and compatriots in the fight for civil liberties, a fair and open internet, and voices for open culture will discuss what his ideas mean to them, and how we can follow his example as we continue our fight.


Speakers Lineup:

Cindy Cohn, Executive Director of the Electronic Frontier Foundation
Cory Doctorow, celebrated scifi author and Editor in Chief of Boing Boing
Joi Ito, Director of the MIT Media Lab
John Gilmore, EFF Co-founder, Board Member, entrepreneur and technologist
Trevor Timm, Executive Director of Freedom of the Press
Shari Steele, Executive Director of the Tor Foundation and former EFF Executive Director
Mitch Kapor, Co-founder of EFF and Co-chair of the Kapor Center for Social Impact
Pam Samuelson, Richard M. Sherman Distinguished Professor of Law and Information at the University of California, Berkeley

We suggest a $20 donation for admission to the Symposium, but no one will be turned away for lack of funds. All ticket proceeds will benefit the Electronic Frontier Foundation and the Freedom of the Press Foundation.

John Perry Barlow Symposium
Saturday, April 7, 2018
2 PM to 6 PM

Internet Archive
300 Funston Avenue
San Francisco, CA 94118


Posted in Event, Upcoming Event | Leave a comment

TV News Record: How cable TV news reports news, fact-checks on banking, trade, and public lands

A round up on what’s happening at the TV News Archive by Katie Dahl and Nancy Watzman.

This week, we present a Washington Post analysis of coverage of an alleged affair by the president; a Vox piece examining coverage of Andrew McCabe, the former deputy FBI director; and The Toronto Star’s use of a salient clip to illustrate a point about a presidential appointment. We also show fact-checks from, PolitiFact, and The Washington Post’s Fact-Checker on claims related to banking, public lands, and trade policy.

Chicken-egg question on cable news coverage of alleged affair

CNN and MSNBC hosts and guests are talking a lot more about the alleged past affair between President Donald Trump and Stormy Daniels than Fox News is, according to Philip Bump’s latest analysis for The Washington Post using TV News Archive data via Television Explorer. 

Bump used the analysis as context to dig into a poll released by Suffolk University earlier this month: “One-fifth of Americans said that Fox News was the news or commentary source they trusted the most, a group that was primarily made up of Republicans… There’s a chicken-egg question here. Does Fox give the Stormy Daniels story a light touch because its audience is largely supportive of Trump or is Fox’s audience largely supportive of Trump because of the coverage they see on Fox? Or is it both?”

Did Fox News reporting contribute to perception of fired FBI official?

Vox’s Alvin Chang argues a connection between the firing of Andrew McCabe, former FBI deputy director, to a narrative built up over the course of months by Fox News. Using TV News Archive data via Television Explorer, Chang reports that “long before he was fired, Fox News… constantly referred to McCabe as the quintessential example of the FBI’s corruption and anti-Trump bias. They hinted that he was plotting several schemes against Trump during the election, leaking information to the press, and was bought and paid for by Hillary Clinton and Democrats.” This, he writes, allowed FOX News viewers to think it made “perfect sense for Attorney General Jeff Sessions (perhaps directed by Trump) to fire McCabe.” Chang goes on to warn, “This alternate reality is being fed into the president’s mind.”

What new presidential economic pick had to say about Canadian PM

The Toronto Star embedded a TV news clip in a piece on Trump’s pick to replace his economic advisor. Larry Kudlow, who is taking over from Gary Cohn as economic advisor, had said of U.S. trade policy:  “NAFTA is the key. And unfortunately we’re going after a major NAFTA ally, and perhaps America’s greatest ally, namely Canada. Even with this left-wing crazy guy Trudeau, they’re still our pals. They’re still our pals. Why are we going after them?” The clip has been viewed more than 112,000 times and counting.

Fact-Check: Senate banking bill a big win for Wall Street (Yes and No)

In a floor speech, Sen. Elizabeth Warren, D., Mass., said of the latest proposal to make changes to Dodd-Frank, “This bill is about goosing the bottom line and executive bonuses at the banks that make up the top one half of 1 percent of banks in this country by size. The very tippy-top.”

Manuela Tobias reported for PolitiFact: “The bill raises the bar of what is considered a big bank five-fold, which effectively relaxes the standards for large regional banks. Experts warn this also could open a door for bigger Wall Street bank giveaways.

The bill also has a few provisions affecting banks above $250 billion in assets. However, the effects would largely depend on the Federal Reserve’s interpretation of the law. The biggest banks might be able to get relaxed regulations, but then again, they might not.”

Fact-Check: Public lands proposal largest in history (False)

In a Senate hearing on the budget for the Dept. of the Interior, Interior Secretary Ryan Zinke said the president’s proposal “is the largest investment in our public lands infrastructure in our nation’s history. Let me repeat that, this is the largest investment in our public lands infrastructure in the history of this country.”

PolitiFact rates the claim false. Louis Jacobson reported: “It’s far from assured that the maximum figure of $18 billion in the proposal will ever be reached if enacted. Beyond that, though, Roosevelt’s $3 billion investment in the Civilian Conservation Corps would amount to $53 billion today, and it accounted for vastly more than the Trump proposal as a percentage of federal spending at the time.”

Fact-Check: U.S. has trade deficit with Canada (Four Pinocchios)

After a private meeting with Canadian Prime Minister Justin Trudeau, Trump defended his view about U.S.-Canada trade, tweeting, “We do have a Trade Deficit with Canada, as we do with almost all countries (some of them massive). P.M. Justin Trudeau of Canada, a very good guy, doesn’t like saying that Canada has a Surplus vs. the U.S.(negotiating), but they do … they almost all do … and that’s how I know!”

Glenn Kessler reports for The Washington Post’s Fact Checker that the president is not including services in his analysis of the trade relationship with Canada. He adds: “The president frequently suggests the United States is losing money with these deficits, but countries do not ‘lose’ money on trade deficits. A trade deficit simply means that people in one country are buying more goods from another country than people in the second country are buying from the first.” Kessler gives the claim four Pinocchios.

Eugene Kiely reports for that the president’s claim that figures giving the U.S. a trade surplus with Canada are not including timber and energy is “not accurate. The Census Bureau, which is within the U.S. Department of Commerce, said its trade figures do include timber and energy and referred us to two publications that show that the agency does include timber and energy for imports and exports.”

Follow us @tvnewsarchive, and subscribe to our biweekly newsletter here.

Posted in News, Television Archive | Tagged , , , , , , , , , , , , | Leave a comment

Digital opportunity for the academic scholarly record

[MIT Libraries is holding a workshop on Grand Challenges for the scholarly record.  They asked participants for a problem/solution statement.  This is mine. -brewster kahle]

The problem of academic scholarly record now:

University library budgets are spent on closed rather than open: We invest dollars in closed/subscription services (Elsevier, JSTOR, Hathi) rather than ones open to all users (PLOS, Arxiv, Internet Archive, eBird)– and for a reason.  There is only so much money and our community demands access to closed services, and the open ones are there whether we pay for them or not.

We want open access AND digital curation and preservation– but have no means to spend cooperatively.

University libraries funded the building of Elsevier / JSTOR / HathiTrust: closed, subscription services.

We need to invest most University Library acquisition dollars in open: PLOS, Arxiv, Wikipedia, Internet Archive, eBird.

We have solved it when:

Anyone anywhere can get ALL information available to an MIT student, for free.

Everyone everywhere has the opportunity to contribute to the scholarly record as if they were MIT faculty, for free.

What should we do now?

Analog -> Digital conversion of all published scholarly must be completed soon.   And completely open, available in bulk.

Curation and Digital Preservation of born-digital research products: papers/websites/research data.

“Multi-homing” digital research product (papers, websites, research data) via peer-to-peer backends.

Who can best implement?

Vision and tech ability: Internet Archive, PLOS, Wikipedia, arxiv.

Funding now is coming from researchers, individuals, rich people.

Funding should come from University Library acquisition budgets.

Why might MIT lead?

OpenCourseware was bold.  MIT might invest in opening the scholarly record.

How might MIT do this?

Be bold.

Spend differently.


Posted in Announcements, News | Leave a comment

Some Very Entertaining Plastic, Emulated at the Archive

It’s been a little over 4 years since the Internet Archive started providing emulation in the browser from our software collection; millions of plays of games, utilities, and everything else that shows up on a screen have happened since then. While we continue to refine the technology (including adding Webassembly as an option for running the emulations), we also have tried to expand out to various platforms, computers, and anything else that we can, based on the work of the emulation community, especially the MAME Development Team.

For a number of years, the MAME team has been moving towards emulating a class of hardware and software that, for some, stretches the bounds of what emulation can do, and we have now put up a collection of some of their efforts here at

Introducing the Handheld History Collection.

This collection of emulated handheld games, tabletop machines, and even board games stretch from the 1970s well into the 1990s. They are attempts to make portable, digital versions of the LCD, VFD and LED-based machines that sold, often cheaply, at toy stores and booths over the decades.

We have done our best to add instructions and in some cases link to scanned versions of the original manuals for these games. They range from notably simplistic efforts to truly complicated, many-buttoned affairs that are truly difficult to learn, much less master.

They are, of course, entertaining in themselves – these are attempts to put together inexpensive versions of video games of the time, or bringing new properties wholecloth into existence. Often sold cheaply enough that they were sealed in plastic and sold in the same stores as a screwdriver set or flashlight, these little systems tried to pack the most amount of “game” into a small, custom plastic case, running on batteries. (Some were, of course, better built than others.)

They also represent the difficulty ahead for many aspects of digital entertainment, and as such are worth experiencing and understanding for that reason alone.

Taking a $2600 machine and selling it for $20

The shocking difference between the original sold arcade stand-ups and their toy store equivalents can be seen, for example, in the Arcade Game Q*Bert, which you can play at the Archive.

The original Arcade machine looks like this:

And the videogame itself looks like this:

Meanwhile. some time after the release of the arcade machine, a plastic tabletop version of the game came out, and it looked like this:

Using VFD (Vacuum Fluorescent Display) technology, the pre-formed art is lit up based on circuits that try to act like the arcade game as much as possible, without using an actual video screen or a even the same programming. As a result, the “video’ is much more abstract, fascinatingly so:

The music and speech synthesis is gone, a small plastic joystick replaces the metal and hard composite of the original, and the colors are a fraction of what they were. But somehow, if you squint, the original Q*Bert game is in there.

This sort of Herculean effort to squeeze a major arcade machine into a handful of circuits and a beeping, booping shell of what it once was is an ongoing situation – where once it was trying to make arcade machines work both on home consoles like the 2600 and Colecovision, so it was also the case of these plastic toy games. Work of this sort continues, as mobile games take charge and developers often work to bring huge immersive experiences to where a phone hits all the same notes.

The work in this area often speaks for itself. Check out some of these “screenshots” in the VFD games and see if you recognize the originals:

Naturally, these simple screens came packed in the brightest, most colorful stickers and plastic available, to lure in customers. The original containers, while not “emulated” in this browser-based version, definitely represent an important part of the experience.

A Major Bow to the Emulation Developers

The efforts behind accurately reflecting video game and computer experiences in an emulator, which the Archive then uses to provide our in-browser Emularity, are impressive in their own right, and should be highlighted as the lion’s group of the effort. Groups like the MAME Team as well as efforts like Dolphin, Higan, and many others, are all poking and prodding code to bring accuracy, speed and depth to software preservation. They are an often overlooked legion of volunteer effort addressing technical hurdles that no one else is approaching.

While this entry could be filled with many paragraphs about these efforts, one particularly strong example sticks out: Bringing emulation of LCD-based games to MAME.

Destroying The Artifact to Save It

In the case of most emulation, the chips of a circuit board as well as storage media connected to a machine can be read from non-destructively, such that the information is pulled off the original, returned to place, and these copies are used to present emulated versions. An example of this might be an arcade machine, whose chips are pulled from a circuit board, read, and then plugged back into the board, allowing the arcade machine to keep functioning. (Occasionally, an arcade machine/computer will use approaches like glue or batteries to prevent this sort of duplication, but it is generally a rare thing, due to maintenance concerns for operators.)

In the case of an LCD game machine, however, sometimes it is necessary to pull the item completely apart to get all the information from it. On the MAME team, there is a contributor named Sean Riddle and his collaborator “hap” who have been tireless in digging the information out of both LCD games and general computer chips.

To get the information off an LCD game, it has to be pulled apart and all its components scanned, vectorized, and traced to then make them into a software version of themselves. Among the information grabbed is the LCD display itself, which has a pre-formed set of images that do not overlap and represent every possible permutation of any visual data in the game. This will make almost no sense without illustrations, so here are some.

When playing the LCD version of the game “Nightmare Before Christmas”, the game will look like this:

That is a drawn background (also scanned in this process) that has a clear liquid-crystal display over it, showing Jack Skellington, the tree, and an elf. The artistry and intense technical challenge as both the original programming/design and the recovery of this information becomes clear when you see the LCD layer with all the elements “on” at once:

This sort of intense work is everywhere in the background of these LCD games. Here are some more:


(There are many more examples of these at this page at Sean Riddle’s site.)

Not only must the LCD panel be disassembled, but the circuit board beneath as well, to determine the programming involved. These are scanned and then studied to work out the cross-connections that tell the game when to light up what. The work has been optimized and can often go relatively quickly, but only due to years of experience behind the effort, experience which, again, comes from a volunteer force. Unfortunately, the machine does not survive, but the argument is made, quite rightly, that otherwise these toys will fade into oblivion. Now, they can be played by thousands or millions and do so for a significant amount of time to come.

The Fundamental Question: What Needs to be Emulated?

Floating in the back of this new collection, and in the many new LCD and electronic games being emulated by the MAME Team, is the core concern of “what will bring the most of the old game to life to be able to experience and study it?” With “standard” arcade games, it is often just a case of providing the video output as well as the speaker output and accepting the control panel signals either through a keyboard or through connected hardware. While you do not get the full role-play of being inside a dark arcade in the 1980s, you do get both the chance to play the original program as well as study its inner workings and the discoveries made in the process. Additional efforts to photograph or reference control panels, outside artwork and so on are also being done to the best available amount.

This question falls into sharp focus, however, with these electronic toys. The plastic is such a major component of the experience that it may not be enough for some researchers and users to be handed a version of the visual output to really know what the game was like. Compare the output of Bandai Pair Match:

…to what the original toy looked like:

The “core” is there, but a lot is left to the side out of necessity. Documentation, research and capturing all aspects of these machines will be required if they are to be ever recreated or understood in the future.

It’s the best of times that we are able to ask these questions while originals are still around, and it’s a testament to the many great teams and researchers who are bringing these old games into the realms of archives.

So please, take a walk through the Handheld History collection (as well as our other emulation efforts) and relive those plastic days of joy again.

Shout Outs and Thanks

Many different efforts and projects were brought together to make the Handheld History collection what it is. (We intend to expand it over time.) As always, a huge thanks to the MAME Developers for their tireless efforts to emulate our digital history; a special shout-out to Ryan Holtz for his announcements and highlighting of advances in this effort that inspired this collection to be assembled. Thanks to Daniel Brooks for maintenance of The Emularity as well as expanding the capabilities of the system to handle these new emulations. Sources for the photographs of the original plastic systems include The Handheld Games Museum and Electronic Plastic. (It is amazing how few photos of the original toy systems exist; in some cases Ebay sales are the only documented photographs of any resolution.) As a reference work for knowing which systems are emulated and how, we relied heavily on the work of the Arcade Italia Database site. Thanks to Azma and Zeether for providing metadata on images and control schemes for these games; and a huge thanks to all the photographers, documenters, scanners and reviewers who have been chronicling the history of these games for decades.

Posted in Announcements, News | 7 Comments

TV News Record: Glorious ContextuBot making progress

A round up on what’s happening at the TV News Archive by Katie Dahl and Nancy Watzman.

This week, we present an update on the video context project Glorious Contextubot, two recent news reports that use TV News Archive data, and fact-checks of TV appearances by the DNC chair and the president.

Fueled by TV News Archive, the Glorious Contextubot is making progress

Let’s say a friend posts a YouTube video link to a politician’s statement on Facebook, but you have a feeling it’s taken out of context. The clip is tightly edited, and you’re curious to see the rest of the statement. Was the politician answering a question? Was the statement part of a larger discussion?

Enter the Glorious ContextuBot. For the past nine months, veteran media innovators Mark Boas and Laurian Gridnoc of Hyperaudio and Trint, led by the Internet Archive’s own Dan Schultz, senior creative technologist of the TV News Archive, have been building a prototype of the Contextubot, fueled by the TV News Archive. The Contextubot is one of 20 winners of the Knight Prototype Fund’s $1 million challenge, announced in June 2017.

With the ContextuBot, it’s possible to use video to search video. Just paste a link to a video snippet into an interface and then pull up a transcript that puts things in context of what came before and after. Built from the Duplitron 5000, an audio fingerprinting tool Schultz developed to track political ads for the Political TV Ad Archive, the ContextuBot demonstrates how open technology built by the TV team can be repurposed and improved by motivated technologists – one that’s already captured the attention of the University of Iowa Informatics department, which is considering adopting it for researchers.

To date, the team has:

  • Made it easier to scale audio search. It’s now possible to scale up and down audio fingerprint finding within a corpus of TV news by adding or removing individual computers or compute clusters.  Our Duplitron would take eight hours to search a year of television, but the ContextuBot makes it much easier to spread that computing across multiple machines.
  • Built a demo interface. You can see a clip in context with a transcript of what comes before and after. Click on a word in the transcript, and you’ll be able to jump to that point in the video stream.
  • Begun to explore a “comic view.”  The team’s biggest goal is to explore ways to communicate the essence of a longer clip in a short amount of time.  One approach: converting video into a comic. This would set the groundwork for automatically extracting (and rendering) a storyboard from a video clip.

The team will present the prototype shortly before the International Symposium of Online Journalism conference in Austin in April 2018.

The Washington Post finds stark differences in cable TV coverage of Jared Kushner

After a heavy news week of developments related to Jared Kushner, President Trump’s son-in-law and a senior adviser, The Washington Post’s Philip Bump dug into the TV News Archive and found that while MSNBC and CNN had numerous mentions of Kushner’s name, Fox News had just ten.

The Washington Post examines coverage of Parkland shooting

Rachel Siegal used the TV News Archive to compare coverage of the Parkland shooting with several other high-profile shootings, and found that this time cable TV attention spans are a bit longer.

Fact-Check: the DNC raised record-making amounts in January. (Two Pinocchios)

In a recent interview, Democratic National Committee Chairman Tom Perez said, “We raised more money in January… of 2018 than any January in our history. So if the question is, ‘Do we have enough money to implement our game plan?’ Absolutely.”

This claim earned “two Pinocchios” from Salvador Rizzo, reporting for The Washington Post’s Fact Checker:  the “DNC raised $6 million in January 2018… That was below what it raised in January 2014 ($6.6 million), January 2012 ($13.2 million), January 2011 ($7.1 million) and January 2010 ($9.1 million).”  A spokesman for Perez “backed off from those comments when we reached out with FEC figures that told a different story.”

Fact-Check: Congressman fears NRA downgrade for gun legislation (misleading)

In a meeting with lawmakers to talk gun legislation, President Donald Trump suggested that an age requirement increase for purchasing guns was not included in a 2013 reform effort by Rep. Pat Toomey, R., Pa., “because you’re afraid of the NRA, right?”

Reporting by’s Eugene Kiley, Lori Robertson, and Robert Farley calls this statement misleading.  “As a result of the legislation, Toomey’s rating with the NRA dropped from an “A” to a “C,” and the endorsements and contributions Toomey got from the NRA in previous House and Senate races disappeared. In 2016, the NRA stayed out of Toomey’s Senate race altogether; his Democratic opponent, Katie McGinty, had an “F” grade from the NRA. In that race, Toomey got the endorsement of a gun-control group, Everytown for Gun Safety, which ran ads supporting him.”

Follow us @tvnewsarchive, and subscribe to our biweekly newsletter here.

Posted in News, Television Archive | Tagged , , , , , , , , , , , , | Comments Off on TV News Record: Glorious ContextuBot making progress

Archive video now supports WebVTT for captions

We now support .vtt files (Web Video Text Tracks) in addition to .srt (SubRip) (.srt we have supported for years) files for captioning your videos.

It’s as simple as uploading a “parallel filename” to your video file(s).


  • myvid.mp4
  • myvid.vtt

Multi-lang support:

  • myvid.webm
  • myvid.en.vtt

Here’s a nice example item:

VTT with caption picker (and upcoming A/V player too!)

(We will have an updated A/V player with a better “picker” for so many language tracks in days, have no fear 😎



Posted in Technical, Television Archive, Video Archive | Tagged , , , | Comments Off on Archive video now supports WebVTT for captions

10 Ways To Explore The Internet Archive For Free

The Internet Archive is a treasure trove of fascinating media, texts, and ephemera. Items that if they didn’t exist here, would be lost forever. Yet so many of our community members have difficulty describing what exactly it is…that we do here. Most people know us for the Wayback Machine, but we are so much more. To that end, we’ve put together a fun and useful guide to exploring the Archive. So, grab your flashlight and pith hat and let your digital adventure begin…

1. Pick a place & time you want to explore. Search our eBooks and Texts collection and download or borrow one of the 3 million books for free, offered in many formats, including PDFs and EPub.

2. Enter a time machine of old time films. Explore films of historic significance in the Prelinger Archives.

3. Want to listen to a live concert? The Live Music Archive holds more than 12,000 Grateful Dead concerts.

4. Who Knows What Evil Lurks in the Hearts of Men? Only the Shadow knows. You can too. Listen to “The Shadow” as he employs his power to cloud minds to fight crime in Old Time Radio.

5. To read or not to read? Try listening to Shakespeare with the LibriVox Free Audiobook Collection.

6. Need a laugh? Search the Animation Shorts collection for an old time cartoon.

7. Before there was Playstation 4… there was Atari. Play a classic video game on an emulated old time console, right in the browser. Choose from hundreds of games in the Internet Arcade.

8. Are you a technophile? Take the Oregon Trail or get nostalgic with the Apple II programs. You have instant access to decades of computer history in the Software Library.

9. Find a television news story you missed. Search our Television News Archive for all the channels that presented the story. How do they differ? Quote a clip from the story and share it.

10. Has your favorite website disappeared? Go to the Wayback Machine and type in the URL to see if this website has been preserved across time. Want to save a website? Use “Save Page Now.”

What does it take to become an archivist? It’s as simple as creating your own Internet Archive account and diving in. Upload photos, audio, and video that you treasure. Store them for free. Forever.


Sign up for free at

Posted in Announcements, News | Comments Off on 10 Ways To Explore The Internet Archive For Free

Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation

The Andrew W. Mellon Foundation has awarded a research and development grant to the Internet Archive to address the critical need to preserve the “long tail” of open access scholarly communications. The project, Ensuring the Persistent Access of Long Tail Open Access Journal Literature, builds on prototype work identifying at-risk content held in web archives by using data provided by identifier services and registries. Furthermore, the project expands on work acquiring missing open access articles via customized web harvesting, improving discovery and access to this materials from within extant web archives, and developing machine learning approaches, training sets, and cost models for advancing and scaling this project’s work.

The project will explore how adding automation to the already highly automated systems for archiving the web at scale can help address the need to preserve at-risk open access scholarly outputs. Instead of specialized curation and ingest systems, the project will work to identify the scholarly content already collected in general web collections, both those of the Internet Archive and collaborating partners, and implement automated systems to ensure at-risk scholarly outputs on the web are well-collected and are associated with the appropriate metadata. The proposal envisages two opposite but complementary approaches:

  • A top-down approach involves taking journal metadata and open data sets from identifier and registry sources such as ISSN, DOAJ, Unpaywall, CrossRef, and others and examining the content of large-scale web archives to ask “is this journal being collected and preserved and, if not, how can collection be improved?”
  • A bottom-up approach involves examining the content of general domain-scale and global-scale web archives to ask “is this content a journal and, if so, can it be associated with external identifier and metadata sources for enhanced discovery and access?”

The grant will fund work to use the output of these approaches to generate training sets and test them against smaller web collections in order to estimate how effective this approach would be at identifying the long-tail content, how expensive a full-scale effort would be, and what level of computing infrastructure is needed to perform such work. The project will also build a model for better understanding the costs for other web archiving institutions to do similar analysis upon their collection using the project’s algorithms and tools. Lastly, the project team, in the Web Archiving and Data Services group with Director Jefferson Bailey as Principal Investigator,  will undertake a planning process to determine resource requirements and work necessary to build a sustainable workflow to keep the results up-to-date incrementally as publication continues.

In combination, these approaches will both improve the current state of preservation for long-tail journal materials as well as develop models for how this work can be automated and applied to existing corpora at scale. Thanks to the Mellon Foundation for their support of this work and we look forward to sharing the project’s open-source tools and outcomes with a broad community of partners.

Posted in Announcements, News | Tagged , , | Comments Off on Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation

27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving

The lives and activities of communities are increasingly documented online; local news, events, disasters, celebrations — the experiences of citizens are now largely shared via social media and web platforms. As these primary sources about community life move to the web, the need to archive these materials becomes an increasingly important activity of the stewards of community memory. And in many communities across the nation, public libraries, as one of their many responsibilities to their patrons, serve the vital role of stewards of local history. Yet public libraries have historically been a small fraction of the growing national and international web archiving community.

With generous support from the Institute of Museum and Library Services, as well as the Kahle/Austin Foundation and the Archive-It service, the Internet Archive and 27 public library partners representing 17 different states have launched a new program: Community Webs: Empowering Public Libraries to Create Community History Web Archives. The program will provide education, applied training, cohort network development, and web archiving services for a group of public librarians to develop expertise in web archiving for the purpose of local memory collecting. Additional partners in the program include OCLC’s WebJunction training and education service and the public libraries of Queens, Cleveland and San Francisco will serve as “lead libraries” in the cohort. The program will result in dozens of terabytes of public library administered local history web archives, a range of open educational resources in the form of online courses, videos, and guides, and a nationwide network of public librarians with expertise in local history web archiving and the advocacy tools to build and expand the network. A full listing of the participating public libraries is below and on the program website.

In November 2017, the cohort gathered together at the Internet Archive for a kickoff meeting of brainstorming, socializing, and, of course, talking all things web archiving.  Partners shared details on their existing local history programs and ideas for collection development around web materials. Attendees talked about building collections documenting their demographic diversity or focusing on local issues, such as housing availability or changes in community profile. As an example, Abbie Zeltzer from the Patagonia Public Library, spoke about the changes in her community of 913 residents as the town redevelops a long dormant mining industry. Zeltzer intends on developing a web archive documenting this transition and the related community reaction and changes.

Since the kickoff meeting, the Community Webs cohort has been actively building collections, from hyper-local media sites in Kansas City, to neighborhood blogs in Washington D.C., to Mardi Gras in East Baton Rouge. In addition, program staff, cohort members, and WebJunction have been building out an extensive online course space with educational materials for training on web archiving for local history. The full course space and all open educational resources will be released in early 2019 and a second full in-person meeting of the cohort will take place in Fall 2018.

For further information on the Community Webs program, contact Maria Praetzellis, Program Manager, Web Archiving [maria at] or Jefferson Bailey, Director, Web Archiving [jefferson at].

Public Library City State
Athens Regional Library System Athens GA
Birmingham Public Library Birmingham AL
Brooklyn Public Library – Brooklyn Collection New York City NY
Buffalo & Erie County Public Library Buffalo NY
Cleveland Public LIbrary Cleveland OH
Columbus Metropolitan Library Columbus OH
County of Los Angeles Public Library Los Angeles CA
DC Public Library Washington DC
Denver Public Library – Western History and Genealogy Department and Blair-Caldwell African American Research Library Denver CO
East Baton Rouge Parish Library East Baton Rouge LA
Forbes Library Northampton MA
Grand Rapids Public Library Grand Rapids MI
Henderson District Public Libraries Henderson NV
Kansas City Public Library Kansas City MO
Lawrence Public Library Lawrence KS
Marshall Lyon County Library Marshall MN
New Brunswick Free Public Library New Brunswick NJ
Schomburg Center for Research in Black Culture (NYPL) New York City NY
Patagonia Library Patagonia AZ
Pollard Memorial Library Lowell MA
Queens Library New York City NY
San Diego Public Library San Diego CA
San Francisco Public Library San Francisco CA
Sonoma County Public Library Santa Rosa CA
The Urbana Free Library Urbana IL
West Hartford Public Library West Hartford CT
Westborough Public Library Westborough MA
Posted in Announcements, Archive-It | Tagged | Comments Off on 27 Public Libraries and the Internet Archive Launch “Community Webs” for Local History Web Archiving