Author Archives: jeff kaplan

Downloading in bulk using wget

If you’ve ever wanted to download files from many different archive.org items in an automated way, here is one method to do it.

____________________________________________________________

Here’s an overview of what we’ll do:

1. Confirm or install a terminal emulator and wget
2. Create a list of archive.org item identifiers
3. Craft a wget command to download files from those identifiers
4. Run the wget command.

____________________________________________________________

Requirements

Required: a terminal emulator and wget installed on your computer. Below are instructions to determine if you already have these.
Recommended but not required: understanding of basic unix commands and archive.org items structure and terminology.

____________________________________________________________

Section 1. Determine if you have a terminal emulator and wget.
If not, they need to be installed (they’re free)

1. Check to see if you already have wget installed
If you already have a terminal emulator such as Terminal (Mac) or Cygwin (Windows) you can check if you have wget also installed. If you do not have them both installed go to Section 2. Here’s how to check to see if you have wget using your terminal emulator:

1. Open Terminal (Mac) or Cygwin (Windows)
2. Type “which wget” after the $ sign
3. If you have wget the result should show what directory it’s in such as /usr/bin/wget. If you don’t have it there will be no results.

2. To install a terminal emulator and/or wget:
Windows: To install a terminal emulator along with wget please read Installing Cygwin Tutorial. Be sure to choose the wget module option when prompted.

MacOSX: MacOSX comes with Terminal installed. You should find it in the Utilities folder (Applications > Utilities > Terminal). For wget, there are no official binaries of wget available for Mac OS X. Instead, you must either build wget from source code or download an unofficial binary created elsewhere. The following links may be helpful for getting a working copy of wget on Mac OSX.
Prebuilt binary for Mac OSX Lion and Snow Leopard
wget for Mac OSX leopard

Building from source for MacOSX: Skip this step if you are able to install from the above links.
To build from source, you must first Install Xcode. Once Xcode is installed there are many tutorials online to guide you through building wget from source. Such as, How to install wget on your Mac.

____________________________________________________________

Section 2. Now you can use wget to download lots of files

The method for using wget to download files is:

  1. Generate a list of archive.org item identifiers (the tail end of the url for an archive.org item page) from which you wish to grab files.
  2. Create a folder (a directory) to hold the downloaded files
  3. Construct your wget command to retrieve the desired files
  4. Run the command and wait for it to finish

Step 1: Create a folder (directory) for your downloaded files
1. Create a folder named “Files” on your computer Desktop. This is where the downloaded where files will go. Create it the usual way by using either command-shift-n (Mac) or control-shift-n (Windows)

Step 2: Create a file with the list of identifiers
You’ll need a text file with the list of archive.org item identifiers from which you want to download files. This file will be used by the wget to download the files.

If you already have a list of identifiers you can paste or type the identifiers into a file. There should be one identifier per line. The other option is to use the archive.org search engine to create a list based on a query.  To do this we will use advanced search to create the list and then download the list in a file.

First, determine your search query using the search engine.  In this example, I am looking for items in the Prelinger collection with the subject “Health and Hygiene.”  There are currently 41 items that match this query.  Once you’ve figured out your query:

1. Go to the advanced search page on archive.org. Use the “Advanced Search returning JSON, XML, and more.” section to create a query.  Once you have a query that delivers the results you want click the back button to go back to the advanced search page.
3. Select “identifier” from the “Fields to return” list.
4. Optionally sort the results (sorting by “identifier asc” is handy for arranging them in alphabetical order.)
5. Enter the number of results from step 1 into the “Number of results” box that matches (or is higher than) the number of results your query returns.
6. Choose the “CSV format” radio button.
This image shows what the advance query would look like for our example:
Advanced Search

7. Click the search button (may take a while depending on how many results you have.) An alert box will ask if you want your results – click “OK” to proceed.  You’ll then see a prompt to download the “search.csv” file to your computer.  The downloaded file will be in your default download location (often your Desktop or your Downloads folder).
8. Rename the “search.csv” file “itemlist.txt” (no quotes.)
9. Drag or move the itemlist.txt file into your “Files” folder that you previously created
10. Open the file in a text program such as TextEdit (Mac) or Notepad (Windows). Delete the first line of copy which reads “identifier”. Be sure you deleted the entire line and that the first line is not a blank line. Now remove all the quotes by doing a search and replace replacing the ” with nothing.

The contents of the itemlist.txt file should now look like this:

AboutFac1941
Attitude1949
BodyCare1948
Cancer_2
Careofth1949
Careofth1951
CityWate1941

…………………………………………………………………………………………………………………………
NOTE: You can use this advanced search method to create lists of thousands of identifiers, although we don’t recommend using it to retrieve more than 10,000 or so items at once (it will time out at a certain point).
………………………………………………………………………………………………………………………...

Step 3: Create a wget command
The wget command uses unix terminology. Each symbol, letter or word represents different options that the wget will execute.

Below are three typical wget commands for downloading from the identifiers listed in your itemlist.txt file.

To get all files from your identifier list:
wget -r -H -nc -np -nH --cut-dirs=1 -e robots=off -l1 -i ./itemlist.txt -B 'http://archive.org/download/'

If you want to only download certain file formats (in this example pdf and epub) you should include the -A option which stands for “accept”. In this example we would download the pdf and jp2 files
wget -r -H -nc -np -nH --cut-dirs=1 -A .pdf,.epub -e robots=off -l1 -i ./itemlist.txt -B 'http://archive.org/download/'

To only download all files except specific formats (in this example tar and zip) you should include the -R option which stands for “reject”. In this example we would download all files except tar and zip files:
wget -r -H -nc -np -nH --cut-dirs=1 -R .tar,.zip -e robots=off -l1 -i ./itemlist.txt -B 'http://archive.org/download/'

If you want to modify one of these or craft a new one you may find it easier to do it in a text editing program (TextEdit or NotePad) rather than doing it in the terminal emulator.

…………………………………………………………………………………………………………………………
NOTE: To craft a wget command for your specific needs you might need to understand the various options. It can get complicated so try to get a thorough understanding before experimenting.You can learn more about unix commands at Basic unix commands

An explanation of each options used in our example wget command are as follows:

-r   recursive download; required in order to move from the item identifier down into its individual files

-H   enable spanning across hosts when doing recursive retrieving (the initial URL for the directory will be on archive.org, and the individual file locations will be on a specific datanode)

-nc   no clobber; if a local copy already exists of a file, don’t download it again (useful if you have to restart the wget at some point, as it avoids re-downloading all the files that were already done during the first pass)

-np   no parent; ensures that the recursion doesn’t climb back up the directory tree to other items (by, for instance, following the “../” link in the directory listing)

-nH   no host directories; when using -r, wget will create a directory tree to stick the local copies in, starting with the hostname ({datanode}.us.archive.org/), unless -nH is provided

--cut-dirs=1   completes what -nH started by skipping the hostname; when saving files on the local disk (from a URL likehttp://{datanode}.us.archive.org/{drive}/items/{identifier}/{identifier}.pdf), skip the /{drive}/items/ portion of the URL, too, so that all {identifier} directories appear together in the current directory, instead of being buried several levels down in multiple {drive}/items/ directories

-e robots=off   archive.org datanodes contain robots.txt files telling robotic crawlers not to traverse the directory structure; in order to recurse from the directory to the individual files, we need to tell wget to ignore the robots.txt directive

-i ../itemlist.txt   location of input file listing all the URLs to use; “../itemlist” means the list of items should appear one level up in the directory structure, in a file called “itemlist.txt” (you can call the file anything you want, so long as you specify its actual name after -i)

-B 'http://archive.org/download/'   base URL; gets prepended to the text read from the -i file (this is what allows us to have just the identifiers in the itemlist file, rather than the full URL on each line)

Additional options that may be needed sometimes:

-l depth --level=depth   Specify recursion maximum depth level depth. The default maximum depth is 5. This option is helpful when you are downloading items that contain external links or URL’s in either the items metadata or other text files within the item. Here’s an example command to avoid downloading external links contained in an items metadata:
wget -r -H -nc -np -nH --cut-dirs=1 -l 1 -e robots=off -i ../itemlist.txt -B 'http://archive.org/download/'

-A  -R   accept-list and reject-list, either limiting the download to certain kinds of file, or excluding certain kinds of file; for instance, adding the following options to your wget command would download all files except those whose names end with _orig_jp2.tar or _jpg.pdf:
wget -r -H -nc -np -nH --cut-dirs=1 -R _orig_jp2.tar,_jpg.pdf -e robots=off -i ../itemlist.txt -B 'http://archive.org/download/'

And adding the following options would download all files containing zelazny in their names, except those ending with .ps:
wget -r -H -nc -np -nH --cut-dirs=1 -A "*zelazny*" -R .ps -e robots=off -i ../itemlist.txt -B 'http://archive.org/download/'

See http://www.gnu.org/software/wget/manual/html_node/Types-of-Files.html for a fuller explanation.
…………………………………………………………………………………………………………………………

Step 4: Run the command
1. Open your terminal emulator (Terminal or Cygwin)
2. In your terminal emulator window, move into your folder/directory. To do this:
For Mac: type cd Desktop/Files
For Windows type in Cygwin after the $ cd /cygdrive/c/Users/archive/Desktop/Files
3. Hit return. You have now moved into th e”Files” folder.
4. In your terminal emulator enter or paste your wget command. If you are using on of the commands on this page be sure to copy the entire command which may be on two lines. You can just cut and paste in Mac. For Cygwin, copy the command, click the Cygwin logo in the upper left corner, select Edit then select Paste.
5. Hit return to run the command.

You will see your progress on the screen.  If you have sorted your itemlist.txt alphabetically, you can estimate how far through the list you are based on the screen output. Depending on how many files you are downloading and their size, it may take quite some time for this command to finish running.

…………………………………………………………………………………………………………………………
NOTE: We strongly recommend trying this process with just ONE identifier first as a test to make sure you download the files you want before you try to download files from many items.
…………………………………………………………………………………………………………………………

Tips:

  • You can terminate the command by pressing “control” and “c” on your keyboard simultaneously while in the terminal window.
  • If your command will take a while to complete, make sure your computer is set to never sleep and turn off automatic updates.
  • If you think you missed some items (e.g. due to machines being down), you can simply rerun the command after it finishes.  The “no clobber” option in the command will prevent already retrieved files from being overwritten, so only missed files will be retrieved.

Rick Prelinger’s “Lost Landscapes of San Francisco 6” Tuesday Jan 24 7:30pm at the Internet Archive

Rick Prelinger will be presenting his latest version of Lost Landscapes of San Francisco at the Internet Archive.

Lost Landscapes of San Francisco 6 (2011) is the latest in a series of historical urban explorations, made from home movies, industrial and promotional films and outtakes, and other cinematic ephemera. It sold out the Castro Theatre in December, and this will be its second screening. YOU are the soundtrack. Please come prepared to shout out your identifications, ask questions about what’s on the screen, and share your thoughts with fellow audience members.

Most of the footage in this program has not been shown before. It includes footage of San Francisco’s cemeteries just before their removal, unique drive-thru footage of the Old Produce Market (now Golden Gateway) in the late 1940s, cruising the newly-built Embarcadero Freeway, grungy back streets in North Beach, the sandswept Sunset District in the 1930s, and newly-rediscovered Cinemascope footage of Playland, the Sky Tram and San Francisco scenes, all in Kodachrome.

Suggested admission for the screening: $5 bucks — or 5 books, which will be donated to Internet Archive’s book scanning project. Reservations are required, the event frequently sells out.

What: Lost Landscapes of San Francisco 6
When: Jan 24, 2012, 7:30pm
Where: Internet Archive, 300 Funston Avenue, San Francisco, CA 94118
Contact: rsvp@archive.org

Internet Archive joins protest of PIPA / SOPA legislation

San Francisco, CA – On January 18, 2012, Internet Archive joined the thousands of internet websites that went dark in protest of the proposed SOPA and PIPA legislation.
12 Hours Dark: Internet Archive vs. Censorship

Hackers & Founders organized a protest in San Francisco’s Civic Center Plaza. They joined forces with tech meetup organizers around the country to hold rallies in New York, Washington DC, Seattle and Silicon Valley to put a public face to the online protests and blackouts.

Speaking at the San Francisco event were many local luminaries including Internet Archive founder Brewster Kahle, Ron Conway, Jonathan Nelson, MC Hammer, Caterina Fake and others.


Brewster Kahle speaks at the PIPA / SOPA Protest in San Francisco

A digital collection on the January 2012 web blackout in protest of the SOPA and PIPA legislation being considered by the US Congress. Internet Archive’s Archive-It created a collection of websites related to this protest including those participating in the blackout as well as commentary and news surrounding the event. Thanks to Library of Congress and other colleagues for their url contributions.

Archive-It Team Encourages Your Contributions To The “Occupy Movement” Collection

Since September 17th, 2011 when protesters descended on Wall Street, set up tents, and refused to move until their voices were heard, an impassioned plea for economic and social equality has manifested itself in similar protests and demonstrations around the world. Inspired by “Occupy Wall Street (OWS)”, these global protests and demonstrations are collectively now being referred to as the “Occupy Movement”.

In an effort to document these historic, and politically and socially charged, events as they unfold, IA’s Archive-It team has recently created an “Occupy Movement” collection to begin capturing information about the movement found online. With blogs communicating movement ideals and demands, social media used to coordinate demonstrations, and news related websites portraying the movement from a dizzying variety of angles, the presence and representation of the Occupy Movement online is both hugely valuable to our understanding of the movement as a whole, while constantly in-flux and at-risk.

The value of the collection hinges on the diversity, depth, and breadth of our seeds and websites we crawl. We are asking and encouraging anyone with websites they feel are important to archive, sites that tell a story about the movement, to pass them along and we will add them to the Occupy Movement collection. These might include movement-wide or city-specific websites, sites with images, blogs, YouTube videos, even Twitter accounts of individuals or organizations involved with the movement. No ideas or additions are too small or too large; perhaps your ideas or suggestions will be a unique part of the movement not yet represented in our collection. IA Archive-It friends and partners are already sending in seeds, which we greatly appreciate.

The web content captured in this collection will be included in the General Archive collection at http://www.archive.org/details/occupywallstreet
which has been actively collecting materials on the Occupy Movement for a few months.

Please send any seeds suggestions, questions, or comments to Graham at graham@archive.org.

Wayback Machine Scheduled Outage Friday through Monday

Update Wednesday 5:45pm: We spent the day fixing hardware and brought up 7 of the machines that were down.  About 90% of the Wayback Machine data is back online now.  We do still have a few machines that don’t want to power up again, and will continue to work on getting that remaining data back online as soon as possible.

—————

Update Monday 12PM PT: We should be up now except for a small % of our data (we have a network switch that didn’t make it though the maintenance).  About 80% of the data is accessible right now, we should have the other 20% online again some time tomorrow.

—————

Update Monday 8:30AM PT:  Power to most of the Wayback machine’s storage servers has not been restored yet.   Engineers are working on it, and we will keep you informed via this blog post.

—————-

 

The Wayback Machine will be offline from Friday evening, October 7, through Sunday, October 9, 2011.  We expect the Wayback to be back in service by Monday morning (PST), October 10, 2011.

So, what’s up?  Maintenance is being done on the data center and cooling system where a large percent of the Wayback’s content is stored, and we’ll need to shut off the power there for the duration of the work.

We aren’t making any changes to the Wayback Machine.  When we power back up some time on Sunday, things should just start working again.  If you are seeing any issues with the Wayback on Monday morning (PST), please drop us a note at info at archive dot org.

Volunteer – Help us get 200,000 books on Sunday!

We have a windfall: the Friends of the San Francisco Public Library are offering all the unsold books from their yearly book sale to the Internet Archive if we can pack them up.

The Archive will then move them to our Physical Archive in Richmond, California, scan the ones we do not have already, and ready them for physical preservation.  The duplicates will be donated to a charity that helps direct books to those in need.

 

We need volunteers to:

Pack books into boxes
Sunday, September 25, 2011
from 4 pm until midnight
at Fort Mason Festival Pavilion

We will have snacks and refreshments, music, and mementos for all volunteers! Come help us accept the generous donation of approximately 200,000 books from the Friends of the San Francisco Public Library.

You can arrive at any time between 4 pm and midnight, but the earlier the better!    Fort Mason Festival Pavilion is located behind building E.  There is pay parking in the Fort Mason lot, or park outside of Fort Mason for free.  There will be no heavy lifting! Just book sorting and fun for everyone!

If you are also available to help on Monday and Tuesday (9/26-27), we will be there from 7:30 AM – Midnight both days and would love your assistance!

It will be fun and productive.  And you will be helping to build a library!

PLEASE RSVP on the Facebook event page, or by emailing ginger at archive dot org.

Understanding 9/11: A Television News Archive

We are proud to announce the launch of Understanding 9/11: A Television News Archive, a library of news coverage of the events of 9/11/2001 and their aftermath as presented by U.S. and international broadcasters. A resource for scholars, journalists and the public, the library presents one week (3,000 hours from 20 channels over 7 days) of news broadcasts for study, research and analysis, with select analysis by scholars.

911 collection pageTelevision is our preeminent medium of information, entertainment and persuasion, but until now it has not been a medium of record. Scholars face great challenges in identifying, locating and adequately citing television news broadcasts in their research. This archive attempts to address this gap by making TV news coverage of this critical week in September 2001 available to those studying these events and their treatment in the media.

Background on the Television Archive

Internet Archive is a non-profit library founded in 1996 that started by attempting to collect every webpage from all websites. This is a major task but it is doable even by a non-profit.

Another medium, television, struck us as historically under-appreciated, despite its tremendous importance. Television is pervasive and persuasive, but it is difficult to access programs for research and analysis.  We felt that TV should be a medium of record, a moniker generally reserved for newspaper publishing. As we learned in high school, to effectively understand we need to be able to ‘compare and contrast’. We need to be able to quote.

Talking with the Library of Congress in 2000 we found that they were not systematically recording TV. Talking with the Federal Broadcast Information Service which was collecting TV for the US intelligence community, we found it would probably be difficult to get the recordings from them for library use. The notable Vanderbilt TV News archive at that time was struggling financially and only captured several hours of television news each night. As a result, we decided to create the Television Archive to help preserve this culturally important medium.

Starting in late 2000, we began collecting Russian, Chinese, Japanese, Iraqi, French, Mexican, British, American, and other stations… 20 channels of TV in DVD quality.

When the events of September 11, 2001 occurred, we, like most Americans, urgently wanted international perspectives on the United States. Stunned by the attacks, we tried to figure out what we could do to help.  Seventy-one people and organizations worked together to get one week of TV News up on the Internet to be launched on October 11, 2001. (Bear in mind this is 3 years before YouTube started.) Launched at the Newseum in Washington DC, we made a website that allowed anyone to research the collection of 20 channels for the week of September 11th.

Today, we are relaunching this collection with an updated interface with a conference at NYU.

LEARNING FROM RECORDED MEMORY: 9/11 TV News Archive Conference

LEARNING FROM RECORDED MEMORY: 9/11 TV News Archive Conference

Co-sponsored by Internet Archive and New York University’s Moving Image Archiving and Preservation Program, Tisch School of the Arts

Wednesday, August 24, 4:00-6:00 pm; reception follows

New York University, Tisch School of the Arts, 721 Broadway, 6th Floor, Michelson Theater, New York, NY 10003

This conference highlights work by scholars using television news materials to help us understand how TV news presented the events of 9/11/2001 and the international response. Our collective recollection of 9/11 and the following days has become inseparable from the televised images we have all seen. But while TV news is inarguably the most vivid and pervasive information medium of our time, it has not been a medium of record. As the number of news outlets increases, research and scholarly access to the thousands of hours of TV news aired each day grows increasingly difficult. Scholars face great challenges in identifying, locating and adequately citing television news broadcasts in their research.

The 9/11 Television News Archive (http://archive.org/details/911) contains 3,000 hours of national and international news coverage from 20 channels over the seven days beginning September 11, plus select analysis by scholars. It is designed to assist scholars and journalists researching relationships between news events and coverage, engaging in comparative and longitudinal studies, and investigating “who said what when.” What kinds of research and scholarship will be enabled by access to an online database of TV news broadcasts? How will emerging TV news studies make use of this service? This conference offers contemporary insights and predictions on new directions in television news studies.

SCHEDULE

4:00:  Welcome: Richard Allen, Chair, Department of Cinema Studies, Tisch School of the Arts, NYU
4:05:  Brewster Kahle, Founder and Digital Librarian at the Internet Archive
4:15:  Brian A. Monahan, Iowa State University
4:25:  Deborah Jaramillo, Boston University
4:35:  Marshall Breeding, Vanderbilt Television News Archive
4:45:  Mark J Williams, Department of Film and Media Studies, Dartmouth College
4:55:  Carolyn Brown, American University
5:05:  Michael Lesk, Rutgers University
5:15:  Beatrice Choi, New York University
5:25:  Scott Blake, Artist
5:35:  Discussion
6:00:  Reception (Remarks by Dennis Swanson, President of Station Operations, Fox Television)

SPEAKERS

Welcome: Richard Allen, Chair, Department of Cinema Studies, Tisch School of the Arts, New York University

 

Brewster Kahle, Internet Archive

“Introducing the 9/11 TV News Archive”

Brewster Kahle is the founder and Digital Librarian of the Internet Archive in 1996.   An entrepreneur and Internet pioneer, Brewster invented the first Internet publishing system and helped put newspapers and publishers online in the 1990’s.  

 

Brian A. Monahan, Iowa State University

“Mediated Meanings and Symbolic Politics: Exploring the Continued Significance of 9/11 News Coverage”

In-depth analysis of television news coverage of the September 11 attacks and their aftermath reveals how these events were fashioned into “9/11,” the politically and morally charged signifier that has profoundly shaped public perception, policy and practice in the last decade.  The central argument is that patterned representations of 9/11 in news media and other arenas fueled the transformation of September 11 into a morality tale centered on patriotism, victimization and heroes.  The resulting narrow and oversimplified public understanding of 9/11 has dominated public discourse, obscured other interpretations and marginalized debate about the contextual complexities of these events. Understanding how and why the coverage took shape as it did yields new insights into the social, cultural and political consequences of the attacks, while also highlighting the role of news media in the creation, affirmation and dissemination of meanings in modern life.

Brian Monahan has extensively researched news coverage of 9/11, resulting in a number of scholarly presentations and a book, The Shock of the News: Media Coverage and the Making of 9/11 (2010, NYU Press).

 

 

Deborah Jaramillo, Boston University

“Fighting Ephemerality: Seeing TV News through the Lens of the Archive”

The experience of watching the news on TV as events unfold is often complicated by the space of exhibition — typically, the domestic space. When hour upon hour of news is catalogued and archived — placed in a space of focused study — the news and the experience become altogether different. What was meant to be ephemeral acquires permanence, and what is usually a short-term viewing experience becomes a rigorous, frame-by-frame examination. In this presentation I will discuss how the archive challenges researchers to adopt new ways of seeing and explaining TV news.

Deborah L. Jaramillo is Assistant Professor in the Department of Film and Television, Boston University.

 

Marshall Breeding, Vanderbilt Television News Archive

“An Overview of the Vanderbilt Television News Archive”

Marshall Breeding will give a brief overview of the Vanderbilt Television News Archive and how it carries out its mission to preserve and provide access to US national television news.   He will relate the incredibly diverse kinds of use that the archive receives, including: academic scholarly research; individuals seeking coverage of themselves or family members that may have appeared on the news in life-changing events; those needing historic footage for current journalism, documentaries or other creative works; or corporations or non-profits researching news coverage of their vested topics.  Breeding will also outline some of the constraints it faces in how it provides access to its collection.

Marshall Breeding is the Executive Director of the Vanderbilt Television News Archive and the Director for Innovative Technology and Research for the Vanderbilt University Library.

 

Mark J. Williams, Department of Film and Media Studies, Dartmouth College

“Media Ecology and Online News Archives”

Online TV news archives are a crucial digital resource to facilitate the awareness
of and critical study of Media Ecology.  The 9/11 TV News Archive will fundamentally
enhance our capacity for the study of historical TV newscasts. Two significant
research and teaching outcomes for this area of study are A) to better understand
the role of television news regarding the mediation of society and its popular
memory, and B) to underscore the significance of television news to the goal of
an informed citizenry.  The 9/11 TV News Archive will enhance and ensure the continued
study of the indelible tragic events and aftermath of 9/11, and make possible
new interventions within journalism history and media history, via online capacities
for access and collaboration.

Mark J. Williams is Associate Professor in the Department of Film and Media Studies, Dartmouth College.

 

Carolyn Brown, American University

“Documentation and Access: A Latino/a Studies Perspective on Using Video Archives”

This talk will explore the possibilities and potential of using accessible video news archives in two areas: immigration research in the field of communication and documentary journalism. I will speak of the significance of video news archives in my current film, The Salinas Project, and discuss my continuing research on Latino/as and immigration in the news.

Carolyn Brown is Assistant Professor in the School of Communication and Journalism at American University. She produced daily news shows for MSNBC News and Fox News Channel, and has worked as a producer and senior producer in local news in San Francisco, Washington, D.C., and Phoenix.

 

Michael Lesk, Rutgers University

“Image Analysis for Media Study”

Focusing on television news coverage of the 9/11 attacks, this talk will outline strategies for automatic quantitative analysis of television news imagery.

After receiving a PhD degree in Chemical Physics in 1969, Michael Lesk joined the computer science research group at Bell Laboratories, where he worked until 1984. From 1984 to 1995 he managed the computer science research group at Bellcore, then joined the National Science Foundation as head of the Division of Information and Intelligent Systems, and since 2003 has been Professor of Library and Information Science at Rutgers University, and chair of that department 2005-2008. He is best known for work in electronic libraries, and his book “Practical Digital Libraries” was published in 1997 by Morgan Kaufmann and the revision “Understanding Digital Libraries” appeared in 2004.  He is a Fellow of the Association for Computing Machinery, received the Flame award from the Usenix association, and in 2005 was elected to the National Academy of Engineering. He chairs the NRC Board on Research Data and Information.

 

Beatrice Choi, New York University

“Live Dispatch: The Ethics of Audio Vision Media Coverage in Trauma and the Legacy of Sound from Shell Shock to 9/11”

What experiential narratives—sensory, aesthetic and political—are invisible to those exposed to traumatic events? Considering September 11, 2001, the media coverage of the event is predominantly visual. People drift in and out of news footage, covered in dust and ash as they exclaim that witnessing the attacks was like watching a movie . In contrast, the wailing of sirens, the staccato thud of feet running from the stricken towers, and the chaotic overlap of voices break through—sometimes even swallow—the visual narratives spun for 9/11. For contemporary American traumatic events, this inquires into how porous the sensory modalities are in experiencing and remembering shock. How, after all, do sensory representations of traumatic events leave in/visible marks on documentation? I address these questions by exploring sound as an alternate modality, evoking a different level of traumatic indexicality. First, I draw attention to the sensory discrepancy between audio and visual content dispersed for American traumatic events, taking 9/11 as the focal event. By investigating the most highly represented media vehicles in the event—television and radio—I delve into a critical visual-acoustic analysis, looking specifically at FDNY radio transmissions and NY1 Aircheck news footage. Finally, I examine the discursive legacy sound imparts in moments of American crisis from shell shock accounts in the late 19th – 20th century to post-9/11 narratives of post-traumatic symptoms. In delineating this legacy, I hope to reveal the ways in which these documented discourses evolve past preconceived sensory boundaries in the experience of trauma.

Beatrice Choi is an NYU MA Graduate from the Media Culture Communication program. She has worked with the 9/11 archives for a year as a Moving Imagery Exhibitions Intern at the National September 11 Memorial & Museum, and recently completed a thesis on Post-Traumatic Landscapes, focusing primarily on post-Katrina New Orleans.

 

Scott Blake, artist

“9/11 Flipbook and Quantitative Media Study”

Scott Blake has created a flipbook consisting of images of United Airlines Flight 175 crashing into the south tower of the World Trade Center. Accompanying the images are essays written by a wide range of participants, each expressing their personal experience of the September 11th attacks. In addition, the authors of the essays were asked to reflect on, and respond to, the flipbook itself. Not surprisingly, the majority of the essayists experienced the events through news network footage. Blake is distributing his 9/11 Flipbooks to encourage a constructive dialog regarding the media’s participation in sensationalizing the tragedy. To further illustrate his point, Blake conducted a media study using the 9/11 TV News Archive to count the number of times major news networks showed the plane crashes, building collapses and people falling from the towers on September 11, 2001.

While best known for his Barcode Art, Scott Blake has created new works that are scandalous, witty, fun, pornographic, humorous and about a thousand other adjectives viewers might use when seeing them for the first time. A self-described “frivolous artist,” he mows over conceptual and visual boundaries to make work that is as thought provoking as it is entertainingly tongue-in-cheek.

RECEPTION

Remarks by Dennis Swanson, President of Station Operations, Fox Television

THANKS TO

We thank the many people at New York University and Internet Archive who have helped to make this conference possible.