Moving Image Archive: New Tools and Digitization work with Educational Films Collection

As part of the effort to provide Universal Access to All Knowledge, The Internet Archive has been actively involved in digitizing and curating the world’s audiovisual heritage. Our collections range from films produced or distributed by the US Government , to educational films on scientific, historical and civic topics used in classrooms throughout the twentieth century. Our Moving Image Archive also hosts collections of significant regional and topical interest, including California Light and Sound of the California Audiovisual Preservation Project and the Prelinger Archives with particular strengths in amateur, industrial and local films.

Non-theatrical motion pictures – meant to be screened outside of the typical commercial theater circuit in schools, local groups or specialized audiences – provide an essential glimpse into corners of history that would otherwise remain obscure or distorted. The Internet Archive holds a wide array of physical collections in a variety of gauges (8mm, super8, 16mm as well as 35mm) that range from home movies and stock footage, to documentaries and entire teaching film collections totaling tens of of thousands of reels. We aim to scan as many of these films as possible and offer them in a variety of file formats with rich descriptions and links to further resources, always emphasizing access and welcoming user comments and contributions to our metadata. We invite visitors to be part of our mission to gather “Visible Evidence” of our past and present.

Key to this effort has been our work on a corpus of educational films, the majority of which were produced for K-12 and college-aged students from the 1940s to the 1970s. They include films from significant collections and repositories on psychology (e.g. the Psychological Cinema Registry), science (Encyclopedia Cinematographica) and art. Our in-house digitization process involves scanning these films at a high resolution (2K where possible), presenting them in a separate, well-curated collection on our site, and working on ways to make them more easily discoverable and more useful to our visitors. Some tools in progress include:

-Voice transcription that generates a text file through which a film’s voice-over and dialogue can be searched (for an example from a recent experiment see here).

-Links in each film page to online resources both within the Internet Archive’s book and journal collections and to our partners at the Media History Project.

-Rich metadata description sourced from the physical collections, educational film catalogues, relevant journals and databases.

Rather than aim for preservation-grade copies of our films – a laborious and costly process that often delays or completely prevents user-access –  we are prioritizing search capabilities, future-proofed video file formats, secure and reliable storage and a user interface that encourages viewing and sharing. Our films will be of interest to educators and researchers, but equally to filmmakers and artists, groups documenting their local history and those that have always wondered how movies became “talkies”

or how Norwegian explorer and writer Thor Heyerdahl managed to cross the Pacific in a raft in 1947.

We are always open to comments, suggestions and ideas. Please let us know what films, functionalities and future improvement you would like to see in our film collections. To be a part of the effort of constructing, curating and conserving the world’s largest digital repository of non-theatrical films, you can email the Internet Archive’s film curator at

Dimitrios Latsis

CLIR-Mellon Postdoctoral Fellow in Data Curation for the Visual Studies, Internet Archive